[testing] Integration test: FrontierRunner → merge → analyze

Mock all API connectors with canned ESI responses. Run:\n\n1. `yentlbench run --provider api --model mock-gpt mock-claude` (3 vignettes × 4 variants)\n2. `yentlbench merge`\n3. `yentlbench analyze`\n\nAssert:\n- 8 `.run.json` files produced (2 models × 4 variants)\n- `cost_summary.json` written with correct token counts\n- `merged_evaluations.csv` contains columns for both mock models\n- No real API endpoints called

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[testing] Integration test: FrontierRunner → merge → analyze #47

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[testing] Integration test: FrontierRunner → merge → analyze #47

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions