Skip to content

Sprint 1: Gemini cost wins #37

@jakebromberg

Description

@jakebromberg

Goal

Cut the Gemini full-corpus extraction cost from roughly $160-800 down to roughly $30-160 by wiring two existing Gemini API features into core/gemini.py and trimming the response schema to fields the model actually needs to produce. Lays the cost-baseline groundwork that Sprint 2 (Flash tier decision) and Sprint 3 (quality long-tail) both depend on.

Sprint scope

Type Issue Effort
PR A1 Context caching in core/gemini.py ~half day
PR A2 Gemini Batch API for one-shot corpus run ~1-2 days
PR B Drop derivable Entry.artist_guess / track_guess; add core/parse.py ~half day
PR G Mark Modal adapters as research-only (docs) ~15 min

Success criteria

  • Billed input-text tokens drop ~75% on cached calls vs the first call (10-page sanity sample).
  • Batch-mode run lands at ~50% of real-time pricing for the same workload.
  • Output tokens drop ~20-30% per page after the schema audit.
  • All existing tests pass; the 5-golden calibration score is unchanged or better.
  • scripts/calibrate_models.py and CLAUDE.md no longer quote the $1200-1800 Modal-corpus band in a way that reads as planned spend.

Sub-issues

Out of scope

  • Gemini Flash tier decision (Sprint 2).
  • Per-quadrant Gemini re-extraction for low-confidence pages (Sprint 3).
  • Phase 2 schema work (date normalization, library reconciliation, etc.).

Context

The Modal explorations (modal-qwen-vl, modal-qwen-vl-quad, modal-churro) ship better infrastructure than they do model quality — none of them catch Gemini 3 Pro's 68/76 calibration baseline (Modal Qwen-VL-quad is at 50/76 at ~6× the per-page cost). Production stays on Gemini; the cost optimizations in this sprint compound multiplicatively (caching × batch × schema-trim ≈ 5-10× lower per-page billing) without touching extraction quality.

Related

  • Sprint 2 — Flash tier decision (downstream).
  • Sprint 3 — Quality long-tail (optional, downstream).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions