PR A: Batch mode + context caching in core/gemini.py

## Problem

`core/gemini.py` calls Gemini one page at a time at real-time pricing with no context caching. The prompt prefix (~2-3K tokens of `PAGE_EXTRACTION_PROMPT` plus the JSON response schema) is identical across the entire ~16K-page corpus — paying full input rate on every call wastes ~75% of that portion. Gemini also exposes a batch mode at ~50% of real-time pricing with a 24h SLA, perfectly suited to a one-shot corpus extraction.

## End state

`GeminiClient` supports both optimizations:

1. **Context caching.** A `cachedContent` resource is created at the start of a corpus run from `PAGE_EXTRACTION_PROMPT` + the response schema. Subsequent `extract_page` calls reference the cached prefix and only pay full rate for the image plus their own request specifics.
2. **Batch mode (opt-in).** A `batch=True` path (constructor flag or sibling method) routes through Gemini's batch API and reconciles results back to the existing `jobs.db` state machine.

The real-time, non-cached path remains the default; both new modes are opt-in so dev iteration isn't affected.

## Where

- [`core/gemini.py`](https://github.com/WXYC/flowsheet-digitization/blob/main/core/gemini.py) — the `GeminiClient` wrapper. New cache lifecycle methods + a batch submission path.
- [`core/pipeline.py`](https://github.com/WXYC/flowsheet-digitization/blob/main/core/pipeline.py) — corpus orchestration; creates the cache once at corpus-run start, polls batch jobs.
- [`core/jobs.py`](https://github.com/WXYC/flowsheet-digitization/blob/main/core/jobs.py) — if batch mode needs an intermediate "submitted" status, add it without breaking the existing `pending → rendered → processing → completed` flow.
- [`cli.py`](https://github.com/WXYC/flowsheet-digitization/blob/main/cli.py) — `--batch` flag on the corpus-extract subcommand.

## Constraints

- Context caching has a minimum-size threshold (currently ~1024 tokens). Confirm the prompt+schema combined comfortably exceeds it. If not, skip caching gracefully.
- Batch mode is async: submit → poll → fetch. The existing pipeline is sync per page; the batch path needs a polling loop bounded by SLA. Don't block dev iteration on the 24h SLA — keep real-time as the default.
- Cache TTL is configurable. For a one-shot corpus run the default is fine; if/when we re-run periodically, surface the TTL knob.
- Existing 34 corpus JSONs must remain valid (no schema mutation that would invalidate them).

## Acceptance criteria

- [ ] `GeminiClient.extract_page` uses `cachedContent` when a cache exists; falls back to the un-cached path cleanly when it doesn't.
- [ ] Pipeline creates the cache once at corpus-run start; subsequent calls reuse it.
- [ ] Batch mode is wired via opt-in flag; submission and polling complete end-to-end against a small (≤10-page) test set.
- [ ] Sanity check: billed input-text tokens drop ~75% on a 10-page run (after the first call); batch run total ~50% of real-time band.
- [ ] All existing tests pass; new tests cover the cache lifecycle and the batch submit/poll flow (mocked).
- [ ] `external_api`-marker tests exercise both paths against the real API in the scheduled workflow.

## Notes for implementer

- The Gemini SDK exposes `client.caches.create(model=..., system_instruction=..., contents=..., ttl=...)` and the corresponding `cached_content` parameter on generate calls. Response schema can stay attached to the cache.
- Batch mode uses a different endpoint (`client.batches.create` / equivalent). Responses come back as a single payload that you iterate over against your submission order; keep the order aligned with `jobs.db` rows.
- Don't conflate the two features in tests — they're independent and one should be debuggable without the other.

## Related

- Sprint 1 parent: #37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PR A: Batch mode + context caching in core/gemini.py #40

Problem

End state

Where

Constraints

Acceptance criteria

Notes for implementer

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

PR A: Batch mode + context caching in core/gemini.py #40

Description

Problem

End state

Where

Constraints

Acceptance criteria

Notes for implementer

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions