PR F: Header/footer-only selective recall when those fields come back null

## Problem

When Gemini returns `page_date_raw=null` or `comments_raw=null` but `core.page_layout.detect_page_layout` shows the corresponding band (header strip / footer strip) has content, the model has demonstrably missed something specific. Re-running the full page is overkill; re-calling Gemini against just the cropped strip is sub-cent and targeted.

## End state

A post-extraction step in `core/pipeline.py` that, for each completed page:

1. checks whether `page_date_raw` is null AND the header strip has detected content, and/or
2. checks whether `comments_raw` is null AND the footer strip has detected content,

then for each missing-but-likely-present field, calls Gemini against just the cropped strip with the existing `HEADER_EXTRACTION_PROMPT` / `FOOTER_EXTRACTION_PROMPT`. Recovered value is merged into the `PageResult` and persisted.

## Where

- [`core/pipeline.py`](https://github.com/WXYC/flowsheet-digitization/blob/main/core/pipeline.py) — the recall step after the main extract.
- [`core/gemini.py`](https://github.com/WXYC/flowsheet-digitization/blob/main/core/gemini.py) — a header-only / footer-only extract method (or a generic `extract_strip(image_bytes, prompt, wire_schema)` that takes the cropped strip and the prompt).
- `_crop_header_strip` / `_crop_footer_strip` currently live in `scripts/calibrate_models.py`. This PR likely moves them to `core/` (e.g. `core/crops.py`) since they'd now be used in production, not just calibration. Update Modal adapter imports accordingly.
- [`core/prompts.py`](https://github.com/WXYC/flowsheet-digitization/blob/main/core/prompts.py) — `HEADER_EXTRACTION_PROMPT` and `FOOTER_EXTRACTION_PROMPT` already exist (built for `modal-qwen-vl-quad`); reuse as-is.

## Constraints

- Recall fires only when BOTH conditions hold: field is null AND layout shows content in that band. False positives are expensive (extra API calls); err on the side of not recalling.
- Cap recalls at 2 per page (one header + one footer). No retry on recall failures — if the strip-only call also returns null, accept it.
- Persist the recovered field by writing back to the on-disk JSON. Existing 34 corpus JSONs predate this — they aren't candidates for the recall pass.
- Add a `recovered_from_strip_recall` flag (or similar metadata) on the page so downstream consumers can tell apart "Gemini got this on the first try" from "we had to recall." Useful for quality analysis.

## Acceptance criteria

- [ ] Recall triggers only on null-field + layout-content-detected pages.
- [ ] Recovered value is merged into the on-disk `PageResult`.
- [ ] No regression on pages that don't trigger the recall.
- [ ] Unit tests cover: no recall needed, header-only recall, footer-only recall, both recalls, recall returns null again (graceful accept), layout detection fails (don't recall).
- [ ] Integration test confirms the recovered page is queryable just like a normally-extracted one.

## Notes for implementer

The Modal-quad work built `HEADER_EXTRACTION_PROMPT` and `FOOTER_EXTRACTION_PROMPT` and a parallel set of wire schemas (`HEADER_WIRE_SCHEMA`, `FOOTER_WIRE_SCHEMA`). For Gemini, you can either reuse the same prompts and rely on Gemini's `response_schema` (the natural Gemini path) or define a tiny header/footer-only Pydantic model just for these recalls. The latter mirrors `GeminiPageResult`/`PageResult` discipline.

## Related

- Sprint 3 parent: #39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PR F: Header/footer-only selective recall when those fields come back null #45

Problem

End state

Where

Constraints

Acceptance criteria

Notes for implementer

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

PR F: Header/footer-only selective recall when those fields come back null #45

Description

Problem

End state

Where

Constraints

Acceptance criteria

Notes for implementer

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions