fix(verifier): share inter-block boundary line between top and bottom partitions#60
Merged
Merged
Conversation
…tom partitions The reattribution pass in `partition_row_lines_by_quadrant` was popping the trailing top-band line and moving it into the bottom band. That kept the bottom-quadrant cropper aligned (the original PR purpose) but stripped the top-quadrant cropper of row N-1's bottom endpoint. When the top band had exactly `sum(spans)` lines remaining after the pop, clean-pairing in `_assign_row_bboxes` (which needs `len(lines) >= sum(spans) + 1`) failed by one and fell back to median-gap stepping anchored at `lines[0]`. On pages with a tall first row — the Hour/Jock cell signature — every subsequent row drifted below the printed grid by ~26 px and bboxes ended up bisecting handwriting. This was Alex's page-34 report and the dominant cluster in the bbox-pathology sweep. The boundary line is a single printed grid line that bounds row N-1 of the top block AND the hour-jock cell of the bottom block; semantically it belongs to both partitions. The fix inserts it into the bottom band without removing it from the top. Measured on the 7 strict-detector candidate pages (the only ones flagged across all 84 deployed bundles): edge-aligned bbox edges 53.5% -> 64.9%; strict pathology count 42 -> 8. Page 04 (covered by the prior reattribution test) and page 25 (the original page-04-era test fixture) remain visually correct.
The bottom-band-shared-boundary fix on page_layout.partition_row_lines_by_quadrant only changes geometry for pages where the trailing top-band gap exceeds 1.3 x median spacing AND the top band had exactly sum(spans) lines remaining after the prior pop. Across the 84 deployed bundles that's 7 pages. All 7 now show correct top-quadrant bbox heights matching the printed grid (row 0 ~ 101 px Hour/Jock cell, subsequent rows ~ 75 px) instead of the prior fixed 75-px median-step drift. Strict-detector sweep across all 84 bundles: pathologies 42 -> 8. The 19 pages Alex has already verified are NOT in the changed set (her verified.json files override bundle geometry once she's saved; the bundles she's currently working through and the ones she hasn't reached yet pick up the fix on the next deploy).
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #59.
Summary
1990-04apr0106-page34: row crops were sliding ~26 px below the printed grid and bisecting handwriting.partition_row_lines_by_quadrant's reattribution pass was popping the trailing top-band line and moving it into the bottom band. The line bounds BOTH bands; removing it from the top stripped row N-1 of its bottom endpoint, causing_assign_row_bboxesto fall back to median-gap stepping anchored atlines[0]— which drifts on pages with a tall first row (the Hour/Jock cell).Numbers
Strict-detector sweep across the 84 deployed bundles:
Only 7 of the 84 bundles regenerated to new bytes. The remaining 77 are bit-identical, confirming the fix is surgical.
Test plan
test_partition_row_lines_shares_boundary_line_with_bottom_block); confirmed red before the fix, green after..seed/verifier/updated with the 7 affected bundles so Railway picks up the fix on next deploy.Risk + scope
1990-04apr0106pages 01-19) are unaffected. None are in the 7-bundle changed set, and their verified.json overlays would override bundle geometry anyway.