Roll-up: SAM fine-tuning, canvas selection, editable annotations, undo/redo, and more (#73, #75, #40, #24, #80)#84
Merged
bnsreenu merged 19 commits intoJun 26, 2026
Conversation
Ultralytics ships no SAM trainer (SAM.train() -> NotImplementedError), so
this adds a custom decoder (optionally encoder) fine-tuning loop that reuses
Ultralytics' own SAM2Predictor forward methods (get_im_features /
prompt_inference) under autograd, with focal+dice loss + AdamW. Fine-tuned
checkpoints save as {"model": state_dict} and reload through the unchanged
SAM(path) inference path, appearing in the SAM model selector.
- training/sam_trainer.py: SAMFineTuner engine + geometry/loss/naming helpers
- training/sam_dataset.py + io/export_formats.export_sam_dataset: project and
prepared-folder dataset producers
- dialogs/sam_trainer_dialog.py: config dialog (reuses TrainingInfoDialog)
- controllers/sam_train_controller.py: menu, GPU gate, SAMTrainingThread,
selector registration; wired in annotator_window
- inference/sam_utils.py: custom-model registry so fine-tuned models load by path
- tests/integration/test_sam_finetuning.py: geometry/loss/dataset/API-drift
guards + opt-in end-to-end (SAM_TRAIN_E2E=1)
- docs: ADR-021, building-block, runtime, glossary; CLAUDE.md structure
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Encoder path: backprop once per image (sum instance losses) so the shared encoder feature graph isn't freed mid-loop; batch_size now accumulates over images. Fixes "backward through the graph a second time" crash on any image with >1 instance when fine-tuning the image encoder. - Concurrency: lock SAM inference UI (tools, model selector, fine-tune menu) for the duration of a run; restore in training_finished on success AND error. The trainer uses its OWN SAM instance — corrected the false "drives the resident model" narrative in code docstrings, ADR-021, and runtime view; the real hazard is two models on one CUDA context. - Threading: convert in-memory slice QImages to numpy on the GUI thread before handing to the worker (matches _qimage_to_numpy cross-thread contract). - Harden export_sam_dataset path components with os.path.basename. - Add opt-in encoder-path multi-instance regression test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- _launch wraps setup-through-start() in try/except so a failure before the worker thread starts re-enables the SAM inference UI (otherwise tools stay disabled until restart — training_finished was the only other unlock site). - export_sam_dataset slice lookup uses `is not None` (consistency with sam_dataset.py). - Add controller tests: UI lock toggle + lock-recovery on setup failure. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Unicode → in the progress print raises UnicodeEncodeError on a Windows cp1252 console. Use -> so a console-launched run can't crash on the export log. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Found during real GUI testing on non-square images. - Mask shift (correctness): training loss mapped decoder logits to the image with a naive F.interpolate, but SAM2 letterboxes (pad bottom/right) and inference crops that padding via ops.scale_masks(padding=False). The decoder learned padding-shifted targets -> masks shifted down on non-square images. Fix: run logits through the SAME ops.scale_masks(padding=False) inference uses. Validated on a landscape image: fine-tuned mask centroid dY=-0.1px, IoU=0.994 (was visibly shifted). Square test images had hidden the bug. - Progress dialog: clear the reused log on launch; disable Stop on completion so a post-finish click can't strand it on "Stopping...". SAM instance only; shared TrainingInfoDialog class untouched (YOLO unaffected). - Device label: show GPU name (cuda:0 read as "not detected"). - Light mode: drop inline color:palette(text) on the config-dialog note and the pre-existing DINO sidebar caption (No Hardcoded Colors Rule). - Tests: ops.scale_masks padding=False crop-geometry guard + opt-in landscape shift regression. ADR-021 coordinate-frame consequence documented. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The loaded canvas + updated window title already make a successful open obvious; the modal was just an extra click. Error dialogs unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The ClassThresholdTable header styled itself inline with background-color: palette(mid); color: palette(text). Those palette() references resolve against the OS palette (dark on Windows), so in the app's light mode the header painted dark-on-light, out of step with the rest of the UI. Light mode (default_stylesheet) also had no QHeaderView::section rule at all, while dark mode did. Strip the inline palette() colours (keep only structural font-weight / padding) and add a light QHeaderView::section rule to default_stylesheet, symmetric with the dark sheet. Header colour now follows the active theme. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…bnsreenu#75) Give the canvas the same selection power the annotation list already had, and make selection legible. Selection (idle mode — no drawing/SAM tool active): - single click selects the smallest mask under the cursor (segmentation or bbox); click on empty space clears - Shift+click toggles a mask; drag draws a rubber band that box-selects; Shift+drag adds. Esc cancels an in-progress band. Ctrl stays pan. - Delete removes the selected mask(s). Canvas selection is unified with the annotation list via AnnotationController.apply_canvas_selection, which mirrors onto the QListWidget (signals blocked) and matches by dict value-equality; Delete/Merge/Change Class are reused unchanged. Canvas Delete gates on the list selection (CanvasContext) so a stale highlight can't pop a spurious warning. Selection rendering — no longer recolours the mask red (invisible on a red-class mask). The mask keeps its class colour; a class-colour- independent overlay is drawn last: a dashed selection-blue bounding box plus bright handle squares at corners/edge-midpoints (open-garden-planner CAD style; handles are visual-only, resize is upstream bnsreenu#40). Class colours: default palette no longer starts on red — core/constants default_class_color (curated tab10-style, red last, muted) replaces the Qt.GlobalColor(n % 16 + 7) formula at the 4 assignment sites. Default fill opacity 0.3 -> 0.2 so masks don't bury the image. Saved projects keep their persisted colours. Tests: unit (hit-testing, gesture resolution, Esc-cancel, render-no- recolor, palette) + integration (controller selection mirror, canvas- delete gating). arc42: ADR-022 (+amendment), runtime, crosscutting, glossary, CLAUDE.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nu#32, bnsreenu#36) Bundles three related annotation-geometry fixes: now makes its 8 selection handles (from bnsreenu#75) draggable: drag a handle to resize (opposite side anchored, stays rectangular, >=1px), drag the interior to move. No new mode, modeled on open-garden-planner's ResizeHandle (anchor- from-handle + per-handle cursors). Move is drag-gated past the click threshold so a plain click still selects (nested-mask click-through preserved). On release the box is clamped and persisted via bboxEditCommitted -> commit_bbox_edit; Esc reverts. coordinates into the image rectangle on commit (clamp_segmentation/clamp_bbox, per-coordinate so vertex count is preserved). Drawn shapes were already clipped. polygon to the image via a shapely intersection (clip_polygon_to_bounds, buffer(0) repair, largest part); polygons left fully outside are dropped. Geometry helpers live in utils.py. Docs: ADR-023 (bbox editing) + ADR-024 (clamp vs clip), runtime view, crosscutting, glossary, CLAUDE.md patterns + shortcuts; backlog rows removed. Tests: 30 new (geometry, bbox-edit lifecycle, controller persistence, augmenter clip). 234 passed, ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…p collapse P1 (data loss): a bbox selected via the annotation LIST stored a value-equal copy (item.data(UserRole) round-trips as a copy) in highlighted_annotations, so the in-place handle-drag mutated a throwaway object and the edit was silently lost. _single_selected_bbox now resolves to the live object inside image_label.annotations by value-equality; _commit_bbox_drag re-points highlighted_annotations at that live object so re-selection survives the rebuild. P1 (geometry): clamp_bbox collapsed a moved box that crossed the top/left edge to a 1px sliver (independent-corner clamping is right for resize, wrong for move). Added fit_bbox_inside (size-preserving translate-inside) and use it for the move path; clamp_bbox stays the resize/trim clamp. P2: clip_polygon_to_bounds now drops shapely's duplicate closing vertex to match the app's unclosed flat-ring convention. Tests: +7 (fit_bbox_inside, top/left-off move preserves size, live-object resolution, list-selection-after-switch persistence). 241 passed, ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Split _single_selected_bbox (cheap, geometry-only, safe per-hover) from _live_annotation (the value-equality scan that resolves to the saved object); the press handler now resolves only on drag-arm, so the O(n) scan no longer runs on every hover frame. - Comment that the post-commit highlighted_annotations reassignment is a no-op for canvas-click selection and only matters for a stale list-selection copy. - Note the exact-tuple == in clip_polygon_to_bounds' closing-vertex drop. Re-review verdict: merge-ready, no P0/P1. 241 passed, ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ped annotations Manual GUI testing found the handles inert on every shape: the feature only armed for annotations with a "bbox" key, but in-app shapes (drawn rectangles, polygons, SAM/DINO masks) are all stored as "segmentation" — only imported COCO/YOLO boxes get a bbox key. So _single_selected_bbox always returned None and the drawn handles did nothing. Generalize to any single selected shape (user-confirmed: resize scales, drag- inside moves): - _single_selected_shape: any single annotation with a bounding box (seg or bbox). - _begin_shape_edit records kind: "seg" scales/translates vertices (_scale_segmentation / _translate_segmentation); "bbox" edits [x,y,w,h]. - _sync_bbox_key keeps an imported annotation's bbox consistent with its edited segmentation (feeds export/training). - Commit clamp by kind+mode: polygon move slides inside (shape-preserving) then clamp_segmentation as a safety net; polygon resize clamp_segmentation; box move fit_bbox_inside; box resize clamp_bbox. Docs (ADR-023, ADR-022 amendment, runtime, crosscutting, glossary, CLAUDE.md) updated to "shape editing". Tests: +6 (polygon scale/move/clamp/bbox-key-sync, polygon resize persistence). 247 passed, ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Senior-review follow-ups: - _annotation_bbox / _annotation_contains crashed (None[0::2]) on imported bbox-only annotations, which carry "segmentation": None — exactly bnsreenu#40's original target, and the hover/select path now calls _annotation_bbox on every selected annotation. Both now fall through to the bbox key when segmentation is None/empty. Test locks selectable + resizable. - Refresh the stale self.bbox_edit init comment (now lists kind/orig_seg, "shape") and the hover-cursor comment (bbox -> shape). 248 passed, ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nsreenu#24 Closes the remaining piece of #24 (the "mask complexity — less ↔ more points" control; point add/remove was already covered by the SAM-points tool). - utils.simplify_polygon: Douglas-Peucker (cv2.approxPolyDP) thinning to a Detail % vertex budget (100 = raw), binary-searched epsilon. - Annotations panel: QListWidget → QTableWidget (ID | Class | Area | Detail %), mirroring the DINO ClassThresholdTable idiom (per-row spinbox, SelectRows, NoEditTriggers, stylesheet-only header). Col 0's UserRole holds the annotation (the bnsreenu#75 value-equality marker). Selection mirror uses setRangeSelected (additive) — selectRow() replaces in ExtendedSelection. count/item/selectedItems → rowCount/item(r,0)/row-deduped selectedIndexes across the bridge + CRUD. - Per-row Detail % spinbox → on_detail_pct_changed: resolves the live drawn object (_live_annotation), lazy-captures segmentation_raw, simplifies (or restores raw at 100), recomputes bbox, refreshes Area+UserRole in place, saves. valueChanged connected after the initial setValue so table builds don't fire it. - New keys segmentation_raw + detail_pct round-trip via .iap (ann.copy → JSON); exports emit the effective (simplified) segmentation. Docs: ADR-025, runtime, crosscutting, glossary; CLAUDE.md pattern row + backlog reconciled (drop closed bnsreenu#63 + done #24, add bnsreenu#74). Tests: +11 (simplify unit + table/persistence/export integration); existing bnsreenu#75/bnsreenu#40 selection tests adapted to the table. 259 passed, ruff clean on changed files. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
P1 (data loss): reshaping a thinned polygon with the bnsreenu#40 handles orphaned its segmentation_raw, so a later Detail %=100 reverted the reshape. _clamp_edited_shape now invalidates the baseline on a seg edit (drop segmentation_raw, reset detail_pct=100) — the edited geometry becomes the new raw. P1 (stale selection): on_detail_pct_changed mutated the live object but left highlighted_annotations holding the pre-simplify value, so the selection overlay drew stale geometry and a subsequent handle drag (_live_annotation) couldn't re-match. Now re-points highlighted_annotations at the mutated object. P2: corrected the binary-search comment (keeps the richest polygon within budget) and ADR-025's reversibility note (reshape resets the raw baseline). Tests: +2 seam tests (thin→reshape→baseline-invalidated; detail-change keeps selection resolvable for a handle drag). 261 passed, ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Senior re-review round 2 (production fixes confirmed correct): - test_detail_change_keeps_selection_resolvable_for_handle_drag selected by identity (apply_canvas_selection), so it passed pre-fix too. Now selects via the table (update_highlighted_annotations stores a UserRole *copy*), so it actually exercises the stale-highlight bug and fails on pre-fix code. - Delete the orphaned delete_annotation (controller method + window delegate, no callers): its removeRow-without-rebuild was incompatible with the index- bound per-row spinbox lambdas — a tripwire for the next single-row-delete. - Glossary: Annotation List is a QTableWidget now, not a QListWidget. 261 passed, ruff clean on changed files. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… fixes Bundles seven related changes: Dataset splitter (dialogs/dataset_splitter.py) - bnsreenu#81: compute_split_counts() + bounded slices so a 0% subset yields exactly 0 images (was leaking the flooring remainder into test). - bnsreenu#80: auto-balance the train/val/test spin boxes so they always sum to 100 (val absorbs train; test→train absorb val; etc.). Frictionless destructive ops (annotation_controller.py) - Delete: removed the confirmation AND success dialogs — instant. - Merge: removed the keep/delete prompt (always replaces originals with the union) and the success dialog. Undo/redo for annotation edits (ADR-026) - New controllers/annotation_history.py: snapshot-based, per-image-key undo/redo stacks with deep-equal dedup and a depth cap. - record_history() choke-point before every synchronous mutation (finish poly/rect, delete, merge, change-class, eraser, SAM/DINO accept); deferred bbox/paint/temp gestures capture a baseline at start (editBaselineRequested) and push it at commit. - Detail-% drags coalesce to one undo entry. - Ctrl+Z / Ctrl+Y / Ctrl+Shift+Z as ApplicationShortcuts; undo/redo no-op during load, modal, text focus, or an in-flight gesture. Tool mode (annotator_window.py, sam_controller.py, image_label.py) - F: single activate_tool() choke-point makes all six tools (incl. SAM) mutually exclusive — a SAM tool can no longer be active alongside a manual tool. - G: Esc cancels the in-progress shape AND returns to selection mode (selectModeRequested -> activate_tool(None)). Docs: ADR-026, crosscutting concepts (tool activation, Esc), runtime view, CLAUDE.md shortcuts + patterns table. Tests: +114 (splitter math/balance, AnnotationHistory unit, history integration incl. mid-paint-undo + number-parity guards, tool mode). 378 passed, 3 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Vertex edits (double-click polygon edit) were the one annotation edit Ctrl+Z couldn't reverse. Their Enter-commit only refreshed the list and relied on a later save to persist, and Esc silently kept the in-place vertex drags instead of reverting them. - start_polygon_edit captures the segmentation + emits editBaselineRequested (undo baseline) on entry. - New polygonEditCommitted signal -> commit_polygon_edit syncs all_annotations (save_current_annotations) + pushes the baseline + refreshes the list. Gated on an actual geometry change. - Esc now reverts the drags from the captured original; _editing_polygon_orig is cleared on commit/cancel/delete/exit/clear. Tests: vertex-edit commit->undo and Esc-reverts-without-history. 380 passed, 3 skipped. ADR-026 + CLAUDE.md updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
switch_slice didn't call exit_editing_mode the way switch_image does, so editing a polygon then switching slice left editing_polygon and the new _editing_polygon_orig pointing at the previous slice's object (stale overlay). One-line parity fix; closes the leak for multi-dim stacks. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR rolls up the five feature sets that were submitted as separate PRs and subsequently closed. It is rebased cleanly on the current upstream
masterand passes the full smoke test suite (31 tests).Architecture / docs included:
Quality:
annotation_history.pyandtraining/are new packagesCloses #73, closes #75, closes #40, closes #32, closes #36, closes #24, closes #80, closes #81.