Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
f0286fc
docs: Add temporary self-deleting upstream-issue backlog to CLAUDE.md
cofade Jun 12, 2026
a4bc753
feat: Add temp-roadmap skill for self-removing CLAUDE.md backlogs
cofade Jun 12, 2026
6bc51db
feat: Filter image list by annotation status (upstream #27)
cofade Jun 12, 2026
a68dfb4
feat: Fall back to CPU on unsupported CUDA compute capability (upstre…
cofade Jun 12, 2026
65ea399
docs: arc42 updates for #27/#57 + execute backlog deletion hook
cofade Jun 12, 2026
36da1ae
fix: Address senior review findings on filter wiring and device probe
cofade Jun 12, 2026
8ef082c
fix: Hide non-matching current row from filtered image list (#27)
cofade Jun 19, 2026
58af8bf
Merge pull request #19 from cofade/feature/issue-27-57-image-filter-a…
cofade Jun 19, 2026
e833a56
fix: Save trained YOLO model with .save() not .export() (#30)
cofade Jun 19, 2026
24e96f9
fix: Open all text files as UTF-8 to prevent Windows charmap crash (#44)
cofade Jun 19, 2026
384467a
fix: Edit smallest containing polygon for nested annotations (#33)
cofade Jun 19, 2026
3853bcd
feat: Sort image list alphabetically (#60)
cofade Jun 19, 2026
bf8a806
fix: Handle LZW/compressed TIFF without imagecodecs gracefully (#56)
cofade Jun 19, 2026
feb7e20
docs+test: arc42 for #60/#56, #44 encoding guard test, execute backlo…
cofade Jun 19, 2026
a02bde0
chore: Clear pre-existing lint (F401/F811/F541/F841) in touched files
cofade Jun 19, 2026
bb34c3b
fix: Address senior review on quick-wins (P1 codec heuristic + P2s)
cofade Jun 19, 2026
e28ba00
test: Pin that bare 'compression' no longer matches codec heuristic (…
cofade Jun 19, 2026
9937d8a
Merge pull request #20 from cofade/feature/quick-wins-30-44-33-60-56
cofade Jun 19, 2026
bb865f6
feat: Fine-tune SAM 2 / 2.1 on user annotations (bnsreenu#73)
cofade Jun 19, 2026
7d7f8c6
fix: Address senior review on SAM fine-tuning
cofade Jun 20, 2026
c870cac
fix: Restore SAM UI if training setup fails; review nits
cofade Jun 20, 2026
6892f3d
fix: ASCII arrow in export_sam_dataset print (cp1252 console safety)
cofade Jun 20, 2026
570de46
fix: Correct SAM fine-tuning mask shift + UI papercuts (PR #21)
cofade Jun 20, 2026
6aee148
chore: Drop redundant "Project Opened" success dialog
cofade Jun 20, 2026
7952e2c
Merge pull request #21 from cofade/feature/issue-73-sam-finetuning
cofade Jun 20, 2026
b448f36
fix: Light-mode contrast for DINO threshold-table headers
cofade Jun 20, 2026
077e604
feat: Canvas mask selection, multi-delete & visible selection styling…
cofade Jun 21, 2026
a6face1
Merge pull request #22 from cofade/feature/issue-75-canvas-selection
cofade Jun 21, 2026
57179b5
feat: Editable bboxes + out-of-bounds clamp/clip (bnsreenu#40, #32, #36)
cofade Jun 21, 2026
5f0f780
fix: Address senior-review P1s — list-selection edit loss + move clam…
cofade Jun 21, 2026
28f80fb
refactor: Address senior-review P2s (perf + clarifying comments)
cofade Jun 21, 2026
4ca24a0
fix: #40 resize/move now works on any shape, not just bbox-typed anno…
cofade Jun 21, 2026
422eb2b
fix: Harden bbox-only hit-test + refresh stale shape-edit comments
cofade Jun 21, 2026
533e7c3
Merge pull request #23 from cofade/feature/issue-40-bbox-edit-bounds
cofade Jun 21, 2026
89b9a57
feat: Reversible per-annotation mask complexity (Detail %) — closes b…
cofade Jun 22, 2026
58b8cf3
fix: Address senior-review P1s at the #24×#40 seam
cofade Jun 22, 2026
4a66d49
test+cleanup: Make P1b test a real guard; drop dead delete_annotation
cofade Jun 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .claude/skills/temp-roadmap/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
name: temp-roadmap
description: Create a temporary, self-removing roadmap/backlog section in CLAUDE.md from a set of identified work items (issue triage, audit findings, review feedback). Use when the user asks to "track these items", "write this list into CLAUDE.md as a plan", "create a backlog/roadmap in CLAUDE.md", or wants a work list that cleans itself up as PRs land.
---

# Temporary Self-Removing Roadmap in CLAUDE.md

Turn a set of identified work items into a CLAUDE.md section that shrinks with
every PR and disappears entirely when the work is done — so CLAUDE.md returns
to its clean state without manual housekeeping.

## When to use

- After issue triage, a code audit, or a review produced a concrete list of
work items that will be resolved across several future PRs.
- NOT for single-PR task tracking (use the todo list) and NOT for permanent
guidance (write a normal CLAUDE.md section or arc42 doc instead).

## Structure to write into CLAUDE.md

Insert a section near the top of CLAUDE.md (after the docs index, before
project structure):

```markdown
## <Topic> Backlog (TEMPORARY section — self-deleting)

**Deletion hook:** When a PR resolves one of the items below, DELETE its row
from this table **in the same PR**. When the last row is gone, delete this
entire section so CLAUDE.md returns to its clean state. Never let a finished
item linger here.

Items validated <YYYY-MM-DD>; source: <issue tracker / audit / review link>.

| Item | Size | Task |
|------|------|------|
| #123 | quick win | One-line actionable description with file hints (`path/file.py`, function name, reporter-verified fix if any) |
| #124 | medium | ... |
| #125 | blocked | ... — name what it is blocked on so it can be re-checked |
```

## Rules

1. **One line per item.** Concise but self-sufficient: a future session must be
able to start the item from the row alone — include file paths, function
names, and known pitfalls ("do NOT use setSortingEnabled — currentRowChanged
fires switch_image").
2. **Classify every item**: `quick win` / `medium` / `large` / `blocked`.
For `blocked`, state the blocker.
3. **Reference the source** (issue number, ticket, review comment) so context
can be recovered.
4. **Date the validation** — rows describe code state at a point in time;
re-verify before implementing if the date is old.
5. **The deletion hook is mandatory** and must appear verbatim-in-spirit at the
top of the section. It is the whole point: the section is a consumable, not
documentation.
6. Mark items currently being worked on with *(in progress on this branch)* so
parallel sessions don't double-pick them.

## Per-PR workflow (executing the hook)

1. Pick item(s), implement on a feature branch.
2. In the same PR: delete the resolved row(s) from the table.
3. If the table is now empty: delete the entire section, including the heading
and the deletion-hook paragraph.
4. The PR diff thus always shows both the fix and the backlog shrinking —
reviewers can see progress without a separate tracker.
6 changes: 4 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@ models/
*.iap

# Local tooling — .claude is mostly local state, but the senior-reviewer
# agent definition under .claude/agents/ is tracked (referenced by
# CLAUDE.md as a mandatory quality gate), so un-ignore that subtree.
# agent definition under .claude/agents/ and project skills under
# .claude/skills/ are tracked (referenced by CLAUDE.md / used as shared
# workflow tooling), so un-ignore those subtrees.
.claude/*
!.claude/agents/
!.claude/skills/
.venv

# Python cache and build artifacts
Expand Down
36 changes: 33 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,21 @@ For detailed architecture and design information, see **[docs/](docs/)**:

See [docs/README.md](docs/README.md) for full documentation index.

## Upstream Issue Backlog (TEMPORARY section — self-deleting)

**Deletion hook:** When a PR resolves one of the items below, DELETE its row
from this table **in the same PR**. When the last row is gone, delete this
entire section so CLAUDE.md returns to its clean state. Never let a finished
item linger here.

Issue numbers refer to https://github.com/bnsreenu/digitalsreeni-image-annotator/issues
(validated 2026-06-22; already-fixed issues have close-request comments posted, not listed here).

| Issue | Size | Task |
|-------|------|------|
| #74 | large | MLflow experiment tracking for model training (SAM fine-tuning / YOLO) |
| #35 | large | Keypoint annotation tool |

## Project Structure

```
Expand All @@ -47,14 +62,17 @@ src/digitalsreeni_image_annotator/
├── __init__.py # Public API re-exports
├── core/ # constants, annotation_utils, image_utils
├── controllers/ # 7 controllers (project, image, sam, dino,
│ # yolo, annotation, class) + io_controller
├── controllers/ # 8 controllers (project, image, sam,
│ # sam_train, dino, yolo, annotation,
│ # class) + io_controller
├── widgets/
│ ├── image_label.py # ImageLabel canvas widget (dispatcher)
│ ├── canvas_context.py # CanvasContext read accessor (ADR-018)
│ └── tools/ # Per-tool handlers (ADR-019): rectangle,
│ # polygon, paint, eraser
├── inference/ # sam_utils.py, dino_utils.py
├── training/ # SAM fine-tuning (ADR-021): sam_trainer.py
│ # (SAMFineTuner), sam_dataset.py
├── io/ # export_formats.py, import_formats.py
├── ui/ # menu_bar, sidebar, shortcuts, theme, stylesheets
└── dialogs/ # Standalone tool dialogs (statistics,
Expand All @@ -76,8 +94,10 @@ src/digitalsreeni_image_annotator/
| `SAMController` | controllers/sam_controller.py | SAM model picker, debounce, in-flight guard (ADR-013) |
| `DINOController` | controllers/dino_controller.py | DINO single/batch detection, batch review, temp-class workflow |
| `YOLOController` | controllers/yolo_controller.py | YOLO training menu + prediction wiring |
| `SAMUtils` | inference/sam_utils.py | Load SAM models, run inference |
| `SAMUtils` | inference/sam_utils.py | Load SAM models (built-in + fine-tuned), run inference |
| `DINOUtils` | inference/dino_utils.py | Grounding-DINO model load + inference |
| `SAMFineTuner` | training/sam_trainer.py | Fine-tune SAM 2 decoder/encoder via custom loop over Ultralytics SAM2Model (ADR-021) |
| `SAMTrainController` | controllers/sam_train_controller.py | SAM fine-tune menu, GPU gate, training thread, selector registration |

See [Building Block View](docs/05_building_block_view.md) for detailed class documentation.

Expand Down Expand Up @@ -169,6 +189,11 @@ See [Runtime View](docs/06_runtime_view.md#multi-dimensional-image-loading) for
| GPU model unload | `model.cpu()` → `gc.collect()` → `torch.cuda.empty_cache()` + `ipc_collect()` + `synchronize()` — full reclaim requires app restart due to per-process CUDA context | Setting refs to None alone leaves circular refs pinned and shows zero Task Manager drop. See [Releasing Model GPU Memory](docs/08_crosscutting_concepts.md#releasing-model-gpu-memory). |
| Export image-path lookup | Exact-key match first, substring fallback only | `"bee.jpg" in "honeybee.jpg"` is True — substring-only matching writes the wrong file. See [Export Format Filename Matching](docs/08_crosscutting_concepts.md#export-format-filename-matching). |
| F2 / global shortcuts | Use `QShortcut` with `Qt.ShortcutContext.ApplicationShortcut`, not `keyPressEvent` | `QTableWidget` consumes F2 for in-cell edit before it bubbles up. |
| Canvas ↔ list selection sync | Canvas selection (idle-mode click/Shift/rubber-band) drives the annotation list via `apply_canvas_selection`; mirror the list with `blockSignals(True/False)` and match annotations by **value-equality**, never identity | PyQt round-trips `UserRole` dicts as copies and `image_label.annotations` is a deepcopy, so identity is never stable; un-blocked `setSelected` recurses through `update_highlighted_annotations`. Multi-select uses **Shift** (Ctrl stays pan). See [ADR-022](docs/09_architecture_decisions.md#adr-022-canvas-mask-selection-unified-with-the-annotation-list). |
| Selection rendering | Don't recolour a selected mask. Keep its class colour; draw a class-colour-independent overlay (dashed `_SELECTION_COLOR` blue bounding box + bright handle squares at corners/edge-midpoints, OGP-style) in a final pass — `_draw_selection_overlay`. For a single selected shape those handles are draggable (resize/move, any shape). Default class colours come from `core/constants.py::default_class_color` (red last, muted) | Red selection was invisible on a red-class mask; a thin dashed outline alone was too faint; the handles carry the visibility. See ADR-022 amendment. |
| Shape editing (#40) | Direct manipulation on the selection handles for **any** single selected shape (`_single_selected_shape()` — most shapes are `"segmentation"`, not `"bbox"`; gating on a bbox key made it unreachable). `_begin_shape_edit` records `kind`: a `"seg"` polygon **scales** its vertices (`_scale_segmentation`) / translates them; a `"bbox"` edits `[x,y,w,h]`. `_draw_selection_overlay` + `_bbox_handle_at` share `_bbox_handle_points` (visual == grab); resize anchors the opposite side (`_resize_bbox`); interior drag moves, **drag-gated**. `_sync_bbox_key` keeps an imported bbox consistent. Clamp + `bboxEditCommitted` → `commit_bbox_edit` on release; Esc reverts | Handles drawn since #75 were visual-only; dispatch sits before the rubber-band branch but stays gated on `_is_select_mode()`. Names keep the `bbox_edit` prefix = "edit via the bounding-box handles". See [ADR-023](docs/09_architecture_decisions.md). |
| Bounds enforcement (#32/#36) | No commit may persist coords outside the image. **Clamp** manual edits (`clamp_segmentation`/`clamp_bbox`, per-coordinate, count-preserving) at edit commit; **clip** augmented polygons (`clip_polygon_to_bounds`, shapely intersection, may drop → `None`, augmenter must `continue`). Drawn shapes already clip in `finish_polygon`/`finish_rectangle` | Clamp keeps vertex correspondence mid-edit; clip is geometrically correct for batch augmentation. See [ADR-024](docs/09_architecture_decisions.md). |
| Annotations table + Detail % (#24) | The Annotations panel is a **`QTableWidget`** (ID \| Class \| Area \| Detail %), not a list — col 0's UserRole holds the annotation (the #75 value-equality marker). Selection-mirror uses **`setRangeSelected`** (additive); `selectRow()` replaces in `ExtendedSelection`. Per-row Detail % spinbox → `on_detail_pct_changed` resolves the live obj (`_live_annotation`), simplifies from a lazy-captured `segmentation_raw` (`simplify_polygon`, 100=raw), refreshes Area+UserRole in place, saves. Connect `valueChanged` **after** the initial `setValue` so building the table doesn't fire it | Re-homing the selection bridge onto a table is the risk; reversibility needs the raw preserved. See [ADR-025](docs/09_architecture_decisions.md). |

## Development Workflow

Expand Down Expand Up @@ -245,6 +270,11 @@ See [Risks and Technical Debt](docs/11_risks_and_technical_debt.md) for full lis
|--------|--------|
| Ctrl+Wheel | Zoom |
| Ctrl+Drag | Pan |
| Click / Shift+Click (no tool) | Select / toggle mask |
| Drag / Shift+Drag (no tool) | Rubber-band select / add |
| Drag handle / inside (one shape selected) | Resize (scale) / move the shape |
| Double-click | Vertex-edit mode |
| Delete | Delete selected mask(s) |
| Enter | Finish/Accept |
| Esc | Cancel |
| Up/Down | Navigate slices |
Expand Down
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,14 @@ python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_

You should see `True` and your GPU name. For other platforms or driver combinations, use the official selector at <https://pytorch.org/get-started/locally/>.

#### Older NVIDIA GPUs (Pascal / Maxwell)

PyTorch ≥ 2.8 wheels no longer include kernels for GPUs older than Volta (compute capability < 7.0), e.g. the GTX 10xx series (sm_61). On such cards the app detects the mismatch, warns once, and automatically runs inference on the CPU instead of crashing with `CUDA error: no kernel image is available`. To keep using the GPU, install an older PyTorch that still supports it:

```bash
pip install torch==2.4.1 torchvision==0.19.1 --index-url https://download.pytorch.org/whl/cu121
```

## Usage

1. Run the DigitalSreeni Image Annotator application:
Expand Down
25 changes: 22 additions & 3 deletions docs/05_building_block_view.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ src/digitalsreeni_image_annotator/
├── utils.py # Cross-cutting utilities
├── core/ # Constants, annotation utils, image utils
│ ├── constants.py
│ └── annotation_utils.py
│ ├── annotation_utils.py
│ └── torch_utils.py # Shared torch device resolution + CPU fallback (#57)
├── widgets/
│ ├── image_label.py # ImageLabel - canvas widget; dispatcher
│ ├── canvas_context.py # CanvasContext - narrow read view (ADR-018)
Expand Down Expand Up @@ -155,6 +156,23 @@ The earlier subprocess approach is documented as
[ADR-011](09_architecture_decisions.md#adr-011-run-torch-based-workers-in-isolated-subprocesses)
(Superseded).

### SAM Fine-Tuning Subsystem (`training/`)

Lets users fine-tune SAM 2 / 2.1 on their own annotations, since
Ultralytics ships no SAM trainer (ADR-021). Distinct from `inference/`
because it is *training*, not inference.

| Module | Responsibility |
|--------|----------------|
| `training/sam_trainer.py` | `SAMFineTuner` — custom decoder (optionally encoder) fine-tuning loop reusing `SAM2Predictor.get_im_features` / `prompt_inference` under autograd, focal+dice loss, AdamW, checkpoint save+reload-verify. Also geometry helpers (`polygon_to_mask`, `mask_to_xyxy`, `mask_to_point`), `make_custom_filename`, `list_custom_models`, and the `SampleGroup` lazy-rasterising dataset item. |
| `training/sam_dataset.py` | `build_groups_from_project` (live `all_annotations`) and `build_groups_from_folder` (prepared dataset) → `list[SampleGroup]`, mirroring `export_yolo_v5plus` image resolution. |
| `io/export_formats.py::export_sam_dataset` | Writes `images/` + `manifest.json` (authoritative bbox/segmentation specs) for an inspectable, re-trainable on-disk dataset. |

Fine-tuned checkpoints save as `{"model": state_dict}` and reload
through the unchanged `SAM(path)` inference path; `SAMUtils` gains a
`custom_models` registry so they appear in the SAM selector alongside
the eight built-ins.

### DINO Subsystem (Grounding DINO + SAM pipeline)

LLM-assisted detection: the user gives free-form text phrases per class,
Expand Down Expand Up @@ -195,7 +213,7 @@ export time — see [Cross-cutting Concepts](08_crosscutting_concepts.md)).

## Level 3: Controllers

Seven `QObject` controllers plus an `io_controller` helper module
Eight `QObject` controllers plus an `io_controller` helper module
carve `ImageAnnotator` into single-responsibility owners that the
orchestrator delegates to. Each `QObject` controller holds `self.mw
= main_window` and owns one slice of behaviour; the
Expand All @@ -208,12 +226,13 @@ the controller graph.
| Controller | Responsibility |
|------------|----------------|
| `ProjectController` | `.iap` save/load, auto-save, backup/restore, missing-image prompts, window-title sync. Owns the `is_loading_project` autosave guard (load/save round-trip safety, v0.8.12). |
| `ImageController` | Open / load / switch images and slices. TIFF + CZI loaders, the multi-dim `DimensionDialog`, the `[-ndim:]` axis-slice bug fix from the v0.9.0 era. |
| `ImageController` | Open / load / switch images and slices. TIFF + CZI loaders (with `imagecodecs` codec-error handling — #56), the multi-dim `DimensionDialog`, the `[-ndim:]` axis-slice bug fix from the v0.9.0 era. Image-list annotation-status filter (`image_has_annotations`, `apply_image_filter` — #27) and alphabetical sort (`sort_image_list` — #60). |
| `AnnotationController` | Annotation CRUD, list sorting, highlight, edit-mode entry/exit, `finish_polygon`, `finish_rectangle`, `replace_annotations` (eraser path). Validates writes before mutating `all_annotations`. |
| `ClassController` | Class add / delete / rename / colour / visibility. `update_slice_list_colors`, `is_class_visible`. |
| `SAMController` | SAM box/points tool lifecycle, debounce timer, `_sam_inference_in_flight` re-entrancy guard (ADR-013), model picker. |
| `DINOController` | Single + batch detection, batch review navigation, temp-annotation accept/reject, custom-model browse, `DINOReviewEventFilter` ownership (ADR-015). |
| `YOLOController` | Training menu, `TrainingThread`, prediction dialog, result processing. |
| `SAMTrainController` | SAM fine-tuning menu, GPU gate, `SAMTrainingThread`, config dialog, registers fine-tuned checkpoints into the SAM selector (ADR-021). |
| `io_controller` *(module-level functions, not a class)* | Thin UI wrappers around the pure `io/export_formats.py` and `io/import_formats.py` modules. |

Communication: `ImageLabel` does not import controllers directly —
Expand Down
Loading
Loading