Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
f0286fc
docs: Add temporary self-deleting upstream-issue backlog to CLAUDE.md
cofade Jun 12, 2026
a4bc753
feat: Add temp-roadmap skill for self-removing CLAUDE.md backlogs
cofade Jun 12, 2026
6bc51db
feat: Filter image list by annotation status (upstream #27)
cofade Jun 12, 2026
a68dfb4
feat: Fall back to CPU on unsupported CUDA compute capability (upstre…
cofade Jun 12, 2026
65ea399
docs: arc42 updates for #27/#57 + execute backlog deletion hook
cofade Jun 12, 2026
36da1ae
fix: Address senior review findings on filter wiring and device probe
cofade Jun 12, 2026
8ef082c
fix: Hide non-matching current row from filtered image list (#27)
cofade Jun 19, 2026
58af8bf
Merge pull request #19 from cofade/feature/issue-27-57-image-filter-a…
cofade Jun 19, 2026
e833a56
fix: Save trained YOLO model with .save() not .export() (#30)
cofade Jun 19, 2026
24e96f9
fix: Open all text files as UTF-8 to prevent Windows charmap crash (#44)
cofade Jun 19, 2026
384467a
fix: Edit smallest containing polygon for nested annotations (#33)
cofade Jun 19, 2026
3853bcd
feat: Sort image list alphabetically (#60)
cofade Jun 19, 2026
bf8a806
fix: Handle LZW/compressed TIFF without imagecodecs gracefully (#56)
cofade Jun 19, 2026
feb7e20
docs+test: arc42 for #60/#56, #44 encoding guard test, execute backlo…
cofade Jun 19, 2026
a02bde0
chore: Clear pre-existing lint (F401/F811/F541/F841) in touched files
cofade Jun 19, 2026
bb34c3b
fix: Address senior review on quick-wins (P1 codec heuristic + P2s)
cofade Jun 19, 2026
e28ba00
test: Pin that bare 'compression' no longer matches codec heuristic (…
cofade Jun 19, 2026
9937d8a
Merge pull request #20 from cofade/feature/quick-wins-30-44-33-60-56
cofade Jun 19, 2026
bb865f6
feat: Fine-tune SAM 2 / 2.1 on user annotations (bnsreenu#73)
cofade Jun 19, 2026
7d7f8c6
fix: Address senior review on SAM fine-tuning
cofade Jun 20, 2026
c870cac
fix: Restore SAM UI if training setup fails; review nits
cofade Jun 20, 2026
6892f3d
fix: ASCII arrow in export_sam_dataset print (cp1252 console safety)
cofade Jun 20, 2026
570de46
fix: Correct SAM fine-tuning mask shift + UI papercuts (PR #21)
cofade Jun 20, 2026
6aee148
chore: Drop redundant "Project Opened" success dialog
cofade Jun 20, 2026
7952e2c
Merge pull request #21 from cofade/feature/issue-73-sam-finetuning
cofade Jun 20, 2026
b448f36
fix: Light-mode contrast for DINO threshold-table headers
cofade Jun 20, 2026
077e604
feat: Canvas mask selection, multi-delete & visible selection styling…
cofade Jun 21, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .claude/skills/temp-roadmap/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
name: temp-roadmap
description: Create a temporary, self-removing roadmap/backlog section in CLAUDE.md from a set of identified work items (issue triage, audit findings, review feedback). Use when the user asks to "track these items", "write this list into CLAUDE.md as a plan", "create a backlog/roadmap in CLAUDE.md", or wants a work list that cleans itself up as PRs land.
---

# Temporary Self-Removing Roadmap in CLAUDE.md

Turn a set of identified work items into a CLAUDE.md section that shrinks with
every PR and disappears entirely when the work is done — so CLAUDE.md returns
to its clean state without manual housekeeping.

## When to use

- After issue triage, a code audit, or a review produced a concrete list of
work items that will be resolved across several future PRs.
- NOT for single-PR task tracking (use the todo list) and NOT for permanent
guidance (write a normal CLAUDE.md section or arc42 doc instead).

## Structure to write into CLAUDE.md

Insert a section near the top of CLAUDE.md (after the docs index, before
project structure):

```markdown
## <Topic> Backlog (TEMPORARY section — self-deleting)

**Deletion hook:** When a PR resolves one of the items below, DELETE its row
from this table **in the same PR**. When the last row is gone, delete this
entire section so CLAUDE.md returns to its clean state. Never let a finished
item linger here.

Items validated <YYYY-MM-DD>; source: <issue tracker / audit / review link>.

| Item | Size | Task |
|------|------|------|
| #123 | quick win | One-line actionable description with file hints (`path/file.py`, function name, reporter-verified fix if any) |
| #124 | medium | ... |
| #125 | blocked | ... — name what it is blocked on so it can be re-checked |
```

## Rules

1. **One line per item.** Concise but self-sufficient: a future session must be
able to start the item from the row alone — include file paths, function
names, and known pitfalls ("do NOT use setSortingEnabled — currentRowChanged
fires switch_image").
2. **Classify every item**: `quick win` / `medium` / `large` / `blocked`.
For `blocked`, state the blocker.
3. **Reference the source** (issue number, ticket, review comment) so context
can be recovered.
4. **Date the validation** — rows describe code state at a point in time;
re-verify before implementing if the date is old.
5. **The deletion hook is mandatory** and must appear verbatim-in-spirit at the
top of the section. It is the whole point: the section is a consumable, not
documentation.
6. Mark items currently being worked on with *(in progress on this branch)* so
parallel sessions don't double-pick them.

## Per-PR workflow (executing the hook)

1. Pick item(s), implement on a feature branch.
2. In the same PR: delete the resolved row(s) from the table.
3. If the table is now empty: delete the entire section, including the heading
and the deletion-hook paragraph.
4. The PR diff thus always shows both the fix and the backlog shrinking —
reviewers can see progress without a separate tracker.
6 changes: 4 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@ models/
*.iap

# Local tooling — .claude is mostly local state, but the senior-reviewer
# agent definition under .claude/agents/ is tracked (referenced by
# CLAUDE.md as a mandatory quality gate), so un-ignore that subtree.
# agent definition under .claude/agents/ and project skills under
# .claude/skills/ are tracked (referenced by CLAUDE.md / used as shared
# workflow tooling), so un-ignore those subtrees.
.claude/*
!.claude/agents/
!.claude/skills/
.venv

# Python cache and build artifacts
Expand Down
35 changes: 32 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,24 @@ For detailed architecture and design information, see **[docs/](docs/)**:

See [docs/README.md](docs/README.md) for full documentation index.

## Upstream Issue Backlog (TEMPORARY section — self-deleting)

**Deletion hook:** When a PR resolves one of the items below, DELETE its row
from this table **in the same PR**. When the last row is gone, delete this
entire section so CLAUDE.md returns to its clean state. Never let a finished
item linger here.

Issue numbers refer to https://github.com/bnsreenu/digitalsreeni-image-annotator/issues
(validated 2026-06-12; already-fixed issues have close-request comments posted, not listed here).

| Issue | Size | Task |
|-------|------|------|
| #32 + #36 | medium | Annotations can extend outside image bounds (manual edit + Image Augmenter) → clamp coords on commit; clip augmented polygons to image rect (shapely intersection). Silently poisons training data |
| #40 | medium | True bbox editing (move whole box, drag edges, keep rectangularity). Currently bbox annotations aren't editable at all — `start_polygon_edit` only matches `"segmentation"` |
| #63 | blocked | SAM 3 support — blocked on Ultralytics shipping SAM 3; re-check their releases before attempting |
| #35 | large | Keypoint annotation tool |
| #24 | large | Magic-wand-style point add/remove mask refinement (partially covered by SAM point prompts) |

## Project Structure

```
Expand All @@ -47,14 +65,17 @@ src/digitalsreeni_image_annotator/
├── __init__.py # Public API re-exports
├── core/ # constants, annotation_utils, image_utils
├── controllers/ # 7 controllers (project, image, sam, dino,
│ # yolo, annotation, class) + io_controller
├── controllers/ # 8 controllers (project, image, sam,
│ # sam_train, dino, yolo, annotation,
│ # class) + io_controller
├── widgets/
│ ├── image_label.py # ImageLabel canvas widget (dispatcher)
│ ├── canvas_context.py # CanvasContext read accessor (ADR-018)
│ └── tools/ # Per-tool handlers (ADR-019): rectangle,
│ # polygon, paint, eraser
├── inference/ # sam_utils.py, dino_utils.py
├── training/ # SAM fine-tuning (ADR-021): sam_trainer.py
│ # (SAMFineTuner), sam_dataset.py
├── io/ # export_formats.py, import_formats.py
├── ui/ # menu_bar, sidebar, shortcuts, theme, stylesheets
└── dialogs/ # Standalone tool dialogs (statistics,
Expand All @@ -76,8 +97,10 @@ src/digitalsreeni_image_annotator/
| `SAMController` | controllers/sam_controller.py | SAM model picker, debounce, in-flight guard (ADR-013) |
| `DINOController` | controllers/dino_controller.py | DINO single/batch detection, batch review, temp-class workflow |
| `YOLOController` | controllers/yolo_controller.py | YOLO training menu + prediction wiring |
| `SAMUtils` | inference/sam_utils.py | Load SAM models, run inference |
| `SAMUtils` | inference/sam_utils.py | Load SAM models (built-in + fine-tuned), run inference |
| `DINOUtils` | inference/dino_utils.py | Grounding-DINO model load + inference |
| `SAMFineTuner` | training/sam_trainer.py | Fine-tune SAM 2 decoder/encoder via custom loop over Ultralytics SAM2Model (ADR-021) |
| `SAMTrainController` | controllers/sam_train_controller.py | SAM fine-tune menu, GPU gate, training thread, selector registration |

See [Building Block View](docs/05_building_block_view.md) for detailed class documentation.

Expand Down Expand Up @@ -169,6 +192,8 @@ See [Runtime View](docs/06_runtime_view.md#multi-dimensional-image-loading) for
| GPU model unload | `model.cpu()` → `gc.collect()` → `torch.cuda.empty_cache()` + `ipc_collect()` + `synchronize()` — full reclaim requires app restart due to per-process CUDA context | Setting refs to None alone leaves circular refs pinned and shows zero Task Manager drop. See [Releasing Model GPU Memory](docs/08_crosscutting_concepts.md#releasing-model-gpu-memory). |
| Export image-path lookup | Exact-key match first, substring fallback only | `"bee.jpg" in "honeybee.jpg"` is True — substring-only matching writes the wrong file. See [Export Format Filename Matching](docs/08_crosscutting_concepts.md#export-format-filename-matching). |
| F2 / global shortcuts | Use `QShortcut` with `Qt.ShortcutContext.ApplicationShortcut`, not `keyPressEvent` | `QTableWidget` consumes F2 for in-cell edit before it bubbles up. |
| Canvas ↔ list selection sync | Canvas selection (idle-mode click/Shift/rubber-band) drives the annotation list via `apply_canvas_selection`; mirror the list with `blockSignals(True/False)` and match annotations by **value-equality**, never identity | PyQt round-trips `UserRole` dicts as copies and `image_label.annotations` is a deepcopy, so identity is never stable; un-blocked `setSelected` recurses through `update_highlighted_annotations`. Multi-select uses **Shift** (Ctrl stays pan). See [ADR-022](docs/09_architecture_decisions.md#adr-022-canvas-mask-selection-unified-with-the-annotation-list). |
| Selection rendering | Don't recolour a selected mask. Keep its class colour; draw a class-colour-independent overlay (dashed `_SELECTION_COLOR` blue bounding box + bright handle squares at corners/edge-midpoints, OGP-style) in a final pass — `_draw_selection_overlay`. Handles are visual-only (resize = #40). Default class colours come from `core/constants.py::default_class_color` (red last, muted) | Red selection was invisible on a red-class mask; a thin dashed outline alone was too faint; the handles carry the visibility. See ADR-022 amendment. |

## Development Workflow

Expand Down Expand Up @@ -245,6 +270,10 @@ See [Risks and Technical Debt](docs/11_risks_and_technical_debt.md) for full lis
|--------|--------|
| Ctrl+Wheel | Zoom |
| Ctrl+Drag | Pan |
| Click / Shift+Click (no tool) | Select / toggle mask |
| Drag / Shift+Drag (no tool) | Rubber-band select / add |
| Double-click | Vertex-edit mode |
| Delete | Delete selected mask(s) |
| Enter | Finish/Accept |
| Esc | Cancel |
| Up/Down | Navigate slices |
Expand Down
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,14 @@ python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_

You should see `True` and your GPU name. For other platforms or driver combinations, use the official selector at <https://pytorch.org/get-started/locally/>.

#### Older NVIDIA GPUs (Pascal / Maxwell)

PyTorch ≥ 2.8 wheels no longer include kernels for GPUs older than Volta (compute capability < 7.0), e.g. the GTX 10xx series (sm_61). On such cards the app detects the mismatch, warns once, and automatically runs inference on the CPU instead of crashing with `CUDA error: no kernel image is available`. To keep using the GPU, install an older PyTorch that still supports it:

```bash
pip install torch==2.4.1 torchvision==0.19.1 --index-url https://download.pytorch.org/whl/cu121
```

## Usage

1. Run the DigitalSreeni Image Annotator application:
Expand Down
25 changes: 22 additions & 3 deletions docs/05_building_block_view.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ src/digitalsreeni_image_annotator/
├── utils.py # Cross-cutting utilities
├── core/ # Constants, annotation utils, image utils
│ ├── constants.py
│ └── annotation_utils.py
│ ├── annotation_utils.py
│ └── torch_utils.py # Shared torch device resolution + CPU fallback (#57)
├── widgets/
│ ├── image_label.py # ImageLabel - canvas widget; dispatcher
│ ├── canvas_context.py # CanvasContext - narrow read view (ADR-018)
Expand Down Expand Up @@ -155,6 +156,23 @@ The earlier subprocess approach is documented as
[ADR-011](09_architecture_decisions.md#adr-011-run-torch-based-workers-in-isolated-subprocesses)
(Superseded).

### SAM Fine-Tuning Subsystem (`training/`)

Lets users fine-tune SAM 2 / 2.1 on their own annotations, since
Ultralytics ships no SAM trainer (ADR-021). Distinct from `inference/`
because it is *training*, not inference.

| Module | Responsibility |
|--------|----------------|
| `training/sam_trainer.py` | `SAMFineTuner` — custom decoder (optionally encoder) fine-tuning loop reusing `SAM2Predictor.get_im_features` / `prompt_inference` under autograd, focal+dice loss, AdamW, checkpoint save+reload-verify. Also geometry helpers (`polygon_to_mask`, `mask_to_xyxy`, `mask_to_point`), `make_custom_filename`, `list_custom_models`, and the `SampleGroup` lazy-rasterising dataset item. |
| `training/sam_dataset.py` | `build_groups_from_project` (live `all_annotations`) and `build_groups_from_folder` (prepared dataset) → `list[SampleGroup]`, mirroring `export_yolo_v5plus` image resolution. |
| `io/export_formats.py::export_sam_dataset` | Writes `images/` + `manifest.json` (authoritative bbox/segmentation specs) for an inspectable, re-trainable on-disk dataset. |

Fine-tuned checkpoints save as `{"model": state_dict}` and reload
through the unchanged `SAM(path)` inference path; `SAMUtils` gains a
`custom_models` registry so they appear in the SAM selector alongside
the eight built-ins.

### DINO Subsystem (Grounding DINO + SAM pipeline)

LLM-assisted detection: the user gives free-form text phrases per class,
Expand Down Expand Up @@ -195,7 +213,7 @@ export time — see [Cross-cutting Concepts](08_crosscutting_concepts.md)).

## Level 3: Controllers

Seven `QObject` controllers plus an `io_controller` helper module
Eight `QObject` controllers plus an `io_controller` helper module
carve `ImageAnnotator` into single-responsibility owners that the
orchestrator delegates to. Each `QObject` controller holds `self.mw
= main_window` and owns one slice of behaviour; the
Expand All @@ -208,12 +226,13 @@ the controller graph.
| Controller | Responsibility |
|------------|----------------|
| `ProjectController` | `.iap` save/load, auto-save, backup/restore, missing-image prompts, window-title sync. Owns the `is_loading_project` autosave guard (load/save round-trip safety, v0.8.12). |
| `ImageController` | Open / load / switch images and slices. TIFF + CZI loaders, the multi-dim `DimensionDialog`, the `[-ndim:]` axis-slice bug fix from the v0.9.0 era. |
| `ImageController` | Open / load / switch images and slices. TIFF + CZI loaders (with `imagecodecs` codec-error handling — #56), the multi-dim `DimensionDialog`, the `[-ndim:]` axis-slice bug fix from the v0.9.0 era. Image-list annotation-status filter (`image_has_annotations`, `apply_image_filter` — #27) and alphabetical sort (`sort_image_list` — #60). |
| `AnnotationController` | Annotation CRUD, list sorting, highlight, edit-mode entry/exit, `finish_polygon`, `finish_rectangle`, `replace_annotations` (eraser path). Validates writes before mutating `all_annotations`. |
| `ClassController` | Class add / delete / rename / colour / visibility. `update_slice_list_colors`, `is_class_visible`. |
| `SAMController` | SAM box/points tool lifecycle, debounce timer, `_sam_inference_in_flight` re-entrancy guard (ADR-013), model picker. |
| `DINOController` | Single + batch detection, batch review navigation, temp-annotation accept/reject, custom-model browse, `DINOReviewEventFilter` ownership (ADR-015). |
| `YOLOController` | Training menu, `TrainingThread`, prediction dialog, result processing. |
| `SAMTrainController` | SAM fine-tuning menu, GPU gate, `SAMTrainingThread`, config dialog, registers fine-tuned checkpoints into the SAM selector (ADR-021). |
| `io_controller` *(module-level functions, not a class)* | Thin UI wrappers around the pure `io/export_formats.py` and `io/import_formats.py` modules. |

Communication: `ImageLabel` does not import controllers directly —
Expand Down
66 changes: 66 additions & 0 deletions docs/06_runtime_view.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,37 @@ User presses Enter
└─> update() to show final annotation
```

## Mask Selection & Deletion on the Canvas (issue #75)

Active only when no drawing/SAM tool is selected (`ImageLabel._is_select_mode()`).
Double-click still enters vertex-edit; Ctrl+drag still pans.

```
User clicks / drags on image (no tool active)
├─> ImageLabel mouse press/move/release
│ ├─> click → annotation_at(pos) (smallest mask, seg or bbox)
│ ├─> click empty → [] (clears selection)
│ ├─> drag → annotations_in_rect(rect) (rubber band, bounds-intersect)
│ └─> Shift → toggle (click) / add (drag)
├─> emit canvasSelectionChanged(annotations, mode) mode = replace|add|toggle
└─> AnnotationController.apply_canvas_selection()
├─> compute new set from highlighted_annotations + annotations per mode
├─> image_label.highlighted_annotations = new (blue selection overlay)
├─> mirror onto annotation_list (blockSignals while selecting)
└─> enable Merge (≥2) / Change Class (≥1)

User presses Delete (canvas focused)
├─> ImageLabel.keyPressEvent → deleteSelectionRequested
└─> AnnotationController.delete_selected_annotations() (confirm → remove → re-sort → autosave)
```

The canvas and the list share one selection (matched by dict value-equality), so
Delete/Merge/Change-Class behave the same from either surface. See ADR-022.

## SAM-Assisted Annotation (SAM-box / SAM-points)

```
Expand Down Expand Up @@ -370,3 +401,38 @@ User clicks "Export" > "YOLO v8/v11"
└─> Return
```

## SAM Fine-Tuning (annotate → train → use)

See [ADR-021](09_architecture_decisions.md#adr-021-sam-fine-tuning-via-a-custom-loop-over-the-ultralytics-sam2-module).

```
User: SAM Fine-Tune (beta) > Train on Current Project…
├─> build_groups_from_project(all_annotations, image_paths, slices, image_slices)
│ polygons/bboxes → SampleGroup(image_loader, specs) (masks rasterised lazily)
├─> _gpu_gate(): resolve_torch_device(); if "cpu" → warn + let user back out
├─> SAMTrainConfigDialog: base model, epochs, lr, batch, prompt (bbox/point),
│ "also fine-tune image encoder?"
├─> deactivate_sam_tools() + lock SAM inference UI (tools, selector, menu)
│ trainer loads its OWN SAM instance; locking avoids a 2nd model on the same CUDA context
└─> SAMTrainingThread → SAMFineTuner.train(...)
│ build predictor (one warmup predict), pin device, apply freeze policy
│ for each epoch / image / instance:
│ set_image → get_im_features (no_grad when encoder frozen)
│ prompt_inference(bbox|point) under enable_grad → mask logits
│ focal+dice loss → backward → AdamW step (every batch_size instances)
│ progress_signal → TrainingInfoDialog (Stop supported)
│ save {"model": state_dict} as <name>_<base_token>.pt → reload-verify via SAM()
└─> training_finished: register in SAMUtils.custom_models,
add "★ <name>" to the SAM selector and select it
→ SAM-box / SAM-points now use the fine-tuned model

Offline variant: "Prepare SAM Dataset…" → export_sam_dataset (images/ + manifest.json),
then "Train from Dataset Folder…" → build_groups_from_folder → same training path.
```
Loading
Loading