diff --git a/.claude/agents/senior-reviewer.md b/.claude/agents/senior-reviewer.md index 2cb8486..58d047b 100644 --- a/.claude/agents/senior-reviewer.md +++ b/.claude/agents/senior-reviewer.md @@ -7,7 +7,7 @@ color: red You are a senior staff engineer with 20 years of experience. You have shipped systems that outlived three reorgs. You have seen every flavour of "we'll clean this up later." You are in a bad mood today. You give honest, direct, unsweetened feedback. You do NOT pad with praise. You call out sloppiness, missing rigor, hand-waving, and architecture-by-vibes. You are fair — if something is genuinely good, you grudgingly say so in one sentence — but the default is critical. -You are NOT the author. Treat this as an independent review of pending changes for the DigitalSreeni Image Annotator (PyQt5 desktop app for scientific image annotation with SAM 2 integration). +You are NOT the author. Treat this as an independent review of pending changes for the DigitalSreeni Image Annotator (PyQt6 desktop app for scientific image annotation with SAM 2 integration). ## Operating principles @@ -24,7 +24,7 @@ The default scope is the diff between the current branch and upstream master (`g Cover the following dimensions; only report findings, not the dimensions themselves: 1. **Correctness against the user story / acceptance criteria.** Identify gaps (claimed but not implemented), overreach (scope creep), and silent regressions in adjacent code. -2. **Code quality and patterns.** Does new code follow existing patterns in the codebase, or did the author invent a parallel mechanism? Premature abstractions, copy-paste duplication, defensive code for impossible states, swallowed exceptions, fallbacks that hide failures, half-finished implementations. PyQt5 specifics: signal/slot wiring, widget lifecycle, threading off the GUI thread, coordinate-system bugs. +2. **Code quality and patterns.** Does new code follow existing patterns in the codebase, or did the author invent a parallel mechanism? Premature abstractions, copy-paste duplication, defensive code for impossible states, swallowed exceptions, fallbacks that hide failures, half-finished implementations. PyQt6 specifics: signal/slot wiring, widget lifecycle, threading off the GUI thread (the `_run_sync` event-loop-pump pattern in `sam_utils.py` and re-entrancy guards at call sites), coordinate-system bugs, enum namespacing (`Qt.AlignmentFlag.AlignCenter` etc. — Qt6 is strict). 3. **Tests.** This project has no automated tests (yet). For any new feature, flag whether manual testing instructions are at least present in the commit message or a plan file. If a feature could regress silently, that's P1 minimum. 4. **Documentation accuracy.** Where the change touches behaviour described in docs (`CLAUDE.md`, arc42 chapters under `docs/`), do the docs still match? Documentation drift is debt that compounds; flag it. 5. **Cross-document consistency.** When several docs reference the same concept, do they agree after the change? Re-grep for stale references. @@ -35,8 +35,7 @@ Cover the following dimensions; only report findings, not the dimensions themsel - Coordinate system conventions respected (zoom_factor, offset_x/y)? - `is_loading_project` guard checked before save operations? - DINO config persisted in `.iap` with backward compat? - - No torch/transformers imports in main process (subprocess-only)? - - **Worker subprocess PyQt isolation (ADR-011).** If `sam_worker.py` or `dino_worker.py` was touched, run `python tools/check_worker_isolation.py`. Exit code 0 means both workers can be imported without pulling PyQt5 into the interpreter; non-zero means the WinError 1114 DLL load-order bug has been re-introduced. The script uses `importlib.abc.MetaPathFinder.find_spec` (the modern API) plus a `sys.modules` sweep to catch leaks even if a finder is bypassed. Negative-test verified. + - **In-process inference re-entrancy (ADR-013).** SAM/DINO inference runs on a `QThread` while the calling thread pumps its event loop via `_run_sync`. The torch/ultralytics/transformers model objects are not thread-safe, so a second call must not start while a first is still running. Verify `_inference_in_flight` guard in `sam_utils._run_sync` still raises `InferenceBusyError` on re-entry, and that timer-driven call sites (especially `apply_sam_prediction` in `annotator_window.py`) carry their own busy guard. Silently returning `None` on a load failure would be a regression — exceptions must propagate out of the worker via `_InferenceThread._exc`. ## How to investigate diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml index 69308a5..4bdf156 100644 --- a/.github/workflows/tests.yml +++ b/.github/workflows/tests.yml @@ -13,7 +13,7 @@ jobs: fail-fast: false matrix: os: [ubuntu-latest, windows-latest, macos-latest] - python-version: ['3.10', '3.11', '3.12'] + python-version: ['3.10', '3.11', '3.12', '3.13'] steps: - uses: actions/checkout@v4 @@ -27,9 +27,14 @@ jobs: if: runner.os == 'Linux' run: | sudo apt-get update - sudo apt-get install -y libxcb-xinerama0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-randr0 libxcb-render-util0 libxcb-shape0 libxcb-xfixes0 libxkbcommon-x11-0 libdbus-1-3 - # For headless Qt testing - sudo apt-get install -y xvfb x11-utils libxkbcommon-x11-0 libxcb-cursor0 + # Qt6 Linux runtime: XCB plugin set + libEGL + xkbcommon + + # libxcb-cursor0 (required by Qt 6; was optional in Qt 5). + # Plus xvfb for headless test runs. + sudo apt-get install -y \ + libxcb-xinerama0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 \ + libxcb-randr0 libxcb-render-util0 libxcb-shape0 libxcb-xfixes0 \ + libxcb-cursor0 libxkbcommon-x11-0 libdbus-1-3 libegl1 libgl1 \ + xvfb x11-utils - name: Cache pip packages uses: actions/cache@v4 diff --git a/CLAUDE.md b/CLAUDE.md index 2cfb7cb..ae5b906 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,7 +2,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. -DigitalSreeni Image Annotator - PyQt5 desktop app for image annotation with SAM 2 integration and multi-dimensional image support. +DigitalSreeni Image Annotator - PyQt6 desktop app for image annotation with SAM 2 integration and multi-dimensional image support. **Fork of**: https://github.com/bnsreenu/digitalsreeni-image-annotator @@ -20,9 +20,9 @@ python -m src.digitalsreeni_image_annotator.main ## Tech Stack -Python 3.10+ | PyQt5 5.15.11 | Ultralytics 8.3.27 (SAM 2) | NumPy | OpenCV | Shapely +Python 3.10+ | PyQt6 6.7+ | Ultralytics 8.3.27 (SAM 2) | NumPy | OpenCV | Shapely -**No automated tests exist** - all testing is manual. +**Test suite**: `tests/` (pytest + pytest-qt). 65 tests pass on PyQt6. ## Documentation @@ -83,11 +83,16 @@ self.all_annotations[self.image_file_name].append({ ### SAM Integration +SAM runs in-process; the Ultralytics model object lives on `SAMUtils` +and persists across calls. Inference runs on a background QThread but +the public API is synchronous — see ADR-013 in +`docs/09_architecture_decisions.md`. + ```python -# Load model (first use downloads, ~40-400MB) -self.sam_utils.change_sam_model("SAM 2 tiny") +# Load model on first selection (downloads weights if missing, ~40-400MB) +self.sam_utils.change_sam_model("SAM 2 tiny") # blocks UI thread via QEventLoop spin -# Run inference +# Run inference (also runs on worker thread, returns when done) prediction = self.sam_utils.apply_sam_points( qimage, positive_points=[(x1, y1)], @@ -99,8 +104,8 @@ prediction = self.sam_utils.apply_sam_points( ## Important Notes ### Platform Support -- ✅ Windows, macOS fully supported -- ⚠️ Linux has XCB issues, limited testing +- ✅ Windows, macOS, Linux supported (PyQt6 native integration improved over PyQt5) +- Linux runtime needs libxcb-cursor0 (Qt6 requires this; was optional under Qt5) ### Critical: Project Loading **Always check `is_loading_project` flag before saving!** Autosave during load corrupts files (v0.8.12 fix). @@ -124,9 +129,24 @@ See [Cross-cutting Concepts](docs/08_crosscutting_concepts.md#coordinate-systems - Slices extracted with names like `stack.tif_T0_Z5_C0` - Each slice annotated independently - Stored in `image_slices` dict +- TIFF axis hint: `load_tiff` reads `tifffile.series[0].axes` and pre-fills the dimension dialog; ndim≥5 had a `[-ndim:]` slice bug that produced 2560 wrong slices on a 5D `TZCYX` file — see arc42 if you touch this See [Runtime View](docs/06_runtime_view.md#multi-dimensional-image-loading) for workflow. +### Patterns introduced in v0.9.0 (read before touching these areas) + +| Area | Pattern | Why | +|------|---------|-----| +| Pan / zoom-to-cursor in scroll area | Use `event.globalPosition()` for pan; derive post-zoom offset from `viewport().width()`, not `self.width()` | Widget-local coords absorb half the pan delta as the widget shifts; `self.width()` is stale during zoom-out before layout settles. See [Pan + Zoom Reference Frames](docs/08_crosscutting_concepts.md#pan--zoom-reference-frames). | +| Dark mode contrast | No hardcoded `background:` / `color:` in widget `setStyleSheet(...)` | Hardcoded greys override `soft_dark_stylesheet.py` and punch bright boxes into the sidebar. Add a global rule first, then write the widget. See [No Hardcoded Colors Rule](docs/08_crosscutting_concepts.md#dark-mode--no-hardcoded-colors-rule). | +| DINO review state | `image_label.temp_annotations` is a single field, **not** per-image — must be re-synced from `dino_batch_results` on every image/slice switch via `_refresh_dino_temp_for_current` | Otherwise the first image's masks bleed onto every subsequent slice during navigation. See [DINO Temp Annotations](docs/08_crosscutting_concepts.md#dino-temp-annotations--single-field-many-images). | +| DINO batch over stacks | Use `_collect_dino_batch_work_items()` to flatten regular images + every loaded slice; don't iterate `self.all_images` directly | Multi-dim images appear in `all_images` as a single entry — slices live in `self.image_slices[base_name]` and were silently skipped. | +| DINO Enter/Escape during review | Application-wide `_DINOReviewEventFilter`, gated on pending temp_annotations + no modal + no text input | `QListWidget` consumes Enter for `itemActivated` before `ImageLabel.keyPressEvent` sees it. See [ADR-015](docs/09_architecture_decisions.md#adr-015-application-wide-event-filter-for-dino-review-shortcuts). | +| Auto-accept dropdown | Honored by **both** `run_dino_detection_single` and `run_dino_detection_batch` | Easy to forget in the single path because the combo is labeled "batch". | +| GPU model unload | `model.cpu()` → `gc.collect()` → `torch.cuda.empty_cache()` + `ipc_collect()` + `synchronize()` — full reclaim requires app restart due to per-process CUDA context | Setting refs to None alone leaves circular refs pinned and shows zero Task Manager drop. See [Releasing Model GPU Memory](docs/08_crosscutting_concepts.md#releasing-model-gpu-memory). | +| Export image-path lookup | Exact-key match first, substring fallback only | `"bee.jpg" in "honeybee.jpg"` is True — substring-only matching writes the wrong file. See [Export Format Filename Matching](docs/08_crosscutting_concepts.md#export-format-filename-matching). | +| F2 / global shortcuts | Use `QShortcut` with `Qt.ShortcutContext.ApplicationShortcut`, not `keyPressEvent` | `QTableWidget` consumes F2 for in-cell edit before it bubbles up. | + ## Development Workflow **CRITICAL: Always use feature branches — NEVER commit directly to master.** @@ -151,7 +171,7 @@ Before opening a PR, verify at minimum: 4. **Dark mode** — toggle and check rendering of new UI elements 5. **Save/load roundtrip** — if the feature touches `.iap` project files, save, close, reopen, verify state restored 6. **Adjacent features** — verify no regression in SAM, annotation tools, export formats -7. **Subprocess features** — if touching `sam_worker.py` or `dino_worker.py`, verify inference still works (model loads, returns masks/boxes) +7. **Inference features** — if touching `sam_utils.py` or `dino_utils.py`, verify the model loads end-to-end (no silent load failure), returns masks/boxes, and the UI stays responsive during inference (timers, redraws, progress dialog cancels keep firing — see ADR-013) ### arc42 Documentation Update Rules @@ -172,7 +192,9 @@ Before every PR, run the senior reviewer agent (`.claude/agents/senior-reviewer. This is **mandatory** — the agent performs an independent end-of-implementation review: - Reads the actual diff, not commit messages - Ranks issues P0 (blocks merge) / P1 (should fix) / P2 (nit) -- Checks CLAUDE.md compliance (feature branches, coordinate systems, `is_loading_project` guards, DINO config persistence, subprocess isolation) +- Checks CLAUDE.md compliance (feature branches, coordinate systems, `is_loading_project` guards, DINO config persistence, in-process inference re-entrancy guards) + +**Run it in the foreground** — never `run_in_background: true`. The review is a blocking quality gate: the next steps (address P0s, push, open PR) depend on its findings. Launch the agent and wait for the result before doing anything else, then iterate until clean. Address all P0s before merging. Address P1s unless there's explicit justification. diff --git a/PYTHON314_SETUP.md b/PYTHON314_SETUP.md deleted file mode 100644 index 820c4d8..0000000 --- a/PYTHON314_SETUP.md +++ /dev/null @@ -1,239 +0,0 @@ -# Python 3.14 Setup Complete ✓ - -## Summary - -Successfully set up the testing infrastructure for the DigitalSreeni Image Annotator project with **Python 3.14.2** support. - -## What Was Fixed - -### 1. Dependency Compatibility - -**Problem**: Python 3.14 is bleeding-edge and requires specific dependency versions. - -**Solution**: -- Updated `numpy>=2.4.0` (required for Python 3.14) -- Changed all pinned versions in [setup.py](setup.py) to flexible constraints (`>=`) -- Updated [requirements.txt](requirements.txt) with Python 3.14-compatible versions - -### 2. PyTorch/Torch DLL Loading Issue - -**Problem**: PyTorch has DLL compatibility issues with Python 3.14 on Windows, causing: -``` -OSError: [WinError 1114] A dynamic link library (DLL) initialization routine failed. -Error loading "torch\lib\c10.dll" -``` - -**Solution**: Modified test imports to load modules directly by file path using `importlib.util`, bypassing the package `__init__.py` that imports torch: - -```python -import importlib.util - -# Import module directly by file path -module_path = os.path.join(os.path.dirname(__file__), '..', '..', 'src', 'digitalsreeni_image_annotator', 'utils.py') -spec = importlib.util.spec_from_file_location("utils", module_path) -utils = importlib.util.module_from_spec(spec) -spec.loader.exec_module(utils) -``` - -This allows tests to run without loading PyTorch. - -## Test Results - -### All Tests Passing ✓ - -```bash -.venv\Scripts\python.exe -m pytest tests/unit/ -v -``` - -**Results:** -- ✅ **47 tests passed** (27 utils + 20 conversions) -- ✅ **100% coverage** on [utils.py](src/digitalsreeni_image_annotator/utils.py) -- ✅ **16% coverage** on [image_label.py](src/digitalsreeni_image_annotator/image_label.py) (get_image_coordinates method) -- ✅ **2% overall coverage** (baseline established) - -### Test Suite Breakdown - -1. **[tests/unit/test_utils.py](tests/unit/test_utils.py)** - 27 tests - - `TestCalculateArea`: 9 tests (polygons, bboxes, edge cases) - - `TestCalculateBbox`: 9 tests (various polygon shapes) - - `TestNormalizeImage`: 9 tests (8-bit, 16-bit, float conversion) - -2. **[tests/unit/test_conversions.py](tests/unit/test_conversions.py)** - 20 tests - - `TestGetImageCoordinates`: 11 tests (zoom, pan, screen-to-image) - - `TestCoordinateConversionProperties`: 6 tests (parametrized) - - `TestEdgeCases`: 3 tests (edge cases) - -3. **[tests/integration/test_export_formats.py](tests/integration/test_export_formats.py)** - 20+ tests (not yet run) - - COCO JSON export - - YOLO format export - - Pascal VOC export - - Multi-dimensional slices - -## Files Modified - -### Configuration Files -- [requirements.txt](requirements.txt) - Updated to `numpy>=2.4.0`, flexible versions -- [setup.py](setup.py) - Changed pinned versions to flexible constraints -- [pytest.ini](pytest.ini) - Created pytest configuration - -### Test Files Created -- [tests/](tests/) - Test directory structure - - [tests/conftest.py](tests/conftest.py) - Pytest fixtures - - [tests/unit/test_utils.py](tests/unit/test_utils.py) - 27 utility function tests - - [tests/unit/test_conversions.py](tests/unit/test_conversions.py) - 20 coordinate conversion tests - - [tests/integration/test_export_formats.py](tests/integration/test_export_formats.py) - 20+ export tests - -### Documentation -- [TESTING.md](TESTING.md) - Complete testing guide -- [PYTHON314_SETUP.md](PYTHON314_SETUP.md) - This file -- [.github/workflows/tests.yml](.github/workflows/tests.yml) - CI/CD pipeline - -## How to Run Tests - -### Using Virtual Environment - -```bash -# Activate venv (Windows) -.venv\Scripts\activate - -# Run all unit tests -pytest tests/unit/ -v - -# Run specific test file -pytest tests/unit/test_utils.py -v - -# Run with coverage -pytest tests/unit/ -v --cov=src/digitalsreeni_image_annotator --cov-report=html - -# View coverage report -start htmlcov/index.html -``` - -### Direct Execution (without activating venv) - -```bash -# Windows -.venv\Scripts\python.exe -m pytest tests/unit/ -v - -# Run single test -.venv\Scripts\python.exe -m pytest tests/unit/test_utils.py::TestCalculateArea::test_polygon_area_square -v -``` - -## Dependencies Installed in .venv - -### Core Dependencies -- PyQt5 5.15.11 -- numpy 2.4.2 (Python 3.14 compatible) -- Pillow 12.1.0 -- opencv-python 4.13.0.90 -- shapely 2.1.2 -- ultralytics 8.4.9 (with torch 2.10.0) -- scikit-image 0.26.0 -- And 20+ other dependencies - -### Test Dependencies -- pytest 9.0.2 -- pytest-qt 4.5.0 -- pytest-cov 7.0.0 -- pytest-mock 3.15.1 -- coverage 7.13.2 - -## Known Limitations - -### PyTorch Integration Tests - -Tests that require SAM (Segment Anything Model) or torch will currently fail due to DLL loading issues. Workarounds: - -1. **Mock torch/SAM** in tests (future work) -2. **Skip torch-dependent tests** on Python 3.14 (future work) -3. **Wait for PyTorch update** with full Python 3.14 support - -### CI/CD Pipeline - -The GitHub Actions workflow ([.github/workflows/tests.yml](.github/workflows/tests.yml)) is configured for Python 3.10, 3.11, 3.12. To add Python 3.14: - -1. Wait for PyTorch to add Windows Python 3.14 support -2. Update workflow matrix to include `'3.14'` - -## Next Steps - -### Milestone 1.1 Complete ✓ - -- ✅ Test infrastructure setup -- ✅ Python 3.14 compatibility -- ✅ 47 unit tests passing -- ✅ 100% coverage on utils.py -- ✅ CI/CD pipeline configured - -### Milestone 1.2: PyQt6 Migration (Next) - -1. Update all PyQt5 imports to PyQt6 -2. Handle Qt enum changes (Qt.AlignLeft → Qt.AlignmentFlag.AlignLeft) -3. Update signal/slot syntax -4. Test for 100% feature parity -5. Update documentation - -### Future Testing Work - -1. **Add more unit tests** - - Polygon operations (Shapely) - - QImage ↔ NumPy conversions - - Export format helpers - -2. **Add integration tests** - - Test export formats (when ready) - - Test project save/load - - Test annotation workflows - -3. **Add UI tests** (pytest-qt) - - Test annotation creation - - Test SAM integration (when torch works) - - Test video loading (Phase 2) - -## Verification Commands - -```bash -# Verify Python version -.venv\Scripts\python.exe --version -# Output: Python 3.14.2 - -# Verify numpy version -.venv\Scripts\python.exe -c "import numpy; print(numpy.__version__)" -# Output: 2.4.2 - -# Run all tests -.venv\Scripts\python.exe -m pytest tests/unit/ -v -# Output: 47 passed - -# Check coverage -.venv\Scripts\python.exe -m pytest tests/unit/ --cov=src/digitalsreeni_image_annotator --cov-report=term -# Output: 2% coverage (utils.py at 100%) -``` - -## Resources - -- [pytest documentation](https://docs.pytest.org/) -- [numpy 2.4 release notes](https://numpy.org/doc/stable/release/2.4.0-notes.html) -- [PyTorch compatibility matrix](https://pytorch.org/get-started/locally/) -- [Python 3.14 what's new](https://docs.python.org/3.14/whatsnew/3.14.html) - -## Troubleshooting - -### "Module not found" errors - -Make sure you're using the venv Python: -```bash -.venv\Scripts\python.exe -m pytest tests/unit/ -v -``` - -### Torch DLL errors - -This is expected with Python 3.14. Tests are designed to work around this by importing modules directly. - -### Coverage warnings - -If you see "already imported", this is normal due to our direct import workaround. Coverage still tracks correctly. - ---- - -**Status**: ✅ Ready for PyQt6 Migration (Milestone 1.2) diff --git a/README.md b/README.md index 5565eb0..104ab92 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ ![License](https://img.shields.io/badge/license-MIT-green) ![PyPI version](https://img.shields.io/pypi/v/digitalsreeni-image-annotator.svg?style=flat-square) -A powerful and user-friendly tool for annotating images with polygons and rectangles, built with PyQt5. Now with additional supporting tools for comprehensive image processing and dataset management. +A powerful and user-friendly tool for annotating images with polygons and rectangles, built with PyQt6. Now with additional supporting tools for comprehensive image processing and dataset management. ## Support the Project @@ -66,7 +66,7 @@ Dr. Sreenivas Bhattiprolu ## Operating System Requirements -This application is built using PyQt5 and has been tested on macOS and Windows. It may experience compatibility issues on Linux systems, particularly related to the XCB plugin for PyQt5. Extensive testing on Linux systems has not been done yet. +This application is built using PyQt6 and runs on macOS, Windows and Linux. On Linux you'll need the standard Qt 6 runtime libraries (notably `libxcb-cursor0`, `libegl1`, `libgl1`, and the XCB plugin set) — `sudo apt install libxcb-cursor0 libegl1 libgl1 libxcb-xinerama0 libxkbcommon-x11-0` covers the common ones on Debian/Ubuntu. ## Installation diff --git a/TESTING.md b/TESTING.md index 8474c8f..ea3ede7 100644 --- a/TESTING.md +++ b/TESTING.md @@ -56,7 +56,7 @@ This document describes the testing infrastructure for the DigitalSreeni Image A 6. **CI/CD Pipeline** ✓ - Created [.github/workflows/tests.yml](.github/workflows/tests.yml) - Multi-platform testing: Ubuntu, Windows, macOS - - Multi-version testing: Python 3.10, 3.11, 3.12 + - Multi-version testing: Python 3.10, 3.11, 3.12, 3.13 - Automated coverage reporting (Codecov integration) - Coverage report artifacts @@ -130,46 +130,14 @@ pytest -m "not slow" -v - **Phase 2 Target**: 80% code coverage - **Phase 3 Target**: 90% code coverage -## Known Issues +## Headless Testing -### Python 3.14 + PyTorch Compatibility +All tests run under `QT_QPA_PLATFORM=offscreen` so they work in CI and SSH +sessions without a display. The Linux runner needs the Qt 6 platform-plugin +deps (`libxcb-cursor0`, `libegl1`, `libgl1`, etc. — see +[`.github/workflows/tests.yml`](.github/workflows/tests.yml) for the full list). -**Issue**: PyTorch (torch) has DLL loading issues with Python 3.14 on Windows, causing access violations when importing ultralytics/SAM. - -**Workaround**: Tests use `importlib.util.spec_from_file_location()` to import modules directly by file path, bypassing the package `__init__.py` that imports torch. This allows unit tests to run without loading PyTorch. - -**Impact**: -- ✓ Unit tests work fine (utils, conversions) -- ✓ Integration tests that don't use SAM work -- ⚠️ Tests requiring SAM/torch will need mocking or skipping until PyTorch adds full Python 3.14 support - -**Dependencies updated for Python 3.14**: -- `numpy>=2.4.0` (Python 3.14 requires numpy 2.4+) -- Other dependencies use latest compatible versions - -### Virtual Environment - -To use the project's .venv with Python 3.14: -```bash -# Windows -.venv\Scripts\activate -.venv\Scripts\python.exe -m pytest tests/unit/ -v - -# Linux/macOS -source .venv/bin/activate -python -m pytest tests/unit/ -v -``` - -## Next Steps - -### Milestone 1.2: PyQt6 Migration - -- Update all PyQt5 imports to PyQt6 -- Handle Qt enum changes -- Test for feature parity -- Update documentation - -### Future Testing Work +## Future Testing Work 1. **Add UI Tests** (pytest-qt) - Test annotation creation workflows diff --git a/docs/01_introduction_and_goals.md b/docs/01_introduction_and_goals.md index b2bf839..a351e87 100644 --- a/docs/01_introduction_and_goals.md +++ b/docs/01_introduction_and_goals.md @@ -2,7 +2,7 @@ ## Overview -DigitalSreeni Image Annotator is a PyQt5-based desktop application for annotating images with polygons, rectangles, and paint tools. It integrates SAM 2 (Segment Anything Model) for semi-automated annotations and supports multi-dimensional images (TIFF stacks, CZI files). +DigitalSreeni Image Annotator is a PyQt6-based desktop application for annotating images with polygons, rectangles, and paint tools. It integrates SAM 2 (Segment Anything Model) for semi-automated annotations and supports multi-dimensional images (TIFF stacks, CZI files). **Repository**: https://github.com/cofade/digitalsreeni-image-annotator (fork of https://github.com/bnsreenu/digitalsreeni-image-annotator) diff --git a/docs/02_architecture_constraints.md b/docs/02_architecture_constraints.md index 668287f..5d9c304 100644 --- a/docs/02_architecture_constraints.md +++ b/docs/02_architecture_constraints.md @@ -5,7 +5,7 @@ | Constraint | Description | Rationale | |------------|-------------|-----------| | **Python 3.10+** | Minimum Python version | Required for modern type hints and library compatibility | -| **PyQt5** | GUI framework | Cross-platform, mature, rich widget set | +| **PyQt6 6.7+** | GUI framework | Cross-platform, mature, rich widget set; improved Linux/XCB integration over PyQt5 | | **Ultralytics** | SAM integration | Simplified SAM model loading, includes PyTorch | | **Desktop Application** | Not web-based | Direct file system access, better performance for large images | @@ -23,17 +23,18 @@ |----------|--------|-------| | **Windows** | ✅ Fully Supported | Primary development platform | | **macOS** | ✅ Fully Supported | Tested and working | -| **Linux** | ⚠️ Limited Support | XCB plugin issues, minimal testing | +| **Linux** | ✅ Supported | Qt6 native integration; runtime needs libxcb-cursor0 | -### Linux-Specific Issues -- PyQt5 XCB platform plugin conflicts -- Workaround: Remove `QT_QPA_PLATFORM_PLUGIN_PATH` environment variable on startup (see [main.py](../src/digitalsreeni_image_annotator/main.py:15-19)) +### Linux Runtime Requirements +- `libxcb-cursor0` (required by Qt 6, was optional under Qt 5) +- `libegl1`, `libgl1` for software rendering fallback +- `libxkbcommon-x11-0` and the standard XCB plugin set ## Conventions | Convention | Description | |------------|-------------| -| **Code Style** | Follow existing PyQt5 patterns | +| **Code Style** | Follow existing PyQt6 patterns (fully-qualified enum names) | | **UI Modes** | Support both light and dark mode | | **Image Paths** | Store absolute paths in project files | | **Annotations** | Polygon (segmentation) or bbox (rectangle) format | diff --git a/docs/05_building_block_view.md b/docs/05_building_block_view.md index 661c0fa..ab5359d 100644 --- a/docs/05_building_block_view.md +++ b/docs/05_building_block_view.md @@ -8,7 +8,7 @@ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ GUI │ │ SAM 2 │ │ YOLO │ │ -│ │ (PyQt5) │ │(Ultraly.)│ │ Trainer │ │ +│ │ (PyQt6) │ │(Ultraly.)│ │ Trainer │ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ ┌──────────────────────────────────────┐ │ @@ -80,24 +80,34 @@ sam_negative_points: list # SAM negative points ### SAMUtils (sam_utils.py) -**Responsibility**: SAM model loading and inference - -**Key Attributes**: -```python -sam_models: dict # Available SAM model variants -current_sam_model: str # Currently loaded model -sam_model: SAM # Ultralytics SAM instance -``` - -**Key Methods**: -- `change_sam_model(model_name)`: Load SAM model -- `apply_sam_points(image, positive_points, negative_points)`: Run inference -- `qimage_to_numpy(qimage)`: Convert QImage to numpy array -- `mask_to_polygon(mask)`: Convert SAM mask to polygon contours - -Inference does not run in-process. `SAMUtils._send_request()` spawns -`sam_worker.py` as a subprocess (PyQt-free) and exchanges JSON over -stdin/stdout. See [ADR-011](09_architecture_decisions.md#adr-011-run-torch-based-workers-in-isolated-subprocesses). +**Responsibility**: SAM model loading and inference (in-process). + +**Key state** (on the `SAMUtils` instance): +- `sam_models: dict` — available SAM model variants (class-level, exposed for the UI dropdown) +- `current_sam_model: str | None` — name of the currently loaded model; `None` if unloaded +- `_model: ultralytics.SAM | None` — the loaded model object (private) + +**Key public methods**: +- `change_sam_model(model_name)` — load a SAM model. Blocks the calling thread (with the UI's event loop pumping) until weights are downloaded and the model is in memory. Raises on load failure. +- `apply_sam_points(image, positive_points, negative_points)` — point-prompted segmentation. +- `apply_sam_prediction(image, bbox)` — single bbox-prompted segmentation. +- `apply_sam_predictions_batch(image, bboxes)` — multi-bbox segmentation in one model call (used by the DINO pipeline). +- `unload()` — drop the cached model and free GPU/CPU memory. Wired to the Tools → "Unload AI Models" menu entry. + +**Module-level helpers** (not class methods): +- `_qimage_to_numpy(qimage)` — convert a `QImage` to an owned numpy array (always copies; see ADR-013 on lifetime safety). +- `_mask_to_polygon(mask)` — convert a SAM mask tensor into polygon contour vertices. +- `_run_sync(fn, *args, **kwargs)` — run `fn` on a worker `QThread`, pump the calling thread's event loop until done, re-raise any exception. Serialises concurrent calls via the `_inference_in_flight` flag; re-entry raises `InferenceBusyError`. + +Inference runs in-process on a background `QThread`. `SAMUtils._run_sync()` +spawns the thread, pumps the caller's event loop until done, and returns +the result — keeping the API synchronous-looking from call sites while +the UI stays responsive. Model objects (Ultralytics `SAM`) live on the +`SAMUtils` singleton and persist across calls. See +[ADR-013](09_architecture_decisions.md#adr-013-in-process-inference-with-qthread-wrapping). +The earlier subprocess approach is documented as +[ADR-011](09_architecture_decisions.md#adr-011-run-torch-based-workers-in-isolated-subprocesses) +(Superseded). ### DINO Subsystem (Grounding DINO + SAM pipeline) @@ -107,30 +117,30 @@ segmentation masks. | Module | Responsibility | |--------|----------------| -| `dino_utils.py` | `DINOUtils` — parent-side façade. Resolves model paths via `models_base_dir()` and forwards detection requests to the worker. | -| `dino_worker.py` | Standalone subprocess that loads `transformers.GroundingDinoForObjectDetection` and runs inference. No PyQt imports. | +| `dino_utils.py` | `DINOUtils` — in-process Grounding DINO wrapper. Resolves model paths via `models_base_dir()`, loads `transformers.AutoModelForZeroShotObjectDetection` lazily on first use, caches it across calls, runs inference on a worker `QThread` (same `_run_sync` pattern as `SAMUtils`). | | `dino_phrase_editor.py` | Two widgets: `ClassThresholdTable` (per-class box/text/NMS thresholds) and `PhraseEditorPanel` (per-class phrase list). These widgets are the **single source of truth** for phrases and thresholds; project save/load reads/writes them via `get_all_phrases()` / `set_phrases()` and `get_thresholds_dict()` / `set_thresholds()`. | | `dino_merge_dialog.py` | Standalone dialog: merges accumulated DINO+SAM annotations across images into a training-ready COCO JSON. | -**Detection request shape** (parent → worker): +**Detection call signature** (in-process): ```python -{ - "image_path": "/abs/path/to/temp.png", - "class_configs": [ - {"name": "drone", "phrases": ["drone", "quadcopter"], - "box_thr": 0.10, "txt_thr": 0.25, "nms_thr": 0.50}, - ... - ], - "model_path": "/abs/path/to/models/grounding-dino-base" -} +DINOUtils().detect( + qimage, # PyQt6.QtGui.QImage + class_configs=[ + {"name": "drone", "phrases": ["drone", "quadcopter"], + "box_thr": 0.10, "txt_thr": 0.25, "nms_thr": 0.50}, + ... + ], + model_name="grounding-dino-base", # or custom_model_path=... +) ``` -**Detection response shape** (worker → parent): +**Detection return value**: ```python -{"results": [ - {"class_name": "drone", "bbox": [x1, y1, x2, y2], "score": 0.93, "label": "drone"}, - ... -]} +[ + {"class_name": "drone", "bbox": [x1, y1, x2, y2], "score": 0.93, "label": "drone"}, + ... +] +# or [] if no boxes survived filtering, or None on error ``` DINO's xyxy boxes feed directly into `SAMUtils.apply_sam_predictions_batch()`, diff --git a/docs/06_runtime_view.md b/docs/06_runtime_view.md index 32b6fa3..45c7125 100644 --- a/docs/06_runtime_view.md +++ b/docs/06_runtime_view.md @@ -20,7 +20,7 @@ │ ├─> Show Main Window │ - └─> Enter Event Loop (app.exec_()) + └─> Enter Event Loop (app.exec()) ``` ## Annotation Creation - Manual Polygon @@ -154,16 +154,17 @@ User clicks "Detect Current Image" │ ├─> DINOUtils.detect(qimage, class_configs, model_name) │ │ - │ ├─> Save QImage to a temp PNG - │ ├─> Spawn dino_worker.py subprocess (PyQt-free) - │ ├─> Send JSON request over stdin - │ ├─> Worker loads GroundingDinoForObjectDetection - │ ├─> Worker runs inference per phrase, applies per-class NMS + │ ├─> Convert QImage to numpy (on calling thread) + │ ├─> _run_sync: spawn QThread, pump caller's event loop while waiting + │ ├─> On the worker thread: + │ │ - Load (or reuse cached) GroundingDinoForObjectDetection + │ │ - Run inference per phrase, apply per-class NMS + │ │ - Apply cross-class NMS │ └─> Returns [{class_name, bbox: [x1,y1,x2,y2], score, label}, ...] │ ├─> Feed DINO bboxes into SAMUtils.apply_sam_predictions_batch() │ │ - │ ├─> Spawn sam_worker.py subprocess + │ ├─> Convert QImage to numpy, run Ultralytics SAM on worker thread │ └─> Returns one {segmentation: [...], score: ...} per bbox │ ├─> Build temp_annotations (segmentation + class + score + source="dino") diff --git a/docs/08_crosscutting_concepts.md b/docs/08_crosscutting_concepts.md index d61c6c2..42efbd5 100644 --- a/docs/08_crosscutting_concepts.md +++ b/docs/08_crosscutting_concepts.md @@ -26,6 +26,24 @@ Annotations are stored in image coordinates (unzoomed, absolute pixels): - **Polygon**: Flattened list `[x1, y1, x2, y2, ...]` - **Rectangle**: COCO format `[x, y, width, height]` +### Pan + Zoom Reference Frames + +Two non-obvious gotchas live in `ImageLabel.mouseMoveEvent` / +`wheelEvent`: + +- **Pan must use `event.globalPosition()`, not `event.position()`.** + Widget-local coords absorb half the cursor delta during a scrollbar + move (the widget shifts under the cursor mid-drag) → effective + half-speed pan. The global frame is stable. +- **Zoom-to-cursor must compute the post-zoom `offset_x/y` + analytically from the viewport, not read `self.offset_x` after the + zoom call.** `update_scaled_pixmap()` only *relaxes* the minimum + size on zoom-out; the widget hasn't shrunk by the time + `update_offset()` runs, so `self.width()` is stale and the offset + comes out wrong. Use `viewport().width()` + `scaled_pixmap.width()` + to derive the offset directly. Zoom-in worked by accident because + the widget grows immediately when `setMinimumSize` enlarges it. + ## Image Format Conversions ### QImage ↔ NumPy Array @@ -133,6 +151,24 @@ First use downloads models, subsequent uses load from cache: sam_model = SAM("sam2_t.pt") # Downloads if not present ``` +### Releasing Model GPU Memory + +`SAMUtils.unload()` and `DINOUtils.unload()` must do **three** things, +in order: + +1. Drop the cached Python references (`self._model = None`, etc.). +2. **`gc.collect()`** to break circular references inside Ultralytics + / Transformers model objects (config ↔ model, processor ↔ + tokenizer). Without this, the C++/CUDA backing memory stays pinned + until Python's cyclic GC runs on its own schedule, which can be + many seconds or never. Task Manager / `nvidia-smi` will show zero + drop in GPU memory. +3. **`torch.cuda.empty_cache()`** (plus `torch.cuda.ipc_collect()`) so + the PyTorch allocator returns the freed blocks to the OS / driver. + +Skipping step 2 was the cause of "Tools → Unload AI Models does +nothing visible" in v0.9.0 manual testing. + ### Model Size Recommendations | Model | Size | RAM Usage | Speed | Recommendation | @@ -161,6 +197,30 @@ else: - Text labels use high-contrast colors - Background grid adjusted for dark backgrounds +### Dark Mode — No Hardcoded Colors Rule + +**Do not hardcode `background`, `color`, or other palette-dependent +values in widget `setStyleSheet(...)` calls.** They override both the +default OS look *and* `soft_dark_stylesheet.py`, leaving bright +rectangles on the dark sidebar. Past offenders that bit us: + +- `ClassThresholdTable` header had `background: #e0e0e0;` → bright bar + across the top of the DINO panel in dark mode. +- `lbl_dino_status` had `background: #f5f5f5;` → bright box where the + "No DINO model loaded" status sat. + +Either leave the property out of the inline stylesheet so the global +sheet wins, or use Qt's palette role functions (`palette(base)`, +`palette(mid)`, `palette(text)`, …) which resolve at paint time +against the active palette. Inline hardcoded greys are an anti-pattern. + +When introducing a new widget type that doesn't have a rule in +`soft_dark_stylesheet.py` yet — add the rule there *first*, then build +the widget. Otherwise the widget uses the OS default in dark mode, +which on Windows means barely-visible radio-button indicators and +white-on-white headers (the dataset splitter radio buttons hit this +before they were styled). + ## Thread Safety for YOLO Training ### Training Thread @@ -279,3 +339,108 @@ print(f"Loading project from: {project_path}") ``` **Note**: No formal logging framework is used. Output goes to console. + +## DINO Temp Annotations — Single Field, Many Images + +`ImageLabel.temp_annotations` is a **single list on the image_label**, +not a per-image cache. It holds the pending DINO+SAM masks shown as +an overlay while the user decides accept/reject. The per-image batch +cache is `ImageAnnotator.dino_batch_results` (a dict keyed by image +name) — `image_label.temp_annotations` is only ever set to one image's +slice of that dict at a time. + +Consequences this codebase has tripped over: + +- **Image/slice switches must re-sync** `temp_annotations` from + `dino_batch_results` for the new image (load if pending, clear if + not). Otherwise masks from the previously-viewed image visually + bleed onto every slice the user navigates to. See + `_refresh_dino_temp_for_current()`. +- **Enter / Escape during review** must work even when the focus is on + slice_list / image_list / a button — `QListWidget` consumes + Enter for itemActivated before `ImageLabel.keyPressEvent` ever sees + it. Solved with an application-wide event filter + (`_DINOReviewEventFilter`) that fires only while + `temp_annotations` has DINO items and skips modal dialogs and text + inputs. Setting `image_label.setFocus()` synchronously inside + `_show_dino_batch_review` was not enough — Qt's focus handling + raced the click event that opened the review and the canvas + often didn't end up focused. `QTimer.singleShot(0, …)` defers until + the current event chain settles. +- **Auto-accept dropdown applies to both paths.** The batch-mode + combo ("Review before accepting" / "Auto-accept all detections") + controls **both** "Detect Current Image" and "Detect All Images". + Only checking it in `run_dino_detection_batch` and not + `run_dino_detection_single` produced a confusing "auto-accept + doesn't actually auto-accept for single image" bug. +- **Batch detection must enumerate slices, not just `all_images`.** + Multi-dim images live in `all_images` as a single entry with + `is_multi_slice=True`, and their actual slice QImages live under + `self.image_slices[base_name]`. The first cut of + `run_dino_detection_batch` iterated `all_images` and skipped the + multi-slice entries with a console log — leaving stack-based + projects unable to use "Detect All Images" at all. Batch jobs go + through `_collect_dino_batch_work_items()` which flattens regular + images + every loaded slice into a `(name, QImage)` list. +- **Review navigation must handle slice names.** Slice names like + `stack_T1_Z1_C1` are not in `image_list`. After collecting batch + results for slices, `_navigate_to_image_or_slice()` finds the + parent image via `os.path.splitext` matching and then activates + the specific row in `slice_list`. Without this, batch review on + slices either silently no-op'd or showed the first regular + image's masks on a slice. + +## Multi-dimensional TIFF Axis Defaults + +`load_tiff` extracts `tif.series[0].axes` (e.g. `"TZCYX"`) and maps +it through `{T:T, Z:Z, C:C, S:S, Y:H, X:W}` to populate the +`DimensionDialog` combo boxes. This is what lets a user open an +ImageJ-style 5D TIFF and just click OK. + +When the metadata is missing or unfamiliar, fall back to the +hand-crafted defaults keyed on `ndim`: + +| ndim | default labels | +|------|---------------| +| 3 | `Z H W` | +| 4 | `T Z H W` | +| 5 | `T Z C H W` | +| 6 | `T Z C S H W` | + +**Do not** use `default_dimensions[-ndim:]` of a shorter list to +"extend" defaults — that silently degrades for `ndim ≥ 5`: the final +combo gets no default and inherits the first item ("T"), which is +the wrong axis. The 5D TZCYX bug that produced 2560 one-row slices +on a `(2,5,2,256,256)` file came from exactly this. + +## Export Format Filename Matching + +`export_formats.py` historically looked up image paths via substring +match: + +```python +image_path = next( + (path for name, path in image_paths.items() if image_name in name), + None, +) +``` + +That is fragile — `"bee.jpg" in "honeybee.jpg"` returns True and you +write the wrong file. The COCO, YOLO v4, and YOLO v5+ exports all +share this code path. + +**Always try the exact key first; fall back to substring only if no +exact key matches.** Pattern: + +```python +image_path = image_paths.get(image_name) +if image_path is None: + image_path = next( + (path for name, path in image_paths.items() if image_name in name), + None, + ) +``` + +The substring fallback is kept for backward compatibility with old +projects that may have stored normalised image names (e.g. without +extension); new code should prefer the exact-key path. diff --git a/docs/09_architecture_decisions.md b/docs/09_architecture_decisions.md index 3ec508b..755f6dd 100644 --- a/docs/09_architecture_decisions.md +++ b/docs/09_architecture_decisions.md @@ -1,23 +1,12 @@ # Architecture Decisions -## ADR-001: Use PyQt5 Instead of PyQt6 +## ADR-001: GUI Framework Choice -**Status**: Accepted - -**Context**: Need a mature, cross-platform GUI framework with rich widgets - -**Decision**: Use PyQt5 5.15.11 +**Status**: Superseded by [ADR-014](#adr-014-migrate-from-pyqt5-to-pyqt6) -**Rationale**: -- Mature ecosystem with extensive documentation -- Better backwards compatibility -- Proven stability on Windows/macOS -- Original project used PyQt5 +**Original decision (historical)**: Use PyQt5 5.15.11. Chosen because the upstream project used PyQt5, PyQt5's ecosystem was more mature at the time, and migration carried risk. -**Consequences**: -- ✅ Stable, well-tested -- ✅ Large community support -- ⚠️ PyQt6 is newer but would require migration +**Superseding decision**: The project migrated to PyQt6 6.7+ in the same PR that introduced in-process AI inference. See [ADR-014](#adr-014-migrate-from-pyqt5-to-pyqt6) for the rationale (mainly: PyQt6 eliminated the WinError 1114 DLL load-order conflict that motivated ADR-011, unblocking the subprocess removal in ADR-013). --- @@ -218,7 +207,7 @@ ## ADR-011: Run Torch-based Workers in Isolated Subprocesses -**Status**: Accepted +**Status**: Superseded by [ADR-013](#adr-013-in-process-inference-with-qthread-wrapping) **Context**: Both SAM 2 (via Ultralytics) and Grounding DINO (via transformers) load PyTorch into the process. On Windows + Python 3.14, importing PyQt5 first and then loading PyTorch causes `WinError 1114` (DLL load order conflict between Qt and Torch native dependencies). The application is fundamentally PyQt5-based, so we cannot reorder these imports. @@ -237,8 +226,10 @@ - ⚠️ Need UTF-8 forced on both ends of the pipe (`PYTHONIOENCODING=utf-8` in env, `encoding="utf-8", errors="replace"` on parent) — Windows cp1252 default crashes on non-ASCII bytes in torch warnings - ⚠️ Two near-identical worker scripts to maintain (`sam_worker.py` mirrors the pattern from `dino_worker.py`) +**Superseded by**: Migrating to PyQt6 (ADR-013) eliminated the underlying DLL conflict. The subprocess hop, JSON marshalling, and `check_worker_isolation.py` tooling were removed in the same PR. + **Related**: -- Implementation: `sam_utils.py` / `sam_worker.py`, `dino_utils.py` / `dino_worker.py` +- Implementation (historical): `sam_utils.py` / `sam_worker.py`, `dino_utils.py` / `dino_worker.py` - Original SAM-only version landed in #65 (Python 3.14 support) - DINO subprocess pattern landed alongside the DINO feature @@ -265,6 +256,128 @@ --- +## ADR-013: In-process Inference with QThread Wrapping + +**Status**: Accepted + +**Context**: ADR-011 introduced a subprocess hop for every SAM and DINO inference call to work around a PyQt5 + Torch DLL load-order conflict on Windows + Python 3.14. The workaround cost a fresh `python sam_worker.py` / `dino_worker.py` spawn per inference (~1-2 s warm latency, model reloaded from disk on every call) plus a temp-PNG marshal of the image. + +Migrating the GUI from PyQt5 to PyQt6 (same PR) eliminates the DLL conflict — verified by `tools/check_pyqt6_torch_coexistence.py` importing PyQt6 → torch → transformers → ultralytics cleanly in one process on Windows+Py3.14 (the original failure case) and the Linux/macOS test matrix. + +**Decision**: Run SAM and DINO inference directly inside the main Python process. Keep the model objects on the `SAMUtils` / `DINOUtils` singletons so they persist across calls. Wrap each inference in a short-lived `QThread` to keep the UI thread responsive; the public API blocks the caller via a nested `QEventLoop` so call sites in `annotator_window.py` stay synchronous-looking. + +**Rationale**: +- The latency win is the whole point. Subprocess spawn + Python startup + model reload was ~1-2 s every call; in-process with a cached model is ~50-500 ms. +- Threading via a nested `QEventLoop` (the `_run_sync` helper in `sam_utils.py`) lets the calling thread keep pumping events — timers, repaints, progress dialog cancels still work — while inference runs on the QThread. Existing call sites need no refactor. +- Torch and transformers are imported lazily on first inference, so app startup stays fast for users who never touch SAM/DINO. +- `_qimage_to_numpy` already exists; converting the QImage on the calling thread (not on the worker) keeps Qt objects single-threaded as required. + +**Consequences**: +- ✅ Each inference is ~1-2 s faster on Windows; less dramatic on macOS/Linux but still smoother. +- ✅ Cached model survives between calls — opening a DINO model once costs once. The DINO model stays on its compute device (CPU or CUDA) for its full lifetime; the old worker shuffled CPU↔GPU per call, defeating the caching gain on PCIe. Call `DINOUtils.unload()` / `SAMUtils.unload()` to free GPU memory explicitly. +- ✅ UI stays responsive during batch DINO+SAM runs (the calling thread's `QEventLoop` still processes events). +- ✅ One source of truth per model — no more keeping `sam_utils.py` and `sam_worker.py` aligned. +- ✅ Exceptions from the inference worker (model load failures, CUDA errors) propagate out of `_run_sync` rather than being printed and silently turned into `None`. The `change_sam_model` error path in `annotator_window.py` actually catches now. +- ⚠️ A crash in torch (CUDA OOM, segfault) now takes the app down where the subprocess used to absorb it. Mitigation: inference is wrapped in `try/except` at the `_run_sync` boundary; the user sees an error dialog instead of a frozen UI. +- ⚠️ Model RAM stays resident until the user closes the app (or invokes the `unload()` method). +- ⚠️ Re-entrancy is a real hazard, addressed with belt-and-braces: + - `_run_sync` sets a module-level `_inference_in_flight` flag and raises `InferenceBusyError` if re-entered. Same-thread re-entry can happen because the calling thread pumps its event loop while waiting (a timer fire, a click on an un-disabled widget, etc.). A `QMutex` would not help — same-thread re-acquisition deadlocks on a non-recursive mutex and is meaningless on a recursive one. + - The known re-entry vector — the SAM debounce timer firing during an in-flight inference — is guarded at the call site: `apply_sam_prediction` in `annotator_window.py` carries its own `_sam_inference_in_flight` flag and skips. Batch DINO already disables its trigger buttons. + - The two-layer design is intentional: the call-site flag handles the common case quietly; the `_run_sync` flag is the safety net that surfaces unknown re-entry vectors as a real exception rather than corrupting the model with concurrent `.forward()` calls (torch / ultralytics / transformers model objects are not thread-safe). + +**Related**: +- Implementation: `sam_utils.py`, `dino_utils.py` (both refactored in the same PR that retires ADR-011). +- Smoke test: `tools/check_pyqt6_torch_coexistence.py` (gate that gated this whole change). +- Supersedes: [ADR-011](#adr-011-run-torch-based-workers-in-isolated-subprocesses). + +--- + +## ADR-014: Migrate from PyQt5 to PyQt6 + +**Status**: Accepted + +**Context**: The project shipped on PyQt5 5.15+ (ADR-001) from inception. Two pressures combined to motivate a migration: +1. The PyQt5 + Torch DLL load-order conflict on Windows + Python 3.14 (ADR-011) forced an entire subprocess isolation layer (`sam_worker.py`, `dino_worker.py`, `check_worker_isolation.py`) that added ~1-2 s latency per inference. The conflict only manifests on PyQt5 — Qt6's packaging reshuffle eliminates it. +2. PyQt5 is in maintenance mode. PyQt6 is the actively developed line, gets new Qt6.x features, and has better Linux native integration (XCB plugin paths in particular). + +**Decision**: Migrate the GUI binding from PyQt5 (`>=5.15.0`) to PyQt6 (`>=6.7.0`). Land in a single PR alongside the subprocess-removal work (ADR-013), gated behind `tools/check_pyqt6_torch_coexistence.py` to confirm the DLL conflict is actually gone on Windows + Python 3.14. + +**Rationale**: +- Two coupled changes share most of their cost (touching every file that imports PyQt5) so doing them in one PR avoids paying the migration tax twice. +- Most PyQt5→PyQt6 differences are enum namespacing (`Qt.AlignCenter` → `Qt.AlignmentFlag.AlignCenter`) and module relocations (`QAction` moves from `QtWidgets` to `QtGui`) — mechanical, codemod-able. The behavioural risk is in event APIs (`event.pos()` → `event.position()`, returning `QPointF` not `QPoint`) and a handful of removed widgets (`QDesktopWidget` → `QGuiApplication.primaryScreen()`). +- The existing test suite (65 pytest-qt tests, mostly exercising coordinate transforms) serves as the regression safety net. + +**Consequences**: +- ✅ Subprocess workers retired; inference is in-process with cached models (see [ADR-013](#adr-013-in-process-inference-with-qthread-wrapping)). +- ✅ Cleaner Linux story — `libxcb-cursor0` is required by Qt 6 (was optional under Qt 5), but the platform plugin path mess is gone. +- ✅ Long support runway: PyQt6 is the maintained binding. +- ⚠️ One-time migration cost: ~30 files touched, enum namespacing across `annotator_window.py` (300+ references), `event.pos()` → `event.position()` rewrite in `image_label.py`. +- ⚠️ PyQt6 is GPLv3 / commercial like PyQt5. Switching to PySide6 (LGPL) was considered and rejected to stay close to the existing `pyqtSignal`/`pyqtSlot` API. +- ✅ All `.exec_()` call sites in `src/` migrated to `.exec()` in the v0.9.0 fix-pack — the PyQt5 alias is gone from this codebase. + +**Verification**: +- `tools/check_pyqt6_torch_coexistence.py` imports PyQt6 → torch → torchvision → transformers → ultralytics in that order. Run before merging on the Windows + Python 3.14 target. +- 65 tests pass on the new binding under `QT_QPA_PLATFORM=offscreen`. +- Full app constructs and renders headlessly; snake-game easter egg validates the `QDesktopWidget` → `QGuiApplication.primaryScreen()` replacement. + +**Related**: +- Supersedes: [ADR-001](#adr-001-gui-framework-choice). +- Unblocks: [ADR-013](#adr-013-in-process-inference-with-qthread-wrapping). + +--- + +## ADR-015: Application-wide Event Filter for DINO Review Shortcuts + +**Status**: Accepted (v0.9.0) + +**Context**: During DINO batch / single-image review, the user has +to accept (Enter) or reject (Escape) pending masks. The keyboard +handling was originally in `ImageLabel.keyPressEvent`, which only +fires when the canvas has focus. In practice the user clicks slice +entries, image entries, or buttons during review — focus moves to +those widgets and Enter is consumed locally (e.g. `QListWidget` +emits `itemActivated`), never reaching the canvas. The result: Enter +and Escape silently failed during the most common review workflow. + +Three options were considered: + +1. **Force focus back to the canvas on every UI interaction** — + intrusive, breaks normal navigation (Tab/Arrow keys on lists), and + fragile because Qt's focus chain is not always predictable. +2. **Global `QShortcut` with ApplicationShortcut context** — fires + regardless of focus but unconditionally hijacks Enter / Escape, + breaking modal dialogs (Enter activates default button) and inline + editing in `QLineEdit` / `QInputDialog`. +3. **Application-wide `QObject` event filter** that intercepts only + when DINO temp_annotations are pending, and only when the focused + widget is not a text input and no modal dialog is active. + +**Decision**: Option 3. Implement `_DINOReviewEventFilter`, install it +on `QApplication.instance()` once at startup, and gate the +interception on three conditions: pending DINO temp_annotations, +no active modal widget, focus not on `QLineEdit`/`QTextEdit`. + +**Consequences**: +- ✅ Enter/Escape works regardless of which widget holds focus during + DINO review. +- ✅ Modal dialogs and text-input fields are unaffected. +- ✅ Pattern is reusable for any future "review pending state" feature. +- ⚠️ Adds a per-key-press function call cost to the entire app. The + filter short-circuits in three cheap checks before any work, so the + overhead is negligible (≤ a few μs per keystroke). +- ⚠️ Single global filter means future review-state features must + share it or layer additional filters; if more review modes appear, + collapse them into a strategy registry rather than installing + multiple top-level filters. + +**Related**: +- Implementation: `annotator_window.py` (`_DINOReviewEventFilter` + class, `installEventFilter` call in `__init__`). +- Cross-cuts: documented in + [Cross-cutting Concepts → DINO Temp Annotations](08_crosscutting_concepts.md#dino-temp-annotations--single-field-many-images). + +--- + ## Decisions Under Consideration ### Consider pytest-qt for Utility Testing diff --git a/docs/12_glossary.md b/docs/12_glossary.md index 1d04fbb..3cc8309 100644 --- a/docs/12_glossary.md +++ b/docs/12_glossary.md @@ -56,8 +56,8 @@ A 2D image extracted from a multi-dimensional image stack. Named with format `{f ### Stack A multi-dimensional image, typically a TIFF or CZI file with multiple 2D slices in Z-dimension (depth). -### Subprocess Worker -A standalone Python script (`sam_worker.py`, `dino_worker.py`) that runs ML model inference in its own process, communicating with the GUI parent via JSON over stdin/stdout. The split is required to avoid a Windows + Python 3.14 DLL load-order conflict between PyQt5 and PyTorch — see [ADR-011](09_architecture_decisions.md#adr-011-run-torch-based-workers-in-isolated-subprocesses). +### Subprocess Worker (historical) +A standalone Python script (`sam_worker.py`, `dino_worker.py`) that ran ML model inference in its own process to dodge a PyQt5 + Torch DLL load-order conflict on Windows + Python 3.14. Removed once the codebase migrated to PyQt6 (the conflict no longer manifests). See [ADR-011](09_architecture_decisions.md#adr-011-run-torch-based-workers-in-isolated-subprocesses) (Superseded) and [ADR-013](09_architecture_decisions.md#adr-013-in-process-inference-with-qthread-wrapping). ### TIFF Stack Multi-page TIFF file containing multiple 2D images, often used for Z-stacks in microscopy. diff --git a/requirements.txt b/requirements.txt index 6102f60..9bae7be 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,5 @@ # Core dependencies -PyQt5>=5.15.0 +PyQt6>=6.7.0 Pillow>=10.0.0 numpy>=2.0.0 # pip resolves 2.4+ on Py3.14, 2.2.x on Py3.10 (last 3.10-compatible) opencv-python>=4.8.0 diff --git a/setup.py b/setup.py index 13b6e65..04c20f4 100644 --- a/setup.py +++ b/setup.py @@ -10,7 +10,7 @@ setup( name="digitalsreeni-image-annotator", - version="0.8.12", # Updated version number + version="0.9.0", # PyQt6 + in-process inference author="Dr. Sreenivas Bhattiprolu", author_email="digitalsreeni@gmail.com", description="A tool for annotating images using manual and automated tools, supporting multi-dimensional images and SAM2-assisted annotations", @@ -29,7 +29,7 @@ ], python_requires=">=3.10", install_requires=[ - "PyQt5>=5.15.0", + "PyQt6>=6.7.0", "numpy>=2.0.0", # pip resolves 2.4+ on Py3.14, 2.2.x on Py3.10 (last 3.10-compatible) "Pillow>=10.0.0", "tifffile>=2023.0.0", diff --git a/src/digitalsreeni_image_annotator/__init__.py b/src/digitalsreeni_image_annotator/__init__.py index 873a420..452fbaf 100644 --- a/src/digitalsreeni_image_annotator/__init__.py +++ b/src/digitalsreeni_image_annotator/__init__.py @@ -7,7 +7,7 @@ @DigitalSreeni Dr. Sreenivas Bhattiprolu """ -__version__ = "0.8.12" +__version__ = "0.9.0" __author__ = "Dr. Sreenivas Bhattiprolu" from .annotator_window import ImageAnnotator diff --git a/src/digitalsreeni_image_annotator/annotation_statistics.py b/src/digitalsreeni_image_annotator/annotation_statistics.py index 6b6da44..f0cafe8 100644 --- a/src/digitalsreeni_image_annotator/annotation_statistics.py +++ b/src/digitalsreeni_image_annotator/annotation_statistics.py @@ -1,7 +1,7 @@ import plotly.graph_objects as go from plotly.subplots import make_subplots -from PyQt5.QtWidgets import QDialog, QVBoxLayout, QTextBrowser, QPushButton, QHBoxLayout -from PyQt5.QtCore import Qt +from PyQt6.QtWidgets import QDialog, QVBoxLayout, QTextBrowser, QPushButton, QHBoxLayout +from PyQt6.QtCore import Qt import tempfile import os import webbrowser @@ -11,7 +11,7 @@ def __init__(self, parent=None): super().__init__(parent) self.setWindowTitle("Annotation Statistics") self.setGeometry(100, 100, 600, 400) - self.setWindowFlags(self.windowFlags() | Qt.Window) + self.setWindowFlags(self.windowFlags() | Qt.WindowType.Window) self.initUI() def initUI(self): diff --git a/src/digitalsreeni_image_annotator/annotation_utils.py b/src/digitalsreeni_image_annotator/annotation_utils.py index 3ff8b80..6013e89 100644 --- a/src/digitalsreeni_image_annotator/annotation_utils.py +++ b/src/digitalsreeni_image_annotator/annotation_utils.py @@ -1,6 +1,6 @@ -from PyQt5.QtWidgets import QListWidgetItem -from PyQt5.QtGui import QColor -from PyQt5.QtCore import Qt +from PyQt6.QtWidgets import QListWidgetItem +from PyQt6.QtGui import QColor +from PyQt6.QtCore import Qt class AnnotationUtils: @staticmethod @@ -9,11 +9,11 @@ def update_annotation_list(self, image_name=None): current_name = image_name or self.current_slice or self.image_file_name annotations = self.all_annotations.get(current_name, {}) for class_name, class_annotations in annotations.items(): - color = self.image_label.class_colors.get(class_name, QColor(Qt.white)) + color = self.image_label.class_colors.get(class_name, QColor(Qt.GlobalColor.white)) for i, annotation in enumerate(class_annotations, start=1): item_text = f"{class_name} - {i}" item = QListWidgetItem(item_text) - item.setData(Qt.UserRole, annotation) + item.setData(Qt.ItemDataRole.UserRole, annotation) item.setForeground(color) self.annotation_list.addItem(item) @@ -23,17 +23,17 @@ def update_slice_list_colors(self): item = self.slice_list.item(i) slice_name = item.text() if slice_name in self.all_annotations and any(self.all_annotations[slice_name].values()): - item.setForeground(QColor(Qt.green)) + item.setForeground(QColor(Qt.GlobalColor.green)) else: - item.setForeground(QColor(Qt.black) if not self.dark_mode else QColor(Qt.white)) + item.setForeground(QColor(Qt.GlobalColor.black) if not self.dark_mode else QColor(Qt.GlobalColor.white)) @staticmethod def update_annotation_list_colors(self, class_name=None, color=None): for i in range(self.annotation_list.count()): item = self.annotation_list.item(i) - annotation = item.data(Qt.UserRole) + annotation = item.data(Qt.ItemDataRole.UserRole) if class_name is None or annotation['category_name'] == class_name: - item_color = color if class_name else self.image_label.class_colors.get(annotation['category_name'], QColor(Qt.white)) + item_color = color if class_name else self.image_label.class_colors.get(annotation['category_name'], QColor(Qt.GlobalColor.white)) item.setForeground(item_color) @staticmethod @@ -57,10 +57,10 @@ def save_current_annotations(self): @staticmethod def add_annotation_to_list(self, annotation): class_name = annotation['category_name'] - color = self.image_label.class_colors.get(class_name, QColor(Qt.white)) + color = self.image_label.class_colors.get(class_name, QColor(Qt.GlobalColor.white)) annotations = self.image_label.annotations.get(class_name, []) item_text = f"{class_name} - {len(annotations)}" item = QListWidgetItem(item_text) - item.setData(Qt.UserRole, annotation) + item.setData(Qt.ItemDataRole.UserRole, annotation) item.setForeground(color) self.annotation_list.addItem(item) \ No newline at end of file diff --git a/src/digitalsreeni_image_annotator/annotator_window.py b/src/digitalsreeni_image_annotator/annotator_window.py index 5ed0d14..ffc94ee 100644 --- a/src/digitalsreeni_image_annotator/annotator_window.py +++ b/src/digitalsreeni_image_annotator/annotator_window.py @@ -2,6 +2,7 @@ import json import os import shutil +import traceback import warnings from datetime import datetime @@ -9,11 +10,20 @@ import numpy as np import shapely from czifile import CziFile -from PyQt5.QtCore import Qt, QThread, QTimer, pyqtSignal -from PyQt5.QtGui import QColor, QFont, QIcon, QImage, QKeySequence, QPalette, QPixmap -from PyQt5.QtWidgets import ( - QAbstractItemView, +from PyQt6.QtCore import QEvent, QObject, Qt, QThread, QTimer, pyqtSignal +from PyQt6.QtGui import ( QAction, + QColor, + QFont, + QIcon, + QImage, + QKeySequence, + QPalette, + QPixmap, + QShortcut, +) +from PyQt6.QtWidgets import ( + QAbstractItemView, QApplication, QButtonGroup, QCheckBox, @@ -74,7 +84,7 @@ import_yolo_v5plus, process_import_format, ) -from .sam_utils import SAMUtils +from .sam_utils import InferenceBusyError, SAMUtils from .slice_registration import SliceRegistrationTool from .snake_game import SnakeGame from .soft_dark_stylesheet import soft_dark_stylesheet @@ -143,6 +153,44 @@ def get_dimensions(self): return [combo.currentText() for combo in self.combos] +class _DINOReviewEventFilter(QObject): + """Application-wide event filter that lets Enter / Escape accept or + reject pending DINO temp_annotations regardless of which widget has + focus. Without this, clicking a slice/image entry in a list moves + focus there and Enter is consumed by the list's itemActivated + handler before it can reach ImageLabel.keyPressEvent. + + Suppressed when a modal dialog is active or focus is on a text-input + widget so we don't break dialog default-button behaviour or + in-cell editing. + """ + + def __init__(self, main_window: "ImageAnnotator"): + super().__init__(main_window) + self.main_window = main_window + + def eventFilter(self, obj, event): + if event.type() != QEvent.Type.KeyPress: + return False + key = event.key() + if key not in (Qt.Key.Key_Return, Qt.Key.Key_Enter, Qt.Key.Key_Escape): + return False + app = QApplication.instance() + if app is None or app.activeModalWidget() is not None: + return False + focused = app.focusWidget() + if isinstance(focused, (QLineEdit, QTextEdit)): + return False + temp = self.main_window.image_label.temp_annotations + if not temp or not any(a.get("source") == "dino" for a in temp): + return False + if key in (Qt.Key.Key_Return, Qt.Key.Key_Enter): + self.main_window.accept_dino_results() + else: + self.main_window.reject_dino_results() + return True + + class ImageAnnotator(QMainWindow): def __init__(self): super().__init__() @@ -206,6 +254,11 @@ def __init__(self): self.sam_inference_timer.setSingleShot(True) self.sam_inference_timer.timeout.connect(self.apply_sam_prediction) + # Guards against re-entrant `apply_sam_prediction` calls — the + # debounce timer can fire while an earlier inference is still + # pumping inside _run_sync. See apply_sam_prediction(). + self._sam_inference_in_flight = False + # Create sam_magic_wand_button self.sam_magic_wand_button = QPushButton("Magic Wand") self.sam_magic_wand_button.setCheckable(True) @@ -225,12 +278,19 @@ def __init__(self): } # Also, add the options in create_menu_bar method self.current_font_size = "Medium" - # Dark mode control - self.dark_mode = False + # Dark mode control. Default on — matches the look most users + # expect from a 2025-era desktop annotation tool; toggle with + # Settings → Toggle Dark Mode (Ctrl+D). + self.dark_mode = True # Default annotations sorting self.current_sort_method = "class" # Default sorting method + # DINO batch review state. Initialised eagerly here so the + # consumers don't each carry a `hasattr` check (one forgotten + # check would crash with AttributeError). + self.dino_batch_results: dict[str, list] = {} + # Setup UI components self.setup_ui() @@ -245,6 +305,23 @@ def __init__(self): # YOLO Trainer self.yolo_trainer = None self.setup_yolo_menu() + + # F2 → Snake game (Easter egg). Registered as a global QShortcut + # so it fires regardless of which widget has focus — putting it + # in keyPressEvent didn't work because QTableWidget (DINO + # threshold table) and other focusable children consume F2 + # before it bubbles up to the main window. + self._snake_shortcut = QShortcut(QKeySequence("F2"), self) + self._snake_shortcut.setContext(Qt.ShortcutContext.ApplicationShortcut) + self._snake_shortcut.activated.connect(self.launch_snake_game) + + # Enter/Escape for DINO temp_annotations need to work even when + # focus is on slice_list / image_list / a button — none of which + # forward the key to ImageLabel.keyPressEvent. Application-wide + # event filter intercepts these keys but only when DINO results + # are pending review, and skips modal dialogs + text inputs. + self._dino_review_filter = _DINOReviewEventFilter(self) + QApplication.instance().installEventFilter(self._dino_review_filter) # Start in maximized mode self.showMaximized() @@ -404,6 +481,12 @@ def open_specific_project(self, project_file): # Now save once after everything is loaded self.is_loading_project = False # Clear loading flag + # Reveal the phrase editor if any classes exist — the + # per-class selectRow inside add_class was skipped during + # load (see add_class). Selecting row 0 is enough; the + # user can switch rows freely afterwards. + if self.dino_class_table.rowCount() > 0: + self.dino_class_table.selectRow(0) self.save_project(show_message=False) # Save once after loading self.initialize_yolo_trainer() @@ -527,11 +610,11 @@ def handle_missing_images(self, missing_images): self, "Missing Images", message, - QMessageBox.Yes | QMessageBox.No, - QMessageBox.Yes, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.Yes, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: self.load_missing_images(missing_images) else: self.remove_missing_images(missing_images) @@ -574,11 +657,11 @@ def prompt_load_missing_images(self, missing_images): self, "Load Missing Images", message, - QMessageBox.Yes | QMessageBox.No, - QMessageBox.Yes, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.Yes, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: self.load_missing_images(missing_images) def load_missing_images(self, missing_images): @@ -649,13 +732,13 @@ def close_project(self): self, "Close Project", "Do you want to save the current project before closing?", - QMessageBox.Yes | QMessageBox.No | QMessageBox.Cancel, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No | QMessageBox.StandardButton.Cancel, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: self.remove_all_temp_annotations() # Remove temp annotations before saving self.save_project(show_message=False) # Save without showing a message - elif reply == QMessageBox.Cancel: + elif reply == QMessageBox.StandardButton.Cancel: return # User cancelled the operation # Clear all data @@ -683,10 +766,10 @@ def delete_selected_class(self): self, "Delete Class", f"Are you sure you want to delete the class '{class_name}'?", - QMessageBox.Yes | QMessageBox.No, - QMessageBox.No, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.No, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: self.delete_class( class_name ) # Sreeni note: Implement this method to handle class deletion @@ -745,11 +828,11 @@ def save_project(self, show_message=True): f"The project structure requires all images to be in an 'images' subdirectory. " f"{len(images_to_copy)} images need to be copied to the correct location. " f"Do you want to copy these images?", - QMessageBox.Yes | QMessageBox.No, - QMessageBox.Yes, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.Yes, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: for file_name, src_path, dst_path in images_to_copy: try: shutil.copy2(src_path, dst_path) @@ -888,10 +971,10 @@ def auto_save(self): self, "No Project", "You need to save the project before auto-saving. Would you like to save now?", - QMessageBox.Yes | QMessageBox.No, - QMessageBox.Yes, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.Yes, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: self.save_project() else: return @@ -916,7 +999,7 @@ def show_project_details(self): dialog = ProjectDetailsDialog(self, stats_dialog) - if dialog.exec_() == QDialog.Accepted: + if dialog.exec() == QDialog.DialogCode.Accepted: if dialog.were_changes_made(): self.project_notes = dialog.get_notes() self.save_project(show_message=False) @@ -981,7 +1064,7 @@ def activate_sam_magic_wand(self): # Set the current tool self.image_label.current_tool = "sam_magic_wand" self.image_label.sam_magic_wand_active = True - self.image_label.setCursor(Qt.CrossCursor) + self.image_label.setCursor(Qt.CursorShape.CrossCursor) # Update UI based on the current tool self.update_ui_for_current_tool() @@ -1004,7 +1087,7 @@ def deactivate_sam_magic_wand(self): self.image_label.sam_magic_wand_active = False self.sam_magic_wand_button.setChecked(False) self.sam_magic_wand_button.setEnabled(False) # Disable the button - self.image_label.setCursor(Qt.ArrowCursor) + self.image_label.setCursor(Qt.CursorShape.ArrowCursor) # Clear any SAM-related temporary data self.image_label.sam_bbox = None @@ -1041,10 +1124,10 @@ def toggle_sam_magic_wand(self): ) self.sam_magic_wand_button.setChecked(False) return - self.image_label.setCursor(Qt.CrossCursor) + self.image_label.setCursor(Qt.CursorShape.CrossCursor) self.image_label.sam_magic_wand_active = True else: - self.image_label.setCursor(Qt.ArrowCursor) + self.image_label.setCursor(Qt.CursorShape.ArrowCursor) self.image_label.sam_magic_wand_active = False self.image_label.sam_bbox = None @@ -1056,56 +1139,84 @@ def schedule_sam_prediction(self): self.sam_inference_timer.start(1000) def apply_sam_prediction(self): - if self.image_label.current_tool == "sam_box": - if self.image_label.sam_bbox is None: - print("SAM bbox is None") + # Re-entry guard: if a previous SAM call is still in flight, the + # event-loop pump inside _run_sync can deliver this timer fire + # before the first call returns. Bail and rely on the user + # clicking again (which restarts the debounce) to issue a fresh + # inference with the up-to-date point set. + if self._sam_inference_in_flight: + return + self._sam_inference_in_flight = True + try: + try: + if self.image_label.current_tool == "sam_box": + if self.image_label.sam_bbox is None: + print("SAM bbox is None") + return + x1, y1, x2, y2 = self.image_label.sam_bbox + bbox = [min(x1, x2), min(y1, y2), max(x1, x2), max(y1, y2)] + prediction = self.sam_utils.apply_sam_prediction(self.current_image, bbox) + self.image_label.sam_bbox = None + elif self.image_label.current_tool == "sam_points": + # Always use all points! + pos_points = self.image_label.sam_positive_points + neg_points = self.image_label.sam_negative_points + print( + f"[SAM-POINTS] Predicting with {len(pos_points)} positive points: {pos_points} " + f"and {len(neg_points)} negative points: {neg_points}" + ) + if not pos_points: + print("No positive points for SAM-points") + return + prediction = self.sam_utils.apply_sam_points( + self.current_image, + pos_points, + neg_points, + ) + else: + return + except InferenceBusyError: + # Re-entry safety net from sam_utils. The call-site flag + # above should catch this first, but if a different + # caller drives inference concurrently we just skip — + # the user keeps interacting; their next click will + # restart the debounce. return - x1, y1, x2, y2 = self.image_label.sam_bbox - bbox = [min(x1, x2), min(y1, y2), max(x1, x2), max(y1, y2)] - prediction = self.sam_utils.apply_sam_prediction(self.current_image, bbox) - self.image_label.sam_bbox = None - elif self.image_label.current_tool == "sam_points": - # Always use all points! - pos_points = self.image_label.sam_positive_points - neg_points = self.image_label.sam_negative_points - print( - f"[SAM-POINTS] Predicting with {len(pos_points)} positive points: {pos_points} " - f"and {len(neg_points)} negative points: {neg_points}" - ) - if not pos_points: - print("No positive points for SAM-points") + except Exception as exc: + traceback.print_exc() + QMessageBox.critical( + self, + "SAM Error", + f"SAM inference failed:\n\n{exc}\n\n" + "See the log for details.", + ) return - prediction = self.sam_utils.apply_sam_points( - self.current_image, - pos_points, - neg_points, - ) - else: - return - if prediction: - temp_annotation = { - "segmentation": prediction["segmentation"], - "category_id": self.class_mapping[self.current_class], - "category_name": self.current_class, - "score": prediction["score"], - } - self.image_label.temp_sam_prediction = temp_annotation - self.image_label.update() - elif prediction is None: - QMessageBox.information( - self, - "SAM", - "No mask matches the given constraints. " - "Try adjusting the box or point positions." - ) - else: - print("Failed to generate prediction") + if prediction: + temp_annotation = { + "segmentation": prediction["segmentation"], + "category_id": self.class_mapping[self.current_class], + "category_name": self.current_class, + "score": prediction["score"], + } + self.image_label.temp_sam_prediction = temp_annotation + self.image_label.update() + elif prediction is None: + QMessageBox.information( + self, + "SAM", + "No mask matches the given constraints. " + "Try adjusting the box or point positions." + ) + else: + print("Failed to generate prediction") - # Only clear box/points for box mode, not for points mode! - if self.image_label.current_tool == "sam_box": - self.image_label.sam_bbox = None - self.image_label.update() + # Only clear box/points for box mode, not for points mode! + if self.image_label.current_tool == "sam_box": + self.image_label.sam_bbox = None + self.image_label.update() + finally: + self._sam_inference_in_flight = False def accept_sam_prediction(self): if self.image_label.temp_sam_prediction: @@ -1129,14 +1240,6 @@ def setup_slice_list(self): self.image_list_layout.addWidget(QLabel("Slices:")) self.image_list_layout.addWidget(self.slice_list) - def qimage_to_numpy(self, qimage): - width = qimage.width() - height = qimage.height() - ptr = qimage.bits() - ptr.setsize(height * width * 4) - arr = np.frombuffer(ptr, np.uint8).reshape((height, width, 4)) - return arr[:, :, :3] # Slice off the alpha channel - def open_images(self): file_names, _ = QFileDialog.getOpenFileNames( self, @@ -1246,14 +1349,14 @@ def closeEvent(self, event): self, "Unsaved Changes", "You have unsaved changes. Do you want to save them before closing?", - QMessageBox.Yes | QMessageBox.No | QMessageBox.Cancel, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No | QMessageBox.StandardButton.Cancel, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: if self.image_label.temp_paint_mask is not None: self.image_label.commit_paint_annotation() if self.image_label.temp_eraser_mask is not None: self.image_label.commit_eraser_changes() - elif reply == QMessageBox.Cancel: + elif reply == QMessageBox.StandardButton.Cancel: event.ignore() return @@ -1275,14 +1378,14 @@ def switch_slice(self, item): self, "Unsaved Changes", "You have unsaved changes. Do you want to save them?", - QMessageBox.Yes | QMessageBox.No | QMessageBox.Cancel, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No | QMessageBox.StandardButton.Cancel, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: if self.image_label.temp_paint_mask is not None: self.image_label.commit_paint_annotation() if self.image_label.temp_eraser_mask is not None: self.image_label.commit_eraser_changes() - elif reply == QMessageBox.Cancel: + elif reply == QMessageBox.StandardButton.Cancel: return else: self.image_label.discard_paint_annotation() @@ -1313,6 +1416,10 @@ def switch_slice(self, item): # Reset zoom level to default (1.0) self.set_zoom(1.0) + # Sync DINO temp_annotations to the new slice (carry over masks + # from the previous slice was a reported bug). + self._refresh_dino_temp_for_current() + def switch_image(self, item): if item is None: return @@ -1394,6 +1501,10 @@ def switch_image(self, item): self.update_image_info() self.clear_slice_list() + # Sync DINO temp_annotations to the new image (mask carry-over + # bug from single-image review and batch review). + self._refresh_dino_temp_for_current() + def adjust_zoom_to_fit(self): if not self.current_image: return @@ -1411,7 +1522,7 @@ def adjust_zoom_to_fit(self): def activate_current_slice(self): if self.current_slice: # Ensure the current slice is selected in the slice list - items = self.slice_list.findItems(self.current_slice, Qt.MatchExactly) + items = self.slice_list.findItems(self.current_slice, Qt.MatchFlag.MatchExactly) if items: self.slice_list.setCurrentItem(items[0]) @@ -1437,6 +1548,7 @@ def load_tiff( self, image_path, dimensions=None, shape=None, force_dimension_dialog=False ): print(f"Loading TIFF file: {image_path}") + axes_hint = None with TiffFile(image_path) as tif: print(f"TIFF tags: {tif.pages[0].tags}") @@ -1447,6 +1559,28 @@ def load_tiff( except KeyError: print("No ImageDescription metadata found") + # Try to read axis labels from the tifffile series. ImageJ / + # OME-TIFF stores axes like "TZCYX" — we can prefill the + # dimension dialog with the right labels so the user just + # clicks OK instead of guessing per axis. Map tifffile's + # axes vocabulary (T,Z,C,S,Y,X) to the app's (T,Z,C,S,H,W). + try: + series_axes = tif.series[0].axes if tif.series else None + if series_axes: + axis_map = { + "T": "T", "Z": "Z", "C": "C", "S": "S", + "Y": "H", "X": "W", + } + mapped = [axis_map.get(a) for a in series_axes] + if all(a is not None for a in mapped): + axes_hint = mapped + print(f"TIFF series axes: {series_axes} → dimension hint: {axes_hint}") + else: + unknown = [a for a in series_axes if axis_map.get(a) is None] + print(f"TIFF series axes had unknown labels {unknown}, no hint applied") + except Exception as e: + print(f"Could not read TIFF series axes: {e}") + # Check if it's a multi-page TIFF if len(tif.pages) > 1: print(f"Multi-page TIFF detected. Number of pages: {len(tif.pages)}") @@ -1471,7 +1605,8 @@ def load_tiff( dimensions = None self.process_multidimensional_image( - image_array, image_path, dimensions, force_dimension_dialog + image_array, image_path, dimensions, force_dimension_dialog, + axes_hint=axes_hint, ) def load_czi( @@ -1505,7 +1640,8 @@ def load_regular_image(self, image_path): self.current_slice = None def process_multidimensional_image( - self, image_array, image_path, dimensions=None, force_dimension_dialog=False + self, image_array, image_path, dimensions=None, + force_dimension_dialog=False, axes_hint=None, ): file_name = os.path.basename(image_path) base_name = os.path.splitext(file_name)[0] @@ -1515,16 +1651,43 @@ def process_multidimensional_image( if dimensions is None or force_dimension_dialog: if image_array.ndim > 2: - default_dimensions = ( - ["Z", "H", "W"] if image_array.ndim == 3 else ["T", "Z", "H", "W"] - ) - default_dimensions = default_dimensions[-image_array.ndim :] + # Prefer the loader's metadata-derived hint (e.g. ImageJ + # TIFF axes='TZCYX'). Fall back to a hand-crafted default + # that covers ndim 3..6 so a user clicking OK without + # tweaking the combos gets a sensible result. The earlier + # `default_dimensions[-ndim:]` slice silently degraded for + # ndim≥5: one axis ended up unset and inherited the combo + # box's first item ("T"), producing 2560 wrong slices for + # a 5D TZCYX file. + if axes_hint and len(axes_hint) == image_array.ndim: + default_dimensions = list(axes_hint) + print(f"Applying axes hint as default dims: {default_dimensions}") + else: + if axes_hint and len(axes_hint) != image_array.ndim: + print( + f"Ignoring axes hint (length {len(axes_hint)} " + f"vs ndim {image_array.ndim})" + ) + ndim_defaults = { + 3: ["Z", "H", "W"], + 4: ["T", "Z", "H", "W"], + 5: ["T", "Z", "C", "H", "W"], + 6: ["T", "Z", "C", "S", "H", "W"], + } + # ndim ≥ 7 falls into the generic case: pad with + # "T" at the front so H / W are still the last two + # axes — that way "click OK" still produces a + # sensible 2D slice even on exotic inputs. + default_dimensions = ndim_defaults.get( + image_array.ndim, + ["T"] * max(0, image_array.ndim - 2) + ["H", "W"], + ) # Show a progress dialog progress = QProgressDialog( "Assigning dimensions...", "Cancel", 0, 100, self ) - progress.setWindowModality(Qt.WindowModal) + progress.setWindowModality(Qt.WindowModality.WindowModal) progress.setMinimumDuration(0) progress.setValue(10) QApplication.processEvents() @@ -1533,12 +1696,11 @@ def process_multidimensional_image( dialog = DimensionDialog( image_array.shape, file_name, self, default_dimensions ) - dialog.setWindowFlags( - dialog.windowFlags() & ~Qt.WindowContextHelpButtonHint - ) + # Qt6 no longer shows the "?" help button by default; + # the old WindowContextHelpButtonHint clear is gone. progress.setValue(50) QApplication.processEvents() - if dialog.exec_(): + if dialog.exec(): dimensions = dialog.get_dimensions() print(f"Assigned dimensions: {dimensions}") if "H" in dimensions and "W" in dimensions: @@ -1598,7 +1760,7 @@ def create_slices(self, image_array, dimensions, image_path): # Create and show progress dialog progress = QProgressDialog("Loading slices...", "Cancel", 0, 100, self) - progress.setWindowModality(Qt.WindowModal) + progress.setWindowModality(Qt.WindowModality.WindowModal) progress.setMinimumDuration(0) # Show immediately # Handle 2D images @@ -1673,8 +1835,11 @@ def add_slice_to_list(self, slice_name): QColor(40, 40, 40) ) # Very dark gray background for all items if slice_name in self.all_annotations: - item.setForeground(QColor(60, 60, 60)) # Dark gray text - item.setBackground(QColor(173, 216, 230)) # Light blue background + # Muted steel-blue + light text; the prior light-blue + # (173, 216, 230) bg + dark-gray text was painfully + # bright on a dark sidebar. + item.setForeground(QColor(235, 235, 235)) + item.setBackground(QColor(58, 95, 140)) else: item.setForeground(QColor(200, 200, 200)) # Light gray text else: @@ -1736,19 +1901,19 @@ def activate_slice(self, slice_name): self.image_label.update() - items = self.slice_list.findItems(slice_name, Qt.MatchExactly) + items = self.slice_list.findItems(slice_name, Qt.MatchFlag.MatchExactly) if items: self.slice_list.setCurrentItem(items[0]) def array_to_qimage(self, array): if array.ndim == 2: height, width = array.shape - return QImage(array.data, width, height, width, QImage.Format_Grayscale8) + return QImage(array.data, width, height, width, QImage.Format.Format_Grayscale8) elif array.ndim == 3 and array.shape[2] == 3: height, width, _ = array.shape bytes_per_line = 3 * width return QImage( - array.data, width, height, bytes_per_line, QImage.Format_RGB888 + array.data, width, height, bytes_per_line, QImage.Format.Format_RGB888 ) else: raise ValueError( @@ -1760,16 +1925,16 @@ def update_slice_list(self): for slice_name, _ in self.slices: item = QListWidgetItem(slice_name) if slice_name in self.all_annotations: - item.setForeground(QColor(Qt.green)) + item.setForeground(QColor(Qt.GlobalColor.green)) else: item.setForeground( - QColor(Qt.black) if not self.dark_mode else QColor(Qt.white) + QColor(Qt.GlobalColor.black) if not self.dark_mode else QColor(Qt.GlobalColor.white) ) self.slice_list.addItem(item) # Select the current slice if self.current_slice: - items = self.slice_list.findItems(self.current_slice, Qt.MatchExactly) + items = self.slice_list.findItems(self.current_slice, Qt.MatchFlag.MatchExactly) if items: self.slice_list.setCurrentItem(items[0]) @@ -1789,9 +1954,9 @@ def keyPressEvent(self, event): super().keyPressEvent(event) return - if event.key() == Qt.Key_F2: - self.launch_snake_game() - elif event.key() == Qt.Key_Delete: + # F2 (Snake game) is wired as a global QShortcut in __init__ + # so it works when child widgets have focus. Don't re-handle here. + if event.key() == Qt.Key.Key_Delete: # Handle deletions if self.class_list.hasFocus() and self.class_list.currentItem(): self.delete_class(self.class_list.currentItem()) @@ -1801,14 +1966,14 @@ def keyPressEvent(self, event): self.delete_selected_annotations() elif self.image_list.hasFocus() and self.image_list.currentItem(): self.delete_selected_image() - elif event.key() == Qt.Key_Up or event.key() == Qt.Key_Down: + elif event.key() == Qt.Key.Key_Up or event.key() == Qt.Key.Key_Down: # Handle slice navigation if self.slice_list.hasFocus(): current_row = self.slice_list.currentRow() - if event.key() == Qt.Key_Up and current_row > 0: + if event.key() == Qt.Key.Key_Up and current_row > 0: self.slice_list.setCurrentRow(current_row - 1) elif ( - event.key() == Qt.Key_Down + event.key() == Qt.Key.Key_Down and current_row < self.slice_list.count() - 1 ): self.slice_list.setCurrentRow(current_row + 1) @@ -1816,13 +1981,13 @@ def keyPressEvent(self, event): else: # Pass the event to the parent for default handling super().keyPressEvent(event) - elif event.key() == Qt.Key_Return or event.key() == Qt.Key_Enter: + elif event.key() == Qt.Key.Key_Return or event.key() == Qt.Key.Key_Enter: # Handle accepting visible temporary classes if self.has_visible_temp_classes(): self.accept_visible_temp_classes() else: super().keyPressEvent(event) - elif event.key() == Qt.Key_Escape: + elif event.key() == Qt.Key.Key_Escape: # Handle rejecting visible temporary classes if self.has_visible_temp_classes(): self.reject_visible_temp_classes() @@ -1835,7 +2000,7 @@ def keyPressEvent(self, event): def has_visible_temp_classes(self): for i in range(self.class_list.count()): item = self.class_list.item(i) - if item.text().startswith("Temp-") and item.checkState() == Qt.Checked: + if item.text().startswith("Temp-") and item.checkState() == Qt.CheckState.Checked: return True return False @@ -1944,11 +2109,11 @@ def import_annotations(self): self, "Missing Images", message, - QMessageBox.Yes | QMessageBox.No, - QMessageBox.No, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.No, ) - if reply == QMessageBox.No: + if reply == QMessageBox.StandardButton.No: print("Import cancelled due to missing images") QMessageBox.information( self, @@ -2188,13 +2353,13 @@ def update_annotation_list(self, image_name=None): if not class_name.startswith( "Temp-" ): # Only show non-temporary annotations - color = self.image_label.class_colors.get(class_name, QColor(Qt.white)) + color = self.image_label.class_colors.get(class_name, QColor(Qt.GlobalColor.white)) for annotation in class_annotations: number = annotation.get("number", 0) area = calculate_area(annotation) item_text = f"{class_name} - {number:<3} Area: {area:.2f}" item = QListWidgetItem(item_text) - item.setData(Qt.UserRole, annotation) + item.setData(Qt.ItemDataRole.UserRole, annotation) item.setForeground(color) self.annotation_list.addItem(item) @@ -2217,12 +2382,14 @@ def update_slice_list_colors(self): slice_name = item.text() if self.dark_mode: - # Dark mode + # Dark mode (annotated colors match add_slice_to_list — + # muted steel-blue, light text; not the prior glaring + # light-blue bg) if slice_name in self.all_annotations and any( self.all_annotations[slice_name].values() ): - item.setForeground(QColor(60, 60, 60)) # Dark gray text - item.setBackground(QColor(173, 216, 230)) # Light blue background + item.setForeground(QColor(235, 235, 235)) + item.setBackground(QColor(58, 95, 140)) else: item.setForeground(QColor(200, 200, 200)) # Light gray text item.setBackground(QColor(40, 40, 40)) # Very dark gray background @@ -2247,14 +2414,14 @@ def update_slice_list_colors(self): def update_annotation_list_colors(self, class_name=None, color=None): for i in range(self.annotation_list.count()): item = self.annotation_list.item(i) - annotation = item.data(Qt.UserRole) + annotation = item.data(Qt.ItemDataRole.UserRole) # Update only the item for the specific class if class_name is provided if class_name is None or annotation["category_name"] == class_name: item_color = ( color if class_name else self.image_label.class_colors.get( - annotation["category_name"], QColor(Qt.white) + annotation["category_name"], QColor(Qt.GlobalColor.white) ) ) item.setForeground(item_color) @@ -2300,7 +2467,7 @@ def save_current_annotations(self): def setup_class_list(self): """Set up the class list widget.""" self.class_list = QListWidget() - self.class_list.setContextMenuPolicy(Qt.CustomContextMenu) + self.class_list.setContextMenuPolicy(Qt.ContextMenuPolicy.CustomContextMenu) self.class_list.customContextMenuRequested.connect(self.show_class_context_menu) self.class_list.itemClicked.connect(self.on_class_selected) self.sidebar_layout.addWidget(QLabel("Classes:")) @@ -2317,7 +2484,7 @@ def setup_tool_buttons(self): manual_layout.setSpacing(5) manual_label = QLabel("Manual Tools") - manual_label.setAlignment(Qt.AlignCenter) + manual_label.setAlignment(Qt.AlignmentFlag.AlignCenter) manual_layout.addWidget(manual_label) manual_buttons_layout = QHBoxLayout() @@ -2340,7 +2507,7 @@ def setup_tool_buttons(self): automated_layout.setSpacing(5) automated_label = QLabel("Automated Tools") - automated_label.setAlignment(Qt.AlignCenter) + automated_label.setAlignment(Qt.AlignmentFlag.AlignCenter) automated_layout.addWidget(automated_label) automated_buttons_layout = QHBoxLayout() @@ -2368,7 +2535,7 @@ def setup_tool_buttons(self): def setup_annotation_list(self): """Set up the annotation list widget.""" self.annotation_list = QListWidget() - self.annotation_list.setSelectionMode(QAbstractItemView.ExtendedSelection) + self.annotation_list.setSelectionMode(QAbstractItemView.SelectionMode.ExtendedSelection) self.annotation_list.itemSelectionChanged.connect( self.update_highlighted_annotations ) @@ -2380,17 +2547,17 @@ def create_menu_bar(self): project_menu = menu_bar.addMenu("&Project") new_project_action = QAction("&New Project", self) - new_project_action.setShortcut(QKeySequence.New) + new_project_action.setShortcut(QKeySequence.StandardKey.New) new_project_action.triggered.connect(self.new_project) project_menu.addAction(new_project_action) open_project_action = QAction("&Open Project", self) - open_project_action.setShortcut(QKeySequence.Open) + open_project_action.setShortcut(QKeySequence.StandardKey.Open) open_project_action.triggered.connect(self.open_project) project_menu.addAction(open_project_action) save_project_action = QAction("&Save Project", self) - save_project_action.setShortcut(QKeySequence.Save) + save_project_action.setShortcut(QKeySequence.StandardKey.Save) save_project_action.triggered.connect(self.save_project) project_menu.addAction(save_project_action) @@ -2472,11 +2639,17 @@ def create_menu_bar(self): dicom_converter_action.triggered.connect(self.show_dicom_converter) tools_menu.addAction(dicom_converter_action) + tools_menu.addSeparator() + + unload_models_action = QAction("Unload AI Models (Free GPU Memory)", self) + unload_models_action.triggered.connect(self.unload_ai_models) + tools_menu.addAction(unload_models_action) + # Help Menu help_menu = menu_bar.addMenu("&Help") help_action = QAction("&Show Help", self) - help_action.setShortcut(QKeySequence.HelpContents) + help_action.setShortcut(QKeySequence.StandardKey.HelpContents) help_action.triggered.connect(self.show_help) help_menu.addAction(help_action) @@ -2484,6 +2657,34 @@ def change_font_size(self, size): self.current_font_size = size self.apply_theme_and_font() + def unload_ai_models(self): + """Drop cached SAM/DINO model objects to free GPU/CPU memory. + + Useful on constrained GPUs (e.g. 8 GB) where SAM 2 base + DINO + base together exhaust VRAM. After unload, the next inference + call will re-load the model from disk (~1-3 s). + """ + self.sam_utils.unload() + self.dino_utils.unload() + # Reset the dropdowns to a neutral state so the user knows they + # need to re-pick the model. + self.sam_model_selector.setCurrentIndex(0) + if hasattr(self, "dino_model_selector"): + self.dino_model_selector.setCurrentIndex(0) + self.dino_model_loaded = False + self.lbl_dino_status.setText("No DINO model loaded") + self.btn_detect_single.setEnabled(False) + self.btn_detect_batch.setEnabled(False) + QMessageBox.information( + self, + "Models Unloaded", + "SAM and DINO models have been unloaded from memory.\n\n" + "Note: PyTorch keeps a per-process CUDA context that survives " + "this unload (typically a few hundred MB visible in Task Manager / " + "nvidia-smi). To fully reclaim GPU memory, restart the app.\n\n" + "Re-select a SAM/DINO model to use AI tools again.", + ) + def setup_sidebar(self): self.sidebar = QWidget() self.sidebar_layout = QVBoxLayout(self.sidebar) @@ -2493,7 +2694,7 @@ def setup_sidebar(self): def create_section_header(text): label = QLabel(text) label.setProperty("class", "section-header") - label.setAlignment(Qt.AlignLeft) + label.setAlignment(Qt.AlignmentFlag.AlignLeft) return label # Import functionality @@ -2521,7 +2722,7 @@ def create_section_header(text): # Class list (without the "Classes" header) self.class_list = QListWidget() - self.class_list.setContextMenuPolicy(Qt.CustomContextMenu) + self.class_list.setContextMenuPolicy(Qt.ContextMenuPolicy.CustomContextMenu) self.class_list.customContextMenuRequested.connect(self.show_class_context_menu) self.class_list.itemClicked.connect(self.on_class_selected) self.sidebar_layout.addWidget(self.class_list) @@ -2614,8 +2815,12 @@ def create_section_header(text): self.lbl_dino_status = QLabel("No DINO model loaded") self.lbl_dino_status.setWordWrap(True) + # No hardcoded background — let the active stylesheet (light or + # dark) provide it via QLabel rules. Hardcoded #f5f5f5 used to + # punch a bright rectangle into the dark sidebar. self.lbl_dino_status.setStyleSheet( - "color:#888;font-size:11px;background:#f5f5f5;padding:4px;border-radius:3px;") + "font-size:11px;padding:4px;border-radius:3px;" + "border:1px solid palette(mid);") dino_layout.addWidget(self.lbl_dino_status) # Threshold table @@ -2668,7 +2873,7 @@ def create_section_header(text): # Annotations list subsection annotation_layout.addWidget(QLabel("Annotations")) self.annotation_list = QListWidget() - self.annotation_list.setSelectionMode(QAbstractItemView.ExtendedSelection) + self.annotation_list.setSelectionMode(QAbstractItemView.SelectionMode.ExtendedSelection) self.annotation_list.itemSelectionChanged.connect( self.update_highlighted_annotations ) @@ -2731,11 +2936,11 @@ def toggle_sam_box(self): self.image_label.current_tool = "sam_box" self.image_label.sam_box_active = True self.image_label.sam_points_active = False - self.image_label.setCursor(Qt.CrossCursor) + self.image_label.setCursor(Qt.CursorShape.CrossCursor) else: self.image_label.current_tool = None self.image_label.sam_box_active = False - self.image_label.setCursor(Qt.ArrowCursor) + self.image_label.setCursor(Qt.CursorShape.ArrowCursor) self.update_ui_for_current_tool() def toggle_sam_points(self): @@ -2744,14 +2949,14 @@ def toggle_sam_points(self): self.image_label.current_tool = "sam_points" self.image_label.sam_points_active = True self.image_label.sam_box_active = False - self.image_label.setCursor(Qt.CrossCursor) + self.image_label.setCursor(Qt.CursorShape.CrossCursor) self.image_label.sam_positive_points = [] self.image_label.sam_negative_points = [] else: self.sam_inference_timer.stop() self.image_label.current_tool = None self.image_label.sam_points_active = False - self.image_label.setCursor(Qt.ArrowCursor) + self.image_label.setCursor(Qt.CursorShape.ArrowCursor) self.image_label.sam_positive_points = [] self.image_label.sam_negative_points = [] self.update_ui_for_current_tool() @@ -2809,8 +3014,8 @@ def update_annotation_list_with_sorted(self, sorted_annotations): area = calculate_area(annotation) item_text = f"{class_name} - {number:<3} Area: {area:.2f}" item = QListWidgetItem(item_text) - item.setData(Qt.UserRole, annotation) - color = self.image_label.class_colors.get(class_name, QColor(Qt.white)) + item.setData(Qt.ItemDataRole.UserRole, annotation) + color = self.image_label.class_colors.get(class_name, QColor(Qt.GlobalColor.white)) item.setForeground(color) self.annotation_list.addItem(item) @@ -2824,8 +3029,8 @@ def change_sam_model(self, model_name): self, "SAM Model Error", f"Failed to load SAM model '{model_name}':\n\n{str(e)}\n\n" - "If you are on Python 3.14, PyTorch may not yet be fully supported. " - "Try reinstalling torch/ultralytics for your platform." + "Check that the model weights are downloadable and that torch " + "is correctly installed for your platform / GPU." ) self.sam_model_selector.setCurrentIndex(0) return @@ -2989,6 +3194,7 @@ def run_dino_detection_single(self): self.lbl_dino_status.setText("Detecting...") QApplication.processEvents() + print(f"[DINO] detect_single: model={model_name!r} class_configs={class_configs}") try: results = self.dino_utils.detect( self.current_image, class_configs, @@ -3007,9 +3213,15 @@ def run_dino_detection_single(self): self.btn_detect_batch.setEnabled(True) if results is None: + print("[DINO] detect_single: results=None (model resolution failure)") self.lbl_dino_status.setText("No detections.") return + print(f"[DINO] detect_single: got {len(results)} result(s)") + if results: + for i, r in enumerate(results[:3]): + print(f"[DINO] result[{i}] class={r['class_name']!r} score={r['score']:.3f} bbox={r['bbox']}") + if not results: self.lbl_dino_status.setText("No detections found.") return @@ -3017,23 +3229,62 @@ def run_dino_detection_single(self): self.lbl_dino_status.setText(f"{len(results)} detection(s). Running SAM...") QApplication.processEvents() - # Batch SAM segmentation + # Batch SAM segmentation. Wrap in try/except for the same reason + # as the DINO call above — sam_utils raises on model load + # failure / CUDA OOM / re-entry now, instead of returning None. bboxes = [r["bbox"] for r in results] - sam_results = self.sam_utils.apply_sam_predictions_batch( - self.current_image, bboxes - ) + print(f"[SAM] batch call: {len(bboxes)} bbox(es), first 3 = {bboxes[:3]}") + try: + sam_results = self.sam_utils.apply_sam_predictions_batch( + self.current_image, bboxes + ) + except Exception as e: + traceback.print_exc() + QMessageBox.critical(self, "SAM Error", str(e)) + self.lbl_dino_status.setText("SAM segmentation failed.") + return if sam_results is None: + print("[SAM] batch returned None (no SAM model loaded)") QMessageBox.warning(self, "SAM Error", "Failed to segment detections with SAM.") self.lbl_dino_status.setText("SAM segmentation failed.") return - # Build temp annotations + n_errors = sum(1 for s in sam_results if "error" in s) + n_ok = sum(1 for s in sam_results if "segmentation" in s) + print(f"[SAM] batch returned {len(sam_results)} result(s): {n_ok} ok, {n_errors} error(s)") + + # Honor the batch-mode dropdown for the single-image case too: + # "Auto-accept" means commit straight to annotations without + # showing the temp-review overlay. The dropdown name is "batch" + # historically but it controls both paths. + image_name = self.current_slice or self.image_file_name + auto_accept = ( + self.dino_batch_mode.currentText() == "Auto-accept all detections" + ) + if auto_accept: + self._commit_dino_results(image_name, results, sam_results) + n_committed = sum(1 for s in sam_results if "error" not in s) + self.image_label.temp_annotations = [] + self.image_label.update() + self.update_annotation_list() + # Refresh slice list so the freshly-annotated slice picks + # up the highlight color; review-mode's accept_dino_results + # already does this, the auto-accept path didn't. + self.update_slice_list_colors() + self.auto_save() + self.lbl_dino_status.setText( + f"Loaded: {model_name} | {n_committed} mask(s) auto-accepted" + ) + print(f"[DINO] auto-accept: committed {n_committed} mask(s) to {image_name}") + return + + # Review mode — build temp annotations and let user accept/reject temp_annotations = [] for r, s in zip(results, sam_results): if "error" in s: - print(f" SAM failed for {r['class_name']}: {s['error']}") + print(f"[SAM] failed for {r['class_name']}: {s['error']}") continue temp_annotations.append({ "segmentation": s["segmentation"], @@ -3044,13 +3295,15 @@ def run_dino_detection_single(self): }) self.image_label.temp_annotations = temp_annotations - # Focus the canvas so Enter/Esc accept/reject without an extra click. - self.image_label.setFocus() + # Defer setFocus until after the click event chain settles — + # synchronous setFocus often loses to whatever widget is still + # processing the original click. + QTimer.singleShot(0, self.image_label.setFocus) self.image_label.update() self.lbl_dino_status.setText( f"Loaded: {model_name} | {len(temp_annotations)} mask(s) ready" ) - print(f"DINO detection: {len(results)} boxes, {len(temp_annotations)} masks.") + print(f"[DINO] detection complete: {len(results)} boxes, {len(temp_annotations)} masks attached to canvas") def run_dino_detection_batch(self): if not self.dino_model_loaded: @@ -3080,38 +3333,34 @@ def run_dino_detection_batch(self): return auto_accept = self.dino_batch_mode.currentText() == "Auto-accept all detections" - total = len(self.all_images) + + # Build a flat list of (display_name, qimage) work items covering + # both regular images (loaded from disk) and multi-dim image + # slices (already QImages in memory). Slices live in + # self.image_slices[base_name], indexed by their slice_name + # (e.g. "stack_T1_Z1_C1"). The earlier implementation only + # iterated self.all_images and skipped multi-slice entries with + # a console warning, leaving slice-based projects unable to use + # Detect All. + work_items = self._collect_dino_batch_work_items() + if not work_items: + QMessageBox.information( + self, "Detect All Images", + "No images or slices available to process." + ) + return + total = len(work_items) progress = QProgressDialog("Running LLM Detection...", "Cancel", 0, total, self) - progress.setWindowModality(Qt.WindowModal) + progress.setWindowModality(Qt.WindowModality.WindowModal) progress.setMinimumDuration(0) - for idx, img_info in enumerate(self.all_images): + for idx, (image_name, qimage) in enumerate(work_items): if progress.wasCanceled(): break progress.setValue(idx) QApplication.processEvents() - image_name = img_info["file_name"] - image_path = self.image_paths.get(image_name) - if not image_path: - # Multi-dimensional image slices live in self.image_slices, - # not self.image_paths — batch detection on stacks isn't - # supported yet. Surface the skip rather than dropping silently. - print(f" Skipping '{image_name}': no entry in image_paths " - "(multi-dimensional slices aren't supported in batch).") - continue - if not os.path.exists(image_path): - print(f" Skipping '{image_name}': file missing on disk " - f"({image_path}).") - continue - - # Load image as QImage for DINO + SAM - from PIL import Image as PILImage - pil_img = PILImage.open(image_path).convert("RGB") - qimage = QImage(pil_img.tobytes(), pil_img.width, pil_img.height, - pil_img.width * 3, QImage.Format_RGB888) - try: results = self.dino_utils.detect( qimage, class_configs, @@ -3126,7 +3375,11 @@ def run_dino_detection_batch(self): continue bboxes = [r["bbox"] for r in results] - sam_results = self.sam_utils.apply_sam_predictions_batch(qimage, bboxes) + try: + sam_results = self.sam_utils.apply_sam_predictions_batch(qimage, bboxes) + except Exception as e: + print(f" SAM failed for {image_name}: {e}") + continue if sam_results is None: continue @@ -3145,10 +3398,56 @@ def run_dino_detection_batch(self): "Detections have been saved to annotations." ) self.update_annotation_list() + # Multi-dim stacks commonly auto-accept across dozens of + # slices; the slice list must show which ones gained + # annotations or the user can't tell what happened. + self.update_slice_list_colors() self.auto_save() else: self._show_dino_batch_review() + def _collect_dino_batch_work_items(self): + """Return a flat ``[(name, QImage), …]`` list for batch DINO. + + Regular images are loaded from disk via PIL → QImage. Multi-dim + images contribute one entry per slice from ``self.image_slices``; + slices that haven't been materialised yet (the parent image was + never opened in this session) are skipped with a console log. + """ + from PIL import Image as PILImage + items = [] + for img_info in self.all_images: + file_name = img_info["file_name"] + if img_info.get("is_multi_slice", False): + base_name = os.path.splitext(file_name)[0] + slices = self.image_slices.get(base_name, []) + if not slices: + print(f" Skipping multi-slice image '{file_name}': " + "no slices loaded (open the image first to " + "materialise its slices).") + continue + for slice_name, qimage in slices: + items.append((slice_name, qimage)) + else: + image_path = self.image_paths.get(file_name) + if not image_path or not os.path.exists(image_path): + print(f" Skipping '{file_name}': missing image path.") + continue + try: + pil_img = PILImage.open(image_path).convert("RGB") + qimage = QImage( + pil_img.tobytes(), + pil_img.width, + pil_img.height, + pil_img.width * 3, + QImage.Format.Format_RGB888, + ) + items.append((file_name, qimage)) + except Exception as e: + print(f" Skipping '{file_name}': failed to load ({e}).") + print(f"[DINO] batch work items: {len(items)} total") + return items + def _commit_dino_results(self, image_name, dino_results, sam_results): """Commit DINO+SAM results to annotations for a single image. @@ -3197,9 +3496,6 @@ def _commit_dino_results(self, image_name, dino_results, sam_results): def _store_dino_batch_results(self, image_name, dino_results, sam_results): """Store results for batch review mode.""" - if not hasattr(self, "dino_batch_results"): - self.dino_batch_results = {} - valid = [] for r, s in zip(dino_results, sam_results): if "error" not in s: @@ -3213,25 +3509,110 @@ def _store_dino_batch_results(self, image_name, dino_results, sam_results): self.dino_batch_results[image_name] = valid def _show_dino_batch_review(self): - """Navigate to first image with batch results for review.""" - if not hasattr(self, "dino_batch_results") or not self.dino_batch_results: + """Navigate to first image with batch results for review. + + If the next entry refers to an image/slice that's no longer in + the project (e.g. the source was removed between detection and + review), pop the orphan and try the next entry so the user + doesn't get stuck with un-reviewable results. + """ + if not self.dino_batch_results: QMessageBox.information(self, "Batch Detection", "No detections found in any image.") return - first = next(iter(self.dino_batch_results)) - # Switch to first image with results + # Drain orphans up front. Navigate to the entry: it may be a + # regular image (key in image_list) or a slice (key in some + # image_slices[base_name]). _navigate_to_image_or_slice handles + # both. After the switch, switch_image / switch_slice's tail + # call to _refresh_dino_temp_for_current copies + # dino_batch_results[first] into image_label.temp_annotations + # and defers setFocus on the canvas — nothing to repeat here. + while self.dino_batch_results: + first = next(iter(self.dino_batch_results)) + if self._navigate_to_image_or_slice(first): + return + print(f"[DINO] dropping orphan batch result for {first!r} " + "(no matching image or slice in project)") + self.dino_batch_results.pop(first, None) + # Drained all entries without a single navigable target. + QMessageBox.warning( + self, "Batch Detection", + "Detections were produced but none of them map to an image " + "or slice still in the project. Results discarded.", + ) + + def _navigate_to_image_or_slice(self, name: str) -> bool: + """Switch the UI to a regular image or a slice by name. + + Returns True if a match was found and the switch was issued. + Used by batch-review navigation, which mixes regular image + names and slice names in ``dino_batch_results``. + """ + # Regular image — match in image_list directly for i in range(self.image_list.count()): item = self.image_list.item(i) - if item and item.text() == first: + if item and item.text() == name: self.image_list.setCurrentRow(i) self.switch_image(item) - break - self.image_label.temp_annotations = self.dino_batch_results.get(first, []) - self.image_label.setFocus() + return True + # Slice — find which multi-dim image contains it, switch to + # that parent image first, then activate the specific slice + # via slice_list. + for base_name, slices in self.image_slices.items(): + if not any(s_name == name for s_name, _ in slices): + continue + # Find the parent file in image_list. The file_name in the + # list includes the extension (e.g. "stack.tif") while + # base_name is the stem ("stack"), so match by stripping + # the extension and comparing for equality. + for i in range(self.image_list.count()): + item = self.image_list.item(i) + if not item: + continue + file_name = item.text() + if os.path.splitext(file_name)[0] == base_name: + self.image_list.setCurrentRow(i) + self.switch_image(item) + # switch_image populates slice_list. Now find the slice. + for s_i in range(self.slice_list.count()): + s_item = self.slice_list.item(s_i) + if s_item and s_item.text() == name: + self.slice_list.setCurrentRow(s_i) + self.switch_slice(s_item) + return True + break + return False + return False + + def _refresh_dino_temp_for_current(self): + """Sync ``image_label.temp_annotations`` to whatever the + currently-displayed image/slice has stored in + ``dino_batch_results``. Called from switch_slice / switch_image. + + Why this exists: ``temp_annotations`` is a single field on + ``ImageLabel``, not a per-image cache. Without this sync, masks + from the previously-viewed image bleed onto every slice the + user navigates to. During a batch review the user expects each + image to show its own pending detections; outside batch review, + switching simply discards the pending overlay. + """ + new_image = self.current_slice or self.image_file_name + pending = self.dino_batch_results.get(new_image, []) if new_image else [] + if pending: + # Re-stamp the "temp" flag in case it was stripped by a + # previous accept path; this list also feeds the paintEvent + # which expects dicts with "segmentation" + "category_name". + self.image_label.temp_annotations = list(pending) + self.lbl_dino_status.setText( + f"Review: {new_image} ({len(pending)} detection(s))" + ) + QTimer.singleShot(0, self.image_label.setFocus) + else: + if self.image_label.temp_annotations: + print("[DINO] temp annotations cleared on switch " + f"(no pending batch results for {new_image!r})") + self.image_label.temp_annotations = [] self.image_label.update() - self.lbl_dino_status.setText( - f"Review: {first} ({len(self.image_label.temp_annotations)} detections)" - ) def accept_dino_results(self): """Accept current temp_annotations (called from keyPressEvent).""" @@ -3262,10 +3643,9 @@ def accept_dino_results(self): self.image_label.temp_annotations = [] # Clear batch results if reviewing - if hasattr(self, "dino_batch_results"): - self.dino_batch_results.pop(image_name, None) - if self.dino_batch_results: - self._show_dino_batch_review() + self.dino_batch_results.pop(image_name, None) + if self.dino_batch_results: + self._show_dino_batch_review() self.save_current_annotations() self.update_slice_list_colors() self.image_label.update() @@ -3276,10 +3656,9 @@ def reject_dino_results(self): """Discard current temp_annotations.""" self.image_label.temp_annotations = [] image_name = self.current_slice or self.image_file_name - if hasattr(self, "dino_batch_results"): - self.dino_batch_results.pop(image_name, None) - if self.dino_batch_results: - self._show_dino_batch_review() + self.dino_batch_results.pop(image_name, None) + if self.dino_batch_results: + self._show_dino_batch_review() self.image_label.update() self.lbl_dino_status.setText("Results discarded.") print("DINO results discarded.") @@ -3354,20 +3733,20 @@ def setup_image_area(self): self.scroll_area = QScrollArea() self.scroll_area.setWidgetResizable(True) - self.scroll_area.setHorizontalScrollBarPolicy(Qt.ScrollBarAsNeeded) - self.scroll_area.setVerticalScrollBarPolicy(Qt.ScrollBarAsNeeded) + self.scroll_area.setHorizontalScrollBarPolicy(Qt.ScrollBarPolicy.ScrollBarAsNeeded) + self.scroll_area.setVerticalScrollBarPolicy(Qt.ScrollBarPolicy.ScrollBarAsNeeded) # Use the already initialized image_label - self.image_label.setAlignment(Qt.AlignCenter) + self.image_label.setAlignment(Qt.AlignmentFlag.AlignCenter) self.scroll_area.setWidget(self.image_label) self.image_layout.addWidget(self.scroll_area) - self.zoom_slider = QSlider(Qt.Horizontal) + self.zoom_slider = QSlider(Qt.Orientation.Horizontal) self.zoom_slider.setMinimum(10) self.zoom_slider.setMaximum(500) self.zoom_slider.setValue(100) - self.zoom_slider.setTickPosition(QSlider.TicksBelow) + self.zoom_slider.setTickPosition(QSlider.TickPosition.TicksBelow) self.zoom_slider.setTickInterval(50) self.zoom_slider.valueChanged.connect(self.zoom_image) self.image_layout.addWidget(self.zoom_slider) @@ -3388,7 +3767,7 @@ def setup_image_list(self): self.image_list.currentRowChanged.connect( lambda row: self.switch_image(self.image_list.currentItem()) ) - self.image_list.setContextMenuPolicy(Qt.CustomContextMenu) + self.image_list.setContextMenuPolicy(Qt.ContextMenuPolicy.CustomContextMenu) self.image_list.customContextMenuRequested.connect(self.show_image_context_menu) self.image_list_layout.addWidget(self.image_list) @@ -3399,7 +3778,7 @@ def setup_image_list(self): ########## ### Tools ########## I love useful image processing tools :) def open_dataset_splitter(self): self.dataset_splitter = DatasetSplitterTool(self) - self.dataset_splitter.setWindowModality(Qt.ApplicationModal) + self.dataset_splitter.setWindowModality(Qt.WindowModality.ApplicationModal) self.dataset_splitter.show_centered(self) def show_annotation_statistics(self): @@ -3470,7 +3849,7 @@ def clear_all(self, new_project=False, show_messages=True): "Clear All", "Are you sure you want to clear all images and annotations? This action cannot be undone.", ) - if reply != QMessageBox.Yes: + if reply != QMessageBox.StandardButton.Yes: return # Clear images @@ -3534,7 +3913,7 @@ def clear_all(self, new_project=False, show_messages=True): self.image_label.drawing_sam_bbox = False self.image_label.temp_sam_prediction = None - self.image_label.setCursor(Qt.ArrowCursor) # Reset cursor to default + self.image_label.setCursor(Qt.CursorShape.ArrowCursor) # Reset cursor to default self.sam_model_selector.setCurrentIndex(0) # Reset to "Pick a SAM Model" self.current_sam_model = None # Reset the current SAM model @@ -3572,7 +3951,7 @@ def update_image_info(self, additional_info=None): def show_question(self, title, message): return QMessageBox.question( - self, title, message, QMessageBox.Yes | QMessageBox.No, QMessageBox.No + self, title, message, QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, QMessageBox.StandardButton.No ) def show_image_context_menu(self, position): @@ -3588,7 +3967,7 @@ def show_image_context_menu(self, position): if self.is_multi_dimensional(file_name): redefine_dimensions_action = menu.addAction("Redefine Dimensions") - action = menu.exec_(self.image_list.mapToGlobal(position)) + action = menu.exec(self.image_list.mapToGlobal(position)) if action == delete_action: self.remove_image() @@ -3641,11 +4020,11 @@ def redefine_dimensions(self, file_name): "Redefine Dimensions", "Redefining dimensions will cause all associated annotations to be lost. " "Do you want to continue?", - QMessageBox.Yes | QMessageBox.No, - QMessageBox.No, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.No, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: # Remove existing annotations for this file base_name = os.path.splitext(file_name)[0] @@ -3867,7 +4246,7 @@ def load_annotations(self): # Reload the current image if it exists, otherwise load the first image if self.image_file_name and self.image_file_name in self.all_annotations: self.switch_image( - self.image_list.findItems(self.image_file_name, Qt.MatchExactly)[0] + self.image_list.findItems(self.image_file_name, Qt.MatchFlag.MatchExactly)[0] ) elif self.all_images: self.switch_image(self.image_list.item(0)) @@ -3883,7 +4262,7 @@ def clear_highlighted_annotation(self): def update_highlighted_annotations(self): selected_items = self.annotation_list.selectedItems() self.image_label.highlighted_annotations = [ - item.data(Qt.UserRole) for item in selected_items + item.data(Qt.ItemDataRole.UserRole) for item in selected_items ] self.image_label.update() # Force a redraw of the image label @@ -3911,14 +4290,14 @@ def delete_selected_annotations(self): self, "Delete Annotations", f"Are you sure you want to delete {len(selected_items)} annotation(s)?", - QMessageBox.Yes | QMessageBox.No, - QMessageBox.No, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.No, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: # Create a list of annotations to remove annotations_to_remove = [] for item in selected_items: - annotation = item.data(Qt.UserRole) + annotation = item.data(Qt.ItemDataRole.UserRole) annotations_to_remove.append((annotation["category_name"], annotation)) # Remove annotations from image_label.annotations @@ -3968,9 +4347,9 @@ def merge_annotations(self): ) return - class_name = selected_items[0].data(Qt.UserRole)["category_name"] + class_name = selected_items[0].data(Qt.ItemDataRole.UserRole)["category_name"] if not all( - item.data(Qt.UserRole)["category_name"] == class_name + item.data(Qt.ItemDataRole.UserRole)["category_name"] == class_name for item in selected_items ): QMessageBox.warning( @@ -3983,7 +4362,7 @@ def merge_annotations(self): polygons = [] original_annotations = [] for item in selected_items: - annotation = item.data(Qt.UserRole) + annotation = item.data(Qt.ItemDataRole.UserRole) original_annotations.append(annotation) if "segmentation" in annotation: points = zip( @@ -4058,16 +4437,16 @@ def are_all_polygons_connected(polygons): msg_box = QMessageBox(self) msg_box.setWindowTitle("Merge Annotations") msg_box.setText("Do you want to keep the original annotations?") - msg_box.setIcon(QMessageBox.Question) + msg_box.setIcon(QMessageBox.Icon.Question) - keep_button = msg_box.addButton("Keep", QMessageBox.YesRole) - delete_button = msg_box.addButton("Delete", QMessageBox.NoRole) - cancel_button = msg_box.addButton("Cancel", QMessageBox.RejectRole) + keep_button = msg_box.addButton("Keep", QMessageBox.ButtonRole.YesRole) + delete_button = msg_box.addButton("Delete", QMessageBox.ButtonRole.NoRole) + cancel_button = msg_box.addButton("Cancel", QMessageBox.ButtonRole.RejectRole) msg_box.setDefaultButton(cancel_button) msg_box.setEscapeButton(cancel_button) - msg_box.exec_() + msg_box.exec() if msg_box.clickedButton() == cancel_button: return @@ -4102,11 +4481,11 @@ def delete_selected_image(self): "Delete Image", f"Are you sure you want to delete the image '{file_name}'?\n\n" "This will remove the image and all its associated annotations.", - QMessageBox.Yes | QMessageBox.No, - QMessageBox.No, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.No, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: # Remove from all data structures self.image_list.takeItem(self.image_list.row(current_item)) self.image_paths.pop(file_name, None) @@ -4242,11 +4621,11 @@ def add_class(self, class_name=None, color=None): item.setIcon(QIcon(pixmap)) # Set visibility state - item.setData(Qt.UserRole, True) + item.setData(Qt.ItemDataRole.UserRole, True) # Set checkbox - item.setFlags(item.flags() | Qt.ItemIsUserCheckable) - item.setCheckState(Qt.Checked) + item.setFlags(item.flags() | Qt.ItemFlag.ItemIsUserCheckable) + item.setCheckState(Qt.CheckState.Checked) self.class_list.addItem(item) @@ -4254,16 +4633,22 @@ def add_class(self, class_name=None, color=None): self.current_class = class_name print(f"Class added successfully: {class_name}") - # Sync DINO phrase/threshold state - self.dino_class_table.add_class(class_name) + # Sync DINO phrase/threshold state. Select the newly added + # row so the phrase editor below the table reveals itself — + # it hides by default and only becomes visible when a row is + # selected (set_active_class). Skip the row-select during + # project load: classes are added in a loop and we don't want + # N row-selection signals firing during bulk restoration; the + # caller will select an appropriate row after load completes. + row_added = self.dino_class_table.add_class(class_name) self.dino_phrase_panel.on_class_added(class_name) + if row_added and not self.is_loading_project: + self.dino_class_table.selectRow(self.dino_class_table.rowCount() - 1) if not self.is_loading_project: self.auto_save() except Exception as e: print(f"Error adding class: {e}") - import traceback - traceback.print_exc() def update_class_item_color(self, item, color): @@ -4283,18 +4668,18 @@ def update_class_list(self): # Store the visibility state item.setData( - Qt.UserRole, self.image_label.class_visibility.get(class_name, True) + Qt.ItemDataRole.UserRole, self.image_label.class_visibility.get(class_name, True) ) # Set checkbox - item.setFlags(item.flags() | Qt.ItemIsUserCheckable) - item.setCheckState(Qt.Checked if item.data(Qt.UserRole) else Qt.Unchecked) + item.setFlags(item.flags() | Qt.ItemFlag.ItemIsUserCheckable) + item.setCheckState(Qt.CheckState.Checked if item.data(Qt.ItemDataRole.UserRole) else Qt.CheckState.Unchecked) self.class_list.addItem(item) # Re-select the current class if it exists if self.current_class: - items = self.class_list.findItems(self.current_class, Qt.MatchExactly) + items = self.class_list.findItems(self.current_class, Qt.MatchFlag.MatchExactly) if items: self.class_list.setCurrentItem(items[0]) elif self.class_list.count() > 0: @@ -4313,9 +4698,9 @@ def update_class_selection(self): def toggle_class_visibility(self, item): class_name = item.text() - is_visible = item.checkState() == Qt.Checked + is_visible = item.checkState() == Qt.CheckState.Checked self.image_label.set_class_visibility(class_name, is_visible) - item.setData(Qt.UserRole, is_visible) + item.setData(Qt.ItemDataRole.UserRole, is_visible) self.image_label.update() def change_annotation_class(self): @@ -4337,12 +4722,12 @@ def change_annotation_class(self): class_combo.addItem(class_name) layout.addWidget(class_combo) - button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel) + button_box = QDialogButtonBox(QDialogButtonBox.StandardButton.Ok | QDialogButtonBox.StandardButton.Cancel) button_box.accepted.connect(class_dialog.accept) button_box.rejected.connect(class_dialog.reject) layout.addWidget(button_box) - if class_dialog.exec_() == QDialog.Accepted: + if class_dialog.exec() == QDialog.DialogCode.Accepted: new_class = class_combo.currentText() current_name = self.current_slice or self.image_file_name @@ -4356,7 +4741,7 @@ def change_annotation_class(self): ) for item in selected_items: - annotation = item.data(Qt.UserRole) + annotation = item.data(Qt.ItemDataRole.UserRole) old_class = annotation["category_name"] # Remove from old class @@ -4454,7 +4839,7 @@ def toggle_tool(self): self.update_ui_for_current_tool() def wheelEvent(self, event): - if event.modifiers() == Qt.ControlModifier: + if event.modifiers() == Qt.KeyboardModifier.ControlModifier: delta = event.angleDelta().y() if self.image_label.current_tool == "paint_brush": self.paint_brush_size = max(1, self.paint_brush_size + delta // 120) @@ -4495,9 +4880,9 @@ def update_ui_for_current_tool(self): self.image_label.current_tool == "sam_magic_wand" and self.sam_magic_wand_button.isEnabled() ): - self.image_label.setCursor(Qt.CrossCursor) + self.image_label.setCursor(Qt.CursorShape.CrossCursor) else: - self.image_label.setCursor(Qt.ArrowCursor) + self.image_label.setCursor(Qt.CursorShape.ArrowCursor) def on_class_selected(self, current=None, previous=None): if not self.image_label.check_unsaved_changes(): @@ -4536,7 +4921,7 @@ def show_class_context_menu(self, position): item = self.class_list.itemAt(position) if item: - action = menu.exec_(self.class_list.mapToGlobal(position)) + action = menu.exec(self.class_list.mapToGlobal(position)) if action == rename_action: self.rename_class(item) @@ -4551,7 +4936,7 @@ def show_class_context_menu(self, position): def change_class_color(self, item): class_name = item.text() - current_color = self.image_label.class_colors.get(class_name, QColor(Qt.white)) + current_color = self.image_label.class_colors.get(class_name, QColor(Qt.GlobalColor.white)) color = QColorDialog.getColor( current_color, self, f"Select Color for {class_name}" ) @@ -4641,11 +5026,11 @@ def delete_class(self, item=None): "Delete Class", f"Are you sure you want to delete the class '{class_name}'?\n\n" "This will remove all annotations associated with this class.", - QMessageBox.Yes | QMessageBox.No, - QMessageBox.No, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.No, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: # Proceed with deletion # Remove class color self.image_label.class_colors.pop(class_name, None) @@ -4774,13 +5159,13 @@ def finish_polygon(self): self.auto_save() # Auto-save after adding a polygon annotation def highlight_annotation(self, item): - self.image_label.highlighted_annotation = item.data(Qt.UserRole) + self.image_label.highlighted_annotation = item.data(Qt.ItemDataRole.UserRole) self.image_label.update() def delete_annotation(self): current_item = self.annotation_list.currentItem() if current_item: - annotation = current_item.data(Qt.UserRole) + annotation = current_item.data(Qt.ItemDataRole.UserRole) category_name = annotation["category_name"] self.image_label.annotations[category_name].remove(annotation) self.annotation_list.takeItem(self.annotation_list.row(current_item)) @@ -4789,7 +5174,7 @@ def delete_annotation(self): def add_annotation_to_list(self, annotation): class_name = annotation["category_name"] - color = self.image_label.class_colors.get(class_name, QColor(Qt.white)) + color = self.image_label.class_colors.get(class_name, QColor(Qt.GlobalColor.white)) annotations = self.image_label.annotations.get(class_name, []) number = max([ann.get("number", 0) for ann in annotations] + [0]) + 1 annotation["number"] = number @@ -4797,7 +5182,7 @@ def add_annotation_to_list(self, annotation): item_text = f"{class_name} - {number:<3} Area: {area:.2f}" item = QListWidgetItem(item_text) - item.setData(Qt.UserRole, annotation) + item.setData(Qt.ItemDataRole.UserRole, annotation) item.setForeground(color) self.annotation_list.addItem(item) @@ -4926,14 +5311,14 @@ def exit_edit_mode(self): def highlight_annotation_in_list(self, annotation): for i in range(self.annotation_list.count()): item = self.annotation_list.item(i) - if item.data(Qt.UserRole) == annotation: + if item.data(Qt.ItemDataRole.UserRole) == annotation: self.annotation_list.setCurrentItem(item) break def select_annotation_in_list(self, annotation): for i in range(self.annotation_list.count()): item = self.annotation_list.item(i) - if item.data(Qt.UserRole) == annotation: + if item.data(Qt.ItemDataRole.UserRole) == annotation: self.annotation_list.setCurrentItem(item) break @@ -5081,7 +5466,7 @@ def load_prediction_model(self): self.initialize_yolo_trainer() dialog = LoadPredictionModelDialog(self) - if dialog.exec_() == QDialog.Accepted: + if dialog.exec() == QDialog.DialogCode.Accepted: model_path = dialog.model_path yaml_path = dialog.yaml_path if model_path and yaml_path: @@ -5143,14 +5528,14 @@ def show_train_dialog(self): layout.addWidget(imgsz_label) layout.addWidget(imgsz_input) - button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel) + button_box = QDialogButtonBox(QDialogButtonBox.StandardButton.Ok | QDialogButtonBox.StandardButton.Cancel) button_box.accepted.connect(dialog.accept) button_box.rejected.connect(dialog.reject) layout.addWidget(button_box) dialog.setLayout(layout) - if dialog.exec_() == QDialog.Accepted: + if dialog.exec() == QDialog.DialogCode.Accepted: epochs = int(epochs_input.text()) imgsz = int(imgsz_input.text()) self.start_training(epochs, imgsz) @@ -5244,16 +5629,16 @@ def show_predict_dialog(self): layout.addWidget(conf_label) layout.addWidget(conf_input) - button_box = QDialogButtonBox(QDialogButtonBox.Cancel) + button_box = QDialogButtonBox(QDialogButtonBox.StandardButton.Cancel) predict_button = QPushButton("Predict") - button_box.addButton(predict_button, QDialogButtonBox.AcceptRole) + button_box.addButton(predict_button, QDialogButtonBox.ButtonRole.AcceptRole) button_box.accepted.connect(dialog.accept) button_box.rejected.connect(dialog.reject) layout.addWidget(button_box) dialog.setLayout(layout) - if dialog.exec_() == QDialog.Accepted: + if dialog.exec() == QDialog.DialogCode.Accepted: selected_images = [item.text() for item in image_list.selectedItems()] conf = conf_input.value() self.yolo_trainer.set_conf_threshold(conf) @@ -5394,8 +5779,8 @@ def verify_current_class(self): def accept_visible_temp_classes(self): visible_temp_classes = [ item.text() - for item in self.class_list.findItems("Temp-*", Qt.MatchWildcard) - if item.checkState() == Qt.Checked + for item in self.class_list.findItems("Temp-*", Qt.MatchFlag.MatchWildcard) + if item.checkState() == Qt.CheckState.Checked ] for temp_class_name in visible_temp_classes: @@ -5455,8 +5840,8 @@ def select_first_primary_class(self): def reject_visible_temp_classes(self): visible_temp_classes = [ item.text() - for item in self.class_list.findItems("Temp-*", Qt.MatchWildcard) - if item.checkState() == Qt.Checked + for item in self.class_list.findItems("Temp-*", Qt.MatchFlag.MatchWildcard) + if item.checkState() == Qt.CheckState.Checked ] for temp_class_name in visible_temp_classes: @@ -5469,9 +5854,9 @@ def reject_visible_temp_classes(self): self.image_label.update() def is_class_visible(self, class_name): - items = self.class_list.findItems(class_name, Qt.MatchExactly) + items = self.class_list.findItems(class_name, Qt.MatchFlag.MatchExactly) if items: - return items[0].checkState() == Qt.Checked + return items[0].checkState() == Qt.CheckState.Checked return False def check_temp_annotations(self): @@ -5485,10 +5870,10 @@ def check_temp_annotations(self): self, "Temporary Annotations", "There are temporary annotations that will be discarded. Do you want to continue?", - QMessageBox.Yes | QMessageBox.No, - QMessageBox.No, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, + QMessageBox.StandardButton.No, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: for temp_class in temp_classes: del self.image_label.annotations[temp_class] del self.image_label.class_colors[temp_class] diff --git a/src/digitalsreeni_image_annotator/coco_json_combiner.py b/src/digitalsreeni_image_annotator/coco_json_combiner.py index af18766..c2eb5c6 100644 --- a/src/digitalsreeni_image_annotator/coco_json_combiner.py +++ b/src/digitalsreeni_image_annotator/coco_json_combiner.py @@ -1,16 +1,16 @@ import json import os -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, QLabel, QMessageBox, QApplication) -from PyQt5.QtCore import Qt +from PyQt6.QtCore import Qt class COCOJSONCombinerDialog(QDialog): def __init__(self, parent=None): super().__init__(parent) self.setWindowTitle("COCO JSON Combiner") self.setGeometry(100, 100, 400, 300) - self.setWindowFlags(self.windowFlags() | Qt.Window) - self.setWindowModality(Qt.ApplicationModal) + self.setWindowFlags(self.windowFlags() | Qt.WindowType.Window) + self.setWindowModality(Qt.WindowModality.ApplicationModal) self.json_files = [] self.initUI() diff --git a/src/digitalsreeni_image_annotator/dataset_splitter.py b/src/digitalsreeni_image_annotator/dataset_splitter.py index 50b0d7b..2a1d849 100644 --- a/src/digitalsreeni_image_annotator/dataset_splitter.py +++ b/src/digitalsreeni_image_annotator/dataset_splitter.py @@ -2,9 +2,9 @@ import json import shutil import random -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, QLabel, QSpinBox, QRadioButton, QButtonGroup, QMessageBox, QComboBox) -from PyQt5.QtCore import Qt +from PyQt6.QtCore import Qt import yaml from PIL import Image @@ -13,7 +13,7 @@ def __init__(self, parent=None): super().__init__(parent) self.setWindowTitle("Dataset Splitter") self.setGeometry(100, 100, 500, 300) - self.setWindowFlags(self.windowFlags() | Qt.Window) + self.setWindowFlags(self.windowFlags() | Qt.WindowType.Window) self.initUI() def initUI(self): diff --git a/src/digitalsreeni_image_annotator/dicom_converter.py b/src/digitalsreeni_image_annotator/dicom_converter.py index dc84cc7..b53bc60 100644 --- a/src/digitalsreeni_image_annotator/dicom_converter.py +++ b/src/digitalsreeni_image_annotator/dicom_converter.py @@ -2,10 +2,10 @@ import json import numpy as np from datetime import datetime -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, QLabel, QProgressDialog, QRadioButton, QButtonGroup, QMessageBox, QApplication, QGroupBox) -from PyQt5.QtCore import Qt +from PyQt6.QtCore import Qt import pydicom from pydicom.pixel_data_handlers.util import apply_voi_lut import tifffile @@ -15,8 +15,8 @@ def __init__(self, parent=None): super().__init__(parent) self.setWindowTitle("DICOM to TIFF Converter") self.setGeometry(100, 100, 600, 300) - self.setWindowFlags(self.windowFlags() | Qt.Window) - self.setWindowModality(Qt.ApplicationModal) # Add modal behavior + self.setWindowFlags(self.windowFlags() | Qt.WindowType.Window) + self.setWindowModality(Qt.WindowModality.ApplicationModal) # Add modal behavior # Initialize variables first self.input_file = "" @@ -97,7 +97,7 @@ def select_input(self): "Select DICOM File", "", file_filter, - options=QFileDialog.Options() + options=QFileDialog.Option(0) ) if file_name: @@ -115,7 +115,7 @@ def select_output(self): self, "Select Output Directory", "", - QFileDialog.ShowDirsOnly | QFileDialog.DontResolveSymlinks + QFileDialog.Option.ShowDirsOnly | QFileDialog.Option.DontResolveSymlinks ) if directory: @@ -182,7 +182,7 @@ def convert_dicom(self): try: # Create progress dialog progress = QProgressDialog("Processing DICOM file...", "Cancel", 0, 100, self) - progress.setWindowModality(Qt.WindowModal) + progress.setWindowModality(Qt.WindowModality.WindowModal) progress.setMinimumWidth(400) progress.show() diff --git a/src/digitalsreeni_image_annotator/dino_merge_dialog.py b/src/digitalsreeni_image_annotator/dino_merge_dialog.py index 5b2d9f3..44bb6ac 100644 --- a/src/digitalsreeni_image_annotator/dino_merge_dialog.py +++ b/src/digitalsreeni_image_annotator/dino_merge_dialog.py @@ -8,11 +8,12 @@ import math import os import random +import traceback from collections import defaultdict from pathlib import Path -from PyQt5.QtCore import Qt -from PyQt5.QtWidgets import ( +from PyQt6.QtCore import Qt +from PyQt6.QtWidgets import ( QApplication, QComboBox, QDialog, @@ -77,7 +78,7 @@ def _build_ui(self): layout.addWidget(info) form = QFormLayout() - form.setLabelAlignment(Qt.AlignRight) + form.setLabelAlignment(Qt.AlignmentFlag.AlignRight) def browse_row(placeholder, pick_dir=True): row = QHBoxLayout() @@ -316,4 +317,4 @@ def _build_coco(imgs): def show_dino_merge_dialog(parent=None): dialog = DinoMergeDialog(parent) - dialog.exec_() + dialog.exec() diff --git a/src/digitalsreeni_image_annotator/dino_phrase_editor.py b/src/digitalsreeni_image_annotator/dino_phrase_editor.py index dd2d9d9..73e931d 100644 --- a/src/digitalsreeni_image_annotator/dino_phrase_editor.py +++ b/src/digitalsreeni_image_annotator/dino_phrase_editor.py @@ -6,9 +6,9 @@ Ported from annotation_tool_v4.py and adapted for integration. """ -from PyQt5.QtCore import Qt -from PyQt5.QtGui import QColor -from PyQt5.QtWidgets import ( +from PyQt6.QtCore import Qt +from PyQt6.QtGui import QAction, QColor +from PyQt6.QtWidgets import ( QAbstractItemView, QDoubleSpinBox, QHeaderView, @@ -17,6 +17,7 @@ QLabel, QListWidget, QListWidgetItem, + QMenu, QMessageBox, QPushButton, QTableWidget, @@ -48,18 +49,25 @@ def __init__(self, parent=None): self.setHorizontalHeaderLabels( ["Class", "Box thr", "Txt thr", "NMS thr"]) self.horizontalHeader().setSectionResizeMode( - _COL_NAME, QHeaderView.Stretch) + _COL_NAME, QHeaderView.ResizeMode.Stretch) + # Fixed-width threshold columns — wide enough for "0,99" plus + # spin arrows plus frame, with margin for macOS Retina font + # metrics (where 72px clipped the down-arrow on some setups). for col in (_COL_BOX, _COL_TXT, _COL_NMS): self.horizontalHeader().setSectionResizeMode( - col, QHeaderView.ResizeToContents) - self.setSelectionBehavior(QAbstractItemView.SelectRows) - self.setEditTriggers(QAbstractItemView.NoEditTriggers) + col, QHeaderView.ResizeMode.Fixed) + self.setColumnWidth(col, 88) + self.setSelectionBehavior(QAbstractItemView.SelectionBehavior.SelectRows) + self.setEditTriggers(QAbstractItemView.EditTrigger.NoEditTriggers) self.verticalHeader().setVisible(False) self.setMaximumHeight(160) + # No hardcoded background colors — pick them up from the active + # stylesheet so the table integrates with both light and dark + # mode. The earlier "background: #e0e0e0" produced a bright bar + # across the top of the panel in dark mode. self.setStyleSheet( "QTableWidget { font-size: 11px; }" - "QHeaderView::section { font-size: 11px; font-weight: bold; " - " background: #e0e0e0; padding: 2px; }" + "QHeaderView::section { font-size: 11px; font-weight: bold; padding: 2px; }" ) def _make_spin(self, value=0.25): @@ -68,7 +76,7 @@ def _make_spin(self, value=0.25): sp.setSingleStep(0.05) sp.setDecimals(2) sp.setValue(value) - sp.setFrame(False) + sp.setFrame(True) sp.setStyleSheet("font-size: 11px;") return sp @@ -179,6 +187,8 @@ def __init__(self, parent=None): self.phrase_list = QListWidget() self.phrase_list.setMaximumHeight(90) self.phrase_list.setStyleSheet("font-size: 11px;") + self.phrase_list.setContextMenuPolicy(Qt.ContextMenuPolicy.CustomContextMenu) + self.phrase_list.customContextMenuRequested.connect(self._show_phrase_context_menu) layout.addWidget(self.phrase_list) btn_row = QHBoxLayout() @@ -214,7 +224,12 @@ def _refresh_list(self): self.phrase_list.clear() if self._active_class is None: return - for i, phrase in enumerate(self._phrases[self._active_class]): + # Defensive .get(): set_phrases() replaces self._phrases + # wholesale. The danger is cross-project state carryover — + # Project A leaves a row selected (_active_class != None), then + # Project B loads via set_phrases(kept) where `kept` doesn't + # contain that class. KeyError on _phrases[active] was a P1. + for i, phrase in enumerate(self._phrases.get(self._active_class, [])): item = QListWidgetItem(phrase) if i == 0: item.setForeground(QColor("#2E75B6")) @@ -251,6 +266,57 @@ def _remove_phrase(self): self._phrases[self._active_class].pop(row) self._refresh_list() + def _show_phrase_context_menu(self, position): + """Right-click menu on a phrase row. Rename is allowed for every + row including row 0 (the class-name phrase); delete is still + locked for row 0 — handled in _remove_phrase, not here. + """ + if self._active_class is None: + return + item = self.phrase_list.itemAt(position) + if item is None: + return + row = self.phrase_list.row(item) + + menu = QMenu(self) + rename_action = QAction("Rename Phrase", self) + rename_action.triggered.connect(lambda: self._rename_phrase(row)) + menu.addAction(rename_action) + menu.exec(self.phrase_list.mapToGlobal(position)) + + def _rename_phrase(self, row: int): + """Prompt for a new phrase text and replace the row. Mirrors the + class-rename flow in annotator_window: validate non-empty, strip, + reject duplicates within the same class. Row 0 may be renamed — + it just can't be removed. + """ + if self._active_class is None or row < 0: + return + phrases = self._phrases.get(self._active_class, []) + if row >= len(phrases): + return + current = phrases[row] + text, ok = QInputDialog.getText( + self, "Rename Phrase", + f'New text for "{current}":', + text=current, + ) + if not (ok and text.strip()): + return + new_phrase = text.strip().rstrip(".") + if new_phrase == current: + return + existing_lower = [p.lower() for i, p in enumerate(phrases) if i != row] + if new_phrase.lower() in existing_lower: + QMessageBox.information(self, "Duplicate", + "That phrase already exists for this class.") + return + phrases[row] = new_phrase + self._refresh_list() + # Restore selection on the renamed row so a follow-up rename + # works without re-clicking. + self.phrase_list.setCurrentRow(row) + def on_class_added(self, class_name: str): if class_name not in self._phrases: self._phrases[class_name] = [class_name] @@ -261,10 +327,14 @@ def on_class_removed(self, class_name: str): self.set_active_class(None) def get_phrases_for(self, class_name: str) -> list[str]: - phrases = self._phrases.get(class_name, [class_name]) - if class_name not in phrases: - phrases = [class_name] + phrases - return phrases + # Return the user-edited phrase list as-is. The class-name + # phrase (row 0) was historically auto-prepended here as a + # safety net, but that defeated the row-0 rename feature: the + # user would rename "cell" → "small green blob" and DINO would + # still receive ["cell", "small green blob"] because the + # original was re-injected. Trust the editor's state. Fall back + # to a single-phrase list only when nothing was ever stored. + return list(self._phrases.get(class_name, [class_name])) def get_all_phrases(self) -> dict[str, list[str]]: return dict(self._phrases) diff --git a/src/digitalsreeni_image_annotator/dino_utils.py b/src/digitalsreeni_image_annotator/dino_utils.py index 248432b..b6663f3 100644 --- a/src/digitalsreeni_image_annotator/dino_utils.py +++ b/src/digitalsreeni_image_annotator/dino_utils.py @@ -1,21 +1,30 @@ """ -Grounding DINO utilities --- delegates to an isolated subprocess. +Grounding DINO utilities — runs HF Transformers DINO in-process. -On Windows + Python 3.14, loading PyTorch after PyQt5 causes -WinError 1114. Running DINO in a clean subprocess avoids the issue. +History +------- +Earlier versions delegated to a ``dino_worker.py`` subprocess to dodge +the same Windows + Python 3.14 + PyQt5 DLL conflict that motivated the +SAM worker (the now-superseded ADR-011). With PyQt6 in place (ADR-014) +the conflict is gone and we run inference directly — saves a process +spawn per call and lets the model stay resident on its compute device. + +Threading model +--------------- +Same as ``sam_utils.SAMUtils``: inference runs on a worker thread; the +caller's thread pumps its event loop while waiting via ``_run_sync``. +torch + transformers are imported lazily on first detect call. """ -import json +from __future__ import annotations + import os -import subprocess -import sys -import tempfile -import traceback from pathlib import Path -from PyQt5.QtGui import QImage +from PyQt6.QtCore import QObject, pyqtSignal +from PyQt6.QtGui import QImage -from .sam_utils import _qimage_to_numpy +from .sam_utils import _qimage_to_numpy, _run_sync from .utils import models_base_dir @@ -24,14 +33,20 @@ "grounding-dino-tiny", ] +# Area filter: discard DINO boxes that cover more than this fraction of +# the image. Catches degenerate "whole image" detections from generic +# phrases ("object", "thing", etc.). Same value used in the old worker. +MAX_AREA_FRAC = 0.70 + +# Default IoU threshold for cross-class NMS across all classes. +DEFAULT_CROSS_CLASS_NMS_THR = 0.50 + def _gdino_local_path(model_name: str) -> str: """Canonical local install path for a Grounding DINO model.""" return os.path.join(models_base_dir(), model_name) -# Kept for backwards compatibility / external callers. Computed lazily via -# the helper so it always agrees with sam_worker / annotator_window. GDINO_MODEL_PATHS = { "grounding-dino-base": _gdino_local_path("grounding-dino-base"), "grounding-dino-tiny": _gdino_local_path("grounding-dino-tiny"), @@ -43,129 +58,283 @@ def _gdino_local_path(model_name: str) -> str: } -class DINOUtils: - """Thin wrapper that forwards DINO work to a subprocess worker.""" +class DINOUtils(QObject): + """In-process Grounding DINO wrapper with a cached model.""" + + model_changed = pyqtSignal(str) def __init__(self): - self._worker_script = os.path.join( - os.path.dirname(os.path.abspath(__file__)), "dino_worker.py" - ) + super().__init__() + self._proc = None # AutoProcessor instance + self._model = None # AutoModelForZeroShotObjectDetection + self._loaded_model_path: str | None = None + self._device: str | None = None # set on first load + + # ── model lifecycle ─────────────────────────────────────────────── - def _send_request(self, request: dict) -> dict: - """Spawn the DINO worker, send JSON, and return parsed response.""" - env = os.environ.copy() - for possible in ("VIRTUAL_ENV", "CONDA_PREFIX"): - v = os.environ.get(possible) - if v: - env[possible] = v - break - # Force the worker to write UTF-8 so cp1252 (Windows) doesn't choke - # on non-ASCII bytes from torch/transformers warnings. - env["PYTHONIOENCODING"] = "utf-8" - - proc = subprocess.run( - [sys.executable, self._worker_script], - input=json.dumps(request) + "\n", - capture_output=True, - text=True, - encoding="utf-8", - errors="replace", - env=env, + def _resolve_device(self) -> str: + """Pick CUDA if available; honour DINO_DEVICE env override.""" + env = os.environ.get("DINO_DEVICE") + if env: + return env + try: + import torch + return "cuda" if torch.cuda.is_available() else "cpu" + except Exception: + return "cpu" + + def _load_model_blocking(self, model_path: str) -> None: + """Load (cache) the Grounding DINO model for ``model_path``.""" + # Lazy imports so app startup doesn't pay the torch+transformers + # tax for users who never run detection. + from transformers import ( + AutoModelForZeroShotObjectDetection, + AutoProcessor, ) - if proc.returncode != 0: - err_text = proc.stderr.strip() if proc.stderr else "(no stderr)" - raise RuntimeError( - f"DINO worker exited with code {proc.returncode}.\nstderr: {err_text}" - ) + device = self._resolve_device() + print(f"[DINO] Loading from {model_path} on {device} ...") + if not Path(model_path).exists(): + print(f"[DINO] Local path missing; will attempt HF hub download.") - # Echo worker stdout (includes device diagnostics) to parent console - lines = (proc.stdout or "").strip().splitlines() - for line in lines[:-1]: - print(line) + proc = AutoProcessor.from_pretrained(model_path) + model = AutoModelForZeroShotObjectDetection.from_pretrained(model_path) + model.eval().to(device) - try: - return json.loads(lines[-1]) - except (json.JSONDecodeError, IndexError): - out_text = proc.stdout.strip() if proc.stdout else "(no stdout)" - raise RuntimeError( - f"DINO worker returned non-JSON output.\nstdout: {out_text}" - ) - - @staticmethod - def _save_image_temp(image: QImage) -> str: - """Convert QImage to a temporary file and return the path.""" - arr = _qimage_to_numpy(image) - tmp = tempfile.NamedTemporaryFile(suffix=".png", delete=False) - from PIL import Image as PILImage - PILImage.fromarray(arr).save(tmp.name) - tmp.close() - return tmp.name + self._proc = proc + self._model = model + self._loaded_model_path = model_path + self._device = device + print("[DINO] Model loaded successfully.") + + def unload(self) -> None: + """Drop the cached model so its GPU/CPU memory comes back. - def detect(self, image, class_configs, model_name="grounding-dino-base", - custom_model_path=None): + Recipe mirrors ``SAMUtils.unload`` — see that docstring for the + full justification. Caveat: PyTorch keeps a per-process CUDA + context that survives unload (~200-500 MB residual). Full + reclaim requires restarting the app. """ - Run text-prompted detection. - - Parameters - ---------- - image : QImage - The image to detect objects in. - class_configs : list[dict] - Each dict: {"name": str, "phrases": [str], "box_thr": float, - "txt_thr": float, "nms_thr": float} - model_name : str - One of GDINO_MODEL_NAMES, or "custom". - custom_model_path : str | None - Local path for custom/fine-tuned model. - - Returns - ------- - list[dict] | None - Each dict: {"class_name", "bbox": [x1,y1,x2,y2], "score", "label"} - Returns None on error. + import gc + try: + if self._model is not None: + self._model.cpu() + except Exception as e: + print(f"[DINO] unload: warning moving model to CPU: {e}") + self._proc = None + self._model = None + self._loaded_model_path = None + self._device = None + gc.collect() + try: + import torch + if torch.cuda.is_available(): + torch.cuda.synchronize() + torch.cuda.empty_cache() + torch.cuda.ipc_collect() + except Exception: + pass + print("[DINO] unload complete") + + # ── inference ───────────────────────────────────────────────────── + + def detect( + self, + image: QImage, + class_configs: list[dict], + model_name: str = "grounding-dino-base", + custom_model_path: str | None = None, + cross_class_nms_thr: float | None = None, + ): + """Run text-prompted detection. Returns list of dicts: + + {"class_name": str, "bbox": [x1, y1, x2, y2], + "score": float, "label": str} + + Returns ``None`` on error (model resolution failure or runtime + exception). An empty list means "ran, no boxes survived + filtering". """ - model_path = custom_model_path - if model_path is None: - model_path = GDINO_MODEL_PATHS.get(model_name) + model_path = custom_model_path or GDINO_MODEL_PATHS.get(model_name) if model_path is None: print(f"Unknown DINO model: {model_name}") return None - # Both branches (preset and custom) now produce absolute paths: - # GDINO_MODEL_PATHS is built from models_base_dir(); the custom - # path comes from QFileDialog.getExistingDirectory which is always - # absolute. No further normalisation needed. + # Marshal to numpy on the calling thread, with the array + # already copied off the QImage buffer (see _qimage_to_numpy + # — the .copy() there is what actually makes this safe). The + # worker thread then operates on memory it fully owns. + image_np = _qimage_to_numpy(image) - tmp_path = None - try: - tmp_path = self._save_image_temp(image) - request = { - "action": "detect", - "image_path": tmp_path, - "class_configs": class_configs, - "model_path": model_path, - } - result = self._send_request(request) - except Exception: - traceback.print_exc() - return None - finally: - if tmp_path: - try: - os.unlink(tmp_path) - except OSError: - pass - - if "error" in result: - print(f"DINO worker error: {result['error']}") - return None + return _run_sync( + self._detect_blocking, + image_np, + list(class_configs), + model_path, + cross_class_nms_thr, + ) + + def _detect_blocking( + self, + image_np, + class_configs: list[dict], + model_path: str, + cross_class_nms_thr: float | None, + ): + # We're already on a worker thread (called via _run_sync). Load + # the model directly here when needed — calling _run_sync from + # within would deadlock against the outer QEventLoop. Don't + # swallow load errors: let them propagate so _InferenceThread + # captures them and _run_sync re-raises to the caller, which + # can show a real error dialog instead of "No detections." + if self._loaded_model_path != model_path or self._model is None: + self._load_model_blocking(model_path) - return result.get("results", []) + import torch + from PIL import Image as PILImage + from torchvision.ops import nms + + image_pil = PILImage.fromarray(image_np).convert("RGB") + device = self._device or "cpu" + + all_boxes, all_scores, all_labels = [], [], [] + # Model lives on `device` permanently after the first load + # (set in _load_model_blocking). Earlier code shuffled it + # CPU↔GPU on every call, defeating the in-process caching + # win documented in ADR-013 — moving a 1.9 GB DINO base + # over PCIe costs hundreds of ms per call. unload() is + # the explicit way to free GPU memory when the user wants to. + for cfg in class_configs: + boxes, scores, labels = self._run_for_class(image_pil, cfg, device) + if len(boxes): + all_boxes.append(boxes) + all_scores.append(scores) + all_labels.extend(labels) + + if not all_boxes: + print('[DINO] total candidates pre-CCNMS: 0 (no class produced any boxes)') + return [] + + all_boxes = torch.cat(all_boxes, dim=0) + all_scores = torch.cat(all_scores, dim=0) + print(f'[DINO] total candidates pre-CCNMS: {len(all_boxes)}') + + # Cross-class NMS — drop boxes that overlap heavily across + # classes so the user doesn't get two near-identical masks + # for one object. + cc_thr = ( + cross_class_nms_thr + if cross_class_nms_thr is not None + else DEFAULT_CROSS_CLASS_NMS_THR + ) + cross_keep = nms(all_boxes, all_scores, cc_thr).tolist() + all_boxes = all_boxes[cross_keep] + all_scores = all_scores[cross_keep] + all_labels = [all_labels[i] for i in cross_keep] + print( + f'[DINO] cross-class NMS (iou={cc_thr:.2f}): ' + f'{len(cross_keep)} survivor(s)' + ) + + results = [] + for i in range(len(all_boxes)): + box = all_boxes[i].numpy().tolist() + results.append({ + "class_name": all_labels[i], + "bbox": [float(v) for v in box], + "score": float(all_scores[i].item()), + "label": all_labels[i], + }) + print(f'[DINO] detect() returning {len(results)} result(s)') + return results + + def _run_for_class(self, image_pil, class_cfg, device): + """Single DINO inference for one class. Returns (boxes, scores, labels).""" + import torch + from torchvision.ops import nms + + # Use the phrases provided by the caller verbatim. The earlier + # auto-prepend of class_cfg["name"] silently overrode any + # rename of row-0 in the phrase editor (see ADR-015 area + arc42 + # DINO Temp Annotations section). If the user emptied phrases + # entirely, fall back to the class name as the single prompt. + phrases = list(class_cfg.get("phrases") or [class_cfg["name"]]) + clean_phrases = [p.strip().rstrip(".") for p in phrases if p.strip()] + if not clean_phrases: + clean_phrases = [class_cfg["name"]] + prompt = " . ".join(clean_phrases) + " ." + + box_thr = class_cfg.get("box_thr", 0.25) + txt_thr = class_cfg.get("txt_thr", 0.25) + nms_thr = class_cfg.get("nms_thr", 0.50) + + print( + f'[DINO] Class: "{class_cfg["name"]}" ' + f'({len(clean_phrases)} phrase(s), ' + f'box={box_thr:.2f} txt={txt_thr:.2f} nms={nms_thr:.2f})' + ) + + inputs = self._proc( + images=image_pil, + text=prompt, + return_tensors="pt", + ).to(device) + + with torch.no_grad(): + outputs = self._model(**inputs) + + det = self._proc.post_process_grounded_object_detection( + outputs, + inputs.input_ids, + threshold=box_thr, + text_threshold=txt_thr, + target_sizes=[image_pil.size[::-1]], + )[0] + + boxes = det["boxes"].cpu() + scores = det["scores"].cpu() + raw_labels = det.get("text_labels", det.get("labels", [])) + + top_scores = [float(s) for s in scores[:5].tolist()] if len(scores) else [] + print( + f'[DINO] post_process: {len(boxes)} raw box(es), ' + f'top scores={top_scores}' + ) + + if len(boxes) == 0: + return torch.zeros((0, 4)), torch.zeros(0), [] + + # Area filter + iw, ih = image_pil.size + area = iw * ih + keep = [ + i for i, b in enumerate(boxes) + if ((b[2] - b[0]) * (b[3] - b[1])).item() / area < MAX_AREA_FRAC + ] + print(f'[DINO] after area filter (<{MAX_AREA_FRAC} of image): {len(keep)} kept') + if not keep: + return torch.zeros((0, 4)), torch.zeros(0), [] + + boxes = boxes[keep] + scores = scores[keep] + raw_labels = [raw_labels[i] for i in keep] + + # Per-class NMS + keep2 = nms(boxes, scores, nms_thr).tolist() + print(f'[DINO] after per-class NMS (iou={nms_thr:.2f}): {len(keep2)} kept') + boxes = boxes[keep2] + scores = scores[keep2] + raw_labels = [raw_labels[i] for i in keep2] + + # Override DINO's free-text labels to our canonical class name + norm_labels = [class_cfg["name"]] * len(raw_labels) + return boxes, scores, norm_labels + + # ── model download ──────────────────────────────────────────────── def download_model(self, model_name: str): - """ - Download model from Hugging Face Hub into the canonical local path. + """Download model from Hugging Face Hub into the canonical local path. + Returns the absolute local path on success, or None on error. """ try: diff --git a/src/digitalsreeni_image_annotator/dino_worker.py b/src/digitalsreeni_image_annotator/dino_worker.py deleted file mode 100644 index 31a26ca..0000000 --- a/src/digitalsreeni_image_annotator/dino_worker.py +++ /dev/null @@ -1,231 +0,0 @@ -""" -Standalone Grounding DINO worker --- runs in an isolated subprocess. - -No PyQt5 imports. Loads torch/transformers in a clean process. -Communication: stdin -> JSON request, stdout -> JSON response. -""" - -from __future__ import annotations - -import json -import os -import sys -import traceback -from pathlib import Path - -import numpy as np -import torch -from PIL import Image -from torchvision.ops import nms - -# --- constants -------------------------------------------------------------- - -DEVICE = "cuda" if torch.cuda.is_available() else "cpu" -CPU_DEVICE = "cpu" -MAX_AREA_FRAC = 0.70 -DEFAULT_CROSS_CLASS_NMS_THR = 0.50 - -# cached models (loaded once per worker lifetime) -_gdino_proc = None -_gdino_model = None -_loaded_model_path = None - - -# --- helpers ---------------------------------------------------------------- - -def _log(msg: str): - print(f"[DINO] {msg}", flush=True) - - -def _load_models(model_path: str): - """Load (cache) Grounding DINO model.""" - global _gdino_proc, _gdino_model, _loaded_model_path - - if _loaded_model_path == model_path and _gdino_model is not None: - return _gdino_proc, _gdino_model - - from transformers import ( - AutoProcessor, - AutoModelForZeroShotObjectDetection, - ) - - _log(f"Loading Grounding DINO from {model_path} ...") - if not Path(model_path).exists(): - _log("Local path missing; will attempt HF hub download.") - - proc = AutoProcessor.from_pretrained(model_path) - model = AutoModelForZeroShotObjectDetection.from_pretrained(model_path) - model.eval().to(DEVICE) - - _gdino_proc = proc - _gdino_model = model - _loaded_model_path = model_path - _log("Model loaded successfully.") - return proc, model - - -def _run_dino_for_class(image_pil, class_cfg, gdino_proc, gdino_model): - """Single DINO inference for one class. Returns (boxes, scores, labels).""" - phrases = class_cfg.get("phrases", [class_cfg["name"]]) - if class_cfg["name"] not in phrases: - phrases = [class_cfg["name"]] + list(phrases) - - clean_phrases = [p.strip().rstrip(".") for p in phrases if p.strip()] - prompt = " . ".join(clean_phrases) + " ." - - box_thr = class_cfg.get("box_thr", 0.25) - txt_thr = class_cfg.get("txt_thr", 0.25) - nms_thr = class_cfg.get("nms_thr", 0.50) - - _log( - f' Class: "{class_cfg["name"]}" ' - f'({len(clean_phrases)} phrase(s), ' - f'box={box_thr:.2f} txt={txt_thr:.2f} nms={nms_thr:.2f})' - ) - - inputs = gdino_proc( - images=image_pil, - text=prompt, - return_tensors="pt", - ).to(DEVICE) - - with torch.no_grad(): - outputs = gdino_model(**inputs) - - det = gdino_proc.post_process_grounded_object_detection( - outputs, - inputs.input_ids, - threshold=box_thr, - text_threshold=txt_thr, - target_sizes=[image_pil.size[::-1]], - )[0] - - boxes = det["boxes"].cpu() - scores = det["scores"].cpu() - raw_labels = det.get("text_labels", det.get("labels", [])) - - if len(boxes) == 0: - return torch.zeros((0, 4)), torch.zeros(0), [] - - # Area filter: discard boxes covering > MAX_AREA_FRAC of image - iw, ih = image_pil.size - area = iw * ih - keep = [ - i for i, b in enumerate(boxes) - if ((b[2] - b[0]) * (b[3] - b[1])).item() / area < MAX_AREA_FRAC - ] - if not keep: - return torch.zeros((0, 4)), torch.zeros(0), [] - - boxes = boxes[keep] - scores = scores[keep] - raw_labels = [raw_labels[i] for i in keep] - - # Per-class NMS - keep2 = nms(boxes, scores, nms_thr).tolist() - boxes = boxes[keep2] - scores = scores[keep2] - raw_labels = [raw_labels[i] for i in keep2] - - # Single-class pass: override all labels to canonical name - norm_labels = [class_cfg["name"]] * len(raw_labels) - - return boxes, scores, norm_labels - - -def run_dino_detection(image_path: str, class_configs: list[dict], - model_path: str, cross_class_nms_thr: float | None = None) -> list[dict]: - """ - Run DINO detection. Returns list of: - {"class_name": str, "bbox": [x1, y1, x2, y2], "score": float, "label": str} - """ - gdino_proc, gdino_model = _load_models(model_path) - image_pil = Image.open(image_path).convert("RGB") - - all_boxes, all_scores, all_labels = [], [], [] - - gdino_model.to(DEVICE) - for cfg in class_configs: - boxes, scores, labels = _run_dino_for_class(image_pil, cfg, gdino_proc, gdino_model) - if len(boxes): - all_boxes.append(boxes) - all_scores.append(scores) - all_labels.extend(labels) - - gdino_model.to(CPU_DEVICE) - if DEVICE == "cuda": - torch.cuda.empty_cache() - - if not all_boxes: - return [] - - all_boxes = torch.cat(all_boxes, dim=0) - all_scores = torch.cat(all_scores, dim=0) - - # Cross-class NMS - cc_thr = cross_class_nms_thr if cross_class_nms_thr is not None else DEFAULT_CROSS_CLASS_NMS_THR - cross_keep = nms(all_boxes, all_scores, cc_thr).tolist() - all_boxes = all_boxes[cross_keep] - all_scores = all_scores[cross_keep] - all_labels = [all_labels[i] for i in cross_keep] - - results = [] - for i in range(len(all_boxes)): - box = all_boxes[i].numpy().tolist() - results.append({ - "class_name": all_labels[i], - "bbox": [float(v) for v in box], - "score": float(all_scores[i].item()), - "label": all_labels[i], - }) - - return results - - -# --- main ------------------------------------------------------------------- - -def main(): - raw = sys.stdin.read() - if not raw.strip(): - return - - try: - request = json.loads(raw) - except json.JSONDecodeError as exc: - print(json.dumps({"error": f"Invalid JSON: {exc}"})) - return - - action = request.get("action") - if action != "detect": - print(json.dumps({"error": f"Unknown action: {action}"})) - return - - image_path = request.get("image_path") - class_configs = request.get("class_configs", []) - model_path = request.get("model_path", "models/grounding-dino-base") - cc_nms = request.get("cross_class_nms_thr") - - if not image_path or not class_configs: - print(json.dumps({"error": "Missing image_path or class_configs."})) - return - - env_device = os.environ.get("DINO_DEVICE") - if env_device: - global DEVICE - DEVICE = env_device - _log(f"Using device override: {DEVICE}") - else: - _log(f"Using device: {DEVICE}") - - try: - results = run_dino_detection( - image_path, class_configs, model_path, - cross_class_nms_thr=cc_nms - ) - print(json.dumps({"results": results})) - except Exception: - print(json.dumps({"error": traceback.format_exc()})) - - -if __name__ == "__main__": - main() diff --git a/src/digitalsreeni_image_annotator/export_formats.py b/src/digitalsreeni_image_annotator/export_formats.py index 66e283d..fd98c11 100644 --- a/src/digitalsreeni_image_annotator/export_formats.py +++ b/src/digitalsreeni_image_annotator/export_formats.py @@ -1,5 +1,5 @@ import json -from PyQt5.QtGui import QImage +from PyQt6.QtGui import QImage from .utils import calculate_area, calculate_bbox import yaml import os @@ -184,10 +184,17 @@ def export_yolo_v4(all_annotations, class_mapping, image_paths, slices, image_sl qimage.save(save_path) img_width, img_height = qimage.width(), qimage.height() else: - # Handle regular images - image_path = next((path for name, path in image_paths.items() if image_name in name), None) + # Handle regular images. Exact key match first; substring + # fallback (the original behaviour) is fragile when one image + # name is a prefix of another. + image_path = image_paths.get(image_name) + if image_path is None: + image_path = next( + (path for name, path in image_paths.items() if image_name in name), + None, + ) if not image_path or image_path.lower().endswith(('.tif', '.tiff', '.czi')): - print(f"Skipping file: {image_name}") + print(f"[YOLO v4] skipping {image_name!r}: no image path / TIFF source") continue file_name_img = image_name dst_path = os.path.join(images_dir, file_name_img) @@ -200,13 +207,16 @@ def export_yolo_v4(all_annotations, class_mapping, image_paths, slices, image_sl label_file = os.path.splitext(file_name_img)[0] + '.txt' with open(os.path.join(labels_dir, label_file), 'w') as f: for class_name, class_annotations in annotations.items(): + if class_name not in class_to_index: + print(f"[YOLO v4] warning: class {class_name!r} not in class_mapping, skipped") + continue class_index = class_to_index[class_name] for ann in class_annotations: - if 'segmentation' in ann: + if 'segmentation' in ann and ann['segmentation']: polygon = ann['segmentation'] normalized_polygon = [coord / img_width if i % 2 == 0 else coord / img_height for i, coord in enumerate(polygon)] f.write(f"{class_index} " + " ".join(map(lambda x: f"{x:.6f}", normalized_polygon)) + "\n") - elif 'bbox' in ann: + elif 'bbox' in ann and ann['bbox']: x, y, w, h = ann['bbox'] x_center = (x + w/2) / img_width y_center = (y + h/2) / img_height @@ -261,9 +271,16 @@ def export_yolo_v5plus(all_annotations, class_mapping, image_paths, slices, imag # Create a mapping of slice names to their QImage objects slice_map = {slice_name: qimage for slice_name, qimage in slices} + print(f"[YOLO v5+] export: {len(all_annotations)} image entries, " + f"{len(image_paths)} known image paths, " + f"{len(class_to_index)} class(es) → {list(class_to_index.keys())}") + + label_files_written = 0 for image_name, annotations in all_annotations.items(): + print(f"[YOLO v5+] image={image_name!r} annotation-classes={list(annotations.keys()) if annotations else '(none)'}") # Skip if there are no annotations for this image/slice if not annotations: + print(f"[YOLO v5+] skipping: no annotations") continue # For simplicity, we'll put all data in the train directory @@ -281,7 +298,7 @@ def export_yolo_v5plus(all_annotations, class_mapping, image_paths, slices, imag if qimage: break if qimage is None: - print(f"No image data found for slice {image_name}, skipping") + print(f"[YOLO v5+] skipping: no image data for slice {image_name}") continue file_name_img = f"{image_name}.png" save_path = os.path.join(images_dir, file_name_img) @@ -289,36 +306,59 @@ def export_yolo_v5plus(all_annotations, class_mapping, image_paths, slices, imag qimage.save(save_path) img_width, img_height = qimage.width(), qimage.height() else: - # Handle regular images - image_path = next((path for name, path in image_paths.items() if image_name in name), None) - if not image_path or image_path.lower().endswith(('.tif', '.tiff', '.czi')): - print(f"Skipping file: {image_name}") + # Handle regular images. Use exact-key match first; only fall + # back to substring match if no exact key is found (substring + # match was the original behaviour but it produces wrong hits + # when one image name is a prefix of another). + image_path = image_paths.get(image_name) + if image_path is None: + image_path = next( + (path for name, path in image_paths.items() if image_name in name), + None, + ) + if not image_path: + print(f"[YOLO v5+] skipping: no image_paths entry for {image_name!r}") + continue + if image_path.lower().endswith(('.tif', '.tiff', '.czi')): + print(f"[YOLO v5+] skipping: TIFF/CZI source {image_name!r} (use slice export)") continue file_name_img = image_name dst_path = os.path.join(images_dir, file_name_img) if not os.path.exists(dst_path): shutil.copy2(image_path, dst_path) + print(f"[YOLO v5+] copied image → {dst_path}") img = QImage(image_path) img_width, img_height = img.width(), img.height() # Write YOLO format annotation label_file = os.path.splitext(file_name_img)[0] + '.txt' - with open(os.path.join(labels_dir, label_file), 'w') as f: + label_path = os.path.join(labels_dir, label_file) + ann_lines = 0 + with open(label_path, 'w') as f: for class_name, class_annotations in annotations.items(): + if class_name not in class_to_index: + print(f"[YOLO v5+] warning: class {class_name!r} not in class_mapping, skipped") + continue class_index = class_to_index[class_name] for ann in class_annotations: - if 'segmentation' in ann: + if 'segmentation' in ann and ann['segmentation']: polygon = ann['segmentation'] - normalized_polygon = [coord / img_width if i % 2 == 0 else coord / img_height + normalized_polygon = [coord / img_width if i % 2 == 0 else coord / img_height for i, coord in enumerate(polygon)] f.write(f"{class_index} " + " ".join(map(lambda x: f"{x:.6f}", normalized_polygon)) + "\n") - elif 'bbox' in ann: + ann_lines += 1 + elif 'bbox' in ann and ann['bbox']: x, y, w, h = ann['bbox'] x_center = (x + w/2) / img_width y_center = (y + h/2) / img_height w = w / img_width h = h / img_height f.write(f"{class_index} {x_center:.6f} {y_center:.6f} {w:.6f} {h:.6f}\n") + ann_lines += 1 + print(f"[YOLO v5+] wrote {ann_lines} annotation line(s) → {label_path}") + label_files_written += 1 + + print(f"[YOLO v5+] export complete: {label_files_written} label file(s) written") # Create YAML file names = list(class_mapping.keys()) diff --git a/src/digitalsreeni_image_annotator/help_window.py b/src/digitalsreeni_image_annotator/help_window.py index ba5033e..24a0b01 100644 --- a/src/digitalsreeni_image_annotator/help_window.py +++ b/src/digitalsreeni_image_annotator/help_window.py @@ -1,5 +1,5 @@ -from PyQt5.QtWidgets import QDialog, QVBoxLayout, QTextBrowser -from PyQt5.QtCore import Qt +from PyQt6.QtWidgets import QDialog, QVBoxLayout, QTextBrowser +from PyQt6.QtCore import Qt from .soft_dark_stylesheet import soft_dark_stylesheet from .default_stylesheet import default_stylesheet diff --git a/src/digitalsreeni_image_annotator/image_augmenter.py b/src/digitalsreeni_image_annotator/image_augmenter.py index 114960d..e78c3b6 100644 --- a/src/digitalsreeni_image_annotator/image_augmenter.py +++ b/src/digitalsreeni_image_annotator/image_augmenter.py @@ -4,18 +4,18 @@ import numpy as np import json -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, QLabel, QMessageBox, QSpinBox, QCheckBox, QDoubleSpinBox, QProgressBar, QApplication) -from PyQt5.QtCore import Qt +from PyQt6.QtCore import Qt class ImageAugmenterDialog(QDialog): def __init__(self, parent=None): super().__init__(parent) self.setWindowTitle("Image Augmenter") self.setGeometry(100, 100, 400, 600) - self.setWindowFlags(self.windowFlags() | Qt.Window) - self.setWindowModality(Qt.ApplicationModal) + self.setWindowFlags(self.windowFlags() | Qt.WindowType.Window) + self.setWindowModality(Qt.WindowModality.ApplicationModal) self.input_dir = "" self.output_dir = "" self.coco_file = "" @@ -174,7 +174,7 @@ def select_coco_json(self): self.coco_check.setChecked(True) # Automatically check the box when a file is loaded def toggle_elastic_deformation(self, state): - if state == Qt.Checked: + if state == Qt.CheckState.Checked: self.elastic_check.setChecked(False) self.elastic_check.setEnabled(False) else: diff --git a/src/digitalsreeni_image_annotator/image_label.py b/src/digitalsreeni_image_annotator/image_label.py index 24c10c9..e22accf 100644 --- a/src/digitalsreeni_image_annotator/image_label.py +++ b/src/digitalsreeni_image_annotator/image_label.py @@ -14,8 +14,8 @@ import cv2 import numpy as np from PIL import Image -from PyQt5.QtCore import QPoint, QPointF, QRectF, QSize, Qt -from PyQt5.QtGui import ( +from PyQt6.QtCore import QPoint, QPointF, QRectF, QSize, Qt +from PyQt6.QtGui import ( QBrush, QColor, QFont, @@ -29,7 +29,7 @@ QPolygonF, QWheelEvent, ) -from PyQt5.QtWidgets import QApplication, QLabel, QMessageBox +from PyQt6.QtWidgets import QApplication, QLabel, QMessageBox warnings.filterwarnings("ignore", category=UserWarning) @@ -52,7 +52,7 @@ def __init__(self, parent=None): self.end_point = None self.highlighted_annotations = [] self.setMouseTracking(True) - self.setFocusPolicy(Qt.StrongFocus) + self.setFocusPolicy(Qt.FocusPolicy.StrongFocus) self.original_pixmap = None self.scaled_pixmap = None self.pan_start_pos = None @@ -131,8 +131,8 @@ def update_scaled_pixmap(self): self.scaled_pixmap = self.original_pixmap.scaled( scaled_size.width(), scaled_size.height(), - Qt.KeepAspectRatio, - Qt.SmoothTransformation, + Qt.AspectRatioMode.KeepAspectRatio, + Qt.TransformationMode.SmoothTransformation, ) super().setPixmap(self.scaled_pixmap) self.setMinimumSize(self.scaled_pixmap.size()) @@ -305,7 +305,7 @@ def paintEvent(self, event): super().paintEvent(event) if self.scaled_pixmap: painter = QPainter(self) - painter.setRenderHint(QPainter.Antialiasing) + painter.setRenderHint(QPainter.RenderHint.Antialiasing) # Draw the image painter.drawPixmap( int(self.offset_x), int(self.offset_y), self.scaled_pixmap @@ -328,12 +328,12 @@ def paintEvent(self, event): painter.translate(self.offset_x, self.offset_y) painter.scale(self.zoom_factor, self.zoom_factor) for pt in self.sam_positive_points: - painter.setPen(QPen(Qt.green, 6 / self.zoom_factor, Qt.SolidLine)) - painter.setBrush(QBrush(Qt.green)) + painter.setPen(QPen(Qt.GlobalColor.green, 6 / self.zoom_factor, Qt.PenStyle.SolidLine)) + painter.setBrush(QBrush(Qt.GlobalColor.green)) painter.drawEllipse(QPointF(pt[0], pt[1]), 4, 4) for pt in self.sam_negative_points: - painter.setPen(QPen(Qt.red, 6 / self.zoom_factor, Qt.SolidLine)) - painter.setBrush(QBrush(Qt.red)) + painter.setPen(QPen(Qt.GlobalColor.red, 6 / self.zoom_factor, Qt.PenStyle.SolidLine)) + painter.setBrush(QBrush(Qt.GlobalColor.red)) painter.drawEllipse(QPointF(pt[0], pt[1]), 4, 4) painter.restore() # Draw temporary paint mask @@ -353,7 +353,7 @@ def draw_temp_annotations(self, painter): for annotation in self.temp_annotations: color = QColor(255, 165, 0, 128) # Semi-transparent orange - painter.setPen(QPen(color, 2 / self.zoom_factor, Qt.DashLine)) + painter.setPen(QPen(color, 2 / self.zoom_factor, Qt.PenStyle.DashLine)) painter.setBrush(QBrush(color)) # Prefer segmentation polygon over bbox when both are present @@ -423,7 +423,7 @@ def draw_temp_paint_mask(self, painter): self.temp_paint_mask.shape[1], self.temp_paint_mask.shape[0], self.temp_paint_mask.shape[1], - QImage.Format_Grayscale8, + QImage.Format.Format_Grayscale8, ) mask_pixmap = QPixmap.fromImage(mask_image) painter.setOpacity(0.5) @@ -443,7 +443,7 @@ def draw_temp_eraser_mask(self, painter): self.temp_eraser_mask.shape[1], self.temp_eraser_mask.shape[0], self.temp_eraser_mask.shape[1], - QImage.Format_Grayscale8, + QImage.Format.Format_Grayscale8, ) mask_pixmap = QPixmap.fromImage(mask_image) painter.setOpacity(0.5) @@ -469,7 +469,7 @@ def draw_tool_size_indicator(self, painter): # Draw filled circle with lower opacity painter.setOpacity(0.3) - painter.setPen(Qt.NoPen) + painter.setPen(Qt.PenStyle.NoPen) painter.setBrush(color) painter.drawEllipse( QPointF(self.cursor_pos[0], self.cursor_pos[1]), size, size @@ -477,8 +477,8 @@ def draw_tool_size_indicator(self, painter): # Draw circle outline with full opacity painter.setOpacity(1.0) - painter.setPen(QPen(color.darker(150), 1 / self.zoom_factor, Qt.SolidLine)) - painter.setBrush(Qt.NoBrush) + painter.setPen(QPen(color.darker(150), 1 / self.zoom_factor, Qt.PenStyle.SolidLine)) + painter.setBrush(Qt.BrushStyle.NoBrush) painter.drawEllipse( QPointF(self.cursor_pos[0], self.cursor_pos[1]), size, size ) @@ -489,7 +489,7 @@ def draw_tool_size_indicator(self, painter): font = QFont() font.setPointSize(10) painter.setFont(font) - painter.setPen(QPen(Qt.black)) # Use black color for better visibility + painter.setPen(QPen(Qt.GlobalColor.black)) # Use black color for better visibility # Convert cursor position back to screen coordinates screen_x = self.cursor_pos[0] * self.zoom_factor + self.offset_x @@ -504,7 +504,7 @@ def draw_tool_size_indicator(self, painter): ) text = f"Size: {size}" - painter.drawText(text_rect, Qt.AlignLeft | Qt.AlignVCenter, text) + painter.drawText(text_rect, Qt.AlignmentFlag.AlignLeft | Qt.AlignmentFlag.AlignVCenter, text) painter.restore() @@ -515,7 +515,7 @@ def draw_paint_mask(self, painter): self.paint_mask.shape[1], self.paint_mask.shape[0], self.paint_mask.shape[1], - QImage.Format_Grayscale8, + QImage.Format.Format_Grayscale8, ) mask_pixmap = QPixmap.fromImage(mask_image) painter.setOpacity(0.5) @@ -533,7 +533,7 @@ def draw_eraser_mask(self, painter): self.eraser_mask.shape[1], self.eraser_mask.shape[0], self.eraser_mask.shape[1], - QImage.Format_Grayscale8, + QImage.Format.Format_Grayscale8, ) mask_pixmap = QPixmap.fromImage(mask_image) painter.setOpacity(0.5) @@ -548,7 +548,7 @@ def draw_sam_bbox(self, painter): painter.save() painter.translate(self.offset_x, self.offset_y) painter.scale(self.zoom_factor, self.zoom_factor) - painter.setPen(QPen(Qt.red, 2 / self.zoom_factor, Qt.SolidLine)) + painter.setPen(QPen(Qt.GlobalColor.red, 2 / self.zoom_factor, Qt.PenStyle.SolidLine)) x1, y1, x2, y2 = self.sam_bbox painter.drawRect(QRectF(min(x1, x2), min(y1, y2), abs(x2 - x1), abs(y2 - y1))) painter.restore() @@ -563,15 +563,15 @@ def check_unsaved_changes(self): self.main_window, "Unsaved Changes", "You have unsaved changes. Do you want to save them?", - QMessageBox.Yes | QMessageBox.No | QMessageBox.Cancel, + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No | QMessageBox.StandardButton.Cancel, ) - if reply == QMessageBox.Yes: + if reply == QMessageBox.StandardButton.Yes: if self.temp_paint_mask is not None: self.commit_paint_annotation() if self.temp_eraser_mask is not None: self.commit_eraser_changes() return True - elif reply == QMessageBox.No: + elif reply == QMessageBox.StandardButton.No: self.discard_paint_annotation() self.discard_eraser_changes() return True @@ -614,19 +614,19 @@ def draw_annotations(self, painter): if not self.main_window.is_class_visible(class_name): continue - color = self.class_colors.get(class_name, QColor(Qt.white)) + color = self.class_colors.get(class_name, QColor(Qt.GlobalColor.white)) for annotation in class_annotations: if annotation in self.highlighted_annotations: - border_color = Qt.red - fill_color = QColor(Qt.red) + border_color = Qt.GlobalColor.red + fill_color = QColor(Qt.GlobalColor.red) else: border_color = color fill_color = QColor(color) fill_color.setAlphaF(self.fill_opacity) - text_color = Qt.white if self.dark_mode else Qt.black - painter.setPen(QPen(border_color, 2 / self.zoom_factor, Qt.SolidLine)) + text_color = Qt.GlobalColor.white if self.dark_mode else Qt.GlobalColor.black + painter.setPen(QPen(border_color, 2 / self.zoom_factor, Qt.PenStyle.SolidLine)) painter.setBrush(QBrush(fill_color)) if "segmentation" in annotation: @@ -656,7 +656,7 @@ def draw_annotations(self, painter): QFont("Arial", int(12 / self.zoom_factor)) ) painter.setPen( - QPen(text_color, 2 / self.zoom_factor, Qt.SolidLine) + QPen(text_color, 2 / self.zoom_factor, Qt.PenStyle.SolidLine) ) painter.drawText( centroid, @@ -666,13 +666,13 @@ def draw_annotations(self, painter): elif "bbox" in annotation: x, y, width, height = annotation["bbox"] painter.drawRect(QRectF(x, y, width, height)) - painter.setPen(QPen(text_color, 2 / self.zoom_factor, Qt.SolidLine)) + painter.setPen(QPen(text_color, 2 / self.zoom_factor, Qt.PenStyle.SolidLine)) painter.drawText( QPointF(x, y), f"{class_name} {annotation.get('number', '')}" ) if self.current_annotation: - painter.setPen(QPen(Qt.red, 2 / self.zoom_factor, Qt.SolidLine)) + painter.setPen(QPen(Qt.GlobalColor.red, 2 / self.zoom_factor, Qt.PenStyle.SolidLine)) points = [QPointF(float(x), float(y)) for x, y in self.current_annotation] if len(points) > 1: painter.drawPolyline(QPolygonF(points)) @@ -687,7 +687,7 @@ def draw_annotations(self, painter): # Draw temporary SAM prediction if self.temp_sam_prediction: temp_color = QColor(255, 165, 0, 128) # Semi-transparent orange - painter.setPen(QPen(temp_color, 2 / self.zoom_factor, Qt.DashLine)) + painter.setPen(QPen(temp_color, 2 / self.zoom_factor, Qt.PenStyle.DashLine)) painter.setBrush(QBrush(temp_color)) segmentation = self.temp_sam_prediction["segmentation"] @@ -716,8 +716,8 @@ def draw_current_rectangle(self, painter): painter.scale(self.zoom_factor, self.zoom_factor) x1, y1, x2, y2 = self.current_rectangle - color = self.class_colors.get(self.main_window.current_class, QColor(Qt.red)) - painter.setPen(QPen(color, 2 / self.zoom_factor, Qt.SolidLine)) + color = self.class_colors.get(self.main_window.current_class, QColor(Qt.GlobalColor.red)) + painter.setPen(QPen(color, 2 / self.zoom_factor, Qt.PenStyle.SolidLine)) painter.drawRect(QRectF(float(x1), float(y1), float(x2 - x1), float(y2 - y1))) painter.restore() @@ -744,12 +744,12 @@ def draw_editing_polygon(self, painter): ) ] color = self.class_colors.get( - self.editing_polygon["category_name"], QColor(Qt.white) + self.editing_polygon["category_name"], QColor(Qt.GlobalColor.white) ) fill_color = QColor(color) fill_color.setAlphaF(self.fill_opacity) - painter.setPen(QPen(color, 2 / self.zoom_factor, Qt.SolidLine)) + painter.setPen(QPen(color, 2 / self.zoom_factor, Qt.PenStyle.SolidLine)) painter.setBrush(QBrush(fill_color)) painter.drawPolygon(QPolygonF(points)) # Changed QPolygon to QPolygonF - Sreeni @@ -779,12 +779,45 @@ def set_zoom(self, zoom_factor): self.update() def wheelEvent(self, event: QWheelEvent): - if event.modifiers() == Qt.ControlModifier: + if event.modifiers() == Qt.KeyboardModifier.ControlModifier: + if not self.original_pixmap or not self.scaled_pixmap: + event.accept() + return + + cursor_widget_pos = event.position() + # Image-space coords of the pixel under the cursor BEFORE zoom. + img_x = (cursor_widget_pos.x() - self.offset_x) / self.zoom_factor + img_y = (cursor_widget_pos.y() - self.offset_y) / self.zoom_factor + + scrollbar_h = self.main_window.scroll_area.horizontalScrollBar() + scrollbar_v = self.main_window.scroll_area.verticalScrollBar() + old_scroll_h = scrollbar_h.value() + old_scroll_v = scrollbar_v.value() + delta = event.angleDelta().y() if delta > 0: self.main_window.zoom_in() else: self.main_window.zoom_out() + + # Compute the post-zoom offset analytically from the + # viewport size and the new scaled-pixmap size. Reading + # self.offset_x here is unreliable on zoom-OUT: setMinimumSize + # in update_scaled_pixmap only relaxes the minimum, so the + # widget hasn't shrunk yet when update_offset ran. self.width() + # is stale → offset_x is wrong → cursor drifts. The viewport + # width is always current. + viewport = self.main_window.scroll_area.viewport() + new_scaled_w = self.scaled_pixmap.width() + new_scaled_h = self.scaled_pixmap.height() + new_offset_x = max(0, (viewport.width() - new_scaled_w) / 2) + new_offset_y = max(0, (viewport.height() - new_scaled_h) / 2) + + new_widget_x = img_x * self.zoom_factor + new_offset_x + new_widget_y = img_y * self.zoom_factor + new_offset_y + scrollbar_h.setValue(int(round(new_widget_x - cursor_widget_pos.x() + old_scroll_h))) + scrollbar_v.setValue(int(round(new_widget_y - cursor_widget_pos.y() + old_scroll_v))) + event.accept() else: super().wheelEvent(event) @@ -792,26 +825,29 @@ def wheelEvent(self, event: QWheelEvent): def mousePressEvent(self, event: QMouseEvent): if not self.original_pixmap: return - if event.modifiers() == Qt.ControlModifier and event.button() == Qt.LeftButton: - self.pan_start_pos = event.pos() - self.setCursor(Qt.ClosedHandCursor) + if event.modifiers() == Qt.KeyboardModifier.ControlModifier and event.button() == Qt.MouseButton.LeftButton: + # Track pan in global (screen) coords so the reference frame + # doesn't shift when the scrollbar moves the widget under the + # cursor — previously caused effective half-speed pan. + self.pan_start_pos = event.globalPosition() + self.setCursor(Qt.CursorShape.ClosedHandCursor) event.accept() return - pos = self.get_image_coordinates(event.pos()) + pos = self.get_image_coordinates(event.position()) if self.current_tool == "sam_points" and self.sam_points_active: - if event.button() == Qt.LeftButton: + if event.button() == Qt.MouseButton.LeftButton: self.sam_positive_points.append(pos) self.update() self.main_window.schedule_sam_prediction() return - elif event.button() == Qt.RightButton: + elif event.button() == Qt.MouseButton.RightButton: self.sam_negative_points.append(pos) self.update() self.main_window.schedule_sam_prediction() return - if event.button() == Qt.LeftButton: + if event.button() == Qt.MouseButton.LeftButton: if self.current_tool == "sam_box" and self.sam_box_active: self.sam_bbox = [pos[0], pos[1], pos[0], pos[1]] self.drawing_sam_bbox = True @@ -839,15 +875,16 @@ def mousePressEvent(self, event: QMouseEvent): def mouseMoveEvent(self, event: QMouseEvent): if not self.original_pixmap: return - self.cursor_pos = self.get_image_coordinates(event.pos()) - if event.modifiers() == Qt.ControlModifier and event.buttons() == Qt.LeftButton: + self.cursor_pos = self.get_image_coordinates(event.position()) + if event.modifiers() == Qt.KeyboardModifier.ControlModifier and event.buttons() == Qt.MouseButton.LeftButton: if self.pan_start_pos: - delta = event.pos() - self.pan_start_pos + cur = event.globalPosition() + delta = cur - self.pan_start_pos scrollbar_h = self.main_window.scroll_area.horizontalScrollBar() scrollbar_v = self.main_window.scroll_area.verticalScrollBar() - scrollbar_h.setValue(scrollbar_h.value() - delta.x()) - scrollbar_v.setValue(scrollbar_v.value() - delta.y()) - self.pan_start_pos = event.pos() + scrollbar_h.setValue(scrollbar_h.value() - int(delta.x())) + scrollbar_v.setValue(scrollbar_v.value() - int(delta.y())) + self.pan_start_pos = cur event.accept() return @@ -874,22 +911,22 @@ def mouseMoveEvent(self, event: QMouseEvent): elif self.current_tool == "rectangle" and self.drawing_rectangle: self.end_point = pos self.current_rectangle = self.get_rectangle_from_points() - elif self.current_tool == "paint_brush" and event.buttons() == Qt.LeftButton: + elif self.current_tool == "paint_brush" and event.buttons() == Qt.MouseButton.LeftButton: self.continue_painting(pos) - elif self.current_tool == "eraser" and event.buttons() == Qt.LeftButton: + elif self.current_tool == "eraser" and event.buttons() == Qt.MouseButton.LeftButton: self.continue_erasing(pos) self.update() def mouseReleaseEvent(self, event: QMouseEvent): if not self.original_pixmap: return - if event.modifiers() == Qt.ControlModifier and event.button() == Qt.LeftButton: + if event.modifiers() == Qt.KeyboardModifier.ControlModifier and event.button() == Qt.MouseButton.LeftButton: self.pan_start_pos = None - self.setCursor(Qt.ArrowCursor) + self.setCursor(Qt.CursorShape.ArrowCursor) event.accept() else: - pos = self.get_image_coordinates(event.pos()) - if event.button() == Qt.LeftButton: + pos = self.get_image_coordinates(event.position()) + if event.button() == Qt.MouseButton.LeftButton: if ( self.sam_box_active and self.drawing_sam_bbox @@ -923,8 +960,8 @@ def mouseReleaseEvent(self, event: QMouseEvent): def mouseDoubleClickEvent(self, event): if not self.pixmap(): return - pos = self.get_image_coordinates(event.pos()) - if event.button() == Qt.LeftButton: + pos = self.get_image_coordinates(event.position()) + if event.button() == Qt.MouseButton.LeftButton: if self.drawing_polygon and len(self.current_annotation) > 2: self.finish_polygon() else: @@ -942,12 +979,12 @@ def get_image_coordinates(self, pos): return (int(x), int(y)) def keyPressEvent(self, event: QKeyEvent): - if event.key() == Qt.Key_Return or event.key() == Qt.Key_Enter: - if self.temp_annotations and any( - a.get("source") == "dino" for a in self.temp_annotations - ): - self.main_window.accept_dino_results() - elif self.temp_annotations: + if event.key() == Qt.Key.Key_Return or event.key() == Qt.Key.Key_Enter: + # DINO temp_annotations are accepted via the application-wide + # _DINOReviewEventFilter (see ADR-015) so Enter works regardless + # of focus. The branch below only catches non-DINO temp state + # (legacy YOLO model-prediction review path). + if self.temp_annotations: self.accept_temp_annotations() elif self.temp_sam_prediction: self.main_window.accept_sam_prediction() @@ -965,17 +1002,16 @@ def keyPressEvent(self, event: QKeyEvent): self.commit_eraser_changes() else: self.finish_current_annotation() - elif event.key() == Qt.Key_Escape: + elif event.key() == Qt.Key.Key_Escape: if self.sam_points_active: self.main_window.sam_inference_timer.stop() self.sam_positive_points = [] self.sam_negative_points = [] self.clear_temp_sam_prediction() self.update() - elif self.temp_annotations and any( - a.get("source") == "dino" for a in self.temp_annotations - ): - self.main_window.reject_dino_results() + # DINO temp_annotations are rejected via the application-wide + # _DINOReviewEventFilter (see ADR-015). Branch below catches + # non-DINO temp state only. elif self.temp_annotations: self.discard_temp_annotations() elif self.sam_magic_wand_active: @@ -992,7 +1028,7 @@ def keyPressEvent(self, event: QKeyEvent): self.discard_eraser_changes() else: self.cancel_current_annotation() - elif event.key() == Qt.Key_Delete: + elif event.key() == Qt.Key.Key_Delete: if self.editing_polygon: self.main_window.delete_selected_annotations() self.editing_polygon = None @@ -1000,7 +1036,7 @@ def keyPressEvent(self, event: QKeyEvent): self.hover_point_index = None self.main_window.enable_tools() self.update() - elif event.key() == Qt.Key_Minus: + elif event.key() == Qt.Key.Key_Minus: if self.current_tool == "paint_brush": self.main_window.paint_brush_size = max( 1, self.main_window.paint_brush_size - 1 @@ -1009,7 +1045,7 @@ def keyPressEvent(self, event: QKeyEvent): elif self.current_tool == "eraser": self.main_window.eraser_size = max(1, self.main_window.eraser_size - 1) print(f"Eraser size: {self.main_window.eraser_size}") - elif event.key() == Qt.Key_Equal: + elif event.key() == Qt.Key.Key_Equal: if self.current_tool == "paint_brush": self.main_window.paint_brush_size += 1 print(f"Paint brush size: {self.main_window.paint_brush_size}") @@ -1069,7 +1105,7 @@ def handle_editing_click(self, pos, event): ] for i, point in enumerate(points): if self.distance(pos, point) < 10 / self.zoom_factor: - if event.modifiers() & Qt.ShiftModifier: + if event.modifiers() & Qt.KeyboardModifier.ShiftModifier: # Delete point del self.editing_polygon["segmentation"][i * 2 : i * 2 + 2] else: diff --git a/src/digitalsreeni_image_annotator/image_patcher.py b/src/digitalsreeni_image_annotator/image_patcher.py index 86da55f..3585a22 100644 --- a/src/digitalsreeni_image_annotator/image_patcher.py +++ b/src/digitalsreeni_image_annotator/image_patcher.py @@ -1,12 +1,12 @@ import os import numpy as np -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QLabel, QFileDialog, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QLabel, QFileDialog, QSpinBox, QProgressBar, QMessageBox, QListWidget, QDialogButtonBox, QGridLayout, QComboBox, QApplication, QScrollArea, QWidget) -from PyQt5.QtCore import Qt, QThread, pyqtSignal -from PyQt5.QtCore import QTimer, QEventLoop +from PyQt6.QtCore import Qt, QThread, pyqtSignal +from PyQt6.QtCore import QTimer, QEventLoop from tifffile import TiffFile, imwrite from PIL import Image import traceback @@ -37,7 +37,7 @@ def initUI(self): self.combos.append(combo) layout.addLayout(grid_layout) - self.button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel) + self.button_box = QDialogButtonBox(QDialogButtonBox.StandardButton.Ok | QDialogButtonBox.StandardButton.Cancel) self.button_box.accepted.connect(self.accept) self.button_box.rejected.connect(self.reject) layout.addWidget(self.button_box) @@ -123,7 +123,7 @@ def save_patches(self, image, base_name, extension): class ImagePatcherTool(QDialog): def __init__(self, parent=None): super().__init__(parent) - self.setWindowModality(Qt.ApplicationModal) + self.setWindowModality(Qt.WindowModality.ApplicationModal) self.dimensions = {} self.input_files = [] self.output_dir = "" @@ -188,7 +188,7 @@ def initUI(self): # Add the patch info label to the container self.patch_info_label = QLabel() - self.patch_info_label.setAlignment(Qt.AlignLeft | Qt.AlignTop) + self.patch_info_label.setAlignment(Qt.AlignmentFlag.AlignLeft | Qt.AlignmentFlag.AlignTop) patch_info_layout.addWidget(self.patch_info_label) # Set the container as the scroll area's widget @@ -238,9 +238,9 @@ def check_tiff_dimensions(self, file_path): if images.ndim > 2: file_name = os.path.basename(file_path) dialog = DimensionDialog(images.shape, file_name, self) - dialog.setWindowModality(Qt.ApplicationModal) - result = dialog.exec_() - if result == QDialog.Accepted: + dialog.setWindowModality(Qt.WindowModality.ApplicationModal) + result = dialog.exec() + if result == QDialog.DialogCode.Accepted: dimensions = dialog.get_dimensions() if 'H' in dimensions and 'W' in dimensions: self.dimensions[file_path] = dimensions @@ -282,10 +282,10 @@ def start_patching(self): def get_dimensions(self, shape, file_name): dialog = DimensionDialog(shape, file_name, self) - dialog.setWindowModality(Qt.ApplicationModal) - result = dialog.exec_() + dialog.setWindowModality(Qt.WindowModality.ApplicationModal) + result = dialog.exec() - if result == QDialog.Accepted: + if result == QDialog.DialogCode.Accepted: dimensions = dialog.get_dimensions() if 'H' in dimensions and 'W' in dimensions: self.dimensions[file_name] = dimensions diff --git a/src/digitalsreeni_image_annotator/import_formats.py b/src/digitalsreeni_image_annotator/import_formats.py index 5fc80e2..6de3669 100644 --- a/src/digitalsreeni_image_annotator/import_formats.py +++ b/src/digitalsreeni_image_annotator/import_formats.py @@ -4,13 +4,13 @@ import yaml from PIL import Image -from PyQt5.QtCore import QRectF -from PyQt5.QtGui import QColor -from PyQt5.QtWidgets import QMessageBox, QFileDialog +from PyQt6.QtCore import QRectF +from PyQt6.QtGui import QColor +from PyQt6.QtWidgets import QMessageBox, QFileDialog import os import json -from PyQt5.QtWidgets import QMessageBox +from PyQt6.QtWidgets import QMessageBox def import_coco_json(file_path, class_mapping): try: @@ -241,9 +241,9 @@ def import_yolo_v4(yaml_file_path, class_mapping): message += "Do you want to continue importing the remaining data?" reply = QMessageBox.question(None, "Import Issues", message, - QMessageBox.Yes | QMessageBox.No, QMessageBox.No) + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No, QMessageBox.StandardButton.No) - if reply == QMessageBox.No: + if reply == QMessageBox.StandardButton.No: raise ValueError("Import cancelled due to missing files.") return imported_annotations, image_info diff --git a/src/digitalsreeni_image_annotator/main.py b/src/digitalsreeni_image_annotator/main.py index 1c2c890..83b187f 100644 --- a/src/digitalsreeni_image_annotator/main.py +++ b/src/digitalsreeni_image_annotator/main.py @@ -9,12 +9,13 @@ import sys import os -from PyQt5.QtWidgets import QApplication +from PyQt6.QtWidgets import QApplication from .annotator_window import ImageAnnotator -# To address Linux errors, by removing the QT_QPA_PLATFORM_PLUGIN_PATH -# environment variable on Linux systems, which allows the application -# to use the system's Qt platform plugins instead of potentially conflicting ones +# Legacy defensive cleanup from the PyQt5 era: a stale +# QT_QPA_PLATFORM_PLUGIN_PATH could shadow Qt's bundled XCB plugin and +# break startup on Linux. PyQt6 packaging is more robust about this, but +# the pop is cheap and harmless to keep. if sys.platform.startswith("linux"): os.environ.pop("QT_QPA_PLATFORM_PLUGIN_PATH", None) @@ -25,7 +26,7 @@ def main(): app = QApplication(sys.argv) window = ImageAnnotator() window.show() - sys.exit(app.exec_()) + sys.exit(app.exec()) if __name__ == "__main__": main() \ No newline at end of file diff --git a/src/digitalsreeni_image_annotator/project_details.py b/src/digitalsreeni_image_annotator/project_details.py index 892f0ab..f0c9785 100644 --- a/src/digitalsreeni_image_annotator/project_details.py +++ b/src/digitalsreeni_image_annotator/project_details.py @@ -1,7 +1,7 @@ -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QTextEdit, QPushButton, QLabel, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QTextEdit, QPushButton, QLabel, QDialogButtonBox, QScrollArea, QWidget) -from PyQt5.QtCore import Qt -from PyQt5.QtGui import QFont +from PyQt6.QtCore import Qt +from PyQt6.QtGui import QFont import os from datetime import datetime @@ -87,7 +87,7 @@ def format_datetime(date_string): else: formatted_stats.append(f"

{line}

") stats_label = QLabel("".join(formatted_stats)) - stats_label.setTextFormat(Qt.RichText) + stats_label.setTextFormat(Qt.TextFormat.RichText) stats_label.setWordWrap(True) scroll_layout.addWidget(stats_label) @@ -101,7 +101,7 @@ def format_datetime(date_string): layout.addWidget(self.notes_edit) # Buttons - button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel) + button_box = QDialogButtonBox(QDialogButtonBox.StandardButton.Ok | QDialogButtonBox.StandardButton.Cancel) button_box.accepted.connect(self.accept) button_box.rejected.connect(self.reject) layout.addWidget(button_box) diff --git a/src/digitalsreeni_image_annotator/project_search.py b/src/digitalsreeni_image_annotator/project_search.py index 79b830b..c19c7e7 100644 --- a/src/digitalsreeni_image_annotator/project_search.py +++ b/src/digitalsreeni_image_annotator/project_search.py @@ -1,7 +1,7 @@ -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QLineEdit, QPushButton, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QLineEdit, QPushButton, QDateEdit, QLabel, QListWidget, QDialogButtonBox, QFormLayout, QFileDialog, QMessageBox) -from PyQt5.QtCore import Qt, QDate +from PyQt6.QtCore import Qt, QDate import os import json from datetime import datetime @@ -57,7 +57,7 @@ def setup_ui(self): layout.addWidget(self.results_list) # Buttons - button_box = QDialogButtonBox(QDialogButtonBox.Close) + button_box = QDialogButtonBox(QDialogButtonBox.StandardButton.Close) button_box.rejected.connect(self.reject) layout.addWidget(button_box) @@ -216,4 +216,4 @@ def open_selected_project(self, item): def show_project_search(parent): dialog = ProjectSearchDialog(parent) - dialog.exec_() \ No newline at end of file + dialog.exec() \ No newline at end of file diff --git a/src/digitalsreeni_image_annotator/sam_utils.py b/src/digitalsreeni_image_annotator/sam_utils.py index 079a248..0ef90dc 100644 --- a/src/digitalsreeni_image_annotator/sam_utils.py +++ b/src/digitalsreeni_image_annotator/sam_utils.py @@ -1,21 +1,43 @@ """ -SAM utilities — delegates to an isolated subprocess to avoid DLL conflicts. - -On Windows + Python 3.14, loading PyTorch after PyQt5 causes -WinError 1114. Running SAM in a clean subprocess avoids the issue. +SAM 2 utilities — runs Ultralytics SAM in-process. + +History +------- +Earlier versions delegated to a ``sam_worker.py`` subprocess to dodge +``WinError 1114`` on Windows + Python 3.14 + PyQt5 (the now-superseded +ADR-011). Migrating to PyQt6 (ADR-014) eliminated that DLL load-order +conflict, so we run the model directly here — saves a ~1-2 s spawn +per call and lets us keep the model resident. See ADR-013 for the +threading and re-entrancy story. + +Threading model +--------------- +Inference runs on a worker thread (QThread) so the UI stays +responsive. The public API still looks synchronous — the caller +gets the result returned — but the call site's thread (typically +the UI thread) keeps pumping events via a nested QEventLoop while +the worker churns. The Qt event loop processing during the wait +means button clicks, redraws and progress dialog cancels all +continue to flow. Callers that disabled buttons before the call +remain protected from re-entry; callers that didn't (e.g. simple +click-segment) should make sure they themselves are idempotent +under a possible second click. + +torch / ultralytics are imported lazily on first inference so app +startup stays fast for users who never touch SAM. """ -import json +from __future__ import annotations + import os -import subprocess -import sys -import tempfile import traceback -from pathlib import Path +import cv2 import numpy as np -from PIL import Image -from PyQt5.QtGui import QImage +from PyQt6.QtCore import QEventLoop, QObject, QThread, pyqtSignal +from PyQt6.QtGui import QImage + +from .utils import models_base_dir MODEL_NAMES = [ @@ -40,219 +62,453 @@ "SAM 2.1 large": "sam2.1_l.pt", } +# SAM weights live under /sam/, parallel to DINO models. +SAM_MODELS_DIR = os.path.join(models_base_dir(), "sam") + -def _qimage_to_numpy(qimage): - """QImage → RGB numpy array.""" +def _qimage_to_numpy(qimage: QImage) -> np.ndarray: + """QImage → RGB numpy array. Returned array is a fresh copy. + + The naive ``np.frombuffer(qimage.constBits().asarray(N))`` aliases + the QImage's pixel buffer. That's a problem in two ways: (1) the + returned array is invalidated if the QImage is mutated or freed, + and (2) we hand the array across a thread boundary to the + inference worker, where Qt's threading rules make any read from + the QImage memory dicey. Always ``.copy()`` so the worker thread + owns its own buffer for the duration of the call. + """ width = qimage.width() height = qimage.height() fmt = qimage.format() - if fmt == QImage.Format_Grayscale8: + if fmt == QImage.Format.Format_Grayscale8: buffer = qimage.constBits().asarray(height * width) img = np.frombuffer(buffer, np.uint8).reshape((height, width)) - return np.stack((img,) * 3, -1) + return np.stack((img,) * 3, -1) # np.stack already returns a copy - if fmt in (QImage.Format_RGB32, QImage.Format_ARGB32, QImage.Format_ARGB32_Premultiplied): + if fmt in ( + QImage.Format.Format_RGB32, + QImage.Format.Format_ARGB32, + QImage.Format.Format_ARGB32_Premultiplied, + ): buffer = qimage.constBits().asarray(height * width * 4) img = np.frombuffer(buffer, np.uint8).reshape((height, width, 4)) - return img[:, :, :3] + return img[:, :, :3].copy() - if fmt == QImage.Format_RGB888: + if fmt == QImage.Format.Format_RGB888: buffer = qimage.constBits().asarray(height * width * 3) - return np.frombuffer(buffer, np.uint8).reshape((height, width, 3)) + img = np.frombuffer(buffer, np.uint8).reshape((height, width, 3)) + return img.copy() - # Fallback - converted = qimage.convertToFormat(QImage.Format_RGB32) + # Fallback: convert via Qt. ``converted`` is a local QImage that + # goes out of scope at function return, so we MUST copy before + # the buffer is freed. + converted = qimage.convertToFormat(QImage.Format.Format_RGB32) buffer = converted.constBits().asarray(height * width * 4) img = np.frombuffer(buffer, np.uint8).reshape((height, width, 4)) - return img[:, :, :3] - - -class SAMUtils: - """Thin wrapper that forwards SAM work to a subprocess worker.""" + return img[:, :, :3].copy() + + +# ── geometry helpers ──────────────────────────────────────────────────────── + +def _mask_to_polygon(mask: np.ndarray) -> list | None: + contours, _ = cv2.findContours( + (mask > 0).astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE + ) + polygons = [] + for contour in contours: + if cv2.contourArea(contour) > 10: + polygon = contour.flatten().tolist() + if len(polygon) >= 6: + polygons.append(polygon) + if not polygons: + return None + biggest = max( + polygons, + key=lambda p: cv2.contourArea(np.array(p).reshape(-1, 2)), + ) + return biggest + + +def _bbox_of_contour(contour: list) -> tuple[float, float, float, float]: + pts = np.array(contour).reshape(-1, 2) + return ( + float(pts[:, 0].min()), + float(pts[:, 1].min()), + float(pts[:, 0].max()), + float(pts[:, 1].max()), + ) + + +def _bbox_area(bbox: list) -> float: + return float(max(0, bbox[2] - bbox[0]) * max(0, bbox[3] - bbox[1])) + + +def _check_points(contour: list, positive: list, negative: list) -> bool: + cnt = np.array(contour, dtype=np.int32).reshape(-1, 1, 2) + for x, y in positive: + if cv2.pointPolygonTest(cnt, (float(x), float(y)), False) < 0: + return False + for x, y in negative: + if cv2.pointPolygonTest(cnt, (float(x), float(y)), False) >= 0: + return False + return True + + +def _predicted_bbox_area_ratio(pred_contour: list, user_bbox: list) -> float: + px1, py1, px2, py2 = _bbox_of_contour(pred_contour) + user_area = _bbox_area(user_bbox) + if user_area == 0: + return 0.0 + pred_area = max(0, px2 - px1) * max(0, py2 - py1) + return pred_area / user_area + + +def _bbox_constraints_ok(contour, user_bbox) -> bool: + ratio = _predicted_bbox_area_ratio(contour, user_bbox) + if ratio < 0.20: + return False + ux, uy, ux2, uy2 = user_bbox + uw, uh = ux2 - ux, uy2 - uy + px, py, px2, py2 = _bbox_of_contour(contour) + pw, ph = px2 - px, py2 - py + if pw < 0.5 * uw or ph < 0.5 * uh: + return False + if pw > 1.5 * uw or ph > 1.5 * uh: + return False + return True + + +# ── threading scaffolding ────────────────────────────────────────────────── + +class _InferenceThread(QThread): + """Runs a callable on a background thread. + + Captures both the return value AND any exception raised, so + ``_run_sync`` can re-raise on the calling thread. Swallowing + exceptions inside the worker was the cause of silent + model-load failures (review P0). + + We use QThread (not QRunnable) because QRunnable's signal/slot + story requires a separate QObject anyway and we want a minimal + wrapper. Lifetime is bounded by the QEventLoop in _run_sync. + """ + + finished_with_result = pyqtSignal() + + def __init__(self, fn, *args, **kwargs): + super().__init__() + self._fn = fn + self._args = args + self._kwargs = kwargs + self._result = None + self._exc: BaseException | None = None + + def run(self): + try: + self._result = self._fn(*self._args, **self._kwargs) + except BaseException as exc: # noqa: BLE001 - rebroadcast verbatim + # Capture rather than print — _run_sync will re-raise on the + # calling thread so try/except at the call site actually catches. + # We DO emit a traceback to the console here so the failure + # mode is visible even if a caller swallows the exception in + # a broad except block (debugging silent-failure regressions). + print(f"[InferenceThread] {type(exc).__name__}: {exc}") + traceback.print_exc() + self._exc = exc + self.finished_with_result.emit() + + +class InferenceBusyError(RuntimeError): + """Raised when ``_run_sync`` is re-entered before the first call returns. + + See ``_run_sync`` for the full story. Callers that drive inference + from timers or user events should catch this and skip rather than + treating it as "no result found". + """ + + +# Module-level busy flag. ``_run_sync`` pumps the calling thread's +# event loop while inference runs, so a timer fire or user click can +# call back into ``_run_sync`` on the same thread before the first +# call returns. Two concurrent ``model(...)`` calls would race on the +# torch/ultralytics object (not thread-safe) and produce garbled +# masks or CUDA errors. A QMutex won't help: it's the same thread +# trying to re-acquire, which deadlocks a non-recursive mutex and is +# meaningless for a recursive one. A simple flag with an explicit +# exception is the honest fix — callers learn about the re-entry +# instead of silently getting None back. +_inference_in_flight = False + + +def _run_sync(fn, *args, **kwargs): + """Run fn on a worker thread; pump the calling thread's event loop + until done; return the result. Re-raises exceptions on the caller. + + Looks synchronous to callers but keeps the UI alive — timers, + repaints and progress dialog cancels continue to fire during the + wait. Re-entry from the same thread (the only kind that can happen + here) raises :class:`InferenceBusyError` rather than corrupting + the model with concurrent forward passes. + + **Call from the GUI thread only.** The module-level + ``_inference_in_flight`` flag is not protected for cross-thread + access; if a future contributor drives inference from a non-GUI + worker thread (e.g. a background patching/training thread), the + flag becomes a true race. The check below is a tripwire — kept + as an explicit ``raise`` rather than ``assert`` so it survives + ``python -O``. + """ + from PyQt6.QtCore import QCoreApplication, QThread as _QThread + app = QCoreApplication.instance() + if app is not None and _QThread.currentThread() is not app.thread(): + raise RuntimeError( + "_run_sync must be called from the GUI thread. " + "See ADR-013 — the re-entry guard is GUI-thread-local." + ) + global _inference_in_flight + if _inference_in_flight: + raise InferenceBusyError( + "Another SAM/DINO inference is still running. " + "Wait for it to finish or cancel before issuing a new call." + ) + _inference_in_flight = True + try: + thread = _InferenceThread(fn, *args, **kwargs) + loop = QEventLoop() + thread.finished_with_result.connect(loop.quit) + thread.start() + loop.exec() + thread.wait() + if thread._exc is not None: + raise thread._exc + return thread._result + finally: + _inference_in_flight = False + + +# ── public class ─────────────────────────────────────────────────────────── + +class SAMUtils(QObject): + """Runs Ultralytics SAM 2 in-process with a cached model.""" # Exposed for backward compat with annotator_window.py UI setup sam_models = MODEL_FILES.copy() + model_changed = pyqtSignal(str) # emitted with new model name after load + def __init__(self): - self.current_sam_model = None - # Invoke the worker script directly so the package __init__.py - # (which imports PyQt5) does not run inside the subprocess. - self._worker_script = os.path.join( - os.path.dirname(os.path.abspath(__file__)), "sam_worker.py" - ) + super().__init__() + self.current_sam_model: str | None = None + self._model = None # ultralytics.SAM instance once loaded + self._loaded_model_file: str | None = None + + # ── model lifecycle ──────────────────────────────────────────────── - def change_sam_model(self, model_name): + def change_sam_model(self, model_name: str) -> None: if model_name == "Pick a SAM Model": self.current_sam_model = None + self._model = None + self._loaded_model_file = None print("SAM model unset") return if model_name not in MODEL_NAMES: raise ValueError(f"Unknown SAM model: {model_name}") + # Load on a worker thread to avoid stalling the UI on the + # ~1-3 s torch model-load. Behaves synchronously to callers and + # re-raises any load-time exception (network, corrupt weights, + # CUDA OOM) — only flip `current_sam_model` AFTER success so + # callers don't see a stale name on failure. + _run_sync(self._load_model_blocking, model_name) self.current_sam_model = model_name - print(f"Selected SAM model: {model_name}") - - def _send_request(self, request: dict) -> dict: - """Spawn the SAM worker, send JSON, and return parsed response.""" - env = os.environ.copy() - # Propagate the virtual environment - for possible in ("VIRTUAL_ENV", "CONDA_PREFIX"): - v = os.environ.get(possible) - if v: - env[possible] = v - break - # Force the worker to write UTF-8 so cp1252 (Windows) doesn't choke - # on non-ASCII bytes from torch/transformers warnings. - env["PYTHONIOENCODING"] = "utf-8" - - proc = subprocess.run( - [sys.executable, self._worker_script], - input=json.dumps(request) + "\n", - capture_output=True, - text=True, - encoding="utf-8", - errors="replace", - env=env, - ) - - if proc.returncode != 0: - err_text = proc.stderr.strip() if proc.stderr else "(no stderr)" - raise RuntimeError( - f"SAM worker exited with code {proc.returncode}.\nstderr: {err_text}" - ) - - # Echo worker stdout (includes GPU/CPU diagnostics) to parent console - lines = (proc.stdout or "").strip().splitlines() - for line in lines[:-1]: - print(line) - - try: - return json.loads(lines[-1]) - except (json.JSONDecodeError, IndexError): - out_text = proc.stdout.strip() if proc.stdout else "(no stdout)" - raise RuntimeError( - f"SAM worker returned non-JSON output.\nstdout: {out_text}" - ) + self.model_changed.emit(model_name) + print(f"SAM model loaded: {model_name}") + + def _load_model_blocking(self, model_name: str) -> None: + # Lazy import keeps app startup fast for users who never use SAM. + from ultralytics import SAM + self._log_device() + model_file = os.path.join(SAM_MODELS_DIR, MODEL_FILES[model_name]) + os.makedirs(os.path.dirname(model_file), exist_ok=True) + self._model = SAM(model_file) + self._loaded_model_file = model_file @staticmethod - def _save_image_temp(image: QImage) -> str: - """Convert QImage to a temporary file and return the path.""" - arr = _qimage_to_numpy(image) - tmp = tempfile.NamedTemporaryFile(suffix=".png", delete=False) - Image.fromarray(arr).save(tmp.name) - tmp.close() - return tmp.name - - def apply_sam_points(self, image, positive_points, negative_points): - if not self.current_sam_model: - print("No SAM model selected.") - return None + def _log_device() -> None: try: - tmp_path = self._save_image_temp(image) - request = { - "image_path": tmp_path, - "model_name": self.current_sam_model, - "points": { - "positive": [list(p) for p in positive_points], - "negative": [list(p) for p in negative_points], - }, - } - result = self._send_request(request) + import torch + if torch.cuda.is_available(): + dev = torch.cuda.get_device_name(0) + print(f"[SAM] Using CUDA: {torch.version.cuda} — {dev}") + else: + print("[SAM] No GPU available, running on CPU") except Exception: - traceback.print_exc() - return None - finally: - try: - os.unlink(tmp_path) - except OSError: - pass - - if "error" in result: - print(f"SAM worker error: {result['error']}") - return None + pass + + def unload(self) -> None: + """Free GPU/CPU memory held by the loaded model. + + Three-step recipe required to actually drop the GPU allocation: + + 1. Move the underlying nn.Module to CPU + (``self._model.model.cpu()``). Without this the GPU tensors + stay resident even after Python drops its reference, because + the allocator pool keeps them. + 2. Drop the Python references, ``gc.collect()`` to break the + Ultralytics circular refs. + 3. ``torch.cuda.empty_cache()`` + ``ipc_collect()`` + + ``synchronize()`` so the allocator releases freed blocks + back to the driver and pending kernels complete first. + + Caveat: PyTorch keeps a CUDA context per process. Even after a + clean unload, Task Manager / ``nvidia-smi`` show the residual + context (~200-500 MB depending on driver). To reclaim that the + user must restart the app. + """ + import gc + try: + if self._model is not None and hasattr(self._model, "model"): + self._model.model.cpu() + except Exception as e: + print(f"[SAM] unload: warning moving model to CPU: {e}") + self._model = None + self.current_sam_model = None + self._loaded_model_file = None + gc.collect() + try: + import torch + if torch.cuda.is_available(): + torch.cuda.synchronize() + torch.cuda.empty_cache() + torch.cuda.ipc_collect() + except Exception: + pass + print("[SAM] unload complete") - return { - "segmentation": result["segmentation"], - "score": result["score"], - } + # ── inference ────────────────────────────────────────────────────── - def apply_sam_prediction(self, image, bbox): - if not self.current_sam_model: + def apply_sam_points(self, image: QImage, positive_points, negative_points): + if not self.current_sam_model or self._model is None: print("No SAM model selected.") return None - try: - tmp_path = self._save_image_temp(image) - request = { - "image_path": tmp_path, - "model_name": self.current_sam_model, - "bboxes": list(bbox), - } - result = self._send_request(request) - except Exception: - traceback.print_exc() + if not positive_points: + print("No positive points for SAM-points") return None - finally: - try: - os.unlink(tmp_path) - except OSError: - pass - - if "error" in result: - print(f"SAM worker error: {result['error']}") - return None - - return { - "segmentation": result["segmentation"], - "score": result["score"], - } + return _run_sync( + self._sam_points_blocking, + _qimage_to_numpy(image), + list(positive_points), + list(negative_points), + ) - def apply_sam_predictions_batch(self, image, bboxes): - """ - Segment multiple bounding boxes in a single subprocess call. - - Parameters - ---------- - image : QImage - bboxes : list[list[float]] - List of [x1, y1, x2, y2] boxes. - - Returns - ------- - list[dict] | None - Each dict: {"segmentation": [...], "score": float} - or {"error": str} if that box failed. - """ - if not self.current_sam_model: + def _sam_points_blocking(self, image_np, positive_points, negative_points): + all_points = [positive_points + negative_points] + all_labels = [([1] * len(positive_points)) + ([0] * len(negative_points))] + results = self._model(image_np, points=all_points, labels=all_labels) + + masks = results[0].masks.data.cpu().numpy() + confidences = results[0].boxes.conf.cpu().numpy() + + best_result = None + best_score = -1.0 + for i, mask in enumerate(masks): + contour = _mask_to_polygon(mask) + if contour is None: + continue + if not _check_points(contour, positive_points, negative_points): + continue + mask_pixels = int(mask.sum()) + if mask_pixels > best_score: + score = float(confidences[i]) if i < len(confidences) else 0.0 + best_score = mask_pixels + best_result = {"segmentation": contour, "score": score} + + return best_result + + def apply_sam_prediction(self, image: QImage, bbox): + if not self.current_sam_model or self._model is None: print("No SAM model selected.") return None - if not bboxes: - return [] + return _run_sync( + self._sam_bbox_blocking, + _qimage_to_numpy(image), + list(bbox), + ) - try: - tmp_path = self._save_image_temp(image) - request = { - "image_path": tmp_path, - "model_name": self.current_sam_model, - "bboxes": [list(b) for b in bboxes], - } - result = self._send_request(request) - except Exception: - traceback.print_exc() + def _sam_bbox_blocking(self, image_np, bbox): + results = self._model(image_np, bboxes=[bbox]) + res = results[0] + if not (hasattr(res, "masks") and res.masks is not None): return None - finally: - try: - os.unlink(tmp_path) - except OSError: - pass - - if "error" in result: - print(f"SAM worker error: {result['error']}") + + masks = res.masks.data.cpu().numpy() + confidences = ( + res.boxes.conf.cpu().numpy() + if hasattr(res.boxes, "conf") + else np.zeros(len(masks)) + ) + + best = None + best_pixels = -1 + for i, mask in enumerate(masks): + contour = _mask_to_polygon(mask) + if contour is None: + continue + if not _bbox_constraints_ok(contour, bbox): + continue + pixels = int(mask.sum()) + if pixels > best_pixels: + best_pixels = pixels + score = float(confidences[i]) if i < len(confidences) else 0.0 + best = {"segmentation": contour, "score": score} + + return best + + def apply_sam_predictions_batch(self, image: QImage, bboxes: list): + if not self.current_sam_model or self._model is None: + print("[SAM] apply_sam_predictions_batch: no SAM model selected") return None + if not bboxes: + print("[SAM] apply_sam_predictions_batch: empty bbox list") + return [] + print(f"[SAM] apply_sam_predictions_batch: running on {len(bboxes)} bbox(es)") + return _run_sync( + self._sam_batch_blocking, + _qimage_to_numpy(image), + [list(b) for b in bboxes], + ) - if isinstance(result, list): - return result + def _sam_batch_blocking(self, image_np, bboxes): + results = self._model(image_np, bboxes=bboxes) + res = results[0] + if not (hasattr(res, "masks") and res.masks is not None): + # Build a fresh dict per bbox so callers can mutate one + # entry without affecting the others (a `[d] * N` would + # alias the same dict N times). + return [{"error": "No mask generated."} for _ in bboxes] + + masks = res.masks.data.cpu().numpy() + confidences = ( + res.boxes.conf.cpu().numpy() + if hasattr(res.boxes, "conf") + else np.zeros(len(masks)) + ) - # Fallback: single result wrapped in list - return [result] + output = [] + for i in range(len(masks)): + mask = masks[i] + score = float(confidences[i]) if i < len(confidences) else 0.0 + contour = _mask_to_polygon(mask) + if contour is None: + output.append({"error": "No valid mask polygon."}) + continue + + user_bbox = bboxes[i] + if not _bbox_constraints_ok(contour, user_bbox): + output.append({"error": "Mask failed bbox constraints."}) + continue + + output.append({"segmentation": contour, "score": score}) + return output diff --git a/src/digitalsreeni_image_annotator/sam_worker.py b/src/digitalsreeni_image_annotator/sam_worker.py deleted file mode 100644 index f5c58e9..0000000 --- a/src/digitalsreeni_image_annotator/sam_worker.py +++ /dev/null @@ -1,288 +0,0 @@ -""" -Standalone SAM worker — runs in an isolated subprocess. - -This script is intentionally free of PyQt5 imports so it can load -torch/ultralytics in a clean process where the parent GUI's loaded -DLLs do not interfere. - -Communication: - stdin -> JSON request (image path + model + prompts) - stdout -> JSON response (polygon + score or error) -""" - -from __future__ import annotations - -import io -import json -import os -import sys -import traceback - -import cv2 -import numpy as np -from PIL import Image - - -# IMPORTANT: do not import from the digitalsreeni_image_annotator package. -# The package __init__.py imports PyQt5 transitively (via annotator_window), -# which would trigger the WinError 1114 DLL load-order bug ADR-011 exists to -# prevent. Inlined helper below mirrors utils.models_base_dir(); keep in sync. -def _models_base_dir() -> str: - pkg_anchor = os.path.dirname(os.path.dirname(os.path.dirname( - os.path.abspath(__file__)))) - if "site-packages" not in pkg_anchor.replace(os.sep, "/"): - return os.path.join(pkg_anchor, "models") - return os.path.join(os.getcwd(), "models") - - -# SAM weights live under /sam/, parallel to the DINO models -# directories (e.g. /grounding-dino-base/). -SAM_MODELS_DIR = os.path.join(_models_base_dir(), "sam") - -MODELS = { - "SAM 2 tiny": os.path.join(SAM_MODELS_DIR, "sam2_t.pt"), - "SAM 2 small": os.path.join(SAM_MODELS_DIR, "sam2_s.pt"), - "SAM 2 base": os.path.join(SAM_MODELS_DIR, "sam2_b.pt"), - "SAM 2 large": os.path.join(SAM_MODELS_DIR, "sam2_l.pt"), - "SAM 2.1 tiny": os.path.join(SAM_MODELS_DIR, "sam2.1_t.pt"), - "SAM 2.1 small": os.path.join(SAM_MODELS_DIR, "sam2.1_s.pt"), - "SAM 2.1 base": os.path.join(SAM_MODELS_DIR, "sam2.1_b.pt"), - "SAM 2.1 large": os.path.join(SAM_MODELS_DIR, "sam2.1_l.pt"), -} - - -# ── helpers ────────────────────────────────────────────────────────────────── - -def _log_device(): - try: - import torch - - if torch.cuda.is_available(): - dev = torch.cuda.get_device_name(0) - print(f"[SAM] Using CUDA: {torch.version.cuda} — {dev}") - else: - print("[SAM] No GPU available, running on CPU") - except Exception: - pass - - -def load_image(image_path: str) -> np.ndarray: - img = Image.open(image_path) - if img.mode != "RGB": - img = img.convert("RGB") - return np.array(img) - - -def mask_to_polygon(mask: np.ndarray) -> list | None: - contours, _ = cv2.findContours( - (mask > 0).astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE - ) - polygons = [] - for contour in contours: - if cv2.contourArea(contour) > 10: - polygon = contour.flatten().tolist() - if len(polygon) >= 6: - polygons.append(polygon) - if not polygons: - return None - # Return the polygon with the largest area (ignore tiny noise holes) - biggest = max(polygons, key=lambda p: cv2.contourArea(np.array(p).reshape(-1, 2))) - return biggest - - -def _bbox_of_contour(contour: list) -> tuple[float, float, float, float]: - pts = np.array(contour).reshape(-1, 2) - return float(pts[:, 0].min()), float(pts[:, 1].min()), float(pts[:, 0].max()), float(pts[:, 1].max()) - - -def _bbox_area(bbox: list) -> float: - return float(max(0, bbox[2] - bbox[0]) * max(0, bbox[3] - bbox[1])) - - -def _check_points(contour: list, positive: list, negative: list) -> bool: - """Return True iff all positive points are inside and all negative outside.""" - cnt = np.array(contour, dtype=np.int32).reshape(-1, 1, 2) - for x, y in positive: - if cv2.pointPolygonTest(cnt, (float(x), float(y)), False) < 0: - return False - for x, y in negative: - if cv2.pointPolygonTest(cnt, (float(x), float(y)), False) >= 0: - return False - return True - - -def _predicted_bbox_area_ratio(pred_contour: list, user_bbox: list) -> float: - """Ratio of predicted contour bbox area over user-drawn bbox area.""" - px1, py1, px2, py2 = _bbox_of_contour(pred_contour) - user_area = _bbox_area(user_bbox) - if user_area == 0: - return 0.0 - pred_area = max(0, px2 - px1) * max(0, py2 - py1) - return pred_area / user_area - - -# ── core SAM runner ────────────────────────────────────────────────────────── - -def _is_single_bbox(value) -> bool: - """Return True if value is a single bbox [x1,y1,x2,y2] (list of 4 numbers).""" - return ( - isinstance(value, list) - and len(value) == 4 - and all(isinstance(v, (int, float)) for v in value) - ) - - -def _filter_and_rank_masks( - masks, confidences, user_bbox, positive_pts, negative_pts -) -> dict | None: - """Apply hard constraints and return the best single mask as a dict.""" - best_result = None - best_score = -1.0 - - for i, mask in enumerate(masks): - contour = mask_to_polygon(mask) - if contour is None: - continue - - score = float(confidences[i]) if i < len(confidences) else 0.0 - mask_pixels = int(mask.sum()) - - if user_bbox is not None: - ratio = _predicted_bbox_area_ratio(contour, user_bbox) - if ratio < 0.20: - continue - ux, uy, ux2, uy2 = user_bbox - uw, uh = ux2 - ux, uy2 - uy - px, py, px2, py2 = _bbox_of_contour(contour) - pw, ph = px2 - px, py2 - py - if pw < 0.5 * uw or ph < 0.5 * uh: - continue - if pw > 1.5 * uw or ph > 1.5 * uh: - continue - - if positive_pts is not None and negative_pts is not None: - if not _check_points(contour, positive_pts, negative_pts): - continue - - if mask_pixels > best_score: - best_score = mask_pixels - best_result = { - "segmentation": contour, - "score": score, - "mask_pixels": mask_pixels, - } - - return best_result - - -def run_sam( - image_path: str, - model_name: str, - bboxes: list | None = None, - points: dict | None = None, -) -> dict | list[dict]: - from ultralytics import SAM - - _log_device() - - model_file = MODELS[model_name] - os.makedirs(os.path.dirname(model_file), exist_ok=True) - sam_model = SAM(model_file) - image_np = load_image(image_path) - - if points is not None: - pos = points.get("positive", []) - neg = points.get("negative", []) - all_points = [pos + neg] - all_labels = [([1] * len(pos)) + ([0] * len(neg))] - if not all_points[0]: - return {"error": "No points provided."} - results = sam_model(image_np, points=all_points, labels=all_labels) - user_bbox = None - positive_pts = pos - negative_pts = neg - - masks = results[0].masks.data.cpu().numpy() - confidences = results[0].boxes.conf.cpu().numpy() - best = _filter_and_rank_masks(masks, confidences, user_bbox, positive_pts, negative_pts) - - if best is None: - return {"error": "No SAM mask matches the given constraints. Try repositioning positive/negative points."} - return {"segmentation": best["segmentation"], "score": best["score"]} - - elif bboxes is not None: - is_batch = not _is_single_bbox(bboxes) - sam_bboxes = bboxes if is_batch else [bboxes] - - # Ultralytics always returns [Results] (single Results object) - results = sam_model(image_np, bboxes=sam_bboxes) - res = results[0] - - if not (hasattr(res, "masks") and res.masks is not None): - return [{"error": "No mask generated."}] * len(sam_bboxes) if is_batch else {"error": "No mask generated."} - - masks = res.masks.data.cpu().numpy() # (N, H, W) - confidences = res.boxes.conf.cpu().numpy() if hasattr(res.boxes, "conf") else np.zeros(len(masks)) - - output = [] - for i in range(len(masks)): - mask = masks[i] - score = float(confidences[i]) if i < len(confidences) else 0.0 - contour = mask_to_polygon(mask) - if contour is None: - output.append({"error": "No valid mask polygon."}) - continue - - mask_pixels = int(mask.sum()) - user_bbox = sam_bboxes[i] - - # hard constraints per bbox - ratio = _predicted_bbox_area_ratio(contour, user_bbox) - if ratio < 0.20: - output.append({"error": "Mask too small relative to box."}) - continue - ux, uy, ux2, uy2 = user_bbox - uw, uh = ux2 - ux, uy2 - uy - px, py, px2, py2 = _bbox_of_contour(contour) - pw, ph = px2 - px, py2 - py - if pw < 0.5 * uw or ph < 0.5 * uh: - output.append({"error": "Mask dimensions too small."}) - continue - if pw > 1.5 * uw or ph > 1.5 * uh: - output.append({"error": "Mask dimensions too large."}) - continue - - output.append({"segmentation": contour, "score": score}) - - return output if is_batch else output[0] - - else: - return {"error": "No prompts provided."} - - -def main(): - raw = sys.stdin.read() - if not raw.strip(): - return - - try: - request = json.loads(raw) - except json.JSONDecodeError as exc: - print(json.dumps({"error": f"Invalid JSON: {exc}"})) - return - - image_path = request.get("image_path") - model_name = request.get("model_name", "SAM 2 tiny") - bboxes = request.get("bboxes") - points = request.get("points") - - try: - result = run_sam(image_path, model_name, bboxes=bboxes, points=points) - except Exception: - result = {"error": traceback.format_exc()} - - print(json.dumps(result)) - - -if __name__ == "__main__": - main() diff --git a/src/digitalsreeni_image_annotator/slice_registration.py b/src/digitalsreeni_image_annotator/slice_registration.py index dd8408c..4ac0618 100644 --- a/src/digitalsreeni_image_annotator/slice_registration.py +++ b/src/digitalsreeni_image_annotator/slice_registration.py @@ -1,7 +1,7 @@ -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, QLabel, QComboBox, QMessageBox, QProgressDialog, QRadioButton, QButtonGroup, QSpinBox, QApplication, QGroupBox, QDoubleSpinBox) -from PyQt5.QtCore import Qt +from PyQt6.QtCore import Qt from pystackreg import StackReg from skimage import io import tifffile @@ -14,8 +14,8 @@ def __init__(self, parent=None): super().__init__(parent) self.setWindowTitle("Slice Registration") self.setGeometry(100, 100, 600, 400) - self.setWindowFlags(self.windowFlags() | Qt.Window) - self.setWindowModality(Qt.ApplicationModal) # Add modal behavior + self.setWindowFlags(self.windowFlags() | Qt.WindowType.Window) + self.setWindowModality(Qt.WindowModality.ApplicationModal) # Add modal behavior # Initialize variables first self.input_path = "" @@ -193,7 +193,7 @@ def select_input(self): self, "Select Directory with Images", "", - QFileDialog.ShowDirsOnly | QFileDialog.DontResolveSymlinks + QFileDialog.Option.ShowDirsOnly | QFileDialog.Option.DontResolveSymlinks ) else: path, _ = QFileDialog.getOpenFileName( @@ -201,7 +201,7 @@ def select_input(self): "Select TIFF Stack", "", "TIFF Files (*.tif *.tiff)", - options=QFileDialog.Options() + options=QFileDialog.Option(0) ) if path: @@ -219,7 +219,7 @@ def select_output(self): self, "Select Output Directory", "", - QFileDialog.ShowDirsOnly | QFileDialog.DontResolveSymlinks + QFileDialog.Option.ShowDirsOnly | QFileDialog.Option.DontResolveSymlinks ) if directory: @@ -242,7 +242,7 @@ def register_slices(self): progress.setLabelText("Loading images...") progress.setMinimum(0) progress.setMaximum(100) - progress.setWindowModality(Qt.WindowModal) + progress.setWindowModality(Qt.WindowModality.WindowModal) progress.setMinimumWidth(400) progress.show() QApplication.processEvents() diff --git a/src/digitalsreeni_image_annotator/snake_game.py b/src/digitalsreeni_image_annotator/snake_game.py index 8bac104..1a3d390 100644 --- a/src/digitalsreeni_image_annotator/snake_game.py +++ b/src/digitalsreeni_image_annotator/snake_game.py @@ -1,8 +1,8 @@ import sys import random -from PyQt5.QtWidgets import QApplication, QWidget, QDesktopWidget, QMessageBox -from PyQt5.QtGui import QPainter, QColor -from PyQt5.QtCore import Qt, QTimer +from PyQt6.QtWidgets import QApplication, QWidget, QMessageBox +from PyQt6.QtGui import QPainter, QColor, QGuiApplication +from PyQt6.QtCore import Qt, QTimer class SnakeGame(QWidget): def __init__(self): @@ -23,18 +23,19 @@ def initUI(self): self.timer.timeout.connect(self.update_game) self.timer.start(100) - self.setFocusPolicy(Qt.StrongFocus) + self.setFocusPolicy(Qt.FocusPolicy.StrongFocus) self.show() def center(self): qr = self.frameGeometry() - cp = QDesktopWidget().availableGeometry().center() + screen = QGuiApplication.primaryScreen() + cp = screen.availableGeometry().center() qr.moveCenter(cp) self.move(qr.topLeft()) def paintEvent(self, event): painter = QPainter(self) - painter.setRenderHint(QPainter.Antialiasing) + painter.setRenderHint(QPainter.RenderHint.Antialiasing) # Draw snake painter.setBrush(QColor(0, 255, 0)) @@ -52,15 +53,15 @@ def paintEvent(self, event): def keyPressEvent(self, event): key = event.key() - if key == Qt.Key_Left and self.direction != 'RIGHT': + if key == Qt.Key.Key_Left and self.direction != 'RIGHT': self.direction = 'LEFT' - elif key == Qt.Key_Right and self.direction != 'LEFT': + elif key == Qt.Key.Key_Right and self.direction != 'LEFT': self.direction = 'RIGHT' - elif key == Qt.Key_Up and self.direction != 'DOWN': + elif key == Qt.Key.Key_Up and self.direction != 'DOWN': self.direction = 'UP' - elif key == Qt.Key_Down and self.direction != 'UP': + elif key == Qt.Key.Key_Down and self.direction != 'UP': self.direction = 'DOWN' - elif key == Qt.Key_Escape: + elif key == Qt.Key.Key_Escape: self.close() def update_game(self): @@ -110,4 +111,4 @@ def game_over(self): if __name__ == '__main__': app = QApplication(sys.argv) ex = SnakeGame() - sys.exit(app.exec_()) \ No newline at end of file + sys.exit(app.exec()) \ No newline at end of file diff --git a/src/digitalsreeni_image_annotator/soft_dark_stylesheet.py b/src/digitalsreeni_image_annotator/soft_dark_stylesheet.py index bb66105..20033b5 100644 --- a/src/digitalsreeni_image_annotator/soft_dark_stylesheet.py +++ b/src/digitalsreeni_image_annotator/soft_dark_stylesheet.py @@ -146,4 +146,147 @@ QListWidget::item { color: none; } + +/* Form controls — radio + check were invisible-when-selected on the + default OS theme under dark mode. Tables and spin boxes used to + render with a bright header bar in the DINO panel because of + hardcoded #e0e0e0 in code (now removed). */ + +QRadioButton { + color: #E0E0E0; + spacing: 6px; +} + +QRadioButton::indicator { + width: 14px; + height: 14px; + border-radius: 8px; + border: 1px solid #6A6A6A; + background-color: #3A3A3A; +} + +QRadioButton::indicator:checked { + background-color: #4DA3FF; + border: 2px solid #BCD7FF; +} + +QRadioButton::indicator:hover { + border-color: #8A8A8A; +} + +QCheckBox { + color: #E0E0E0; + spacing: 6px; +} + +QCheckBox::indicator { + width: 14px; + height: 14px; + border-radius: 3px; + border: 1px solid #6A6A6A; + background-color: #3A3A3A; +} + +QCheckBox::indicator:checked { + background-color: #4DA3FF; + border: 1px solid #BCD7FF; +} + +QCheckBox::indicator:hover { + border-color: #8A8A8A; +} + +QGroupBox { + color: #E0E0E0; + border: 1px solid #4A4A4A; + border-radius: 4px; + margin-top: 8px; + padding-top: 8px; +} + +QGroupBox::title { + subcontrol-origin: margin; + left: 8px; + padding: 0 4px; +} + +QTableWidget { + background-color: #2F2F2F; + color: #E0E0E0; + gridline-color: #4A4A4A; + border: 1px solid #4A4A4A; +} + +QTableWidget::item:selected { + background-color: #4A4A4A; + color: #FFFFFF; +} + +QHeaderView::section { + background-color: #3A3A3A; + color: #E0E0E0; + border: 1px solid #4A4A4A; + padding: 4px; +} + +QSpinBox, QDoubleSpinBox { + background-color: #3A3A3A; + color: #E0E0E0; + border: 1px solid #4A4A4A; + border-radius: 3px; + padding: 2px; +} + +QSpinBox::up-button, QSpinBox::down-button, +QDoubleSpinBox::up-button, QDoubleSpinBox::down-button { + background-color: #4A4A4A; + border: 1px solid #5A5A5A; + width: 16px; +} + +QSpinBox::up-button:hover, QSpinBox::down-button:hover, +QDoubleSpinBox::up-button:hover, QDoubleSpinBox::down-button:hover { + background-color: #5A5A5A; +} + +QSpinBox::up-arrow, QDoubleSpinBox::up-arrow { + border-left: 4px solid transparent; + border-right: 4px solid transparent; + border-bottom: 4px solid #E0E0E0; + width: 0px; + height: 0px; +} + +QSpinBox::down-arrow, QDoubleSpinBox::down-arrow { + border-left: 4px solid transparent; + border-right: 4px solid transparent; + border-top: 4px solid #E0E0E0; + width: 0px; + height: 0px; +} + +QComboBox { + background-color: #3A3A3A; + color: #E0E0E0; + border: 1px solid #4A4A4A; + border-radius: 3px; + padding: 3px 6px; +} + +QComboBox:hover { + border-color: #6A6A6A; +} + +QComboBox::drop-down { + border-left: 1px solid #4A4A4A; + width: 18px; +} + +QComboBox QAbstractItemView { + background-color: #2F2F2F; + color: #E0E0E0; + selection-background-color: #4A4A4A; + selection-color: #FFFFFF; + border: 1px solid #4A4A4A; +} """ \ No newline at end of file diff --git a/src/digitalsreeni_image_annotator/stack_interpolator.py b/src/digitalsreeni_image_annotator/stack_interpolator.py index c549c5c..9506161 100644 --- a/src/digitalsreeni_image_annotator/stack_interpolator.py +++ b/src/digitalsreeni_image_annotator/stack_interpolator.py @@ -1,9 +1,9 @@ import os import numpy as np -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, QLabel, QComboBox, QMessageBox, QProgressDialog, QRadioButton, QButtonGroup, QGroupBox, QDoubleSpinBox, QApplication) -from PyQt5.QtCore import Qt +from PyQt6.QtCore import Qt from scipy.interpolate import RegularGridInterpolator from skimage import io import tifffile @@ -13,8 +13,8 @@ def __init__(self, parent=None): super().__init__(parent) self.setWindowTitle("Stack Interpolator") self.setGeometry(100, 100, 600, 400) - self.setWindowFlags(self.windowFlags() | Qt.Window) - self.setWindowModality(Qt.ApplicationModal) # Added window modality + self.setWindowFlags(self.windowFlags() | Qt.WindowType.Window) + self.setWindowModality(Qt.WindowModality.ApplicationModal) # Added window modality # Initialize variables self.input_path = "" @@ -166,7 +166,7 @@ def select_input(self): self, "Select Directory with Images", "", - QFileDialog.ShowDirsOnly | QFileDialog.DontResolveSymlinks + QFileDialog.Option.ShowDirsOnly | QFileDialog.Option.DontResolveSymlinks ) else: path, _ = QFileDialog.getOpenFileName( @@ -174,7 +174,7 @@ def select_input(self): "Select TIFF Stack", "", "TIFF Files (*.tif *.tiff)", - options=QFileDialog.Options() + options=QFileDialog.Option(0) ) if path: @@ -192,7 +192,7 @@ def select_output(self): self, "Select Output Directory", "", - QFileDialog.ShowDirsOnly | QFileDialog.DontResolveSymlinks + QFileDialog.Option.ShowDirsOnly | QFileDialog.Option.DontResolveSymlinks ) if directory: @@ -207,7 +207,7 @@ def select_output(self): def load_images(self): try: progress = QProgressDialog("Loading images...", "Cancel", 0, 100, self) - progress.setWindowModality(Qt.WindowModal) + progress.setWindowModality(Qt.WindowModality.WindowModal) progress.show() QApplication.processEvents() @@ -274,7 +274,7 @@ def interpolate_stack(self): try: # Create progress dialog progress = QProgressDialog("Processing...", "Cancel", 0, 100, self) - progress.setWindowModality(Qt.WindowModal) + progress.setWindowModality(Qt.WindowModality.WindowModal) progress.setWindowTitle("Interpolation Progress") progress.setMinimumDuration(0) progress.setMinimumWidth(400) diff --git a/src/digitalsreeni_image_annotator/stack_to_slices.py b/src/digitalsreeni_image_annotator/stack_to_slices.py index 9754d11..c59a2a7 100644 --- a/src/digitalsreeni_image_annotator/stack_to_slices.py +++ b/src/digitalsreeni_image_annotator/stack_to_slices.py @@ -1,9 +1,9 @@ import os import numpy as np -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QFileDialog, QLabel, QMessageBox, QComboBox, QGridLayout, QWidget, QProgressDialog, QApplication) -from PyQt5.QtCore import Qt +from PyQt6.QtCore import Qt from tifffile import TiffFile from czifile import CziFile from PIL import Image @@ -49,8 +49,8 @@ def __init__(self, parent=None): super().__init__(parent) self.setWindowTitle("Stack to Slices") self.setGeometry(100, 100, 400, 200) - self.setWindowFlags(self.windowFlags() | Qt.Window) - self.setWindowModality(Qt.ApplicationModal) + self.setWindowFlags(self.windowFlags() | Qt.WindowType.Window) + self.setWindowModality(Qt.WindowModality.ApplicationModal) self.dimensions = None self.initUI() @@ -98,8 +98,8 @@ def process_czi(self): def get_dimensions(self, shape): dialog = DimensionDialog(shape, os.path.basename(self.file_name), self) - dialog.setWindowModality(Qt.ApplicationModal) - if dialog.exec_(): + dialog.setWindowModality(Qt.WindowModality.ApplicationModal) + if dialog.exec(): self.dimensions = dialog.get_dimensions() self.convert_button.setEnabled(True) else: @@ -133,7 +133,7 @@ def save_slices(self, image_array, output_dir): total_slices = np.prod([image_array.shape[i] for i in slice_indices]) progress = QProgressDialog("Saving slices...", "Cancel", 0, total_slices, self) - progress.setWindowModality(Qt.WindowModal) + progress.setWindowModality(Qt.WindowModality.WindowModal) progress.setWindowTitle("Progress") progress.setMinimumDuration(0) progress.setValue(0) diff --git a/src/digitalsreeni_image_annotator/utils.py b/src/digitalsreeni_image_annotator/utils.py index 492a386..60be366 100644 --- a/src/digitalsreeni_image_annotator/utils.py +++ b/src/digitalsreeni_image_annotator/utils.py @@ -15,7 +15,7 @@ def models_base_dir() -> str: """Return the absolute path of the `models/` directory used for ML weights. - Resolution strategy (single source of truth used by sam_worker, dino_utils, + Resolution strategy (single source of truth used by sam_utils, dino_utils, and annotator_window so all three agree on where weights live): 1. Editable / dev install: package source lives at diff --git a/src/digitalsreeni_image_annotator/yolo_trainer.py b/src/digitalsreeni_image_annotator/yolo_trainer.py index 8451eb8..2232908 100644 --- a/src/digitalsreeni_image_annotator/yolo_trainer.py +++ b/src/digitalsreeni_image_annotator/yolo_trainer.py @@ -1,6 +1,6 @@ import os -from PyQt5.QtWidgets import QFileDialog, QMessageBox -from PyQt5.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, +from PyQt6.QtWidgets import QFileDialog, QMessageBox +from PyQt6.QtWidgets import (QDialog, QVBoxLayout, QHBoxLayout, QPushButton, QLineEdit, QLabel, QFileDialog, QDialogButtonBox) import yaml import numpy as np @@ -11,8 +11,8 @@ from collections import deque -from PyQt5.QtWidgets import QDialog, QVBoxLayout, QTextEdit, QPushButton -from PyQt5.QtCore import Qt, pyqtSignal, QObject +from PyQt6.QtWidgets import QDialog, QVBoxLayout, QTextEdit, QPushButton +from PyQt6.QtCore import Qt, pyqtSignal, QObject class TrainingInfoDialog(QDialog): stop_signal = pyqtSignal() @@ -80,7 +80,7 @@ def __init__(self, parent=None): layout.addLayout(yaml_layout) # OK and Cancel buttons - self.button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel) + self.button_box = QDialogButtonBox(QDialogButtonBox.StandardButton.Ok | QDialogButtonBox.StandardButton.Cancel) self.button_box.accepted.connect(self.accept) self.button_box.rejected.connect(self.reject) layout.addWidget(self.button_box) diff --git a/tests/conftest.py b/tests/conftest.py index 3ed607f..a62db8b 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -16,7 +16,7 @@ @pytest.fixture(scope="session") def qt_application(): """Create a QApplication instance for the test session.""" - from PyQt5.QtWidgets import QApplication + from PyQt6.QtWidgets import QApplication app = QApplication.instance() if app is None: app = QApplication([]) diff --git a/tests/integration/test_export_formats.py b/tests/integration/test_export_formats.py index 503818e..654e5aa 100644 --- a/tests/integration/test_export_formats.py +++ b/tests/integration/test_export_formats.py @@ -10,7 +10,7 @@ import tempfile import shutil from pathlib import Path -from PyQt5.QtGui import QImage +from PyQt6.QtGui import QImage from src.digitalsreeni_image_annotator.export_formats import ( export_coco_json, export_yolo_v5plus, @@ -30,7 +30,7 @@ def temp_output_dir(): @pytest.fixture def sample_image(): """Create a sample QImage for testing.""" - image = QImage(100, 100, QImage.Format_RGB32) + image = QImage(100, 100, QImage.Format.Format_RGB32) image.fill(0xFFFFFFFF) # White background return image diff --git a/tests/unit/test_conversions.py b/tests/unit/test_conversions.py index a8d1b98..e10ba87 100644 --- a/tests/unit/test_conversions.py +++ b/tests/unit/test_conversions.py @@ -8,8 +8,8 @@ import sys import os import importlib.util -from PyQt5.QtCore import QPoint, QSize -from PyQt5.QtGui import QPixmap +from PyQt6.QtCore import QPoint, QSize +from PyQt6.QtGui import QPixmap # Import image_label module directly by file path to avoid torch dependency issues image_label_path = os.path.join(os.path.dirname(__file__), '..', '..', 'src', 'digitalsreeni_image_annotator', 'image_label.py') diff --git a/tools/check_pyqt6_torch_coexistence.py b/tools/check_pyqt6_torch_coexistence.py new file mode 100644 index 0000000..c630eeb --- /dev/null +++ b/tools/check_pyqt6_torch_coexistence.py @@ -0,0 +1,104 @@ +""" +Phase 0 gate: confirm PyQt6 and PyTorch can coexist in one process. + +Why this exists +--------------- +The historical ADR-011 documented that on Windows + Python 3.14, +importing PyQt5 first and then loading PyTorch triggers +``WinError 1114`` (DLL load-order conflict between Qt's and Torch's +native deps). That motivated the now-deleted subprocess isolation +layer (sam_worker.py, dino_worker.py, check_worker_isolation.py). + +Migrating to PyQt6 *should* eliminate the conflict — Qt6 reshuffled +its DLL packaging — but that is a hypothesis. This script is the +mechanical check. Run it before deleting any worker code. + +The crucial bit: ``import PyQt6.QtCore`` alone does NOT load Qt's +native platform plugin (qwindows.dll on Windows, libqxcb on Linux). +The plugin is loaded lazily by ``QApplication.__init__``. That's +where the WinError 1114 actually triggers. So this script +constructs a ``QApplication`` after importing both PyQt6 and torch +to exercise the real interaction. + +Usage +----- + python tools/check_pyqt6_torch_coexistence.py + +Run it especially on Windows + Python 3.14. Exit code 0 means the +combination loads cleanly *and* QApplication constructs without +crashing; exit code 1 means at least one stage failed. +""" + +from __future__ import annotations + +import platform +import sys +import traceback + + +def _try(label: str, fn) -> bool: + print(f"[{label}] running ...", flush=True) + try: + result = fn() + except BaseException: # catch SystemExit / segfault recovery too + print(f"[{label}] FAILED:") + traceback.print_exc() + return False + if result is not None and hasattr(result, "__version__"): + print(f"[{label}] OK — version: {result.__version__}", flush=True) + else: + print(f"[{label}] OK", flush=True) + return True + + +def _construct_qapplication(): + """Force Qt's platform plugin to load. + + On Windows this is where qwindows.dll gets loaded, which is the + site of the historical WinError 1114. We use 'offscreen' so the + script runs in a headless CI / SSH context. + """ + import os + # Don't clobber an existing user setting — they may want to test + # the real platform plugin specifically. + os.environ.setdefault("QT_QPA_PLATFORM", "offscreen") + from PyQt6.QtWidgets import QApplication + app = QApplication.instance() or QApplication(sys.argv) + return app + + +def main() -> int: + print(f"Python: {sys.version}") + print(f"Platform: {platform.platform()}") + print(f"Machine: {platform.machine()}") + print("-" * 60) + + # Order matters: PyQt first, then Torch, then Transformers. + # This is the exact order the running app loads them in + # (annotator_window imports PyQt at startup; torch is pulled + # in by ultralytics/transformers when the user picks a model). + ok = True + ok &= _try("PyQt6.QtCore", lambda: __import__("PyQt6.QtCore", fromlist=["QtCore"])) + ok &= _try("PyQt6.QtWidgets", lambda: __import__("PyQt6.QtWidgets", fromlist=["QtWidgets"])) + ok &= _try("PyQt6.QtGui", lambda: __import__("PyQt6.QtGui", fromlist=["QtGui"])) + ok &= _try("torch", lambda: __import__("torch")) + ok &= _try("torchvision", lambda: __import__("torchvision")) + ok &= _try("transformers", lambda: __import__("transformers")) + ok &= _try("ultralytics", lambda: __import__("ultralytics")) + # THIS is the real test — load the Qt platform plugin AFTER torch + # is in the address space. Pure import_module above does not load + # the platform plugin, so a green result without this step would + # be a false positive. + ok &= _try("QApplication construct (loads Qt platform plugin)", _construct_qapplication) + + print("-" * 60) + if ok: + print("RESULT: PyQt6 + Torch coexist cleanly, QApplication constructs.") + print(" Subprocess removal unblocked.") + return 0 + print("RESULT: at least one stage failed. Investigate before merging.") + return 1 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tools/check_worker_isolation.py b/tools/check_worker_isolation.py deleted file mode 100644 index e20c030..0000000 --- a/tools/check_worker_isolation.py +++ /dev/null @@ -1,134 +0,0 @@ -""" -Smoke check: importing either ML worker subprocess script must NOT pull -PyQt5 into the interpreter. - -ADR-011 (docs/09_architecture_decisions.md) requires that `sam_worker.py` -and `dino_worker.py` run in a Qt-free process — loading PyQt5 alongside -PyTorch on Windows + Python 3.14 triggers `WinError 1114`. Both workers -have already shipped, been broken, and been fixed once on this exact -invariant; this script is the mechanical guard so it doesn't happen -a third time. - -Usage: - python tools/check_worker_isolation.py - -Exit code 0 = both workers clean. Exit code 1 = PyQt5 leaked into the -import of at least one worker, or a real error occurred. Prints what -went wrong. -""" - -from __future__ import annotations - -import importlib.abc -import importlib.util -import os -import sys - - -REPO_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) -WORKERS = [ - os.path.join(REPO_ROOT, "src", "digitalsreeni_image_annotator", "sam_worker.py"), - os.path.join(REPO_ROOT, "src", "digitalsreeni_image_annotator", "dino_worker.py"), -] - - -class _PyQt5Tripwire(importlib.abc.MetaPathFinder): - """Raise on any attempt to import PyQt5 or a PyQt5 submodule.""" - - def __init__(self): - self.tripped = False - self.tripped_by: str | None = None - - def find_spec(self, fullname, path=None, target=None): - if fullname == "PyQt5" or fullname.startswith("PyQt5."): - self.tripped = True - self.tripped_by = fullname - raise ImportError( - f"PyQt5 leaked into worker subprocess via import of {fullname!r}" - ) - return None - - -def _check_one( - worker_path: str, tripwire: _PyQt5Tripwire, pyqt_before: set[str] -) -> tuple[bool, str]: - """Return (ok, message). ok=False means the worker either leaked PyQt5 - or failed to load for an unrelated reason. ``pyqt_before`` is the set of - PyQt5-related sys.modules keys that existed *before* this exec, so we - can diff and only flag new leaks (avoids false positives when this - script is invoked from a Qt-loaded interpreter).""" - name = f"_check_{os.path.basename(worker_path)[:-3]}" - try: - spec = importlib.util.spec_from_file_location(name, worker_path) - if spec is None or spec.loader is None: - return False, f"Cannot create import spec for {worker_path}" - module = importlib.util.module_from_spec(spec) - spec.loader.exec_module(module) - except ImportError as e: - # The tripwire raises ImportError to abort the load. Distinguish - # PyQt5 leakage (the thing we care about) from a missing third-party - # dependency that just makes the smoke check unusable in this env. - if tripwire.tripped: - return False, str(e) - return False, ( - f"Skipped {worker_path}: missing dependency ({e}). " - "Install project requirements to run this check." - ) - except Exception as e: - return False, f"Unexpected error loading {worker_path}: {type(e).__name__}: {e}" - - # Belt-and-braces: even if the tripwire didn't fire, confirm no NEW - # PyQt5 modules landed in sys.modules during this exec. The tripwire - # catches first-import; this catches the case where a future bug - # bypassed the finder. Diff against the pre-exec snapshot so PyQt5 - # already loaded in the caller's interpreter doesn't false-positive. - pyqt_now = {m for m in sys.modules if m == "PyQt5" or m.startswith("PyQt5.")} - newly_loaded = pyqt_now - pyqt_before - if newly_loaded: - return False, ( - f"PyQt5 modules {sorted(newly_loaded)!r} appeared in sys.modules " - f"during exec of {worker_path} — leaked past the tripwire." - ) - return True, f"OK: {os.path.basename(worker_path)} loaded without PyQt5." - - -def main() -> int: - tripwire = _PyQt5Tripwire() - sys.meta_path.insert(0, tripwire) - try: - all_ok = True - skipped_for_deps = False - for worker in WORKERS: - # Snapshot PyQt5-related sys.modules entries before each exec so - # the diff is per-worker and survives a pre-loaded PyQt5. - pyqt_before = { - m for m in sys.modules if m == "PyQt5" or m.startswith("PyQt5.") - } - ok, msg = _check_one(worker, tripwire, pyqt_before) - print(msg) - if not ok: - if "missing dependency" in msg: - skipped_for_deps = True - else: - all_ok = False - # Reset the tripwire between workers — a leak in one shouldn't - # mask a leak in the next. - tripwire.tripped = False - tripwire.tripped_by = None - finally: - try: - sys.meta_path.remove(tripwire) - except ValueError: - pass - - if not all_ok: - return 1 - if skipped_for_deps: - # Couldn't fully verify; surface this so CI/reviewer notices. - print("\nNote: at least one worker was skipped due to missing deps. " - "Re-run after `pip install -e .` for a complete check.") - return 0 - - -if __name__ == "__main__": - sys.exit(main())