Release 0.3.0 by guillaumejaume · Pull Request #204 · mahmoodlab/TRIDENT

guillaumejaume · 2026-04-15T16:11:54Z

Multi-format readers, multi-GPU pipeline, run tracking, and lock hardening

TL;DR

This branch promotes TRIDENT from a single-GPU CLI into a production-grade WSI
processing pipeline:

Two new WSI readers (Zeiss CZI, OpenSlide-DICOM) and a corruption-tolerant
fallback in the OpenSlide path.
Multi-GPU + multi-CPU sharding via --gpus 0 1 2 3 (and --gpus -1 -1 …),
in both standard and cache-pipeline modes.
Self-describing .lock files + safe stale-lock cleanup (--clear_dead_locks),
replacing the previous "delete every lock at startup" behavior.
Per-run / per-slide state and reports: summary.md, runs/<id>.json,
wsi_states/<slide>__<hash>.json. Re-running on the same --job_dir is now
idempotent by design.
New encoders: KEEP (patch), GenBio-PathFM already in main is now properly
surfaced; offline support extended for gigapath.
Big test bump: 14 new test files (~2.6k lines), including real-data
end-to-end integration tests and a real multi-GPU equivalence stress test.
Docs overhaul: index / quickstart / installation / tutorials / api / FAQ
rewritten to surface the actually-useful features (multi-GPU, resume, cache,
lock cleanup, run reports), with corrected output-directory paths.

No breaking CLI changes. --gpu N (singular) still works; --gpus is preferred.

What's new

1. New WSI readers

Zeiss CZI (trident/wsi_objects/CZIWSI.py, --reader_type czi,
pip install -e ".[czi]").
DICOM through OpenSlide (--reader_type openslide on .dcm).
Corruption-tolerant level reads in OpenSlideWSI: when a single pyramid
level is corrupt, TRIDENT now falls back to the next level instead of failing
the whole slide.
WSIFactory updated to dispatch CZI and DCM cleanly.

2. Multi-GPU / multi-CPU pipeline

New --gpus flag (nargs='+'): pending slides are sharded round-robin across
the listed device IDs.
Multi-CPU fallback: --gpus -1 -1 runs N independent CPU workers (useful for
segmentation-only or otsu pipelines on machines without a GPU).
Smart dedup: duplicate positive GPU IDs are deduplicated (running two
workers on the same CUDA device wastes memory), but -1 entries are kept
(each is an independent CPU worker).
Cross-platform multiprocessing context: prefers forkserver on POSIX,
spawn on Windows / when CUDA is in use, so the pipeline works on Linux,
macOS, and Windows.
DataLoader pickling fallback in WSI.py: when the chosen mp context can't
pickle a complex object (e.g. WSIPatcher), TRIDENT transparently retries
with a different context, then with single-process loading.
Backward-compatible: --gpu N still works; if both are given, --gpus wins
and a one-line warning is printed.

3. Lock hardening

Self-describing locks: .lock files are now JSON containing pid,
hostname, created_at. This lets TRIDENT (and operators) tell whose lock
this is.
Safe stale-lock cleanup: new --clear_dead_locks flag (and
trident.IO.clear_dead_locks(...) API) removes locks only when one of:
1. the target output already exists,
2. the writer PID is dead on this host, or
3. the lock is unreadable / legacy and older than --dead_lock_max_age_hours
  (default 24h).
Active locks from running jobs are never removed.
The previous startup-time cleanup_files() (which deleted all .lock
files unconditionally — dangerous for multi-user / multi-job dirs) has been
split: cache cleanup is now its own opt-in step (cleanup_cache), and lock
cleanup is gated by --clear_dead_locks.

4. Run tracking and per-slide state

New trident/State.py and trident/Summary.py give every run:

summary.md: appended once per run; counts (completed / skipped /
errored), per-encoder breakdown, and a short error list.
runs/<run_id>.json: per-run manifest (CLI args, timestamps, status).
wsi_states/<slide>__<hash>.json: per-slide machine-readable state
with task-level status, attempts (timings), outputs, and resume info.

Re-running on the same --job_dir skips already-completed (and unlocked)
work. This makes long jobs tolerant to wall-time cutoffs, node failures, and
SIGKILL-by-scheduler.

5. Patch / slide encoders

KEEP patch encoder added (768-d, Astaxanthin/KEEP), with both online
and local-directory loading.
gigapath slide encoder now honors local_ckpts.json for offline
clusters.
BasePatchEncoder.ensure_valid_weights_path accepts either a checkpoint
file or a model directory (needed by HF-style local mirrors like KEEP).
trident-doctor no longer flags missing HF token unless --check-gated is
passed (eliminates false alarms for users who only run non-gated models).
Removed dataclasses / pydantic from trident/: zero runtime dependency
on either.

6. `Processor` API

New selected_wsi_paths= kwarg lets a worker process a pre-sharded slice of
slides without re-running discovery.
Processor now uses ExitStack for slide context management, so a failure
during init releases the slides that were already opened.
mpp lookup from --custom_list_of_wsis now respects per-slide ordering via
the wsi column (previously it was a positional list and could mismatch
mpp to the wrong slide).

7. Documentation

A full pass to surface the actually-useful features instead of generic bullet
points:

docs/index.rst: new "Highlights" section organized as Pipeline / Scale /
Reliability / Models / Formats / Operability.
docs/quickstart.rst: rewrote into a working reference with sections on
outputs, resume / skip behavior, multi-GPU + multi-worker, caching, stage-only
examples, common failure modes, and a tight cheat-sheet table; the
auto-generated parser help is still included verbatim at the bottom.
docs/installation.rst: documents .[czi] and .[omezarr] extras,
expands trident-doctor usage (profiles, --check-gated, --format json for
CI), and clarifies what .[full] does and doesn't include.
docs/tutorials.rst: new recipes for multi-GPU production runs and
resuming after a crash.
docs/api.rst: adds KEEP and GenBio-PathFM rows; new "Notes for
power users" section on resume / lock cleanup / multi-GPU from Python.
docs/faq.rst: documents --clear_dead_locks and adds a multi-GPU FAQ
entry.
docs/generated/run_batch_of_slides_help.txt regenerated (now includes
--gpus, --clear_dead_locks, --dead_lock_max_age_hours, keep).
README.md: rewritten Key Features list (specific encoder names, multi-GPU
details, cache pipeline, run reports, lock cleanup); fixed wrong output paths
(./trident_processed/20x_256px/... → ./trident_processed/20x_256px_0px_overlap/...).

Sphinx build is clean (no warnings).

Test coverage

14 new test files, ~2.6k lines added.

Tier	New tests
Fast unit	`test_run_batch_of_slides.py` (lock cleanup, dead-lock cleanup, GPU dedup, parser); `test_processor_selected_wsi_paths.py`; `test_processor_czi_discovery.py`; `test_summary_md.py`; `test_wsi_states_v2.py`
CZI reader	`test_czi_reader.py`; `test_czi_huggingface_feature_extraction.py`
Multi-GPU equivalence (mocked)	`test_multi_gpu_equivalence_patch_encoders.py` (`uni_v1`, `conch_v1`)
Multi-GPU equivalence (real)	`test_real_multi_gpu_equivalence.py` — actually launches subprocesses on `cuda:0` and `cuda:1`, processes 2 real WSIs end-to-end, asserts content-identical coords + features against the single-GPU run
Heavy real-data integration	`test_run_batch_of_slides_integration_outputs.py` — 7 tests covering: exact UNI v1 first-embedding values, idempotent re-runs, `--task all` ≡ `seg → coords → feat`, single-worker ≡ multi-worker (CPU), coords determinism, `--wsi_cache` ≡ non-cache, `--dump_patches` count

Verified locally

Fast tier: 64 passed, 45 skipped (intentionally gated).
Integration tier (TRIDENT_RUN_INTEGRATION_TESTS=1): 69 passed.
GPU tier (TRIDENT_RUN_GPU_TESTS=1): 15 passed.
Heavy real-data tier: 7 passed (~2.5 min on CPU).
Real multi-GPU stress test (2× CUDA): 1 passed (~30 s).
Combined full sweep: 80 passed, 0 failures, 0 flakes.

Sphinx build: clean.

Migration notes

--gpu is still supported but --gpus is preferred. If both are passed,
--gpus wins and TRIDENT prints a one-line deprecation warning.
.lock cleanup is no longer automatic on startup. If you previously
relied on TRIDENT wiping locks for you, add --clear_dead_locks to the next
run. Existing scripts that don't pass the flag are now safer (they won't
step on another job's locks) but may need this one-time migration.
--wsi_cache directory is still wiped at startup (unchanged).
Output directory name unchanged: features still land under
<job_dir>/<mag>x_<patch>px_<overlap>px_overlap/features_<encoder>/<slide>.h5
— the README previously documented the wrong path; the actual code is
unchanged.

guillaumejaume · 2026-04-20T17:13:46Z

@copilot resolve the merge conflicts in this pull request

guillaumejaume · 2026-04-21T15:59:08Z

@winglet0996, could you check this PR? Let me know if critical things are missing. thx!

winglet0996 · 2026-04-21T16:31:11Z

@guillaumejaume Thanks for the excellent engineering! I think everything works quite well and I just noticed the docs need update accordingly? Much appreciated!

guillaumejaume added 2 commits April 15, 2026 17:57

feat: czi wsi reader

9d21fd3

feat: support dcm from openslide

1ebfeca

guillaumejaume marked this pull request as draft April 15, 2026 16:12

guillaumejaume added 5 commits April 16, 2026 09:36

feat: wsi state and summary

6320c62

docs: improve readme

bba10b4

docs: improve readme

4e91047

docs: improve faq

22ee223

Fix formatting and clarify optional install profiles

b12135d

Merge branch 'main' into gja/feature/slidereaders

fa9557c

winglet0996 mentioned this pull request Apr 20, 2026

multigpu_v2 #208

Merged

winglet0996 and others added 4 commits April 21, 2026 13:25

multigpu_v2 (#208)

20ac88e

test: increase coverage of integration tests

2fbc21f

feat: handle dead locks

559843f

fix: better handling of locks

d4b2d22

docs: improve

df6ad90

guillaumejaume changed the title ~~Gja/feature/slidereaders~~ Release 0.3.0 May 5, 2026

guillaumejaume marked this pull request as ready for review May 5, 2026 13:47

guillaumejaume merged commit e0dbde6 into main May 5, 2026
2 checks passed

guillaumejaume deleted the gja/feature/slidereaders branch May 29, 2026 12:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.3.0#204

Release 0.3.0#204
guillaumejaume merged 13 commits into
mainfrom
gja/feature/slidereaders

guillaumejaume commented Apr 15, 2026 •

edited

Loading

Uh oh!

guillaumejaume commented Apr 20, 2026

Uh oh!

guillaumejaume commented Apr 21, 2026

Uh oh!

winglet0996 commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

guillaumejaume commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Multi-format readers, multi-GPU pipeline, run tracking, and lock hardening

TL;DR

What's new

1. New WSI readers

2. Multi-GPU / multi-CPU pipeline

3. Lock hardening

4. Run tracking and per-slide state

5. Patch / slide encoders

6. Processor API

7. Documentation

Test coverage

Verified locally

Migration notes

Uh oh!

guillaumejaume commented Apr 20, 2026

Uh oh!

guillaumejaume commented Apr 21, 2026

Uh oh!

winglet0996 commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

guillaumejaume commented Apr 15, 2026 •

edited

Loading

6. `Processor` API