Skip to content

Docker one-command WSI TIL inference + smoke test#3

Open
YoniSchirris wants to merge 9 commits into
mainfrom
claude/docker-wsi-inference-2KL7d
Open

Docker one-command WSI TIL inference + smoke test#3
YoniSchirris wants to merge 9 commits into
mainfrom
claude/docker-wsi-inference-2KL7d

Conversation

@YoniSchirris

@YoniSchirris YoniSchirris commented May 26, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Adds a one-command, Dockerised pipeline to run end-to-end ECTIL TIL inference on whole-slide images: tissue mask (FESI) → DLUP tiling → RetCCL feature extraction → MeanMIL + GatedAttention → TIL score. Supports a single slide or a directory of slides, writes a timestamped run dir with per-slide artifacts (tils_score.json, tile_predictions.csv, features.h5, thumbnail/mask/heatmap PNGs) and an aggregate tils_scores.csv.
  • Fixes the Docker build: pins setuptools/wheel/packaging as a matched set with pip==23.3.2 (the editable install otherwise died on canonicalize_version(strip_trailing_zero=)), and relaxes python_requires from ==3.10.9 to >=3.10,<3.11 so conda-resolved 3.10.x patches install.
  • Adds --overwrite-mpp so slides without an embedded spacing (many TCGA SVS) can be opened/tiled instead of raising dlup's UnsupportedSlideError. Off by default (None); when set, it supplies the slide's native base spacing so dlup can resample to the model's target --mpp 0.5 (the value used in the extraction experiments — see configs/datamodule/encoder/retccl.yaml). For TCGA 40x diagnostic slides that native spacing is 0.25.
  • Adds tools/infer/run_demo.sh: a standalone smoke test (plain bash, no external tooling) that downloads the RetCCL + ECTIL weights and 5 TCGA-BRCA slides, builds the image, runs both single-slide and directory modes on CPU, and asserts the outputs before printing SMOKE TEST PASSED/FAILED.

Verification

Smoke test passed end-to-end on CPU (both modes, all assertions green). Real TIL scores on 5 TCGA-BRCA slides:

Slide TIL score Tiles
TCGA-AC-A23G 5.3% 179
TCGA-OL-A5RW 10.3% 102
TCGA-OL-A5RX 7.9% 205
TCGA-OL-A5RZ 24.0% 266
TCGA-OL-A5S0 12.7% 147

Test plan

This is a single command. It downloads everything, builds the container, runs both inference modes, and validates the output. No weights, no slides, and no Python deps need to be set up by hand.

1. Clone and enter the repo

git clone https://github.com/NKI-AI/ectil
cd ectil

You need Docker running, plus curl and python3 on the host (the latter only to bootstrap gdown and parse the result JSON — all the heavy C deps live inside the image). The script refuses to start and tells you exactly what's missing if not.

2. Run the one command

./tools/infer/run_demo.sh

That's it. Everything below happens automatically, and it's idempotent — re-running skips anything already downloaded.

3. What you should see scroll past

The script narrates 7 stages. Roughly:

==> [1/7] Pre-flight checks ...
    docker, curl present; daemon running.
==> [2/7] RetCCL encoder weights ...        # ~94 MB from Google Drive
    -> model_zoo/retccl/retccl_best_ckpt.pth
==> [3/7] ECTIL classifier weights ...      # ~4.7 MB
    -> model_zoo/ectil/tcga/fold_0/epoch_065_step_858_weights_only.ckpt
==> [4/7] TCGA-BRCA slides (5 total) ...    # downloaded to data/wsi/demo/*.svs
    have 5 slides in data/wsi/demo.
==> [5/7] Building Docker image 'ectil-inference' (cached after first build) ...
==> [6/7] Running SINGLE-SLIDE mode (...) ...   # --wsi <one .svs>  -> 1 result
==> [6/7] Running DIRECTORY mode (all 5 slides ...) ...   # --wsi data/wsi/demo -> 5 results
==> [7/7] Validating outputs ...
    [ok]   tils_scores.csv has 1 data row(s)
    [ok]   5 per-slide subdir(s)
    [ok]   score sane for TCGA-...: 0.xx
    ...

The two [6/7] runs are the point of the test: single-slide (--wsi <one .svs>, expect 1 row) and directory (--wsi data/wsi/demo, expect 5 rows) modes.

4. What "pass" looks like

It ends with a per-slide score table and:

============================================================
SMOKE TEST PASSED
TIL score (single-slide TCGA-...): 0.xxxx   |   run dir: data/inference_output/demo_single

and exits 0. On any problem it prints SMOKE TEST FAILED with the offending [FAIL] lines and exits 1.

5. Inspect the artifacts (optional)

ls data/inference_output/demo_dir/          # config.json, tils_scores.csv, 5 per-slide subdirs
cat data/inference_output/demo_dir/tils_scores.csv

Each per-slide subdir holds: tils_score.json, tile_predictions.csv, features.h5, and thumbnail.png / mask.png / mask_overlay.png / attention_heatmap.png / til_heatmap.png. Spot-check a til_heatmap.png against the slide to sanity-check the spatial predictions.

Everything downloaded (data/wsi, model_zoo/**) and written (data/inference_output) is gitignored.


Requesting @NUltee as reviewer (recent commits on main).

🤖 Generated with Claude Code

claude and others added 5 commits May 26, 2026 08:44
Provide a single end-to-end entry point (ectil/inference.py) that takes a WSI
and ECTIL classifier weights and runs mask -> tiling -> RetCCL -> ECTIL,
auto-loading RetCCL. Reuses the existing DLUP tiling/FESI mask, RetCCL encoder,
and MeanMIL + GatedAttention components so results match extract.py + eval.py.

Per slide it writes the final TIL score (tils_score.json), per-tile TIL and
attention scores (tile_predictions.csv), the generated feature dataset
(features.h5), thumbnail/mask/mask-overlay images, and attention/TIL heatmaps.

Add a Dockerfile (conda-based, mirrors README install), .dockerignore, an
example run script, and README usage. Weights are mounted at runtime.

https://claude.ai/code/session_018mX7wRvMnm23m4uf44Upq8
--wsi now accepts a directory of slides (recursively globbed by extension,
including .mrxs, whose companion data directory is never matched). Slides that
fail are skipped and recorded rather than aborting the run. RetCCL and the ECTIL
classifier are loaded once and reused across all slides.

Each run writes a timestamped <output>/<run_name>/ directory (override with
--run-name) containing config.json, an aggregate tils_scores.csv (one row per
slide, written incrementally), and a per-slide subdir. Per-slide tils_score.json
now embeds the full run config.

https://claude.ai/code/session_018mX7wRvMnm23m4uf44Upq8
The editable install failed inside the image because pip 23.3.2 was paired
with a setuptools whose _core_metadata calls canonicalize_version with
strip_trailing_zero, a kwarg the resolved packaging lacked. Pin
setuptools/wheel/packaging as a matched set. Also relax python_requires from
==3.10.9 to >=3.10,<3.11 so conda-resolved 3.10.x patch releases install.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add --overwrite-mpp so TCGA-style SVS that lack an embedded micron-per-pixel
can be opened and tiled (dlup otherwise raises UnsupportedSlideError);
forwarded to both SlideImage.from_file_path and from_standard_tiling. Decode
the slide thumbnail a single time and reuse it for the saved PNG and both
heatmap overlays, hoist csv/PIL imports, and derive --mask-function choices
from AvailableMaskFunctions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
run_demo.sh downloads the RetCCL encoder and ECTIL classifier weights and five
TCGA-BRCA slides, builds the Docker image, runs both single-slide and directory
inference on CPU, and asserts the expected per-slide outputs before printing a
SMOKE TEST PASSED/FAILED verdict. Ignore the demo's data/inference_output run dir.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@YoniSchirris YoniSchirris requested a review from NUltee May 26, 2026 13:49
YoniSchirris and others added 2 commits May 26, 2026 15:57
Lead with WSI inference (smoke test, Docker, direct), move the manuscript-
reproduction details into a collapsible section, and surface run_demo.sh.
Fix broken example commands (missing line-continuation backslashes in the
extract and eval snippets), a malformed markdown link, the clone URL
(YoniSchirris -> nki-ai, also in setup.py), and note --overwrite-mpp for
TCGA slides that lack an embedded spacing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
continuumio/miniconda3:latest was bumped to conda 26.x on 2026-04-29.
On a clean pull the conda-forge solve for the WSI libs (openslide /
pixman / libvips) could swap python for graalpy, after which
`pip install torch==2.4.1` fails because torch has no GraalPy wheels:

    Could not find a version that satisfies the requirement torch==2.4.1

Mac developers didn't see this because they had the old :latest cached.

- Pin the base image to continuumio/miniconda3:24.11.1-0.
- Constrain `python=3.10.9` to the *_cpython build at conda create time
  and write `python 3.10.9` to conda-meta/pinned so the second
  conda install can't swap CPython out when resolving the WSI deps.
- Add .gitattributes (eol=lf for *.sh / Dockerfile) so shell scripts
  don't get CRLF on Windows checkouts and break bash.
@YoniSchirris

Copy link
Copy Markdown
Collaborator Author

@NUltee — pushed dbb4868 which should fix the smoke-test failure you hit on Windows at stage 5/7 (Could not find a version that satisfies the requirement torch==2.4.1, with the frozen graalpy pip hook warning).

Root cause: continuumio/miniconda3:latest was bumped to conda 26.x on 2026-04-29. On your clean Windows pull, the conda-forge solve for the WSI libs (openslide / pixman / libvips) was swapping CPython for GraalPy in the env, after which pip install torch==2.4.1 fails because torch ships no GraalPy wheels on PyPI. You were right that Windows-vs-Mac shouldn't matter for a containerised build — what actually differed was that my Mac had the old :latest cached and yours pulled the new one. Mutable tag, not OS.

Fix:

  • Pinned the base image to continuumio/miniconda3:24.11.1-0.
  • Constrained python=3.10.9 to the *_cpython build at conda create time and wrote python 3.10.9 to conda-meta/pinned as belt-and-braces so a future solver can't swap CPython out when resolving the WSI deps.
  • Added .gitattributes enforcing eol=lf on *.sh and Dockerfile so a Windows checkout with core.autocrlf=true doesn't give you CRLF shell scripts that bash refuses to run.

Could you do a git pull && docker build --no-cache -t ectil-inference . and re-run ./tools/infer/run_demo.sh? If it still trips, grab the build log around the failing pip install — specifically python --version and which python — and I'll dig further.

Adds .github/workflows/docker.yml: on push to main and on v* tags,
build linux/amd64 and push to ghcr.io/nki-ai/ectil-inference with
tags :latest (main), :vX.Y.Z (releases), and :sha-<short> (per
commit). PRs that touch the Dockerfile / requirements / workflow
trigger a build-only validation (no push).

Updates the docs and the runnable wrapper to prefer pulling the
published image over building locally, while keeping the local-build
path as a one-liner alternative. The local-build smoke test
(tools/infer/run_demo.sh) is intentionally left building from source
so it keeps validating the Dockerfile end-to-end.

Note: the first successful push from main creates the package under
the nki-ai org and defaults to private; flip it to public in the
GitHub package settings to let unauthenticated users docker pull.
@YoniSchirris

Copy link
Copy Markdown
Collaborator Author

Follow-up: pushed 85a8f09 adding a GHCR publish workflow so future users don't need to build at all.

What's in it:

  • .github/workflows/docker.yml — on push to main and on v* tags, builds linux/amd64 and pushes to ghcr.io/nki-ai/ectil-inference with tags :latest (main), :vX.Y.Z (releases), and :sha-<short> (per commit). PRs that touch the Dockerfile / requirements / the workflow itself trigger a build-only validation (no push), so this PR will exercise the workflow once it kicks off.
  • Updated README.md, Dockerfile header, and tools/infer/infer_docker.sh to prefer docker pull ghcr.io/nki-ai/ectil-inference:latest, with the local docker build kept as the fallback. tools/infer/run_demo.sh is left building locally on purpose — it's the end-to-end validator for the Dockerfile itself.

Heads-up for whoever merges:

  1. The first push from main creates the package under the nki-ai org and defaults to private. To let unauthenticated users docker pull, flip it to public at https://github.com/orgs/NKI-AI/packagesectil-inference → Package settings → Change visibility.
  2. If org settings restrict Actions from creating packages, the publish step will 403. Fix at Org settings → Actions → General → "Workflow permissions" → ensure GITHUB_TOKEN has packages: write (the workflow already requests it).
  3. Single-arch (linux/amd64) only. Apple-silicon Macs will pull via Rosetta/qemu, which works for CPU inference but is slow; happy to add linux/arm64 if anyone needs it natively.

Without this the PR build is build-only, which means anyone reviewing
this PR still has to docker build locally - defeating the point of
publishing the image in the first place.

Same-repo PRs now push a :pr-<number> tag so reviewers can
`docker pull ghcr.io/nki-ai/ectil-inference:pr-3` to exercise the
branch's container without building. Forked PRs stay build-only
(no writable GITHUB_TOKEN; we don't want unreviewed fork code
pushing to our registry anyway).
@YoniSchirris

Copy link
Copy Markdown
Collaborator Author

Good catch from offline — pushed 0b4445a so this PR is actually testable via pull, not just post-merge.

Same-repo PRs now publish a :pr-<number> tag, so once the workflow on this PR finishes, @NUltee can:

docker pull ghcr.io/nki-ai/ectil-inference:pr-3
docker run --rm \
    -v /path/to/slides:/input:ro \
    -v /path/to/weights:/weights:ro \
    -v /path/to/output:/output \
    ghcr.io/nki-ai/ectil-inference:pr-3 \
        --wsi /input/slide.svs \
        --classifier-weights /weights/ectil_fold_0_weights_only.ckpt \
        --retccl-weights /weights/retccl_best_ckpt.pth \
        --output /output

Forked PRs stay build-only (no writable token from a fork, and we don't want unreviewed fork code pushing to the registry).

Two-step caveat for the very first run: the GHCR package doesn't exist yet, so this PR's workflow run will be the one that creates it under nki-ai and it'll default to private. To pull without auth, flip it to public at https://github.com/orgs/NKI-AI/packagesectil-inference → Package settings → Change visibility. After that the :pr-3 tag is pullable directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants