This repository contains the full pipeline for detecting metallic implant anomalies in pelvic MR images using unsupervised deep learning. The work is based on the SynthRAD 2023 Pelvis dataset and evaluates ten anomaly detection model families in a one-class classification setting, where only normal (implant-free) scans are used for training.
Four controlled experiments were conducted:
| # | Research Question | Approach |
|---|---|---|
| 1 | Which anomaly detection model family best identifies metallic implant artifacts in pelvic MR under standardized conditions? | Benchmark 10 models across 5 families (Reconstruction, Knowledge Distillation, Memory Bank, Normalizing Flow, One-Class) on replicated-channel NIfTI slices from a single center |
| 2 | How does input representation affect anomaly detection — does synthetic color encoding or local 3D context improve localization over a plain replicated baseline? | Re-evaluate the best model per family across three input formats: replicated-channel NIfTI, bone colormap PNG, and 2.5D consecutive slices |
| 3 | Does replacing an ImageNet backbone with a RadImageNet backbone improve anomaly detection for pelvic MR? | Swap the frozen WRN50 (ImageNet) encoder in FastFlow for a ResNet50 pretrained on RadImageNet; evaluate across all three input formats |
| 4 | Do the best single-center configurations generalize to a multi-center setting with scanner and protocol variability? | Apply the top model–format combination per family from Exp 2 to a multi-center dataset (centers A + C) and compare against single-center performance |
Each experiment builds on the previous: Exp 1 establishes the model family ranking; Exp 2 tests whether a different input format improves the best models; Exp 3 isolates the effect of a domain-specific backbone on FastFlow; Exp 4 evaluates generalizability of the best configurations to a multi-center cohort.
The full project pipeline spans four modules, from raw MR/CT volumes to final evaluation and visualization:
| Stage | Module | Description |
|---|---|---|
| 1 — Preprocessing | data-preprocessing/ |
Converts 3D MR/CT NIfTI volumes into normalised 2D slices (PNG or NIfTI). Generates binary anomaly labels via CT Hounsfield-Unit thresholding refined by the MR signal, and exports body masks for downstream filtering. |
| 2 — Training | model-training/ |
Unified training and feature-extraction pipeline for ten anomaly detection model families: Normalizing Flow (FastFlow, CFlow), Knowledge Distillation (RD4AD, STFPM), Memory Bank (PatchCore, CFA), One-Class (DeepSVDD, CutPaste), and Reconstruction (DRAEM, Dinomaly). All models were trained following the environment and run commands documented in SETUP.md. |
| 3 — Post-Processing | post_processing/ |
Refines raw binary prediction masks through body masking, morphological closing, and a 3D volumetric persistence filter. Computes pixel-, slice-, and patient-level evaluation metrics. |
| 4 — Visualization | visualizations/ |
Jupyter notebooks and scripts for dataset statistics, NIfTI channel diagnostics, mask refinement analysis, and standardised side-by-side model prediction comparisons. |
MR-OOD-Anomaly-Detection/
│
├── resources/ # Diagrams for this README
│ ├── experiments_overview.png
│ ├── experiment_design.png
│ └── pipeline_overview.png
│
├── data-preprocessing/ # Stage 1: MR/CT volume → 2D slice dataset
│ ├── README.md
│ ├── labels/ # CT HU stats, patient split assignments, OOD label files
│ └── scripts/src/ # Processing driver scripts + utility modules
│
├── model-training/ # Stage 2: Train and extract anomaly maps
│ ├── README.md
│ ├── SETUP.md
│ ├── train.py # Unified training entry point
│ ├── extract.py # Unified anomaly map + mask extraction
│ ├── config/ # Per-model YAML configurations
│ ├── models/ # Model registry (flow, kd, memory, recon)
│ ├── data/ # NIfTI → PNG dataset conversion
│ ├── scripts/ # setup / train / extract / run_pipeline shell scripts
│ ├── Deep-SVDD/ # BMAD Deep-SVDD implementation
│ └── pytorch-cutpaste/ # BMAD CutPaste implementation
│
├── post_processing/ # Stage 3: Refine masks + evaluate
│ ├── README.md
│ ├── main_pipeline.py # End-to-end pipeline entry point
│ ├── apply_bodymask.py # Body mask application
│ ├── filter_prediction_masks_consecutive.py # 3D persistence filter
│ ├── evaluate_model_outputs.py # Pixel / slice / patient metrics
│ ├── compute_pixel_metrics.py
│ ├── postprocess_utils.py
│ ├── morphology/ # Morphological processing + NIfTI reconstruction
│ ├── config/ # Morphology tuning configuration
│ └── results/ # Pipeline overview diagrams
│
├── visualizations/ # Stage 4: Reporting & analysis notebooks + scripts
│ ├── README.md
│ ├── visualize.py # CLI tool: 3-panel MR / GT / prediction plots
│ ├── visualize_processed_prediction_masks.py # Raw / masked / filtered mask panels
│ ├── visualize_processed_anomaly_maps.py # Body-masked anomaly map comparison
│ ├── visualize_anomaly_thresholded_outputs.py # Anomaly map + thresholded binary
│ ├── convert_to_bone_colormap.py # NIfTI → bone-colormap PNG
│ ├── channel_test.ipynb # NIfTI channel format diagnostics
│ ├── dataset_stats.ipynb # Patient / slice distribution charts
│ ├── mask_refinement.ipynb # Mask refinement procedure visualizations
│ ├── discussion_visualization.ipynb # Multi-model GT vs. prediction analysis
│ └── figures/ # Output figures
│
└── README.md # This file
bash model-training/scripts/setup.shSee model-training/SETUP.md for manual installation steps and dataset variant configuration.
# Single-center, PNG output (bone colormap)
python data-preprocessing/scripts/src/sc_dataset_processing_png.py \
--dir_pelvis /path/to/Task1/pelvis \
--dir_output /path/to/datasetSee data-preprocessing/README.md for all output formats (PNG, NIfTI replicated, NIfTI consecutive) and multi-center variants.
# Full pipeline: setup → train → extract (example: FastFlow)
bash model-training/scripts/run_pipeline.sh fastflow /path/to/dataset exp1 0
# Or step by step:
python model-training/train.py --config model-training/config/fastflow.yaml \
--data_root /path/to/dataset --name exp1_fastflow
python model-training/extract.py --config model-training/config/fastflow.yaml \
--checkpoint model-training/results/fastflow/exp1_fastflow/checkpoints/last.ckpt \
--output_dir /path/to/extract_outpython post_processing/main_pipeline.py \
--input-dir /path/to/extract_out/prediction_masks/test \
--body-mask-dir /path/to/dataset \
--output-root post_process_outputs \
--ground-truth-dir /path/to/dataset/testMetrics are written to post_process_outputs/metrics/metrics_summary.json. See post_processing/README.md for morphology tuning and 3D volume inspection.
# Side-by-side MR / ground truth / prediction comparison
python visualizations/visualize.py \
--mr_path /path/to/mr.nii.gz \
--predicted_mask_path /path/to/pred.nii.gz \
--ground_truth_path /path/to/gt.nii.gz \
--output_directory ./figures \
--model_name "FastFlow"See visualizations/README.md for the full notebook suite.
| Model | Family |
|---|---|
| FastFlow | Normalizing Flow |
| CFlow | Normalizing Flow |
| RD4AD | Knowledge Distillation |
| STFPM | Knowledge Distillation |
| PatchCore | Memory Bank |
| CFA | Memory Bank |
| DeepSVDD | One-Class |
| CutPaste | One-Class |
| DRAEM | Reconstruction |
| Dinomaly | Reconstruction |
The pipeline is built around the SynthRAD 2023 Pelvis dataset. The preprocessing step produces three dataset variants:
| Variant | Format | Description |
|---|---|---|
synth23_pelvis_v7_png |
PNG | Single-slice, bone colormap, 3-channel |
synth23_pelvis_v7_nifti_3ch_rep |
NIfTI | Single-slice replicated across 3 channels |
synth23_pelvis_v7_nifti_con |
NIfTI | Three consecutive slices (prev / current / next) |
All variants share the same folder layout expected by model-training/:
<dataset_root>/
├── train/good/
├── valid/good/img/
├── valid/Ungood/img/
├── valid/Ungood/label/
├── test/good/img/ # ID patient slices
├── test/good/bodymask/
├── test/Ungood/img/ # Only annotated OOD slices
├── test/Ungood/bodymask/
├── test/Ungood/label/ # Ground-truth masks for OOD slices
└── test/Ungood_whole_patient_scans/ # ALL slices of OOD patients (for patient-level eval)
├── img/
├── bodymask/
└── label/ # GT masks (nonzero on annotated slices, zero elsewhere)
Why two Ungood folders?
Ungood/contains only the slices with visible artifacts — used for pixel- and slice-level evaluation.Ungood_whole_patient_scans/contains every slice of the same patients — used for patient-level evaluation, where the system must decide from the full volume whether to flag the patient.
Each module ships its own requirements file:
# Training environment
pip install -r model-training/requirements-all.txt \
--extra-index-url https://download.pytorch.org/whl/cu124
# Post-processing environment
pip install -r post_processing/requirements.txtKey packages: anomalib==2.2.0, torch==2.6.0, nibabel==5.3.2, opencv-python==4.8.1.78, scipy==1.10.1, scikit-learn.
This project relies on several open-source tools and public datasets. Please cite them if you use this code.
All models (except DeepSVDD and CutPaste) are trained and evaluated through Anomalib.
@inproceedings{akcay2022anomalib,
title = {Anomalib: A Deep Learning Library for Anomaly Detection},
author = {Akcay, Samet and Ameln, Dick and Vaidya, Ashwin and
Lakshmanan, Barath and Ahuja, Nilesh and Genc, Utku},
booktitle = {2022 IEEE International Conference on Image Processing (ICIP)},
pages = {1706--1710},
year = {2022},
organization = {IEEE}
}The DeepSVDD and CutPaste implementations are adapted from the BMAD benchmark suite, whose taxonomy also guided our model selection.
@article{bao2023bmad,
title = {{BMAD}: Benchmarks for Medical Anomaly Detection},
author = {Bao, Jinan and Sun, Hanshi and Deng, Hanqiu and
He, Yinsheng and Zhang, Zhaoxiang and Li, Xingyu},
journal = {arXiv preprint arXiv:2306.11876},
year = {2023}
}Experiment 3 uses RadImageNet pretrained weights as a domain-specific backbone replacement for FastFlow.
@article{mei2022radimagenet,
title = {{RadImageNet}: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning},
author = {Mei, Xueyan and Liu, Zelong and Robson, Philip M. and Marinelli, Brett and
Huang, Mingqian and Doshi, Amish and Jacobi, Adam and Cao, Chendi and
Link, Katherine E. and Yang, Thomas and Wang, Ying and Greenspan, Hayit and
Deyer, Timothy and Fayad, Zahi A. and Yang, Yang},
journal = {Radiology: Artificial Intelligence},
volume = {4},
number = {5},
pages = {e210315},
year = {2022},
doi = {10.1148/ryai.210315}
}All experiments use the pelvis subset of the SynthRAD2023 challenge dataset.
@article{thummerer2023synthrad,
title = {{SynthRAD2023} Grand Challenge dataset: Generating synthetic {CT} for radiotherapy},
author = {Thummerer, Adrian and van der Bijl, Erik and Galapon Jr, Aubin and
Verhoeff, Joost J.C. and Langendijk, Johannes A. and Both, Stefan and
van den Berg, Cornelis A.T. and Maspero, Matteo},
journal = {Medical Physics},
volume = {50},
number = {7},
pages = {4664--4674},
year = {2023},
doi = {10.1002/mp.16529}
}

