Feature/ssb benchmark by kkirchheim · Pull Request #132 · kkirchheim/pytorch-ood

kkirchheim · 2026-05-12T13:51:15Z

No description provided.

Implements the Semantic Split Benchmark from "Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks" (IJCV 2024, arxiv 2408.16757). This is pytorch-ood's first fine-grained open-set recognition benchmark, evaluating OOD detectors on CUB-200-2011, Stanford Cars, and FGVC-Aircraft with semantic similarity-based splits (Easy=far-OOD, Hard=near-OOD). New files: - benchmark/img/ssb.py: Private dataset classes (_CUB200, _StanfordCars, _FGVCAircraft) and public benchmarks (CUB_SSB, StanfordCars_SSB, Aircraft_SSB). Includes load_ssb_splits() utility to fetch .pkl splits from the repo. - tests/test_benchmark_ssb.py: 25 unit tests covering dataset filtering, benchmark structure, and OSCR metric. - examples/benchmarks/example_ssb_cub.py: End-to-end usage example. New metric: - utils/metrics.py: oscr_score() for Open-Set Classification Rate. Measures joint accuracy on known classes + rejection of unknown classes. Also added proper __all__ export for public functions (calibration_error, aurra, fpr_at_tpr, oscr_score). Implementation details: - Datasets accept classes=[...] parameter for subsetting with label remapping. - CUB & Aircraft support auto-download via download=True. - Stanford Cars requires manual download (original host dead, guide provided). - All dataset implementations aligned with reference code from Visual-AI repo. - OSCR uses O(n log n) searchsorted for efficiency. Also: Added linting best practices to CLAUDE.md. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Add :no-index: directive to all autoclass blocks in dataset/img/__init__.py to suppress duplicate documentation warnings. This tells Sphinx to use the source module as the primary documentation location instead of treating the __init__.py docstring as an alternative location. Also updated CLAUDE.md to document that docs should be built with the sphinx-39 conda environment. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

codecov · 2026-05-12T13:55:57Z

Codecov Report

❌ Patch coverage is 91.56627% with 21 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/pytorch_ood/utils/metrics.py	18.18%	18 Missing ⚠️
tests/test_benchmark_ssb.py	98.67%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

kkirchheim and others added 3 commits May 12, 2026 14:42

Draft for SSB benchmark

0ef45c0

kkirchheim merged commit 732a92e into dev May 12, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/ssb benchmark#132

Feature/ssb benchmark#132
kkirchheim merged 3 commits into
devfrom
feature/ssb-benchmark

kkirchheim commented May 12, 2026

Uh oh!

codecov Bot commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kkirchheim commented May 12, 2026

Uh oh!

codecov Bot commented May 12, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant