Fix RescaleIntensity on images larger than 2**24 voxels by fepegar · Pull Request #1474 · TorchIO-project/torchio

fepegar · 2026-06-14T13:25:35Z

Description

torch.quantile raises RuntimeError: quantile() input tensor is too large for inputs with more than 2**24 (16,777,216) elements. RescaleIntensity/Normalize computes its input range with torch.quantile, so it fails on high-resolution volumes. For example:

import torchio as tio
t1 = tio.datasets.Colin27(2008).t1            # ~57M voxels (362x434x362)
tio.RescaleIntensity(out_min=0, out_max=1)(t1)
# RuntimeError: quantile() input tensor is too large

This replaces the torch.quantile call in _percentile_range with a small _quantile helper built on torch.kthvalue, which has no 2**24-element limit and is much faster on large tensors. The helper computes a single quantile of a 1D tensor with linear interpolation to reproduce the default torch.quantile behaviour, validates that q is within [0, 1], and its docstring links to @ego-thales' solution from pytorch/pytorch#157431 (comment). Results are exact for all tensor sizes (no subsampling/approximation), so behaviour is unchanged for tensors that previously worked.

TestQuantile adds coverage for parity with torch.quantile on a small tensor, the ValueError for out-of-range q, an interior quantile on a >2**24 tensor, and a full RescaleIntensity run on an image that exceeds the limit. Lint, format, and tests pass.

Checklist

I have read the CONTRIBUTING docs and have a developer setup ready
Changes are
- Non-breaking (would not break existing functionality)
- Breaking (would cause existing functionality to change)
Tests added or modified to cover the changes
In-line docstrings updated
Documentation updated
This pull request is ready to be reviewed

torch.quantile raises "input tensor is too large" for inputs with more than 2**24 elements, so RescaleIntensity/Normalize failed on high-resolution volumes (e.g. Colin27(2008), ~57M voxels), even with the default 0/100 percentiles. Route percentile computation through a helper that returns min/max for the 0 and 100 endpoints (exact, and avoids torch.quantile for the common default) and estimates interior quantiles of oversized tensors from a deterministic strided subsample.

for more information, see https://pre-commit.ci

github-actions · 2026-06-14T13:26:12Z

📖 Docs Preview

Preview of the documentation for this PR:

🔗 https://smokeshow.helpmanual.io/21612l3n3o356s4l2i24/

_{Built from 54e89a3}

Copilot

Pull request overview

This pull request updates Normalize/RescaleIntensity to avoid torch.quantile() failures on very large images (> 2**24 voxels) by introducing a _quantile helper that uses exact min/max for endpoint percentiles and a deterministic strided subsample for interior percentiles. This keeps default behavior working on high-resolution volumes while preserving existing behavior for tensors under the PyTorch quantile size limit.

Changes:

Route percentile computation through a new _quantile() helper that avoids torch.quantile() on 0%/100% and subsamples oversized tensors for interior quantiles.
Add tests that exercise endpoint percentiles, interior percentiles on oversized tensors, and a full RescaleIntensity call on a large tensor.
Normalize CNAME formatting (no functional code impact).

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File	Description
`src/torchio/transforms/intensity/normalize.py`	Adds `_quantile()` and uses it in `_percentile_range()` to handle tensors exceeding PyTorch’s quantile size limit.
`tests/test_normalize.py`	Adds coverage for large-tensor percentile behavior and a `RescaleIntensity` regression test.
`CNAME`	Trims/normalizes formatting of the domain entry.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Cast the (possibly subsampled) values to float only after striding, so an oversized non-float32 tensor never materializes a full-size float copy.

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

fepegar · 2026-06-14T22:20:48Z

+    if sample.numel() > _MAX_QUANTILE_ELEMENTS:
+        step = sample.numel() // _MAX_QUANTILE_ELEMENTS + 1
+        sample = sample[::step]


[Generated by a coding agent]

Fixed in 64e1559. Subsampling now targets ~1e6 values (_QUANTILE_SUBSAMPLE_SIZE = 1_000_000), well under the 2**24 limit, so torch.quantile runs on at most ~1M elements and stays fast.

fepegar · 2026-06-14T22:20:49Z

+    def test_percentile_range_interior_subsamples(self) -> None:
+        from torchio.transforms.intensity.normalize import _percentile_range
+
+        values = torch.linspace(0.0, 100.0, self.LIMIT + 1000).reshape(1, -1, 1, 1)
+        low, high = _percentile_range(values, None, 25.0, 75.0, "t1")
+        assert 24.0 < low < 26.0
+        assert 74.0 < high < 76.0


[Generated by a coding agent]

Fixed in 64e1559. The interior-subsample test now monkeypatches _QUANTILE_SUBSAMPLE_SIZE to a small value and uses a small tensor, so it exercises the strided-subsample branch quickly. The >2**24 integration is still covered by the endpoint and full RescaleIntensity tests (which use the min/max path).

Subsample oversized inputs to ~1e6 values (well under torch.quantile's 2**24 limit) so percentile estimation stays fast and low-memory on high-resolution volumes. Exercise the subsample branch in tests via a small monkeypatched cap instead of building a >2**24 tensor.

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

fepegar · 2026-06-14T22:29:27Z

+    if q <= 0:
+        return float(values.min().item())
+    if q >= 1:
+        return float(values.max().item())


[Generated by a coding agent]

Fixed in 10ea2fd. The fast path now only triggers for exact q == 0 / q == 1 (min/max). Out-of-range percentiles fall through to torch.quantile, which validates q and raises as before. Added a regression test (test_invalid_percentile_still_raises).

fepegar · 2026-06-14T22:29:28Z

+    if sample.numel() > _QUANTILE_SUBSAMPLE_SIZE:
+        step = sample.numel() // _QUANTILE_SUBSAMPLE_SIZE + 1
+        sample = sample[::step]
+    return float(torch.quantile(sample.float(), q).item())


[Generated by a coding agent]

This is a deliberate trade-off (the previous review round asked to cap the work going into torch.quantile at ~1e6 to keep it fast and bounded), so subsampling above ~1e6 stays. I documented the approximation explicitly at the public Normalize API: the 0/100 endpoints are exact (min/max), and interior percentiles of very large images are estimated from a deterministic subsample (10ea2fd).

Treat only q==0 and q==1 as exact min/max endpoints so out-of-range percentiles still fall through to torch.quantile and raise (instead of being silently clamped). Document at the public Normalize API that interior percentiles of very large images are estimated from a subsample.

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

Build the oversized input with zeros and a single max voxel and assert sentinel voxels, avoiding extra full-tensor min/max reductions.

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

fepegar · 2026-06-14T22:48:18Z

+    if sample.numel() > _QUANTILE_SUBSAMPLE_SIZE:
+        step = sample.numel() // _QUANTILE_SUBSAMPLE_SIZE + 1
+        sample = sample[::step]


[Generated by a coding agent]

Fixed in a7d7fb6. Switched the subsample stride to ceiling division (step = -(-sample.numel() // _QUANTILE_SUBSAMPLE_SIZE)), so the strided subsample stays as close to the cap as possible while never exceeding it (the old numel // target + 1 undersampled, e.g. ~2/3 of the cap for exact multiples). Added a test asserting the subsample size stays within the cap.

`numel // target + 1` undersamples (e.g. ~2/3 of the cap for exact multiples). Ceiling division keeps the subsample as close to the target size as possible while still never exceeding it.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

fepegar · 2026-06-14T22:53:37Z

+    def test_quantile_subsample_stays_within_cap(
+        self,
+        monkeypatch: pytest.MonkeyPatch,
+    ) -> None:
+        # The strided subsample must never exceed the target size, including
+        # for exact multiples of the cap.
+        from torchio.transforms.intensity import normalize as norm
+
+        target = 1000
+        monkeypatch.setattr(norm, "_QUANTILE_SUBSAMPLE_SIZE", target)
+        for numel in (target + 1, 2 * target, 3 * target, 10 * target + 7):
+            values = torch.arange(numel, dtype=torch.float32)
+            step = -(-values.numel() // target)
+            assert values[::step].numel() <= target
+


[Generated by a coding agent]

Fixed in cec0d4d. The test now monkeypatches torch.quantile with a spy that records the numel() of the tensor actually passed in, and asserts every recorded size stays within the cap. This fails if _quantile() changes its subsampling strategy (or stops subsampling) rather than re-implementing the stride math.

fepegar · 2026-06-14T22:53:39Z

+
+            Note:
+                The `0` and `100` percentiles are computed exactly (as
+                the min and max). For very large images, *interior*
+                percentiles are estimated from a deterministic strided
+                subsample to keep the computation fast and bounded in
+                memory.


[Generated by a coding agent]

Fixed in cec0d4d. Flattened the nested Note: block into plain parameter text under percentile_high to avoid docstring-parser ambiguity and keep Note: reserved for top-level sections.

- Subsample-cap test now spies on torch.quantile to assert the actual tensor length passed in stays within the cap, so it fails if the subsampling strategy changes. - Out-of-range quantile test accepts ValueError as well as RuntimeError, since torch's exception type is not guaranteed across versions. - Flatten the nested Note: block in the percentile_high docstring into plain parameter text for consistent rendering.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

fepegar · 2026-06-14T23:03:24Z

+        data = torch.zeros(self.LIMIT + 1000).reshape(1, -1, 1, 1)
+        data[0, 0, 0, 0] = 1000.0
+        result = tio.RescaleIntensity(out_min=0, out_max=1)(tio.ScalarImage(data))
+        assert result.data[0, 0, 0, 0].item() == pytest.approx(1.0, abs=1e-4)
+        assert result.data[0, 1, 0, 0].item() == pytest.approx(0.0, abs=1e-4)


[Generated by a coding agent]

Fixed in 54f7b14. The test now passes copy=False so the oversized ScalarImage is not deep-copied before transforming, removing the extra full-size allocation while keeping the end-to-end coverage.

Pass copy=False so the >2**24-element ScalarImage is not deep-copied before transforming, reducing the test's memory footprint.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Extract _subsample_for_quantile and add a subsample_size parameter to _quantile and _percentile_range (defaulting to _QUANTILE_SUBSAMPLE_SIZE). Tests now inject a small cap directly and assert the cap bound on the real production helper, removing the module-constant monkeypatch and the torch.quantile spy.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

fepegar · 2026-06-14T23:27:51Z

+    if values.numel() <= target:
+        return values
+    step = -(-values.numel() // target)
+    return values[::step]


[Generated by a coding agent]

Fixed in 65c3091. _subsample_for_quantile now raises a clear ValueError for a non-positive target (instead of a ZeroDivisionError or odd negative-stride behaviour) now that subsample_size is caller-exposed. Added a parametrized test for target in {0, -1}.

Now that subsample_size is exposed to callers, _subsample_for_quantile raises ValueError for a non-positive target instead of a ZeroDivisionError or odd negative-stride behaviour.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

fepegar · 2026-06-14T23:35:37Z

+        # Exercise the subsample branch quickly via a small cap passed
+        # through the public parameter (no monkeypatching).


[Generated by a coding agent]

Fixed in 9d96e4a. Reworded the comment to say the cap is injected through the helper's subsample_size argument, avoiding any implication that _percentile_range is public API.

_percentile_range is an internal helper, not public API; reword the comment to say the cap is injected through its subsample_size argument.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

fepegar · 2026-06-15T20:44:29Z

@copilot undo all changes and use this instead (add in the docstring a link to @ego-thales' solution): pytorch/pytorch#157431 (comment)

Copilot · 2026-06-15T20:56:21Z

@copilot undo all changes and use this instead (add in the docstring a link to @ego-thales' solution): pytorch/pytorch#157431 (comment)

[Generated by a coding agent]

Done in c672080. Reverted the previous strided-subsample approach and replaced it with @ego-thales' torch.kthvalue-based solution: a small _quantile helper now computes percentiles with kthvalue (no 2**24-element limit) and linear interpolation, and its docstring links to pytorch/pytorch#157431 (comment). _percentile_range routes through it, and TestQuantile covers parity with torch.quantile, invalid q, an interior quantile on a >2**24 tensor, and an end-to-end RescaleIntensity.

fepegar and others added 2 commits June 14, 2026 14:25

[pre-commit.ci] auto fixes from pre-commit.com hooks

618f2ec

for more information, see https://pre-commit.ci

fepegar requested a review from Copilot June 14, 2026 21:56

Copilot started reviewing on behalf of fepegar June 14, 2026 21:57 View session