Skip to content

Non-deterministic decode output for 4-band RGBA JP2 (uninitialized memory) #410

@boxerab

Description

@boxerab

Summary

Decoding a 4-band (RGBA) JP2 to TIFF produces different output bytes on each run of the same command, independent of thread count. The variation is small (~10 bytes / a handful of pixels) but reproducible.

Version

libgrokj2k 20.3.4

Reproduction

Test file: stefan_full_rgba.jp2 (RGBA; e.g. from the GDAL autotest corpus autotest/gdrivers/data/jpeg2000/stefan_full_rgba.jp2).

for i in 1 2 3; do grk_decompress -i stefan_full_rgba.jp2 -o /tmp/s$i.tif -H 1; md5sum /tmp/s$i.tif; done

Yields three different md5sums, e.g.:

d98619f19bf5f35e35c8634d3c33d522  /tmp/s1.tif
694ae60dd93d25242b1903c63d4651be  /tmp/s2.tif
01a00f468954f1c7ba4b516b22148683  /tmp/s3.tif

cmp -l s1.tif s2.tif shows ~10 differing bytes (first around file offset ~49904, in the pixel-data region).

Key observations (localizing the cause)

  • Reproduces at -H 1 (single-threaded) as well as -H 8. Single-threaded rules out a data race ⇒ this is uninitialized memory, not a concurrency bug.
  • The raw single-component PGM decode of the same data is deterministic; only the multi-band TIFF output varies. So the non-determinism is in the multi-component assembly / output path (composite or alpha handling), not the core wavelet/T1 decode.
  • Thread count does not change the result deterministically — same command run twice differs at a fixed -H.

Expected

Decode output is bit-for-bit deterministic for a given input and parameters.

Likely area

Multi-component / RGBA compositing or alpha-channel buffer not fully initialized before being written to the output image (a few edge/padding pixels).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions