Skip to content

Verify content digest on the batch CAS download path#656

Open
evilgensec wants to merge 2 commits into
bazelbuild:masterfrom
evilgensec:verify-batch-download-digest
Open

Verify content digest on the batch CAS download path#656
evilgensec wants to merge 2 commits into
bazelbuild:masterfrom
evilgensec:verify-batch-download-digest

Conversation

@evilgensec

Copy link
Copy Markdown

What

BatchDownloadBlobsWithStats stores each blob returned by BatchReadBlobs keyed by the digest the server asserts in the response, without re-hashing the returned bytes:

res[digest.NewFromProtoUnvalidated(r.Digest)] = bi

There is no length check and no NewFromBlob(r.Data) == requested check, so a CAS that returns bytes which do not match the requested digest — a buggy or compromised server, a caching/forwarding proxy, or an on-path party on a non-TLS connection — is taken at face value, and the bytes are later written to disk by downloadBatch / downloadNonUnified.

The streamed path already guards against this: readBlobStreamed tees the stream through digest.NewFromReader and returns calculated digest %s != expected digest %s on mismatch. Because useBatchOps defaults to true and makeBatches groups every sub-MaxBatchSize blob into a batch, the unverified batch path is the common route for most blobs.

Change

Verify the returned (post-decompression) bytes hash to the requested digest before storing them, mirroring the streamed path, and key the result by the verified digest. Adds TestBatchDownloadBlobsDigestMismatch.

BatchDownloadBlobsWithStats stored each returned blob keyed by the server-asserted digest without re-hashing the returned bytes, so a CAS that returns data not matching the requested digest would be accepted and written to disk by the download paths. The streamed path (readBlobStreamed) already verifies the calculated digest against the expected one; this makes the batch path consistent by verifying the returned bytes hash to the requested digest before storing them. Adds a regression test.
Copilot AI review requested due to automatic review settings May 30, 2026 04:39
BatchDownloadBlobs now verifies that returned bytes hash to the requested
digest, so TestBatchReadBlobsIndividualRequestRetries failed: it served
blobs under fabricated digest.TestNew digests that never matched their
data, which the new check correctly rejects. Give the two data-returning
blobs real content-hash digests, make the per-request digest comparison
order-independent (the content-hash digests no longer sort in the
fabricated a,b,c,d order), and fail fast on a request-count mismatch
instead of panicking. Write-path tests are unaffected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant