Pool the dictionary buffer when training a Zstandard dictionary#129125
Open
alinpahontu2912 wants to merge 1 commit into
Open
Pool the dictionary buffer when training a Zstandard dictionary#129125alinpahontu2912 wants to merge 1 commit into
alinpahontu2912 wants to merge 1 commit into
Conversation
ZstandardDictionary.Train allocated 'new byte[maxDictionarySize]' on every call. Dictionary sizes are typically tens to hundreds of KB (zstd recommends up to ~100 KB, but the API allows more), so each training call paid for a fresh GC allocation that often landed on the LOH. Rent the buffer from ArrayPool<byte>.Shared instead. Create copies the trained slice into an exact-sized array before returning, so the rented buffer can be returned immediately. Use clearArray: true on Return because the trained dictionary is derived from caller-supplied samples and must not linger in the shared pool. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @dotnet/area-system-collections |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR reduces repeated large allocations in ZstandardDictionary.Train by renting the native training output buffer from ArrayPool<byte>.Shared instead of allocating a new byte[] every call, then returning the rented buffer after copying the trained dictionary into its own array.
Changes:
- Replace per-call
new byte[maxDictionarySize]withArrayPool<byte>.Shared.Rent(maxDictionarySize)for the native training output buffer. - Ensure the rented dictionary buffer is returned in a
finallyblock, currently usingclearArray: trueto avoid retaining caller-derived data in the pool.
Comment on lines
+152
to
+153
| // Clear before returning: the trained dictionary is derived from caller-supplied samples. | ||
| ArrayPool<byte>.Shared.Return(dictionaryBuffer, clearArray: true); |
rzikm
approved these changes
Jun 8, 2026
MihaZupan
reviewed
Jun 8, 2026
Comment on lines
+152
to
+153
| // Clear before returning: the trained dictionary is derived from caller-supplied samples. | ||
| ArrayPool<byte>.Shared.Return(dictionaryBuffer, clearArray: true); |
Member
There was a problem hiding this comment.
Suggested change
| // Clear before returning: the trained dictionary is derived from caller-supplied samples. | |
| ArrayPool<byte>.Shared.Return(dictionaryBuffer, clearArray: true); | |
| ArrayPool<byte>.Shared.Return(dictionaryBuffer); |
We practically never clear buffers outside of crypto, I don't see a reason to do it here
| ZstandardUtils.ThrowIfError(dictSize); | ||
| return Create(dictionaryBuffer.AsSpan(0, (int)dictSize)); | ||
| } | ||
| finally |
Member
There was a problem hiding this comment.
We already have a try/finally for the lengthsArray buffer, can we avoid even more nesting by reusing the existing blocks?
This was referenced Jun 8, 2026
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ZstandardDictionary.Trainallocatesnew byte[maxDictionarySize]on every call to hold the trained dictionary returned by the nativeZDICT_trainFromBuffer. Upstream zstd guidance puts a reasonable dictionary size at ~100 KB and the CLI defaults to 112,640 bytes — both above the .NET 85,000-byte LOH threshold. So for the recommended-and-up sizes, every training call costs a fresh Large Object Heap allocation that survives until the next gen2 collection. This change rents the buffer fromArrayPool<byte>.Sharedand returns it (withclearArray: true) once the trained dictionary has been copied into its own array.Why dictionary sizes are LOH-heavy
Upstream zstd (
zdict.h):CLI manpage (
zstd.1.md):--maxdict=#default is 112640 bytes.The managed wrapper repeats this guidance (
ZstandardDictionary.cs:85-87) and only guards the lower bound (ThrowIfLessThan(maxDictionarySize, 256, ...)at line 128) — there is no upper-bound check, so MB-sized buffers are also legal..NET LOH threshold is 85,000 bytes, so the recommended ~100 KB dictionary and the 112,640-byte CLI default both land on the LOH on every call.
Empirical evidence
I modelled the exact allocation pattern (
new byte[N]vsRent/Return) and measured per-call allocations withGC.GetAllocatedBytesForCurrentThread(Release build, 1,000 iterations, workstation GC, .NET 11 preview):new byte[]per callRentper call (steady state)For the recommended/default dictionary sizes, every
Traincall previously produced a ~100 KB LOH allocation that only a gen2 collection could reclaim. After this change, steady-stateTrainallocations for the dictionary buffer drop to zero.Reproducer
Program.cs(run from a scratchnet11.0console project,dotnet run -c Release):Output (workstation GC, single thread):
The 24-byte overhead per allocation is the 64-bit
byte[]object header (sync block + method table pointer + length).Rentreports 0 bytes/iter because the pool returns the same array each iteration after warm-up. The pre-PR path stays on the SOH for the first two rows and lands on the LOH for the last three.