[mono] Fix lock-free mempool chunk under-allocation by 8 bytes#129843
Open
pavelsavara wants to merge 3 commits into
Open
[mono] Fix lock-free mempool chunk under-allocation by 8 bytes#129843pavelsavara wants to merge 3 commits into
pavelsavara wants to merge 3 commits into
Conversation
lock_free_mempool_chunk_new sized each chunk by reserving sizeof(LockFreeMempoolChunk) (24 bytes on 64-bit), but chunk->mem is then aligned up to 16 bytes, so the usable region starts at offset 32 and a single-page chunk only has pagesize-32 usable bytes. A request whose length is congruent to -24 (mod pagesize) - e.g. 4072 on 4 KB pages, 16360 on 16 KB pages - produced a freshly allocated chunk whose size (pagesize-32) is smaller than the requested length, tripping g_assert (chunk->pos + size <= GINT_TO_UINT(chunk->size)); in lock_free_mempool_alloc0 and aborting the runtime. This pool is used on the async AOT unwind/exception-info decode path (mono_aot_get_unwind_info / decode_exception_debug_info with async==TRUE), which is driven by the EventPipe SampleProfiler stack walk, so the crash showed up intermittently as a SIGABRT during eventpipe tracing tests under Mono LLVM full-AOT on x64. Fix: reserve the 16-byte-aligned header size in the sizing loop so the chunk always has room for the request.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adjusts the lock-free mempool chunk sizing logic in Mono so the chunk-capacity guarantee matches the allocator’s actual usable payload after the chunk header is pointer-aligned. This prevents lock-free allocations from tripping the g_assert in lock_free_mempool_alloc0 when allocations land on specific page-boundary sizes.
Changes:
- Update the chunk-sizing loop in
lock_free_mempool_chunk_newto reserve a 16-byte-aligned header size (ALIGN_TO(sizeof(LockFreeMempoolChunk), 16)) instead of the rawsizeof(...). - Ensure the computed
chunk->size(derived fromchunk->memafterALIGN_PTR_TO(..., 16)) is always sufficient for the requested allocation size.
BrzVlad
approved these changes
Jun 25, 2026
lewing
approved these changes
Jun 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
lock_free_mempool_chunk_new(src/mono/mono/metadata/memory-manager.c) under-sizes each chunk by 8 bytes, which can trip the assert inlock_free_mempool_alloc0and abort the runtime.The sizing loop reserves
sizeof(LockFreeMempoolChunk)(24 bytes on 64-bit):But
chunk->memis then aligned up to 16 bytes, so the data starts at offset 32, and the usablechunk->sizeissize - 32— 8 bytes less than the loop guaranteed. For a request whose length is congruent to-24 (mod pagesize)— e.g. 4072 on 4 KB pages, 16360 on 16 KB pages — the fresh chunk (pos == 0) haschunk->size < len, so the assert inlock_free_mempool_alloc0:fires and aborts the runtime with a SIGABRT.
Why it showed up as a flaky eventpipe crash
This pool is used only on the async AOT unwind / exception-info decode path (
mono_aot_get_unwind_info/decode_exception_debug_info, theif (async)branch), which is driven by the EventPipe SampleProfiler stack walk. The crash therefore appeared intermittently as a SIGABRT during thetracing/eventpipe/providervalidationandrundownvalidationtests on the Mono LLVM full-AOT (x64) leg — it only triggers when the profiler asynchronously decodes a method whose unwind / EH-info blob happens to be exactly a boundary size. The code is pre-existing (it dates back many years) and is not specific to any recent change.Fix
Reserve the 16-byte-aligned header size (32) in the sizing loop so the usable
chunk->sizealways covers the request.mono_vallocreturns page-aligned memory, sochunk->memis always exactlychunk + 32; reservingALIGN_TO(sizeof(LockFreeMempoolChunk), 16)makes the loop's guarantee match the chunk's real capacity.Evidence
1. Deterministic reproduction of the under-sizing (standalone simulation of the exact
chunk_newmath):2. Reproduced the exact assert in a real Mono LLVM full-AOT runtime. An instrumented build plus a one-shot probe that requests
pagesize-24bytes from the real allocator (with the SampleProfiler active so the async path is live) reproduces the CI signature exactly:3. After the fix (same probe and SampleProfiler workload, instrumentation kept only to observe):
The boundary request now gets the next page-multiple (
8192 - 32 = 8160 >= 4072), and the SampleProfiler workload runs clean with all async allocations satisfied — including boundary-crossing ones (e.g.size=8208 -> chunk_size=12256).Note
This pull request was authored with the assistance of GitHub Copilot.