Skip to content

[mono] Fix lock-free mempool chunk under-allocation by 8 bytes#129843

Open
pavelsavara wants to merge 3 commits into
dotnet:mainfrom
pavelsavara:mono-lockfree-mempool-chunk-align
Open

[mono] Fix lock-free mempool chunk under-allocation by 8 bytes#129843
pavelsavara wants to merge 3 commits into
dotnet:mainfrom
pavelsavara:mono-lockfree-mempool-chunk-align

Conversation

@pavelsavara

Copy link
Copy Markdown
Member

Summary

lock_free_mempool_chunk_new (src/mono/mono/metadata/memory-manager.c) under-sizes each chunk by 8 bytes, which can trip the assert in lock_free_mempool_alloc0 and abort the runtime.

The sizing loop reserves sizeof(LockFreeMempoolChunk) (24 bytes on 64-bit):

size = mono_pagesize ();
while (size - sizeof (LockFreeMempoolChunk) < GINT_TO_UINT(len))
    size += mono_pagesize ();
...
chunk->mem  = ALIGN_PTR_TO ((char*)chunk + sizeof (LockFreeMempoolChunk), 16); // -> offset 32
chunk->size = ((char*)chunk + size) - chunk->mem;                              // = size - 32

But chunk->mem is then aligned up to 16 bytes, so the data starts at offset 32, and the usable chunk->size is size - 328 bytes less than the loop guaranteed. For a request whose length is congruent to -24 (mod pagesize) — e.g. 4072 on 4 KB pages, 16360 on 16 KB pages — the fresh chunk (pos == 0) has chunk->size < len, so the assert in lock_free_mempool_alloc0:

g_assert (chunk->pos + size <= GINT_TO_UINT(chunk->size));

fires and aborts the runtime with a SIGABRT.

Why it showed up as a flaky eventpipe crash

This pool is used only on the async AOT unwind / exception-info decode path (mono_aot_get_unwind_info / decode_exception_debug_info, the if (async) branch), which is driven by the EventPipe SampleProfiler stack walk. The crash therefore appeared intermittently as a SIGABRT during the tracing/eventpipe/providervalidation and rundownvalidation tests on the Mono LLVM full-AOT (x64) leg — it only triggers when the profiler asynchronously decodes a method whose unwind / EH-info blob happens to be exactly a boundary size. The code is pre-existing (it dates back many years) and is not specific to any recent change.

Fix

-	while (size - sizeof (LockFreeMempoolChunk) < GINT_TO_UINT(len))
+	while (size - ALIGN_TO (sizeof (LockFreeMempoolChunk), 16) < GINT_TO_UINT(len))

Reserve the 16-byte-aligned header size (32) in the sizing loop so the usable chunk->size always covers the request. mono_valloc returns page-aligned memory, so chunk->mem is always exactly chunk + 32; reserving ALIGN_TO(sizeof(LockFreeMempoolChunk), 16) makes the loop's guarantee match the chunk's real capacity.

Evidence

1. Deterministic reproduction of the under-sizing (standalone simulation of the exact chunk_new math):

sizeof(LockFreeMempoolChunk)=24
pagesize=4096  UNDERSIZED len=4072  chunk_size=4064  deficit=8
pagesize=4096  UNDERSIZED len=8168  chunk_size=8160  deficit=8
pagesize=16384 UNDERSIZED len=16360 chunk_size=16352 deficit=8

2. Reproduced the exact assert in a real Mono LLVM full-AOT runtime. An instrumented build plus a one-shot probe that requests pagesize-24 bytes from the real allocator (with the SampleProfiler active so the async path is live) reproduces the CI signature exactly:

lfm_chunk_new UNDERSIZED len=4072 allocsize=4096 chunk_size=4064 deficit=8
* Assertion at .../src/mono/mono/metadata/memory-manager.c, condition 'pos + size <= chunk->size' not met
Got a SIGABRT while executing native code.

3. After the fix (same probe and SampleProfiler workload, instrumentation kept only to observe):

PROBE pagesize=4096 probe_len=4072 chunk_size=8160 assert=OK
...
DONE   (corerun exit 0, no assert)

The boundary request now gets the next page-multiple (8192 - 32 = 8160 >= 4072), and the SampleProfiler workload runs clean with all async allocations satisfied — including boundary-crossing ones (e.g. size=8208 -> chunk_size=12256).

Note

This pull request was authored with the assistance of GitHub Copilot.

lock_free_mempool_chunk_new sized each chunk by reserving
sizeof(LockFreeMempoolChunk) (24 bytes on 64-bit), but chunk->mem is then
aligned up to 16 bytes, so the usable region starts at offset 32 and a
single-page chunk only has pagesize-32 usable bytes. A request whose length
is congruent to -24 (mod pagesize) - e.g. 4072 on 4 KB pages, 16360 on 16 KB
pages - produced a freshly allocated chunk whose size (pagesize-32) is smaller
than the requested length, tripping

  g_assert (chunk->pos + size <= GINT_TO_UINT(chunk->size));

in lock_free_mempool_alloc0 and aborting the runtime.

This pool is used on the async AOT unwind/exception-info decode path
(mono_aot_get_unwind_info / decode_exception_debug_info with async==TRUE),
which is driven by the EventPipe SampleProfiler stack walk, so the crash
showed up intermittently as a SIGABRT during eventpipe tracing tests under
Mono LLVM full-AOT on x64.

Fix: reserve the 16-byte-aligned header size in the sizing loop so the chunk
always has room for the request.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the lock-free mempool chunk sizing logic in Mono so the chunk-capacity guarantee matches the allocator’s actual usable payload after the chunk header is pointer-aligned. This prevents lock-free allocations from tripping the g_assert in lock_free_mempool_alloc0 when allocations land on specific page-boundary sizes.

Changes:

  • Update the chunk-sizing loop in lock_free_mempool_chunk_new to reserve a 16-byte-aligned header size (ALIGN_TO(sizeof(LockFreeMempoolChunk), 16)) instead of the raw sizeof(...).
  • Ensure the computed chunk->size (derived from chunk->mem after ALIGN_PTR_TO(..., 16)) is always sufficient for the requested allocation size.

@pavelsavara pavelsavara enabled auto-merge (squash) June 25, 2026 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants