perf(store): 8-byte tagged-pointer Entry slot - first CLEAR memory win vs redis 8.8.0 (round 5)#265
Merged
Merged
Conversation
…n vs redis 8.8.0 (round 5) Shrink the per-shard `hashbrown::HashTable<Entry>` slot from 16 bytes to 8 by turning `Entry` from a 16-byte enum (`Str(Box<[u8]>)` fat pointer | `Coll(Box< CollEntry>)`) into a single 8-byte `NonNull<u8>` TAGGED POINTER: - low bit 0 = a manually-allocated Str THIN blob `[u32 total_len][round-3 blob]` (the length moves INTO the allocation, so the pointer is one word, not a fat ptr+len), align 8; - low bit 1 = `Box::into_raw(Box<CollEntry>)`. Both allocations are >= 2-aligned, so the low tag bit is always free. This keeps the Round-3 single-allocation-per-key + no-key-duplication win and adds the slot shrink, roughly halving `table_bytes_per_key`. This LIFTS `#![forbid(unsafe_code)]` on ironcache-store (authorized) in favor of `#![deny(unsafe_op_in_unsafe_fn)]`. All unsafe is CONFINED to one heavily documented `Entry` impl in kvobj.rs (manual alloc/dealloc, strict-provenance tag set/clear via `map_addr`, the access reconstructions, Drop, Clone); every unsafe block carries a `// SAFETY:` justification. The blob CONTENT is still parsed with SAFE bounds-checked slicing. The Store waist (ValueRef/RmwEntry/side-traits) is UNCHANGED; only the rmw type-dispatch in lib.rs swaps the old enum match for `obj.as_coll_val_mut()` now that `Entry` is opaque. Result (the optimization-campaign goal: beat redis 8.8.0 on memory): - head-to-head (128B, 300k keys, macOS): bytes/key 221.5 -> 199.69 vs redis 218.61 = 0.91x, CLEARLY BELOW redis (Round 3 was parity at 221.5). - memmodel (allocator-true): table_bytes_per_key 26.2 -> 13.11; size_of::<Entry> 16 -> 8. Soundness hardening from a 3-lens adversarial review (UB / aliasing / behavioral parity). Aliasing = SOUND, parity = PRESERVED; the UB lens found and this fixes two issues miri's executed paths could not catch: - CRITICAL: the new u32 total-length prefix would TRUNCATE for a single value > 4 GiB, so Drop would dealloc with a wrong Layout (UB). It was reachable because APPEND grew a value unbounded (no proto-max-bulk-len check). This was a regression the prefix introduced (round 3's Box<[u8]> carried a usize length). Fix: (1) a hard `expect` in alloc_str_blob so the truncation branch is gone (a controlled panic backstop, not UB); (2) cap APPEND at 512 MB returning the exact Redis error (new ErrorReply::string_exceeds_max), matching Redis checkStringLength AND keeping every value < 4 GiB so the backstop is unreachable in practice. - HIGH: the tag scheme needs the Str blob and Box<CollEntry> >= 2-aligned, guarded only by a release-stripped debug_assert. Added two `const` assertions (STR_ALIGN >= 2, align_of::<CollEntry>() >= 2) so a future alignment-breaking edit fails the BUILD instead of silently corrupting the tag. Gates: 849 tests green; clippy -D warnings, fmt, invariant-lint clean; miri under -Zmiri-strict-provenance clean across the store lib (incl. 8 dedicated Entry unsafe-path tests) AND every integration test. The jemalloc accounting test is miri-ignored (FFI not miri-executable; documented, non-UB). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Zeke <ezequiel.lares@outlook.com>
perf-gate (A5)Same-runner ratchet of HEAD against the merge-base (both rebuilt and measured in this job).
Overall: WARN
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Shrink the per-shard
hashbrown::HashTable<Entry>slot from 16 bytes to 8 by turningEntryfrom a 16-byte enum (Str(Box<[u8]>)fat pointer |Coll(Box<CollEntry>)) into a single 8-byteNonNull<u8>tagged pointer:0= a manually-allocated Str thin blob[u32 total_len][round-3 blob](the length moves into the allocation, leaving a one-word pointer), align 8;1=Box::into_raw(Box<CollEntry>).Both allocations are >= 2-aligned, so the low tag bit is always free. This keeps the Round-3 single-allocation-per-key + no-key-duplication win and adds the slot shrink, roughly halving
table_bytes_per_key.This lifts
#![forbid(unsafe_code)]onironcache-store(authorized) in favor of#![deny(unsafe_op_in_unsafe_fn)]. Allunsafeis confined to one heavily documentedEntryimpl inkvobj.rs(manual alloc/dealloc, strict-provenance tag set/clear viamap_addr, the access reconstructions,Drop,Clone); every block carries a// SAFETY:justification. The blob content is still parsed with safe bounds-checked slicing. The Store waist is unchanged; only the rmw type-dispatch inlib.rsswaps the old enum match forobj.as_coll_val_mut()now thatEntryis opaque.Result (the optimization-campaign goal: beat redis 8.8.0 on memory)
table_bytes_per_keysize_of::<Entry>()This is the first clear memory win over redis 8.8.0 (round 3 was at parity). Memory bytes-per-key is the reliable metric on any box; the macOS throughput number remains contention-bound and non-authoritative (a clean speed verdict needs pinned Linux).
Soundness (3-lens adversarial review: UB / aliasing / behavioral parity)
Aliasing = SOUND, parity = PRESERVED. The UB lens found and this PR fixes two issues miri's executed paths could not catch:
u32total-length prefix would truncate for a single value > 4 GiB, soDropwoulddeallocwith a wrongLayout(UB). Reachable becauseAPPENDgrew a value unbounded (noproto-max-bulk-lencheck). This was a regression the prefix introduced (round 3'sBox<[u8]>carried ausizelength). Fix: (1) a hardexpectinalloc_str_blobso the truncation branch is gone (a controlled panic backstop, not UB); (2) capAPPENDat 512 MB returning the exact Redis error (newErrorReply::string_exceeds_max), matching RedischeckStringLengthand keeping every value < 4 GiB so the backstop is unreachable in practice.Box<CollEntry>>= 2-aligned, guarded only by a release-strippeddebug_assert. Added twoconstassertions (STR_ALIGN >= 2,align_of::<CollEntry>() >= 2) so a future alignment-breaking edit fails the build.Gates
clippy -D warnings,fmt, invariant-lint clean.-Zmiri-strict-provenanceclean across the store lib (incl. 8 dedicatedEntryunsafe-path tests) AND every integration test (primitives, keyspace, eviction, the four collection in-place suites, watch). The jemallocaccountingtest is miri-ignored (FFI not miri-executable; documented, non-UB).🤖 Generated with Claude Code