feat(bench): reproducible run script + competitor matrix (A3, #8)#258
Merged
Conversation
#8) The one scripted invocation that reproduces a published run end to end (BENCHMARK.md #8), plus the living competitor matrix it is measured against. - scripts/bench/run.sh: builds the release binaries, pins the server and client to DISJOINT cores over loopback where taskset exists (configurable SERVER_CORES/CLIENT_CORES; warns and runs unpinned without taskset), boots the server with a readiness probe + a trap that always kills it (no orphan), warms the hot keyset write-only so the read-heavy pass hits (~90%), then runs three measured passes into a versioned out dir: memmodel -> memory.json, loadgen closed -> closed.json (peak QPS), loadgen open -> open.json + open.hgrm (coordinated-omission-free tail). Every knob is an overridable env var; a --smoke mode runs in seconds; a manifest.json records knobs + host facts + version + UTC stamp + matrix pointer. shellcheck-clean, bash-3.2-safe. - docs/bench/COMPETITORS.md: the committed, DATED (2026-06-16) competitor matrix. Pinned Valkey 9.1.0 / Redis 8.8.0 / Dragonfly v1.39.0 with the per-entry memory facts the A1 model is compared against (Redis kvobj in 8.2; Valkey embedded key/value in 8.0 + the 8.1 hashtable redesign; Dragonfly dashtable overhead) and jemalloc defaults. Bumps require an explicit PR; A4's head-to-head (#96) installs the pinned Valkey version. - scripts/bench/README.md: usage + pinning methodology + matrix/A4 relationship. - .gitignore: bench-results/ (run artifacts are never committed). Adversarial review fixes: (1) the server shard count now matches the cores it is pinned to (count_cores of the server pinned set, not the host total), so the thread-per-core engine is not oversubscribed and QPS-per-core stays honest; (2) a pre-launch connect check fails fast if the port is already serving (SO_REUSEPORT would otherwise let a stale server silently co-reside and mix numbers); (3) corrected the Dragonfly BSL change date to 2030-07-01 (per its LICENSE.md). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Zeke <ezequiel.lares@outlook.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A3: the one scripted invocation that reproduces a published run end to end, plus the dated competitor matrix it's measured against.
--smokeruns in seconds. shellcheck-clean.Adversarial review caught + fixed: (1) shards now match the pinned server-core count (was NCPU on a half-box pin -> oversubscription distorting QPS-per-core); (2) pre-launch port-free check (SO_REUSEPORT would let a stale server co-reside and mix numbers); (3) corrected the Dragonfly BSL change date to 2030-07-01.
Smoke-verified end to end (clean shutdown, no orphan, correct manifest). shellcheck clean; invariant lints unaffected (no Rust changed).
🤖 Generated with Claude Code