Skip to content

docs(bench): efficiency findings + optimization scope (A6)#261

Merged
ELares merged 1 commit into
mainfrom
feat/perf-findings
Jun 16, 2026
Merged

docs(bench): efficiency findings + optimization scope (A6)#261
ELares merged 1 commit into
mainfrom
feat/perf-findings

Conversation

@ELares

@ELares ELares commented Jun 16, 2026

Copy link
Copy Markdown
Owner

A6, the "scope the gap" step. With the harness complete (A1-A5), this records where IronCache stands and scopes the optimization precisely instead of a speculative store rewrite.

Indicative head-to-head (IronCache vs redis-server 7.2.1 stand-in, unpinned macOS, 300k keys):

  • bytes-per-key 527 vs 245 = ~2.1x heavier -- the reliable finding (deterministic used_memory delta) and the real gap.
  • qps-per-core ~parity (contention-bound on this box, not authoritative); IronCache p50/p99 far better (1006us/2.5ms vs 4187us/65ms).

docs/bench/FINDINGS.md decomposes the memory gap (fat ~160 B/slot KvObj value + ~210 B/key Swiss-table slack) and prioritizes the levers, each its own future effort now protected by the A5 gate: L1 box the large KvObj variants (highest impact), L2 a Dashtable-style index, L3 load-factor tuning; throughput needs a pinned-Linux-vs-valkey run before optimizing (io_uring #28 the lever). Docs-only.

🤖 Generated with Claude Code

The "scope the gap" step of the performance track. With the harness complete
(A1-A5), this records where IronCache stands against the bar and scopes the
optimization precisely, rather than rewriting the store waist speculatively.

Indicative head-to-head (IronCache vs a redis-server 7.2.1 stand-in, unpinned
macOS, 300k keys via scripts/bench/headtohead.sh):
- bytes-per-key 527 vs 245 = ~2.1x HEAVIER. This is the reliable finding (a
  deterministic used_memory delta, not contention-sensitive) and the real gap.
- qps-per-core ~parity (7151 vs 7528) but contention-bound on an unpinned,
  co-resident box, so NOT authoritative; IronCache's far lower p50/p99 (1006us
  vs 4187us; 2.5ms vs 65ms) suggests headroom a pinned run would expose.

docs/bench/FINDINGS.md decomposes the memory gap with the A1 model (a fat
per-slot KvObj value sized for its largest inline variant, ~160 B/slot, plus
~210 B/key Swiss-table slack at load) and scopes the levers, each as its own
future effort now protected by the A5 perf-gate: L1 box the large KvObj variants
to shrink the slot (highest impact), L2 a Dashtable-style compact index, L3
load-factor tuning; and for throughput, confirm on pinned Linux vs valkey 9.1.0
before optimizing (io_uring #28 is the lever if a real gap appears). A speculative
store rewrite is deliberately deferred to the authoritative bar.

README points to the findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Zeke <ezequiel.lares@outlook.com>
@ELares ELares merged commit d772216 into main Jun 16, 2026
2 checks passed
@ELares ELares deleted the feat/perf-findings branch June 16, 2026 06:57
@github-actions

Copy link
Copy Markdown

perf-gate (A5)

Same-runner ratchet of HEAD against the merge-base (both rebuilt and measured in this job).
PASS = within the noise band, WARN = a real move inside budget (does not fail), FAIL = past budget in the bad direction.

metric base head delta% band budget verdict
qps_median (peak) 71601.53 71952.52 0.49% +/-5.03% drop <= 15% PASS
bytes_per_key int 239.99 239.99 0.00% det rise <= 5% PASS
bytes_per_key embstr 240.09 240.09 0.00% det rise <= 5% PASS
bytes_per_key raw 496.10 496.10 0.00% det rise <= 5% PASS

Overall: PASS

  • qps: noisy on shared CI, so the band comes from the base reps spread (floored at 5%); a drop is only a regression past the 15% budget.
  • bytes_per_key: deterministic (allocator-true memmodel), so a tight 5% rise budget; any rise beyond it FAILs.
  • Open-loop tails / criterion micro-benches are reported-not-failed (tail noise is high) and are not part of this ratchet.
  • An intentional perf trade is landed by raising the relevant budget in this PR with a documented reason (CI never auto-commits a baseline).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant