db/datastruct/btindex: interpolation search in BTree leaf#21794
Draft
sudeepdino008 wants to merge 1 commit into
Draft
db/datastruct/btindex: interpolation search in BTree leaf#21794sudeepdino008 wants to merge 1 commit into
sudeepdino008 wants to merge 1 commit into
Conversation
Search the BTree leaf window by interpolation instead of binary search: estimate the target position from the bytes after the bound keys' common prefix, falling back to binary after BtInterpBudget (8) probes. Default on. Cold .kv reads speed up 1.5-2.4x because interpolated probes cluster near the target -> far fewer distinct cold page faults than binary's midpoint jumps. No index/format change; M unchanged. Cold btnav us/op (M=256, mainnet 0-8192 files, vmtouch -e, 15k keys): storage 171 -> 108 (1.58x) accounts 147 -> 94 (1.56x) code 258 -> 109 (2.37x)
ca1bb01 to
b43326e
Compare
Collaborator
|
On bloatnet:
Context:
( i benched this PR. not: #21799 ) |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces an interpolation-based search strategy when narrowing the B+tree leaf window, aiming to improve cold-read latency by keeping probes spatially clustered in the .kv data file. It adds an environment-controlled toggle and a probe budget, with fallback to binary search.
Changes:
- Add
BT_INTERP/BT_INTERP_BUDGETknobs to enable interpolation search with a bounded probe budget. - Extend
BpsTree.Getto narrow[l,r)using interpolation estimates (with binary fallback after the budget). - Add a new test asserting interpolation search returns identical
(value, ok, offset)results as binary search across multiple budgets.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| db/datastruct/btindex/interp_search_test.go | Adds an equivalence test comparing interpolation vs binary results across budgets. |
| db/datastruct/btindex/btree_index.go | Introduces env-configurable globals to enable/parameterize interpolation search. |
| db/datastruct/btindex/bps_tree.go | Updates pivot search to return bounding keys and implements interpolation narrowing in Get, plus helpers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+16
to
+19
| func TestInterpEquivBinary(t *testing.T) { | ||
| t.Parallel() | ||
| saveInterp, saveBudget := BtInterp, BtInterpBudget | ||
| defer func() { BtInterp, BtInterpBudget = saveInterp, saveBudget }() |
Comment on lines
+367
to
+371
| // Interpolation search narrows the window with position estimates from the | ||
| // bound keys; after BtInterpBudget probes fall back to binary. The final | ||
| // small window is handed to the linear scan below either way. | ||
| if BtInterp && len(klo) > 0 && len(khi) > 0 { | ||
| probes := uint64(0) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Search the BTree leaf window by interpolation instead of binary search. Leaf keys are near-uniform, so interpolating on the bytes after the bound keys' common prefix lands near the target — and keeps probes spatially clustered, so cold reads hit far fewer distinct
.kvpages.After a probe budget (
BtInterpBudget, default 8) it falls back to binary search, bounding degenerate windows. Default on (BT_INTERP=true). No index/format change;Munchanged (256).Cold read latency (btnav µs/op, M=256, mainnet
0-8192files,vmtouch -e, 15k random keys)The win is page locality, not fewer probes. Interpolated probes land near the target, so successive probes hit the same/adjacent
.kvpages (already faulted); binary's midpoint jumps fault new cold pages. The clearest evidence: onstorage, pure interpolation does more probes than binary yet is still fastest cold — so probe count can't be what's driving it.Probe counts are a secondary, sometimes-counterintuitive metric (sim, 20k keys, budget 8): accounts 7.04→2.86, code 7.04→2.83, storage 7.04→4.30. The budget is exhausted (falling back to binary) only on
storage(20 B shared prefix → interpolation estimates degrade) ~10% of the time, never on accounts/code. Pure interpolation (unbounded budget) ties budget 8 on cold latency; budget 8 just caps the worst-case probe tail (storagemax242→16) at ~no latency cost.Correctness
TestInterpEquivBinaryasserts interp returns identical(value, ok, offset)as binary for 50k hits + misses across budgets{0,1,2,4,8,2²⁰}. Existing btree tests pass with interp default-on. Validated on mainnet (storage/accounts/code).