Skip to content

[wip] db/datastruct/btindex: fix OOM in Build() — stream nodes, use off-heap EF#21778

Open
AskAlexSharov wants to merge 6 commits into
mainfrom
alex/bt_streaming_build3_36
Open

[wip] db/datastruct/btindex: fix OOM in Build() — stream nodes, use off-heap EF#21778
AskAlexSharov wants to merge 6 commits into
mainfrom
alex/bt_streaming_build3_36

Conversation

@AskAlexSharov

@AskAlexSharov AskAlexSharov commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

based on: #21777

  • .bt file format change

Fixes two sources of OOM during BTree index build for mainnet-scale snapshots:

Primary (~3.5 GB heap spike): NewEliasFano(keysWritten, maxOffset) allocates the full EF bit arrays on the Go heap. Replaced with NewEliasFanoOffHeap which backs them with a mmap'd temp file — OS pages it, Go heap is unaffected.

Secondary (~576 MB): nodes := make([]Node, 0, keysWritten/M) accumulated all M-th keys in memory before writing. Eliminated by swapping the file layout so nodes stream directly to the output file during ETL Load, with EF appended last.

New file format (v1): [0x01 version][M(8)][nodesCount(8)][nodes...][EF]
Old file format (v0): [EF][nodesCount(8)][nodes...]

M is now stored in the file header so the reader is self-contained — OpenBtreeIndexWithDecompressor reads M from v1 files and ignores the caller-supplied parameter. The BtIndex.M() accessor exposes it.

…ndex build

NewEliasFano allocates a multi-GB contiguous bit array on the Go heap for
large snapshots (mainnet: ~2B keys → ~3.5 GB). NewEliasFanoOffHeap backs
the same arrays with a mmap'd temp file so the OS pages it, avoiding the
heap spike.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the BtIndex on-disk format and build process to avoid mainnet-scale OOMs during B-tree index construction by (1) moving Elias–Fano (EF) build buffers off-heap and (2) streaming node records directly to the output instead of accumulating them in memory.

Changes:

  • Switch EF construction in BtIndexWriter.Build() to eliasfano32.NewEliasFanoOffHeap(...) (mmap-backed temp file) to eliminate large Go-heap spikes.
  • Change the index file layout to v1 ([version][nodesCount][nodes...][EF]) so node entries can be streamed during ETL Load.
  • Update index open logic to support both v0 and v1 layouts, and tighten the test assertion to catch regressions where too many nodes are stored.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
db/datastruct/btindex/btree_index.go Writes v1 header + streams nodes during Build; reads v0/v1 formats on open; uses off-heap EF builder.
db/datastruct/btindex/btree_index_test.go Adds an upper bound assertion for the number of cached nodes.
db/datastruct/btindex/bps_tree.go Removes node-list encoding helper; updates node decoding helper to return bytes consumed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread db/datastruct/btindex/btree_index.go
Comment thread db/datastruct/btindex/bps_tree.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants