txhash: cold-store streamhash index — build + read (#728)#780
Draft
tamirms wants to merge 1 commit into
Draft
Conversation
Implements the buildable slice of #728: the cold (immutable-file) half of the txhash store — a per-index streamhash MPHF over (txhash, ledgerSeq) and a ColdReader that resolves a tx hash to the ledger it was committed in. - cold_format.go: per-index format — 3-byte ledger-seq payload (offset from a 4-byte MinLedger anchor), 1-byte fingerprint; ColdReader.Lookup(hash) -> ledgerSeq, ErrNotFound on miss, idempotent mmap Close. Rejects a payload width other than ColdPayloadSize at open. - cold_index.go + cold_merge.go: BuildColdIndex merges per-chunk sorted .bin files into one streamhash index via a parallel O_DIRECT fan-in merge (ported from streamhash cmd/bench), fed single-pass into the sorted-mode builder so I/O, merge CPU, and the MPHF build overlap. Header/size and payload-budget guards; ctx cancellation; first-error-wins pipeline. - odirect_{linux,other}.go: O_DIRECT page-cache bypass on Linux, cached-open fallback on tmpfs/EINVAL, no-op elsewhere. Benchmark-driven tuning (warm macOS + cold Linux NVMe over 382M real keys): - streamhash block-build workers default to NumCPU/2 (the e2e gate; ~2.7x over serial, saturates at NumCPU/2). - merge leaves capped at NumCPU/2 (not a fixed 32): NumCPU/2 leaves + NumCPU/2 builders fill the cores without oversubscription; ~+18% e2e cold vs NumCPU, neutral warm. - mergeBatchSize 16384 (~+5-7% e2e, fewer channel hand-offs). - k1 merge tiebreak kept: keying on k0 only was ~8% faster but the index is built rarely/offline, and the tiebreak makes byte-reproducibility rest on streamhash being deterministic-given-input rather than insensitive to the within-block order of same-prefix keys. - Sorted k-way merge kept over NewUnsortedBuilder (measured 1.7-6x faster for the pre-sorted inputs); loser-tree / 4-ary heap / wider fan-in rejected with cold data. Out of scope (blocked on unbuilt deps): production build wiring to the cold txhash ingester (#765) and the getTransaction read assembly over the tx-details-by-hash view (#764) + cold ledger reader (#725). The .bin input format is the documented seam for #765. Tests cover build/query round-trip, miss, concurrent reads, fan-in tree, large-file refill, and error/format guards. Benchmarks (warm + skip-guarded real-data) document and reproduce the tuning choices. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part of #728 — implements the buildable slice: the cold (immutable-file) half of the txhash store.
What this adds
A per-index streamhash MPHF over
(txhash, ledgerSeq)and aColdReaderthat resolves a tx hash to the ledger it was committed in.cold_format.go— per-index on-disk format: 3-byte ledger-seq payload (offset from a 4-byteMinLedgeranchor), 1-byte fingerprint.ColdReader.Lookup(hash) -> ledgerSeq,ErrNotFoundon miss, idempotent mmapClose; rejects a payload width ≠ColdPayloadSizeat open.cold_index.go+cold_merge.go—BuildColdIndexmerges per-chunk sorted.binfiles into one streamhash index via a parallel O_DIRECT fan-in merge (ported from streamhashcmd/bench), fed single-pass into the sorted-mode builder so I/O, merge CPU, and the MPHF build overlap. Header/size + payload-budget guards,ctxcancellation, first-error-wins pipeline.odirect_{linux,other}.go— O_DIRECT page-cache bypass on Linux (cached-open fallback on tmpfs/EINVAL; no-op elsewhere).Tuning (benchmark-driven: warm macOS + cold Linux NVMe over 382M real keys)
mergeBatchSize16384 (~+5–7% e2e).NewUnsortedBuilder(measured 1.7–6× faster for the pre-sorted inputs); loser-tree / 4-ary heap / wider fan-in rejected with cold data.Out of scope (blocked on unbuilt deps)
.bininput format is the documented seam for it.getTransactionread assembly over the tx-details-by-hash view (XDR view extractors: events, tx-hashes, tx-details, tx-pages from LedgerCloseMetaView #764) + cold ledger reader (Ledger query path over packfiles (immutable-file reader) #725).ColdReaderstops atledgerSeq.Tests / benchmarks
Build/query round-trip, miss, concurrent reads, fan-in tree, large-file refill, and error/format guards. Benchmarks (warm + skip-guarded real-data) document and reproduce the tuning choices.
go test -race+golangci-lintclean.🤖 Generated with Claude Code