Skip to content

execution/bal: serve BALs up to and beyond the WSP#21764

Open
taratorio wants to merge 6 commits into
mainfrom
worktree-bal-retention
Open

execution/bal: serve BALs up to and beyond the WSP#21764
taratorio wants to merge 6 commits into
mainfrom
worktree-bal-retention

Conversation

@taratorio

Copy link
Copy Markdown
Member

Why

EIP-7928 requires nodes to serve BALs for the weak subjectivity period (3533 epochs ≈ 113,056 blocks ≈ 15.7 days). Erigon only stores BALs in MDBX for the last MaxReorgDepth (96) blocks — the exec-stage prune (#19110) deletes everything older — so eth/71, engine_getPayloadBodiesBy*V2 and eth_getBlockAccessList could not serve anything beyond the reorg window.

Full/archive nodes keep account/storage/code domain history well past the WSP (default prune distance 262,144 blocks), so older BALs don't need to be stored at all: they can be re-derived on demand by re-executing the block against historical state — the same approach already used by the receipts generator for devp2p GetReceipts.

What

New execution/bal package:

  • RederiveBlockAccessList — replays a full block (init system calls at access index −1, every transaction, finalize at len(txns)) recording I/O into a VersionedIO, mirroring the block builder's BAL recording sequence.
  • Regenerator — caching wrapper around it: LRU (BAL_LRU, default 1024 entries ≈ 100MB), per-block-hash dedup mutex, BAL_EXEC_CONCURRENCY replay semaphore (default GOMAXPROCS/2), history reader positioned at the block's first txNum. The re-derived BAL's hash is verified against header.BlockAccessListHash before it is cached or served — the node only ever serves BALs it derived itself from canonical state; a mismatch (e.g. non-canonical block) degrades to "not available". Pruned history surfaces as state.PrunedError; pre-Amsterdam headers (no BAL commitment) return nil.

Wired into all three read paths, DB-first with regeneration fallback:

Read path Before (block older than 96) After
eth/71 GetBlockAccessLists 0x80 sentinel regenerated BAL; 0x80 only beyond history
engine_getPayloadBodiesBy{Hash,Range}V2 blockAccessList: null regenerated; null only beyond history
eth_getBlockAccessList error 4444 regenerated; 4444 only beyond history
  • devp2p: eth.BlockAccessListGetter interface, fallback inside AnswerGetBlockAccessListsQuery; MultiClient owns a Regenerator next to its receipts generator. The existing 2MiB/1024-lookup response limits bound per-request replay work.
  • engine API: fallback in the execmodule payload-bodies getters (serves in-process and gRPC consumers); beginOverlayOrRo now returns kv.TemporalTx so the getters' existing tx is reused for regeneration.
  • RPC: BaseAPI.balRegenerator, constructed when the engine is a full rules.Engine.

BAL storage and pruning are unchanged — this PR only adds the serving fallback.

Testing

All new tests run with t.Parallel():

  • execution/bal: insert a blockgen Amsterdam chain (transfers, contract creation with SSTORE init code, multi-tx and empty blocks), prune all stored BAL rows (asserted 0 rows left), then require regenerated bytes byte-equal blockgen's independently computed canonical BALs and their hash equal the header commitment; pre-Amsterdam returns nil.
  • p2p/protocols/eth: handler fallback unit tests (stored wins, regenerated served, getter empty/error/unknown → sentinel, getter not consulted when stored).
  • execution/execmodule: payload bodies serve stored BALs, then after pruning regenerate identical bytes via both ByHash and ByRange.
  • rpc/jsonrpc: eth_getBlockAccessList returns the regenerated BAL after the stored rows are pruned.

make lint clean, full test suites of all touched packages pass, race-clean.

@yperbasis yperbasis added the Glamsterdam https://eips.ethereum.org/EIPS/eip-7773 label Jun 12, 2026
@yperbasis yperbasis requested a review from Copilot June 12, 2026 11:41

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a regeneration-based fallback to serve EIP-7928 Block Access Lists (BALs) beyond Erigon’s reorg-depth DB retention window by re-executing blocks against historical state, and wires it into RPC, Engine API, and eth/71 devp2p serving paths. This aligns BAL availability with the weak subjectivity period requirements without changing existing BAL storage/pruning.

Changes:

  • Introduces execution/bal with RederiveBlockAccessList + Regenerator (LRU cache, per-block de-dup, bounded replay concurrency, history reader positioning, header hash verification).
  • Adds DB-first + regeneration fallback for BAL serving in:
    • RPC eth_getBlockAccessList
    • Engine API engine_getPayloadBodiesBy{Hash,Range}V2
    • eth/71 GetBlockAccessLists
  • Adds tests covering regeneration after pruning across the three serving surfaces.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
rpc/jsonrpc/eth_block_access_list.go Adds BAL regeneration fallback when stored BAL is missing/pruned.
rpc/jsonrpc/eth_block_access_list_test.go Tests RPC BAL regeneration after pruning stored BAL rows.
rpc/jsonrpc/eth_api.go Adds bal.Regenerator construction into BaseAPI when a full rules engine is available.
p2p/sentry/sentry_multi_client/sentry_multi_client.go Wires BAL regeneration into eth/71 handler using temporal tx + BlockAccessListGetter.
p2p/protocols/eth/handlers.go Extends eth/71 BAL handler to optionally regenerate missing/pruned BALs.
p2p/protocols/eth/handlers_test.go Adds unit tests for handler regeneration fallback behavior and edge cases.
execution/execmodule/getters.go Adds BAL regeneration fallback for payload bodies getters; switches getter tx to kv.TemporalTx.
execution/execmodule/exec_module.go Instantiates a BAL regenerator inside ExecModule.
execution/execmodule/exec_module_test.go Tests payload bodies BAL regeneration after pruning stored BAL rows.
execution/bal/regenerator.go Implements BAL regeneration with caching, concurrency bounds, and historical state setup.
execution/bal/regenerator_test.go Tests regenerator reproduces canonical BAL bytes and handles pre-Amsterdam.
execution/bal/rederive.go Implements full-block replay to re-derive BALs consistent with builder recording sequence.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread execution/bal/regenerator.go
Comment thread execution/bal/regenerator.go
Comment thread execution/bal/regenerator.go
Comment thread execution/bal/regenerator.go
Comment thread execution/bal/regenerator.go
pull Bot pushed a commit to Dustin4444/erigon that referenced this pull request Jun 13, 2026
…ture set (erigontech#21771)

## Problem

`eest-spec-enginextests-benchmark-30m-parallel` failed on erigontech#21764 with
the Go linker dying mid-build
([run](https://github.com/erigontech/erigon/actions/runs/27421370747/job/81047607869)):

```
link: mapping output file failed: no space left on device
```

GitHub's `ubuntu-24.04` hosted pool currently serves two disk flavors —
72G and 145G root volumes. The same shard on the same branch passed 45
minutes earlier on a 145G runner (110G free after cleanup) and failed on
a 72G one (36G free after cleanup).

The job's disk demand sits right at that 36G line because every
`eest-spec-*` shard provisions **all three** EEST corpora through the
blanket `test-fixtures-eest` Make target, while
`tools/run-eest-spec-test.sh` only ever reads one of them:

| corpus | tarball | extracted |
|---|---|---|
| eest_benchmark | 1.0G | 5.3G |
| eest_devnet | 0.6G | 10G |
| eest_stable | 0.4G | 8.1G |

36G free − ~8G (Go toolchain + mod/build cache restore) − 2G tarballs −
23.4G extraction ≈ 2.6G left when `go build` starts → ENOSPC at link.
The ±1-2G run-to-run variance is why it flakes instead of failing
deterministically.

## Fix

- Move fixture provisioning into `tools/run-eest-spec-test.sh`, which
already resolves the shard's corpus from its name — each shard now
downloads/extracts only the set it reads. Worst-case footprint drops
from ~35G to ~22G (devnet shards), ~17G for the benchmark shard that
flaked.
- The zkevm/non-zkevm Makefile rule split existed solely to pick the
fixture prerequisite, so the four pattern rules collapse to two
(race/non-race).
- Add `--download-only` to `tools/test-fixtures.sh` and use it in
`cache-warming-eest-fixtures.yml`. The warmer caches only the `.tar.gz`
files but was extracting all four corpora (~27G) — on a runner that
doesn't even run the cleanup-space step, i.e. the same ENOSPC waiting to
happen on a 72G runner whenever the probe misses.

Local `make eest-spec-*` behaviour is unchanged: provisioning is a
sentinel-guarded no-op when the corpus is already extracted, and `make
test-fixtures-eest`/`test-fixtures-zkevm` remain for prefetching
everything.

## Verification

TDD note: infra/shell change with no Go behaviour; verified functionally
instead of via Go tests:

- Synthetic end-to-end test of `test-fixtures.sh` (tiny `file://`
manifest in a temp dir): download-only cold / idempotent re-run /
normal-mode extract from existing tarball / sentinel no-op /
download-only re-download after tarball deletion — all pass.
- `make -n` for one shard per class (plain, race, zkevm-race,
benchmark): correct binary (`evm` vs `evm.race`), no blanket fixture
extraction in any recipe.
- Smoke run of `run-eest-spec-test.sh statetests-stable` against a
populated cache: in-script provisioning hits the cached fast-path and
proceeds into the run stage.
- `actionlint` on both workflows; `shellcheck` adds no new findings;
`make lint` clean twice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Glamsterdam https://eips.ethereum.org/EIPS/eip-7773

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants