Add ledger arena for warp sync by justinfrevert · Pull Request #1650 · midnightntwrk/midnight-node

justinfrevert · 2026-06-05T20:28:20Z

Overview

Part 1 / ~3 for #1648. Prepares ledger arena storage for warp sync. Adds ledger sync process and handlers for ledger data transfer. (This is covered under the upcoming Q3 statement of work regarding sync performance).

🗹 TODO before merging

Ready

📌 Submission Checklist

All commits are signed off (git commit -s) for the DCO
Changes are backward-compatible (or flagged if breaking)
Pull request description explains why the change is needed
Self-reviewed the diff
I have included a change file, or skipped for this reason:
If the changes introduce a new feature, I have bumped the node minor version
Update documentation (if relevant)
Updated AGENTS.md if build commands, architecture, or workflows changed
No new todos introduced

🧪 Testing Evidence

Please describe any additional testing aside from CI:

Additional tests are provided (if possible)

🔱 Fork Strategy

Node Runtime Update
Node Client Update
Other:
N/A

Links

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

…er-state version Warp serialize (server), import (client), and genesis-arena-init hardcoded ledger_9 (LedgerState v16). A fresh node syncing onto a network governed by an older ledger version (e.g. a real devnet whose genesis+tip arena is v13/ledger_8) then panics: genesis-init can't deserialize the v13 genesis_state, and serve/recover would mismatch. Parse the 'ledger-state[vNN]' tag from the StateKey / genesis_state and dispatch to the matching compiled-in module (v5->ledger_7, v13->ledger_8, v16->ledger_9). Resolves the deferred per-version dispatch (review finding #5). Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Justin Frevert <justinfrevert@gmail.com> (cherry picked from commit 85db776)

The recovery monitor sourced peers from SyncingService::peers_info(), which chain_sync.restart() empties on benign post-warp UnknownParent announcements, stalling 1000-scale recovery with "no peers" while libp2p connections were still up. Source candidate peers from the network layer (connected + reserved peers, which survive restarts), falling back to sync peers only if none. Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Justin Frevert <justinfrevert@gmail.com> (cherry picked from commit 930aa4e)

…stead of dropping them A warp-synced node never progressed past its target on a real network (observed: pinned at the warp target for 9+ hours against a devnet fork, ~8,400 sync req/s). Three root causes, all fixed here: 1. GatedBlockImport returned Ok(ImportResult::MissingState) for gated blocks. substrate silently drops such blocks while chain_sync's best_queued_number stays advanced, so peer ancestor searches can only match genesis (the warp gap has no headers) and "common too far behind" re-arms the search forever; peers stuck in AncestorSearch ignore announcements, starving tip AND gap sync permanently. (Returning Err instead is no better: each restart re-issues an identical block request and BlockRequestHandler reputation-bans the node — "Same block request multiple times".) Fix: defer by awaiting the recovery gate inside import_block. This cannot deadlock under the new gate scope (below): a gated block can only exist after the state-sync target imported, and recovery uses only the client + request-response network, never the import queue. 2. The gate covered every !with_state() block, including gap-sync (block-history) blocks, which import with skip_execution and never touch the arena. Gate now defers only blocks that would execute: StateAction::Execute, or ExecuteIfPossible with parent state present. 3. The monitor re-armed the gate and re-downloaded the whole arena on every restart of a post-warp node, because chain_sync reports an active gap sync as warp_sync: Some(DownloadingBlocks). The monitor now decides by arena content at startup via the new midnight_node_ledger::has_ledger_state (cheap root lookup, per-version dispatch): present -> skip, absent -> recover. Validated end-to-end against a local-environment fork of a real devnet ledger_8 snapshot (6 validators, mock authorities): fresh --sync warp node went genesis -> fully synced (state + verified 7.3MB arena + full 554k block history) in ~83s, follows the tip with finality, container restart does not re-gate; 0 import errors, 0 NoLedgerState, 0 peer bans. Assisted-by: Claude:claude-fable-5 Signed-off-by: Justin Frevert <justinfrevert@gmail.com> (cherry picked from commit 9332cf5ed9f471a1e29a3cfceea480e04f25f17b)

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Signed-off-by: justinfrevert <81839854+justinfrevert@users.noreply.github.com>

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Klapeyron · 2026-06-15T07:12:48Z

+
+	/// Arm the gate: the warp path was taken, so authoring + import must wait for verification.
+	pub fn arm(&self) {
+		self.recovery_pending.store(true, Ordering::Release);


maybe it will simplify the code, if we will be persisting in the database this flag, so we may have support for restart of the node & the same location of data (fetched data & flag), also AuxData::insert_aux supports transactional operations if we put multiple keys together

node/src/warp_ledger_sync/monitor.rs checks the relevant state key on boot, so we can resume partial states.

Klapeyron · 2026-06-15T07:22:26Z

+		);
+
+		let blob = Arc::new(blob);
+		self.cache = Some((target, blob.clone()));


what is the average blob size there? Are we safe to store that in-memory?

I just pulled it for mainnet, and it's a few hundred MiB.

Klapeyron · 2026-06-15T07:33:49Z

+
+	fn handle_request(&mut self, payload: &[u8]) -> Result<Vec<u8>, HandleError> {
+		let req = LedgerSyncRequest::<B::Hash>::decode(&mut &payload[..])?;
+		let blob = self.blob_for(req.target_hash)?;


is it expensive to compute the blob for specific hash? Maybe worth to narrow scope of available hashes, like target_hash will be available only for blocks that rotate the session, or even only for the last session bundary that is finalized?

GRANDPA should already follow that path of session-boundary hops, but maybe to not make it dependent on that, there is a chance of running GRANDPA warp-sync first, or retrieve the session boundary block via runtime api like Session::current_index

having that also, maybe we can adjust the protocol to not target a block hash but rather a session index?

It's a good idea for protection. I think with the scope, maybe we can work on this separately. #1723.

Klapeyron · 2026-06-15T09:07:05Z

+	}
+}
+
+/// Wraps the node's inner [`SyncOracle`] (the `SyncingService`) so AURA reports "still syncing"


AURA has internal logic to skip block authorship when it is in major_sync, it is located in the Verifier that is used for block production & block import:
https://github.com/paritytech/polkadot-sdk/blob/8ae2e6a47aef16e392b4a951b7165c87d9f1e75b/substrate/client/consensus/aura/src/import_queue.rs#L157
maybe we can check if that works when warp is active, or set the flag if that is not a case?

We do use the internal aura check ahead of our check on warp sync status

inner.is_major_syncing() || gate.ledger_recovery_in_progress()

Does that make sense for now?

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

justinfrevert and others added 5 commits June 5, 2026 13:27

Add ledger arena for warp sync

957a145

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

deadlock fix

6b8f16a

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Fix GatedBlockImport behavior

0072284

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Remove spec guide references

16dc948

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Merge branch 'main' into warp-ledger-sync

1ba21c7

justinfrevert marked this pull request as ready for review June 10, 2026 04:59

justinfrevert requested a review from a team as a code owner June 10, 2026 04:59

justinfrevert and others added 7 commits June 9, 2026 22:41

Merge branch 'main' into warp-ledger-sync

5d1c382

Merge branch 'main' into warp-ledger-sync

1be408d

change file

3ea1d4b

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Merge branch 'main' into warp-ledger-sync

f92bcbb

Signed-off-by: justinfrevert <81839854+justinfrevert@users.noreply.github.com>

gilescope reviewed Jun 11, 2026

View reviewed changes

Comment thread ledger/src/versions/common/mod.rs Outdated

gilescope reviewed Jun 11, 2026

View reviewed changes

Comment thread ledger/src/versions/common/mod.rs

gilescope previously approved these changes Jun 11, 2026

View reviewed changes

Merge branch 'main' into warp-ledger-sync

3214d93

justinfrevert mentioned this pull request Jun 12, 2026

Ledger arena snapshot size solution #1687

Open

docs: remove spec references

a9df85c

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

justinfrevert dismissed gilescope’s stale review via a9df85c June 12, 2026 03:56

justinfrevert and others added 2 commits June 12, 2026 05:35

Merge branch 'main' into warp-ledger-sync

c27eeb3

chore: formatting

333d0a0

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

justinfrevert enabled auto-merge June 12, 2026 18:39

Klapeyron reviewed Jun 15, 2026

View reviewed changes

justinfrevert and others added 4 commits June 17, 2026 13:15

Merge branch 'main' into warp-ledger-sync

d65942a

Only serve ledger arena warp data if non-authority node

72b353a

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Compress/decompress ledger data throughout warp sync

524ac65

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Merge branch 'main' into warp-ledger-sync

630b98b

This comment has been minimized.

Sign in to view

justinfrevert mentioned this pull request Jun 19, 2026

Improvement: Warp sync ledger arena data by session #1723

Open

justinfrevert and others added 3 commits June 18, 2026 18:26

compatibility table update for ledger 9

13cbd06

Signed-off-by: Justin Frevert <justinfrevert@gmail.com>

Merge branch 'main' into warp-ledger-sync

518ea41

Merge branch 'main' into warp-ledger-sync

6e2379a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ledger arena for warp sync#1650

Add ledger arena for warp sync#1650
justinfrevert wants to merge 23 commits into
mainfrom
warp-ledger-sync

justinfrevert commented Jun 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Klapeyron Jun 15, 2026

Uh oh!

justinfrevert Jun 18, 2026

Uh oh!

Klapeyron Jun 15, 2026

Uh oh!

justinfrevert Jun 19, 2026

Uh oh!

Klapeyron Jun 15, 2026

Uh oh!

Klapeyron Jun 15, 2026

Uh oh!

Klapeyron Jun 15, 2026

Uh oh!

justinfrevert Jun 19, 2026

Uh oh!

Klapeyron Jun 15, 2026

Uh oh!

justinfrevert Jun 19, 2026

Uh oh!

This comment has been minimized.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

justinfrevert commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

🗹 TODO before merging

📌 Submission Checklist

🧪 Testing Evidence

🔱 Fork Strategy

Links

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

justinfrevert commented Jun 5, 2026 •

edited

Loading