celo-reth import-celo-state storage_v2 by piersy · Pull Request #186 · celo-org/celo-kona

piersy · 2026-05-21T17:14:46Z

Switches support from storage_v1 to storage_v2 in the import-celo-state command.

The updated upstream reth (to v2.2.0) now skips writing to static files for the import and writes everything to the MDBX database.

This bypasses the previous problem we had (solved with a fork of upstream reth in celo-org/optimism#437) with writing static files where each write incremented the block and writes were batched so that we couldn't write the whole state in one go leading to mismatching block numbers.

Tested with a minimal import file, yet to test over the full import.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb0001be6b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

palango · 2026-05-26T12:30:35Z

+                StaticFileSegment::AccountChangeSets,
+                StaticFileSegment::StorageChangeSets,
+            ] {
+                let mut writer = static_file_provider.latest_writer(segment)?;


Could we use writer.ensure_at_block(CEL2_MIGRATION_BLOCK_NUMBER) here instead of manually implementing the loop?

/// Ensures that the writer is positioned at the specified block number. /// /// If the writer is positioned at a greater block number than the specified one, the writer /// will NOT be unwound and the error will be returned. pub fn ensure_at_block(&mut self, advance_to: BlockNumber) -> ProviderResult<()> { let current_block = if let Some(current_block_number) = self.current_block_number() { current_block_number } else { self.increment_block(0)?; 0 }; match current_block.cmp(&advance_to) { Ordering::Less => { for block in current_block + 1..=advance_to { self.increment_block(block)?; } } Ordering::Equal => {} Ordering::Greater => { return Err(ProviderError::UnexpectedStaticFileBlockNumber( self.writer.user_header().segment(), current_block, advance_to, )); } } Ok(()) }

Yep, looks like a better option.

palango · 2026-05-26T13:07:19Z

Were you able to run this?

I run into this error when testing:

thread 'tokio-rt' (3314490) panicked at /mnt/ssd/rust/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/alloy-trie-0.9.5/src/hash_builder/mod.rs:146:9:
add_leaf key Nibbles(0x52df0bdf5a5f92d8037cf11e50f13d8017aefc99d20a73c826416df79570d481) self.key Nibbles(0x52df0bdf5a5f92d8037cf11e50f13d8017aefc99d20a73c826416df79570d481)
stack backtrace:
   0: __rustc::rust_begin_unwind
   1: core::panicking::panic_fmt
   2: alloy_trie::hash_builder::HashBuilder<K>::add_leaf
   3: reth_trie::trie::StorageRoot<T,H>::calculate
   4: reth_trie::trie::StateRoot<T,H>::calculate
   5: reth_db_common::init::compute_state_root_chunked
   6: reth_db_common::init::init_from_state_dump
   7: tokio::runtime::context::runtime::enter_runtime
   8: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
   9: tokio::runtime::task::core::Core<T,S>::poll
  10: tokio::runtime::task::harness::Harness<T,S>::poll
  11: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

thread 'main' (3314449) panicked at /mnt/ssd/rust/cargo/git/checkouts/reth-e231042ee7db3fb7/88505c7/crates/cli/runner/src/lib.rs:171:63:
Failed to join task: JoinError::Panic(Id(10), "add_leaf key Nibbles(0x52df0bdf5a5f92d8037cf11e50f13d8017aefc99d20a73c826416df79570d481) self.key Nibbles(0x52df0bdf5a5f92d8037cf11e50f13d8017aefc99d20a73c826416df79570d481)", ...)
stack backtrace:
   0: __rustc::rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: <core::pin::Pin<P> as core::future::future::Future>::poll
   4: <core::future::poll_fn::PollFn<F> as core::future::future::Future>::poll
   5: tokio::runtime::context::runtime::enter_runtime
   6: reth_cli_runner::CliRunner::run_blocking_until_ctrl_c
   7: celo_reth::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

palango · 2026-05-26T14:00:46Z

Heads up on the empty changesets: history stays empty after import, because IndexAccountHistory builds AccountsHistory entirely from changeset contents. So on an archive node (no
pruning), a historical query for an imported account that hasn't been touched yet returns empty for any block under the current tip — history_info hits NotYetWritten
(historical.rs:861-868) — until something writes to it. v1 didn't have this; write_account_to_db wrote an AccountsHistory[addr] = [31M] entry for every account. Latest-state reads are
unaffected, so a forward-only node is fine. The open question is whether eth_getProof/proofs-history needs historical lookups of the snapshot to work — if so this fix isn't enough on its
own.

piersy · 2026-05-27T12:31:50Z

Were you able to run this?

I run into this error when testing:

thread 'tokio-rt' (3314490) panicked at /mnt/ssd/rust/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/alloy-trie-0.9.5/src/hash_builder/mod.rs:146:9:
add_leaf key Nibbles(0x52df0bdf5a5f92d8037cf11e50f13d8017aefc99d20a73c826416df79570d481) self.key Nibbles(0x52df0bdf5a5f92d8037cf11e50f13d8017aefc99d20a73c826416df79570d481)
stack backtrace:
   0: __rustc::rust_begin_unwind
   1: core::panicking::panic_fmt
   2: alloy_trie::hash_builder::HashBuilder<K>::add_leaf
   3: reth_trie::trie::StorageRoot<T,H>::calculate
   4: reth_trie::trie::StateRoot<T,H>::calculate
   5: reth_db_common::init::compute_state_root_chunked
   6: reth_db_common::init::init_from_state_dump
   7: tokio::runtime::context::runtime::enter_runtime
   8: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
   9: tokio::runtime::task::core::Core<T,S>::poll
  10: tokio::runtime::task::harness::Harness<T,S>::poll
  11: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

thread 'main' (3314449) panicked at /mnt/ssd/rust/cargo/git/checkouts/reth-e231042ee7db3fb7/88505c7/crates/cli/runner/src/lib.rs:171:63:
Failed to join task: JoinError::Panic(Id(10), "add_leaf key Nibbles(0x52df0bdf5a5f92d8037cf11e50f13d8017aefc99d20a73c826416df79570d481) self.key Nibbles(0x52df0bdf5a5f92d8037cf11e50f13d8017aefc99d20a73c826416df79570d481)", ...)
stack backtrace:
   0: __rustc::rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: <core::pin::Pin<P> as core::future::future::Future>::poll
   4: <core::future::poll_fn::PollFn<F> as core::future::future::Future>::poll
   5: tokio::runtime::context::runtime::enter_runtime
   6: reth_cli_runner::CliRunner::run_blocking_until_ctrl_c
   7: celo_reth::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Hey @palango, by testing do you mean running a full import or something else? I ran the state import with the first 100K lines of the full state dump.

palango · 2026-05-27T12:42:13Z

Hey @palango, by testing do you mean running a full import or something else? I ran the state import with the first 100K lines of the full state dump.

Exactly. Indenpently, a challenger tried this out and ran into the same issue.

piersy · 2026-05-27T12:42:22Z

Heads up on the empty changesets: history stays empty after import, because IndexAccountHistory builds AccountsHistory entirely from changeset contents. So on an archive node (no pruning), a historical query for an imported account that hasn't been touched yet returns empty for any block under the current tip — history_info hits NotYetWritten (historical.rs:861-868) — until something writes to it. v1 didn't have this; write_account_to_db wrote an AccountsHistory[addr] = [31M] entry for every account. Latest-state reads are unaffected, so a forward-only node is fine. The open question is whether eth_getProof/proofs-history needs historical lookups of the snapshot to work — if so this fix isn't enough on its own.

Hey @palango, I think we need to add this since as you say queries for any account untouched since the import will not return anything and eth_getProof does accept the block number as input. So the safe approach seems to be add a history entry at the migration block for every account.

piersy · 2026-05-27T12:43:35Z

ndenpently, a challenger tried this out and ran into the same issue

Ok I will try with a the full dump. Do you recall how long it took to fail?

palango · 2026-05-27T12:43:37Z

Hey @palango, I think we need to add this since as you say queries for any account untouched since the import will not return anything and eth_getProof does accept the block number as input. So the safe approach seems to be add a history entry at the migration block for every account.

Alternatively, we could also just leave the migration as a v1 database and then run the v2-migration on top. But we need to check if this circumvents all of these issues.

palango · 2026-05-27T12:56:50Z

Ok I will try with a the full dump. Do you recall how long it took to fail?

It happened in the final step of validating the state root. So probably 45-60mins.

palango · 2026-05-27T13:26:46Z

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4cdb017d36

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-27T21:16:07Z

+    let mut acct_cs_cursor = tx.cursor_dup_write::<tables::AccountChangeSets>()?;
+    acct_cs_cursor.append_dup(block, AccountBeforeTx { address: *address, info: None })?;
+
+    tx.put::<tables::AccountsHistory>(ShardedKey::new(*address, u64::MAX), history_list.clone())?;


Write v2 history/change sets to their routed backends

When this command runs it now requires storage_v2, but these writes still bypass the v2 routing and put account changesets/history directly into MDBX. In storage v2, account changesets are read from static files and account history from RocksDB, so an imported datadir will have empty v2 backends for the migration state even though MDBX contains rows that normal v2 readers ignore; the same direct-write pattern is repeated for storage changesets/history below. This breaks historical lookups and any operation that needs the migration-block change/history data after import, so the import path should use the provider/EitherWriter routing or explicitly populate the v2 static/RocksDB destinations.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-27T21:16:07Z

+            StateRootProgress::Progress(state, _, updates) => {
+                let updated_len = provider.write_trie_updates(updates)?;
+                total_flushed_updates += updated_len;


Commit trie updates between progress batches

For a full Celo migration dump, root_with_progress() can return many Progress batches, but each batch is written into the same MDBX write transaction here and the transaction is not committed until the whole root finishes. MDBX retains dirty pages for the open transaction, so this can exhaust memory or the DB map during the import; the previous chunked path existed to release those pages between batches. Keep the boundary-key workaround, but still commit/reopen at a safe progress boundary or threshold before importing mainnet-sized state.

Useful? React with 👍 / 👎.

piersy added 2 commits May 21, 2026 17:49

celo-reth(import): Support storage_v2 layout

c322b27

celo-reth(import): Force storage_v2 for import

eb0001b

piersy requested review from jcortejoso and palango May 21, 2026 17:15

chatgpt-codex-connector Bot reviewed May 21, 2026

View reviewed changes

Comment thread crates/celo-reth/src/state_import.rs Outdated

palango reviewed May 26, 2026

View reviewed changes

piersy added 2 commits May 27, 2026 11:12

Fix formatting

ff7a4c5

Use writer.ensure_at_block rather than inrementing blocks manually

674e137

chatgpt-codex-connector Bot reviewed May 27, 2026

View reviewed changes

Conversation

piersy commented May 21, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

palango May 26, 2026

Choose a reason for hiding this comment

Uh oh!

piersy May 27, 2026

Choose a reason for hiding this comment

Uh oh!

palango commented May 26, 2026

Uh oh!

palango commented May 26, 2026

Uh oh!

piersy commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

palango commented May 27, 2026

Uh oh!

piersy commented May 27, 2026

Uh oh!

piersy commented May 27, 2026

Uh oh!

palango commented May 27, 2026

Uh oh!

palango commented May 27, 2026

Uh oh!

palango commented May 27, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

piersy commented May 27, 2026 •

edited

Loading