Skip to content

[KOL-91] Implement Ethereum reorg mitigation#512

Open
side2k wants to merge 11 commits into
mainfrom
ethereum-bridge-4-reorgs
Open

[KOL-91] Implement Ethereum reorg mitigation#512
side2k wants to merge 11 commits into
mainfrom
ethereum-bridge-4-reorgs

Conversation

@side2k

@side2k side2k commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

Task: https://fpcomplete.atlassian.net/browse/KOL-91

  • add confirmation_depth(Option<u64>) field to ChainConfig
  • in Ethereum listener, poller now is the single source of data
  • Ethereum subscriber now forms a FIFO-queue of hints - these hints are used by poller to skip chunks of blocks
  • in existing integration tests for Ethereum, explicitly configure confirmation_depth to None
  • add one more integration test with confirmation_depth set to 1, and inside this test ensure that event is not processed until extra block is generated
  • update docs accordingly

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Mar 25, 2026

Copy link
Copy Markdown

Deploying kolme with  Cloudflare Pages  Cloudflare Pages

Latest commit: 7a04403
Status: ✅  Deploy successful!
Preview URL: https://119341b8.kolme.pages.dev
Branch Preview URL: https://ethereum-bridge-4-reorgs.kolme.pages.dev

View logs

@side2k side2k force-pushed the ethereum-bridge-4-reorgs branch from feaa983 to bc6bc97 Compare March 25, 2026 19:33
@side2k side2k requested a review from snoyberg March 25, 2026 19:47
@side2k side2k marked this pull request as ready for review March 26, 2026 09:15
Comment thread packages/kolme/src/core/types.rs Outdated
Comment thread packages/kolme/src/core/types.rs Outdated
/// This setting is generic for future chain listeners, but currently enforced
/// only by the Ethereum listener.
#[serde(default = "ConfirmationDepth::use_default")]
pub confirmation_depth: ConfirmationDepth,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally hadn't considered this as part of the chain config itself, but rather a config set in the listeners. I can see the advantage of having it as part of the chain config, but it's also somewhat unenforceable. Putting it here does force a migration of data, but it's not a big deal. And even though the existing chains don't use a confirmation depth, they probably should, so I'm OK with this being present for all chain types.

Any thoughts on chain config vs listener-specific config?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chain config seems to be the single source of truth for all nodes - so storing confirmation depth there seems more robust.
Could you please explain what did you mean by unenforceable?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unenforceable in the sense that it's the listeners themselves who will be deciding whether sufficient confirmations are present, and they can choose to ignore this setting if they so choose. With other verifications, external entities can confirm whether a validator cheated (e.g., issuing a withdrawal to a chain without a valid Kolme-chain action, or a processor producing invalid blocks). For this case, there would still be a valid transaction on the external chain, it's just a question of whether the listener added it earlier than they should have.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they can choose to ignore this setting

But this depends on listener's code, and not on where setting is stored, if I understand you correctly?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. That's my point: "enforcing" this at the chain level doesn't add any extra protection.

Comment thread packages/kolme/src/listener/ethereum.rs Outdated
EthereumChain::Mainnet | EthereumChain::Sepolia => POLL_INTERVAL_DEFAULT,
};
let mut next_ws_retry = tokio::time::Instant::now();
let mut subscriber_task: Option<tokio::task::JoinHandle<()>> = None;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach risks the spawned task outliving its parent, which I don't think is intended. There are a few ways to approach this, but I'm not really sure what the goal of the overall code is, so I'm not sure what to propose.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to launch a subscriber, which gets events and stores their ids and block ids as hints for the poller, so it can skip blocks and jump straight to a next event blocks, when it becomes eligible.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean why does this need to be a separate task instead of in the current task?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because subscriber runs concurrently with a poller. When subscriber receives events, they're beyond the reach of the poller, but if it has issues running (e.g. RPC provider does not support WS) - we don't want it to affect poller's work at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix the "exit when parent is done" thing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach doesn't work in general: panics will not lead to the subtask being aborted. The part of goal here that I was unclear on is why do we need both an external loop and a separate polling task, plus why we didn't need it previously.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we didn't need it previously

We did not need it, because subscriber was the default source of data and polling was a fallback. So, subscriber, technically, had its own loop, just not in parallel with poller.

For the design, that has been implemented in current PR, I did not want stream.next().await to block poller. Now, that I've looked more thoroughly and found out that tokio::select!() can be used to orchestrate poller and subscriber and, potentially, make whole thing more clean. I'll implement it and update you.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any new commits on the PR, can you link to the change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to push. Fixed.

/// only by the Ethereum listener.
#[serde(default = "ConfirmationDepth::use_default")]
pub confirmation_depth: ConfirmationDepth,
pub confirmation_depth: Option<u64>,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still leaves two questions:

  1. Do we want to allow for a "default" value here, which may change implicitly with code base updates? I lean towards no.
  2. Do we want this as part of chain config as opposed to runtime listener config.

Comment thread packages/kolme/src/listener/ethereum.rs Outdated
EthereumChain::Mainnet | EthereumChain::Sepolia => POLL_INTERVAL_DEFAULT,
};
let mut next_ws_retry = tokio::time::Instant::now();
let mut subscriber_task: Option<tokio::task::JoinHandle<()>> = None;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach doesn't work in general: panics will not lead to the subtask being aborted. The part of goal here that I was unclear on is why do we need both an external loop and a separate polling task, plus why we didn't need it previously.

Also:
- do some refactoring and cleanup
- separate Startup and Runtime errors from the subscriber - the former lead to subscriber restart for a limited amount of times, the latter do not

@snoyberg snoyberg left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not comfortable with the approach here, and it still comes back to my original confusion about why all this code is needed. If we're legitimately just running two tasks simultaneously, why not use select! or JoinSet? The code structure looks like it's allowing for optionality in the subscriber_supervisor which is never actually needed.


impl SubscriberTaskGuard {
fn set(&mut self, handle: tokio::task::JoinHandle<()>) {
self.handle = Some(handle);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't abort an existing handle.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elaborate please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants