feat(op-reth): gate payload builder on interop failsafe#21448
Conversation
The payload builder never read the supervisor interop failsafe, so it kept sealing interop txs into unsafe blocks during a failsafe window until async pool eviction caught up, and it never excluded interop txs that bypassed the filter (e.g. private or locally-submitted txs). Thread a shared InteropFailsafe handle (Arc<AtomicBool>, mirroring SdmPostExecOptIn) from the txpool interop filter client (writer) through node setup into OpBuilderConfig (reader). When the failsafe is active, the build loop now excludes every interop tx, detected intrinsically via its CrossL2Inbox access list rather than the pool's interop-deadline marker, so unfiltered interop txs are caught too.
karlfloersch
left a comment
There was a problem hiding this comment.
Reth is not my strong suit, but I'll recount my understanding of the fix. We currently - correctly - check interop txs when they are sent to reth. However, the revalidation checks were not actually hooked in properly and they weren't actually dropping the interop txs when they become invalid or the failsafe enables. The old logic didn't actually drop the interop txs. This change fixes the revalidation logic so that it actually drops invalid interop messages
This LGTM (and definitely seems to fix a bug) tho I wouldn't say that I have full confidence (even still) that I understand the shape of the reth tx pool enough to say this is definitively correct. That said, the AI vouches for it at least!
|
Your understanding is correct for #21447 but this one is the simpler PR where the builder is acting as a final gate to exclude all executing messages if the fail safe is active. Previously we were relying on these being removed from the tx pool (which they were in the case of failsafe) but that's async and could have bugs - a direct if statement in the builder itself is a much simpler, more reliable final check to apply the failsafe. |
einar-oplabs
left a comment
There was a problem hiding this comment.
Looks like reth code 👍 . Consider adding a test or two more.
Thread one shared InteropFailsafe handle through three builds (off, on, off) so the test proves the gate is re-evaluated per build and the mark_invalid an excluded interop tx triggers does not persist once the failsafe clears.
…nterop-l2-audit * origin/develop: fix(op-rbuilder): append trailing SDM PostExec tx in the standard builder (#21332) docs(ai): clarify OP Stack Rust crates are migrated, not vendored (#21469) docs(ai): add branch CI-failure diagnosis to ci-ops (#21468) test(alloy-op-evm): cover included-but-reverted tx warming a later tx (#21466) op-core/types: add DepositTx, PostExecTx and Receipt types (#21356) fix(alloy-op-evm): skip SDM warming rebates for never-cold accesses; harden test suite (#21434) feat(op-reth): gate payload builder on interop failsafe (#21448) docs: add OPCMv2 Cantina audit report (#21458) docs: add notice about Sepolia estimateGas issue (#21456) feat(op-chain-ops): interop bridge/failsafe test tools (check-lagoon, interop-smoke) (#21395) fix(ctb): skip interop L2 genesis upgrade tests under custom gas token (#21444) fix(alloy-op-evm): roll back SDM block-warming for declined op-rbuilder candidates (#21359) # Conflicts: # docs/security-reviews/README.md
What & why
The op-reth payload builder never read the supervisor interop failsafe. During a failsafe window it kept sealing interop txs into unsafe blocks until async pool eviction caught up, and it never excluded interop txs that bypassed the filter (e.g. private or locally-submitted txs). This is a divergence from op-geth's miner, which gates production on the failsafe — and op-geth is the implementation being removed, so op-reth must own this gate.
Change
InteropFailsafehandle (Arc<AtomicBool>) inreth-optimism-txpool, mirroring the existingSdmPostExecOptInpattern.InteropFilterClient(the writer) now stores/updates this shared handle, with a builder.failsafe(...)option to inject it;OpBuilderConfig(the reader) carries a clone.execute_best_transactionsexcludes every interop tx while the failsafe is active, detected intrinsically via itsCrossL2Inboxaccess list (is_interop_tx) rather than the pool's interop-deadline marker — so interop txs that never went through the filter are caught too.OpNode→OpPoolBuilder(writer, via the filter client) and theOpPayloadBuilderservice builder →OpBuilderConfig(reader).The flag is read once per build (a consistent snapshot for the block); this does not change failsafe-detection latency (still bounded by the ~1s filter poll) — it closes the race where an in-flight build seals interop txs before pool eviction drains them, and covers interop txs that bypassed the filter.
Ref: RETH-3 / hardening H5.