Skip to content

pre-flight fcu for speed bump#4194

Draft
advaita-saha wants to merge 8 commits into
masterfrom
preflight-fcu
Draft

pre-flight fcu for speed bump#4194
advaita-saha wants to merge 8 commits into
masterfrom
preflight-fcu

Conversation

@advaita-saha

@advaita-saha advaita-saha commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

makes fcu call super-fast with minimal downside
Upon testing shows significant performance improvement, and improvement in attestations from CL side ( fewer attestation misses )

Comment thread execution_chain/core/chain/forked_chain.nim

# Enqueue the actual forkchoice apply. For ack-only fCUs we don't await
# the result. This avoids the response getting blocked behind the shared queue worker
let fcuFut = chain.queueForkChoice(headHash, finalizedBlockHash, safeBlockHash)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for the queueForkChoice background job to fail in the background? Or after running preflightForkChoice and returning a success to the caller , should the job always succeed?

@advaita-saha advaita-saha May 1, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically it is not possible to fail.
The only possibility is if the db is corrupt, or somehow the db connection is broken ( or other db level failures )

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it might be possible for it to fail when processing multiple heads. The removeBlockFromCache function can remove blocks from the hashToBlock table which can cause forkChoice to fail. removeBlockFromCache gets called from two places updateBase and updateFinalized. If those functions get called asynchronously before the forkChoice task runs then you get a silent failure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the queueForkChoice just adds the fCU call to the asyncQueue. So nothing else calls it async. And the queue is processed one by one

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removeBlockFromCache gets called from two places updateBase and updateFinalized. If those functions get called asynchronously before the forkChoice task runs then you get a silent failure.

I think not possible in the current design with asyncQueue implementation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure. Maybe its ok because the single queue enforces the ordering as long as there are no await calls which yield to other tasks before the forkChoice task is added to the queue.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jangko any thoughts on this ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure. Maybe its ok because the single queue enforces the ordering as long as there are no await calls which yield to other tasks before the forkChoice task is added to the queue.

Since the queueForkChoice is a template and contains an await call (which is good for rate limiting purposes), then the await will yield the event loop meaning something else can run concurrently perhaps before the forkChoice task is added to the queue. Maybe this is fine as long as no other code calls updateBase or updateFinalized outside of the task queue flow.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is true.
For now by design updateBase & updateFinalized is not called outside the queue

Will add some comments on this

@jangko

jangko commented May 1, 2026

Copy link
Copy Markdown
Contributor

LGTM

return err("Cannot find head block: " & headHash.short)

if safeHash != zeroHash32 and c.hashToBlock.getOrDefault(safeHash).isNil:
return err("Cannot find safe block: " & safeHash.short)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is more strict now than the implementation in forkChoice which would silently ignore if the safeHash doesn't exist in the hashToBlock map. Not sure if this change is desired or not.


# Enqueue the actual forkchoice apply. For ack-only fCUs we don't await
# the result. This avoids the response getting blocked behind the shared queue worker
let fcuFut = chain.queueForkChoice(headHash, finalizedBlockHash, safeBlockHash)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is nothing logging the result of the forkChoice call. Even if we don't expect any failures we should probably log something. Perhaps the queueForkChoice asyncHandler could log the result of forkChoice.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely yeah, will add some logs. Will help identify issues

Comment thread execution_chain/beacon/api_handler/api_forkchoice.nim
# Payload generation reads `chain.latestHeader`, so we must wait for the
# apply to complete before assembling the block.
(await fcuFut).isOkOr:
return invalidFCU(error, chain, header)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we end up awaiting here for the block building it might be better to not use the async queue flow at all if attrsOpt.isSome. Block building would be faster in that case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that PR is also ready on top of this. Just testing and ironing out few issues on that
( little difficult to test as I don't have much validators, so very less proposals to test )

@advaita-saha advaita-saha requested a review from tersec May 1, 2026 17:04
safeHash: Hash32 = zeroHash32):
Result[void, string] =
## Pre-validate a forkchoice update against the current in-memory chain,
## mirroring the failure paths inside `forkChoice()`. Run this from the RPC

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While these are likely to be relatively static, as an ongoing maintenance point, right now the synchronization between the two has been manually checked (and I've been checking it as well, as did you and @bhartnett ). But if the engine API rules change at some point, e.g., Gloas enabling backwards head movement in engine API as had been prohibited but might be allowed, then this being out of sync is a risk.

## Invariant: between this check and the worker actually running our
## enqueued item, the only mutators of `c.heads`/`hashToBlock` are
## previously-queued `validateBlock` (adds heads, never removes the
## validated one) and `updateFinalized` (only prunes branches that don't

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After:

This is is still somewhat true, but e.g., the checks at:

# If the head block is already in our canonical chain, the beacon client is
# probably resyncing. Ignore the update.
# See point 2 of fCUV1 specification
# https://github.com/ethereum/execution-apis/blob/v1.0.0-beta.4/src/engine/paris.md#specification-1
if chain.isCanonicalAncestor(header.number, headHash):
notice "Ignoring beacon update to old head",
headHash = headHash.short,
headNumber = header.number,
base = chain.baseNumber,
pendingFCU = chain.pendingFCU.short,
resolvedFinNum = chain.resolvedFinNumber,
resolvedFinHash = chain.resolvedFinHash.short
return validFCU(Opt.none(Bytes8), headHash)

proc isCanonicalAncestor*(c: ForkedChainRef,
blockNumber: BlockNumber,
blockHash: Hash32): bool =
if blockNumber >= c.latest.number:
return false
if blockHash == c.latest.hash:
return false
if c.base.number < c.latest.number:
# The current canonical chain in memory is headed by
# latest.header
for it in ancestors(c.latest):
if it.hash == blockHash and it.number == blockNumber:
return true

aren't valid anymore in Amsterdam. So it's not really that the head becomes not-valid; it was as valid as it ever was, it just might not be the head anymore.

This pre-flight check still looks ok, I don't see how it can result in an ultimately incorrect outcome as a result of this change, but the reasoning for it differs.

var found = false
for it in ancestors(head):
if it == fin:
found = true

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set this indicator variable rather than directly early-return?

Comment on lines 222 to 870
@@ -296,6 +300,11 @@ func updateFinalized(c: ForkedChainRef, finalized: BlockRef, fcuHead: BlockRef)
c.latest = candidate

proc updateBase(c: ForkedChainRef, base: BlockRef): uint =
## Invariant: must only be called from inside a queue handler (currently
## `processUpdateBase`, enqueued via `queueUpdateBase`). `preflightForkChoice`
## runs synchronously in the RPC handler and relies on this proc — and
## `updateFinalized` — never mutating `c.heads`/`hashToBlock` outside the
## single-consumer worker.
##
## A1 - A2 - A3 D5 - D6
## / /
@@ -743,7 +752,10 @@ proc forkChoice*(c: ForkedChainRef,
if safe.isOk:
c.fcuSafe.number = safe.number
c.fcuSafe.hash = safeHash
?safe.txFrame.fcuSafe(c.fcuSafe)
# Use `.expect(...)` like the sibling fcuHead/fcuFinalized writes so a
# DB-level failure here is a hard fault rather than a recoverable err -
# the RPC handler may have already acked the fCU before we run.
safe.txFrame.fcuSafe(c.fcuSafe).expect("fcuSafe OK")

if headHash == c.latest.hash:
if finalizedHash == zeroHash32:
@@ -813,6 +825,46 @@ template queueForkChoice*(c: ForkedChainRef,
await c.queue.addLast(item)
item.responseFut

func preflightForkChoice*(c: ForkedChainRef,
headHash: Hash32,
finalizedHash: Hash32,
safeHash: Hash32 = zeroHash32):
Result[void, string] =
## Pre-validate a forkchoice update against the current in-memory chain,
## mirroring the failure paths inside `forkChoice()`. Run this from the RPC
## handler before enqueuing so we don't ack a request that the queue worker
## would later reject (since the ack has already been sent by then).
##
## Invariant: between this check and the worker actually running our
## enqueued item, the only mutators of `c.heads`/`hashToBlock` are
## previously-queued `validateBlock` (adds heads, never removes the
## validated one) and `updateFinalized` (only prunes branches that don't
## contain the finalized block we already verified is on head's chain).
## So a successful preflight remains valid by the time the worker runs.
let head = c.hashToBlock.getOrDefault(headHash)
if head.isNil:
return err("Cannot find head block: " & headHash.short)

if safeHash != zeroHash32 and c.hashToBlock.getOrDefault(safeHash).isNil:
return err("Cannot find safe block: " & safeHash.short)

if finalizedHash != zeroHash32:
let fin = c.hashToBlock.getOrDefault(finalizedHash)
if fin.isNil:
return err("Cannot find finalized block: " & finalizedHash.short)
if fin.number > head.number:
return err("Invalid finalizedHash: block is newer than head block")
if c.heads.len > 1:
var found = false
for it in ancestors(head):
if it == fin:
found = true
break
if not found:
return err("Invalid finalizedHash: block not in argument head ancestor lineage")

ok()

func resolvedFinHash*(c: ForkedChainRef): Hash32 =
c.latestFinalized.hash

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a mismatch here between the preflight and the full fork choice handling. When finalizedHash == zeroHash32, i.e. the change has never finalized, and there's e.g., a reorg so that the condition in

if headHash == c.latest.hash:
if finalizedHash == zeroHash32:
# Do nothing if the new head already our current head
# and there is no request to new finality.
return ok()
let
# Find the unique branch where `headHash` is a member of.
head = ?c.findHeadPos(headHash)
# Finalized block must be parent or on the new canonical chain which is
# represented by `head`.
finalized = ?c.findFinalizedPos(finalizedHash, head)

doesn't fire, then it will try to run findFinalizedPos which per
func findFinalizedPos(
c: ForkedChainRef;
hash: Hash32;
head: BlockRef,
): Result[BlockRef, string] =
## Find header for argument `itHash` on argument `head` ancestor chain.
##
# OK, new finalized stays on the argument head branch.
# ::
# - B3 - B4 - B5 - B6
# / ^ ^
# A1 - A2 - A3 | |
# head CCH
#
# A1, A2, B3, B4, B5: valid
# A3, B6: invalid
# Find `hash` on the ancestor lineage of `head`
let fin = c.hashToBlock.getOrDefault(hash)
if fin.isOk:
if fin.number > head.number:
return err("Invalid finalizedHash: block is newer than head block")
# There is no point traversing the DAG if there is only one branch.
# Just return the node.
if c.heads.len == 1:
return ok(fin)
for it in ancestors(head):
if it == fin:
return ok(fin)
err("Invalid finalizedHash: block not in argument head ancestor lineage")
won't find the 0x0 finalized block hash and become an error.

At that point, the preflight and backend handling might have a mismatch. It actually seems like preflight here might have the correct handling and the gap might be in the backend handler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants