Skip to content

fix(ci): stop fuzz jobs from oversubscribing the CPU#21517

Merged
wwared merged 1 commit into
developfrom
fix/fuzz-no-double-parallelization
Jun 23, 2026
Merged

fix(ci): stop fuzz jobs from oversubscribing the CPU#21517
wwared merged 1 commit into
developfrom
fix/fuzz-no-double-parallelization

Conversation

@wwared

@wwared wwared commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Problem

Go fuzz CI jobs (cannon-fuzz, fuzz-golang-*, op-e2e-fuzz) intermittently fail with a bare context deadline exceeded at the -fuzztime boundary — not a real crash. Three such failures on 2026-06-22 across two jobs (e.g. FuzzStateHintRead, FuzzEncodeDecodeWithdrawal); ~0 failures in the prior 745+ runs per job.

Cause

Each fuzz recipe runs N targets concurrently via parallel, and each go test -fuzz defaults its worker count to GOMAXPROCS. On the 8-core CI box that's 8 × 8 = 64 fuzz worker processes. Under that oversubscription a worker can be starved of CPU and miss the fuzztime deadline, which the engine reports as context deadline exceeded.

Fix

Pass -parallel=1 so each target uses a single fuzz worker — the only parallelism left is across targets (≈ cores, no oversubscription). Applied in the shared go_fuzz helper (covers op-node, op-batcher, op-chain-ops, op-service, op-challenger) and the two bespoke recipes (cannon, op-e2e).

The duration -fuzztime budgets are unchanged, so wall-time is unaffected (verified locally: identical wall-time, no throughput regression).

This is a CI-scheduling flake reproducible only under contention, so there's no unit-test regression guard.

Closes #21516

Each fuzz recipe runs N targets concurrently via `parallel`, and each
`go test -fuzz` defaults to GOMAXPROCS workers — N×N worker processes on
an N-core CI box. A starved worker can miss the `-fuzztime` deadline and
fail the target with a bare `context deadline exceeded` (not a real crash).

Pass `-parallel=1` so each target uses a single fuzz worker, leaving the
only parallelism across targets. Fixed in the shared `go_fuzz` helper
(op-node, op-batcher, op-chain-ops, op-service, op-challenger) and the two
bespoke recipes (cannon, op-e2e). Duration `-fuzztime` budgets are
unchanged, so CI wall-time is unaffected.
@wwared wwared requested a review from a team as a code owner June 22, 2026 23:50
@wwared wwared enabled auto-merge June 22, 2026 23:52
@wwared wwared added this pull request to the merge queue Jun 23, 2026
Merged via the queue into develop with commit e546bf7 Jun 23, 2026
101 checks passed
@wwared wwared deleted the fix/fuzz-no-double-parallelization branch June 23, 2026 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

flaky test: Go fuzz jobs context deadline exceeded (cannon-fuzz, fuzz-golang-op-chain-ops)

2 participants