Skip to content

Drop write-lane wrap from BatchWriteItem entry points#568

Merged
Osvaldo Andrade (osvaldoandrade) merged 1 commit into
masterfrom
fix/write-lane-fsm-deadlock
Jun 23, 2026
Merged

Drop write-lane wrap from BatchWriteItem entry points#568
Osvaldo Andrade (osvaldoandrade) merged 1 commit into
masterfrom
fix/write-lane-fsm-deadlock

Conversation

@osvaldoandrade

Copy link
Copy Markdown
Collaborator

Summary

  • 720da17 wrapped PutItemWithCtx, DeleteItemWithCtx and BatchWriteItemCtx in runWriteCtx, holding a write-lane worker across CommitBatch -> Replicate. The leader FSM's ApplyCommittedBatch also takes the write lane (the [Wave 2] Bypass write-lane in CommitBatch to restore group-commit #428 invariant pinned by TestApplyCommittedBatchUsesWriteLane), so under sustained load every worker is parked in Replicate and the FSM cannot acquire a slot — every write deadlocks until RPC_TIMEOUT.
  • Reverts the high-level entry-point lane wrap and restores the workers*2 dispatch buffer that 720da17 dropped to 0. The lane still throttles FSM applies and Set / Delete, which is where it was designed to live. Ctx variants keep their signatures for cancellation and observability.
  • SL threading on writes will need to ride on the batch payload and be dispatched at ApplyCommittedBatch, not at the caller goroutine. Tracked as follow-up.

Bench (8-node, 24 shards, RF=3, 64 write workers, 5m sustained)

Phase baseline cab738b 720da17 broken master + fix
write_only 110 911 rows/s, 0 err, 5m 589 rows/s, 64 err, 30s abort 113 975 rows/s, 0 err, 5m
read_seed 155 461 rows/s 506 rows/s, 64 err 185 742 rows/s
read_only 109 724 rows/s 150 439 rows/s 109 338 rows/s
mixed/write 34 763 rows/s 388 rows/s, 64 err 32 017 rows/s
mixed/read 25 054 rows/s 172 288 rows/s, 512 err 23 679 rows/s

write_only recovery factor: ~193x.

Test plan

  • go test ./internal/storage/adapter/pebble/... -race
  • go test ./internal/server/...
  • scripts/bench/bench_8node_matrix.sh PASS on all 5 phases, 0 errors anywhere
  • TestApplyCommittedBatchUsesWriteLane still pins the FSM-side lane invariant
  • TestCommitBatchBypassesWriteLane unchanged
  • TestCtxAwareMethodsUseServiceLevelShares updated: write-side assertion removed (it was asserting the deadlock-causing behaviour); read-side SL routing still verified

720da17 routed PutItemWithCtx, DeleteItemWithCtx and BatchWriteItemCtx
through the write lane to thread service levels through the DRR
scheduler. Under sustained write load this deadlocks: the entry point
holds one of N write-lane workers across CommitBatch -> Replicate,
which blocks waiting for raft to commit + apply. The leader FSM then
calls ApplyCommittedBatch, which itself takes the write lane
(TestApplyCommittedBatchUsesWriteLane pins this). With every worker
parked at <-req.done in Replicate, the FSM cannot acquire a slot and
the round-trip stalls until RPC_TIMEOUT.

Restore the #428 invariant: high-level write entries bypass the lane.
The lane still throttles FSM applies and Set / Delete, which is where
it was designed to live. Ctx variants keep their signatures for
cancellation and observability, but no longer reserve a worker. SL
threading on writes will need to ride on the batch payload and be
dispatched at ApplyCommittedBatch, not at the caller goroutine.

Also restore dispatch's workers*2 buffer that 720da17 dropped to 0:
without it the dispatcher serializes worker handoffs and loses
pipelining; the change was not motivated by the commit.

8-node bench (24 shards, RF=3, 64 write workers, 5m sustained):
  baseline cab738b:  write_only 110 911 rows/s, 0 errors, 5m
  720da17 broken:    write_only       589 rows/s, 64 errors, 30s abort
  master + fix:      write_only 113 975 rows/s, 0 errors, 5m
@osvaldoandrade Osvaldo Andrade (osvaldoandrade) merged commit 042fa50 into master Jun 23, 2026
7 checks passed
@osvaldoandrade Osvaldo Andrade (osvaldoandrade) deleted the fix/write-lane-fsm-deadlock branch June 23, 2026 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant