The WebUI clarification popup had a response-delivery failure: users
submitted answers in the popup, but the agent still fell through to the
timeout fallback message. Three bugs conspired:
1. No stable clarify_id — _ClarifyEntry had no unique identifier, so
the frontend could not reference a specific pending prompt. The
backend used FIFO resolution which silently failed for stale/late
responses.
2. Frontend hid the card before confirmation — respondClarify() called
hideClarifyCard(true, 'sent') BEFORE the API call completed. If the
backend rejected the response, the card was already gone and the
user's draft was discarded.
3. Backend lied about success — _resolve_clarify_legacy() returned
bool(resolved) or not bool(clarify_id). Since the frontend never
sent clarify_id, the backend always reported ok:true even when
nothing was resolved.
Changes:
api/clarify.py:
- _ClarifyEntry now auto-generates a stable clarify_id (uuid4.hex[:12])
- submit_pending() injects clarify_id into the data dict visible to the
frontend via SSE and polling
- New resolve_clarify_by_id() for O(1) lookup by id instead of FIFO pop
api/routes.py:
- _resolve_clarify_legacy() uses resolve_clarify_by_id when clarify_id
is provided; returns actual bool result (no more unconditional True)
- _handle_clarify_respond() returns HTTP 409 + {ok:false, stale:true}
when resolution fails
static/messages.js:
- respondClarify() now sends clarify_id in the POST body
- Waits for a positive backend acknowledgement before hiding the card
- Saves a draft copy before POST and restores it on failure
- On 409/network error: re-enables controls, shows error toast
- Guards against parallel-SSE race where clearing the cache after a
successful response could erase a newly queued next prompt (codex P1)
tests:
- Updated test_sprint30.py for new ack-before-hide behaviour
- Updated test_clarify_unblock.py for 409 on stale responses
Closes#2639.
Replaces the 1.5s HTTP polling loop for clarify with a Server-Sent Events endpoint at /api/clarify/stream that pushes clarify events to the browser instantly. Mirrors the approval SSE pattern from v0.50.248 (#1350) including all the correctness lessons:
- Atomic subscribe + initial snapshot under clarify._lock
- _clarify_sse_notify called inside _lock for ordering guarantees (no notify-out-of-order race)
- Notify passes head=q[0].data (head-fidelity, not the just-appended entry)
- resolve_clarify also calls notify after pop so trailing clarifies surface immediately (no stuck-clarify bug)
- Empty-state notify with None,0 after pop-empty so frontend hides the card
- 30s keepalive comments, _CLIENT_DISCONNECT_ERRORS handling
- Bounded queue (maxsize=16) with silent drop on full
- Frontend: EventSource with automatic 3s HTTP polling fallback on onerror
Co-authored-by: fxd-jason <wujiachen7@gmail.com>