Replace the earlier frontend-reset approach with a backend side-channel
approach that preserves the queue (event, data) tuple shape.
Problem (Opus catch):
- Live SSE frames emitted by _sse() in api/streaming.py:2296 carried no
'id:' field. Only journal-replay frames (via _sse_with_id) emitted IDs.
- Frontend's _lastRunJournalSeq cursor stayed at 0 during live streaming.
- Mid-stream error → reconnect-to-replay arrived with after_seq=0.
- Server replayed every journaled event from seq 1.
- assistantText (closure-scoped) had accumulated all live tokens already
→ double-rendered output.
Fix:
- api/config.py: STREAM_LAST_EVENT_ID: dict = {} module-level dict.
- api/streaming.py put(): capture journal event_id, write to
STREAM_LAST_EVENT_ID[stream_id]. Keep queue tuple as (event, data).
- api/routes.py _handle_sse_stream: read STREAM_LAST_EVENT_ID[stream_id]
at emit time, use _sse_with_id when set.
- api/streaming.py finally block: pop STREAM_LAST_EVENT_ID for cleanup.
Why side-channel instead of 3-tuple:
- Earlier attempt (queue tuple → (event, data, event_id)) broke 4 existing
tests: test_cancel_interrupt, test_sprint42, test_sprint51,
test_issue1857_usage_overwrite. These all unpack 'event, data = q.get()'.
- Frontend-reset approach (reset assistantText before replay) broke 3
other tests: test_smooth_text_fade, test_streaming_markdown,
test_streaming_race_fix. _wireSSE must NOT reset accumulators because
legacy reconnect doesn't replay events; only journal-replay does.
Side-channel preserves both invariants:
- Queue contract stays (event, data) — legacy consumers unbroken.
- Frontend accumulators stay alive on _wireSSE — legacy reconnect unbroken.
- Live SSE emits 'id:' so the journal cursor advances correctly.
6 regression tests added in test_stage364_opus_live_sse_event_id.py.
1 existing test (test_run_journal_streaming_static.test_streaming_journals_sse_events_before_queue_delivery) updated to be tuple-shape-agnostic.
Test results:
- Full pytest: 5713 passed, 10 skipped, 1 xfailed, 2 xpassed, 0 failed
- Previously-failing 5 tests: ALL PASS
- 6 new regression tests: ALL PASS
Opus advisor caught that the new run-journal replay path could double-render
when the live stream errors mid-stream:
- Live SSE frames emitted by _sse() in api/streaming.py:2296 carry no 'id:'
field. Only _sse_with_id() (used in _replay_run_journal at routes.py:5853)
emits IDs.
- During live streaming, EventSource.lastEventId stays empty, so the frontend's
_lastRunJournalSeq stays at 0.
- If the server dies mid-stream, the error reconnect handler opens replay with
after_seq=0 — server replays every journaled event from seq 1.
- assistantText accumulator (closure scope in messages.js) carries over from
the live phase. The token handler unconditionally appends d.text. Double-
rendered text.
Fix: reset assistantText, reasoningText, liveReasoningText, segmentStart, and
set _smdReconnect=true before opening the replay EventSource. Next live token
clears assistantBody.innerHTML to match the reset accumulator.
4 regression tests added in test_stage364_opus_replay_doublerender_fix.py.
Revert-fix verification confirms 3/4 tests fail against reverted code.
This is the TWO-LAYER catch in action: agent self-verified the producer→
consumer chain works end-to-end (Step 3 in agent-side-empirical-verification.md
PASSED for #2283), and Opus independently caught a separate frontend coupling
issue. Both checks required and both fire.
v0.51.71 — Release AU:
- PR #2349 (fixes#2345) — Stale-stream cleanup non-touching of updated_at
- PR #2343 (refs #2147) — Profiles vs workspaces help card
- PR #2283 (refs #1925) — WebUI run event journal replay (RFC slice 1)
Also relabeled #2283's CHANGELOG entry to add proper PR #2283 attribution
(it had been added without the PR number prefix during the contributor PR),
and #2349's 'PR TBD' placeholder filled in.
v0.51.70 — Release AS (this batch):
- PR #2337 (compression snapshot runtime-clear branch 2)
- PR #2334 (turn-journal fcntl lock)
- PR #2342 (INFLIGHT reattach pending user row)
- PR #2339 (workspace panel edge reopen toggle)
v0.51.69 — Release AT (retroactive — these PRs shipped at v0.51.69
tag yesterday but were never moved out of Unreleased at release time;
restoring proper attribution):
- PR #2332, #2333, #2322, #2326, #2327, #2328, #2330, #2331
CHANGELOG drift detected via Pitfall 6 in test-augmentation pitfalls
doc — Unreleased section contained 8 orphan PRs that shipped at the
v0.51.69 tag but were never sectioned correctly. Retroactively splicing
the v0.51.69 header to attribute them properly so future release notes
don't mis-attribute work to v0.51.70.
Replace the hardcoded Skyly cancellation wording with the configured bot_name from settings, falling back to Hermes when unset.
Keep the client-side fallback in sync by using window._botName if the session refresh after cancellation fails.
Co-authored-by: Obryn 🐉 <obryn-ai@dotbeeps.dev>