Commit Graph

221 Commits

Author SHA1 Message Date
starship-s 2e9ca283dc fix: display canonical cache hit percentage 2026-05-19 02:27:12 -06:00
Dennis Soong ea978a1989 fix: surface auto-compression handoff 2026-05-19 10:45:43 +08:00
nesquena-hermes 91099051c6 Stage 384: PR #2505
# Conflicts:
#	CHANGELOG.md
2026-05-18 22:44:02 +00:00
Michael Lam e94827f460 fix: stop replaying reasoning-only history 2026-05-18 10:50:42 -07:00
Hermes Agent 42b97d15f6 fix: clear fallback streaming warnings 2026-05-18 12:21:59 -05:00
Michael Lam f3f9f3ed40 fix: allow keyless named custom endpoints 2026-05-18 04:27:31 -07:00
nesquena-hermes 935d9e6402 Stage 379: PR #2461
# Conflicts:
#	CHANGELOG.md
2026-05-17 23:35:18 +00:00
swftwolfzyq b2ee7e365f Merge latest origin/master into codex/workspace-prefix-display-fix 2026-05-17 23:44:16 +08:00
swftwolfzyq 3553e63a51 Merge origin/master into codex/workspace-prefix-display-fix 2026-05-17 23:39:12 +08:00
starship-s 625d8d02fd fix: preserve memory lifecycle mark ordering 2026-05-17 05:16:46 -06:00
starship-s eb70a6dc5d fix: align WebUI memory commits with CLI boundaries 2026-05-17 05:04:57 -06:00
starship-s aecad0f427 [verified] Fix WebUI memory session lifecycle commits 2026-05-17 03:30:06 -06:00
nesquena-hermes 47c210899e Stage 374: PR #2421 — fix(cache-tokens): surface provider prompt-cache read/write tokens in WebUI usage by @Michaelyklam (fixes #2419)
Co-authored-by: Michael Lam <michael@example.local>
2026-05-17 02:49:34 +00:00
nesquena-hermes 8a950cfbdd Stage 373: PR #2417 — fix(streaming): stale compaction task resume on fresh greetings (closes #2308, supersedes #2309)
Co-authored-by: Frank Song <franksong2702@gmail.com>
2026-05-17 00:22:22 +00:00
Hermes Agent b937cf3583 Stage 370: PR #2390 — Fix live progress Activity grouping by @franksong2702
# Conflicts:
#	CHANGELOG.md
2026-05-16 20:21:58 +00:00
Hermes Agent ade7401ae1 Stage 369: PR #2396 — fix(streaming): preserve session agents for credential pools by @starship-s 2026-05-16 20:03:44 +00:00
starship-s 727e3c9c8f fix(streaming): preserve session agents for credential pools 2026-05-16 13:05:25 -06:00
Frank Song 2dfe3ffb42 Fix live progress activity grouping 2026-05-16 23:37:44 +08:00
Michael Lam 962b3840e6 fix: strip historical images in text mode 2026-05-16 03:55:12 -07:00
Hermes Agent b293bf8bc5 stage-364: Opus-caught live SSE event_id fix (side-channel approach)
Replace the earlier frontend-reset approach with a backend side-channel
approach that preserves the queue (event, data) tuple shape.

Problem (Opus catch):
- Live SSE frames emitted by _sse() in api/streaming.py:2296 carried no
  'id:' field. Only journal-replay frames (via _sse_with_id) emitted IDs.
- Frontend's _lastRunJournalSeq cursor stayed at 0 during live streaming.
- Mid-stream error → reconnect-to-replay arrived with after_seq=0.
- Server replayed every journaled event from seq 1.
- assistantText (closure-scoped) had accumulated all live tokens already
  → double-rendered output.

Fix:
- api/config.py: STREAM_LAST_EVENT_ID: dict = {} module-level dict.
- api/streaming.py put(): capture journal event_id, write to
  STREAM_LAST_EVENT_ID[stream_id]. Keep queue tuple as (event, data).
- api/routes.py _handle_sse_stream: read STREAM_LAST_EVENT_ID[stream_id]
  at emit time, use _sse_with_id when set.
- api/streaming.py finally block: pop STREAM_LAST_EVENT_ID for cleanup.

Why side-channel instead of 3-tuple:
- Earlier attempt (queue tuple → (event, data, event_id)) broke 4 existing
  tests: test_cancel_interrupt, test_sprint42, test_sprint51,
  test_issue1857_usage_overwrite. These all unpack 'event, data = q.get()'.
- Frontend-reset approach (reset assistantText before replay) broke 3
  other tests: test_smooth_text_fade, test_streaming_markdown,
  test_streaming_race_fix. _wireSSE must NOT reset accumulators because
  legacy reconnect doesn't replay events; only journal-replay does.

Side-channel preserves both invariants:
- Queue contract stays (event, data) — legacy consumers unbroken.
- Frontend accumulators stay alive on _wireSSE — legacy reconnect unbroken.
- Live SSE emits 'id:' so the journal cursor advances correctly.

6 regression tests added in test_stage364_opus_live_sse_event_id.py.
1 existing test (test_run_journal_streaming_static.test_streaming_journals_sse_events_before_queue_delivery) updated to be tuple-shape-agnostic.

Test results:
- Full pytest: 5713 passed, 10 skipped, 1 xfailed, 2 xpassed, 0 failed
- Previously-failing 5 tests: ALL PASS
- 6 new regression tests: ALL PASS
2026-05-16 03:58:54 +00:00
Frank Song 3b96035af0 Add WebUI run event journal replay 2026-05-16 02:58:34 +00:00
Michael Lam 0e91f89ce3 fix: clear runtime fields on loaded compression snapshots 2026-05-15 17:55:35 -07:00
Hermes Agent 62e4d9b2f5 Merge pull request #2327 into stage-362
fix: use assistant name in cancel copy (dotBeeps)
2026-05-15 22:55:35 +00:00
Michael Lam 6799ec56cf test: retarget compression snapshot runtime regression 2026-05-15 15:29:28 -07:00
dot 🐶 3add6f450f fix: use assistant name in cancel copy
Replace the hardcoded Skyly cancellation wording with the configured bot_name from settings, falling back to Hermes when unset.

Keep the client-side fallback in sync by using window._botName if the session refresh after cancellation fails.

Co-authored-by: Obryn 🐉 <obryn-ai@dotbeeps.dev>
2026-05-15 16:00:30 -04:00
Hermes Agent 29d13953d6 stage-361: apply Opus SHOULD-FIX — allow _attachment_root() in _build_native_multimodal_message 2026-05-15 19:55:34 +00:00
Hermes Agent a8a27eeb7d stage-360: Opus follow-up — update _ENV_LOCK docstring to reflect narrow-lock semantics
Opus stage-360 review caught that the docstring at api/streaming.py:40-43
said 'around the entire agent run' which is no longer accurate after the
narrow-lock refactor. The lock is now held only briefly for the env-mutation
critical section; the agent runs outside the lock and the finally block
re-acquires to atomically restore env vars.

Docstring now points to both narrow-lock implementations as references:
- _run_agent_streaming at line ~2719 (the original pattern)
- profile_env_for_background_worker at api/profiles.py:715 (added stage-360)
2026-05-15 19:05:37 +00:00
Hermes Agent fb0e664a10 stage-360 maintainer fix: narrow _ENV_LOCK to env mutation only in profile_env_for_background_worker
#2299 introduced profile_env_for_background_worker() in api/profiles.py and
changed _ENV_LOCK from threading.Lock() to threading.RLock(). Both changes
were incorrect:

1. RLock masked rather than fixed the underlying deadlock. The QA
   test_env_lock_is_non_reentrant test exists precisely to enforce
   non-reentrance — RLock would let a single thread hold _ENV_LOCK across
   nested critical sections, which hides bugs while still allowing
   different-thread races.

2. The original context manager held _ENV_LOCK for the ENTIRE 'yield'
   duration, meaning the lock was held for the full background worker's
   runtime (title generation, compression, update summary — possibly
   many seconds). That blocked ALL other sessions on _ENV_LOCK, which
   the QA test_third_message_completes runtime test caught as a timeout
   on the third sequential message.

Fix: mirror the narrow-lock pattern from _run_agent_streaming:
  - Acquire _ENV_LOCK only for env mutation (set runtime_env + patch
    skill modules)
  - Release immediately, yield to worker (no lock held)
  - Reacquire in finally to restore env + skill modules

Restored _ENV_LOCK back to threading.Lock(). All 20 QA tests now pass,
including test_third_message_completes (was timing out, now 35s).
2026-05-15 17:11:45 +00:00
Hermes Agent 3b05929f1a Merge pull request #2299 into stage-360
Fix profile-scoped auxiliary routing for background workers (starship-s)
2026-05-15 16:15:39 +00:00
Hermes Agent b2ebbebf01 Merge pull request #2279 into stage-360
Fix WebUI stream completion recovery gaps (franksong2702, closes #2262 + #2168)
2026-05-15 16:15:38 +00:00
Hermes Agent 75a2464821 stage-359: apply Opus SHOULD-FIX — symmetric runtime-field clearing on snapshot load-and-mark path 2026-05-15 15:27:24 +00:00
Hermes Agent fb8b91019e Merge pull request #2295 into stage-359
fix: clear runtime fields on compression snapshots (ai-ag2026)

# Conflicts:
#	CHANGELOG.md
#	api/streaming.py
2026-05-15 15:06:35 +00:00
Hermes Agent 4826a31fbc Merge pull request #2285 into stage-359
fix: hide pre-compression snapshots from sidebar (dso2ng, refs #2230)

# Conflicts:
#	CHANGELOG.md
2026-05-15 14:55:19 +00:00
starship-s abb6057304 test(profiles): keep profile module reloads isolated 2026-05-15 04:14:09 -06:00
Frank Song cadcf983d5 Tighten silent failure shrink detection 2026-05-15 18:04:53 +08:00
starship-s 4ffecdd7c9 refactor(profiles): consolidate background profile env 2026-05-15 03:58:40 -06:00
Dennis Soong eb31b4ed1e test: tighten compression snapshot preservation coverage 2026-05-15 17:31:37 +08:00
starship-s aa1c7c24f4 fix(profiles): route background aux workers via session profile 2026-05-15 03:02:42 -06:00
ai-ag2026 3a4259476d fix: clear runtime fields on compression snapshots 2026-05-15 09:20:19 +02:00
Dennis Soong bfccdc5c94 fix: hide pre-compression snapshots from sidebar 2026-05-15 11:20:17 +08:00
Frank Song 5f9b9c02b2 Fix WebUI stream completion recovery gaps 2026-05-15 08:36:48 +08:00
fxd-jason 1e80b51560 fix: align usage-overwrite test FakeAgent with real agent message format
The FakeAgent in test_issue1857_usage_overwrite returned only 2 messages
(user + assistant) without the conversation history. The real agent always
returns the full history plus new messages. This mismatch caused the new
_has_new_assistant_reply helper (which checks only messages beyond the
pre-turn offset) to see len(result)==len(prev) and incorrectly flag the
turn as a silent failure.

Fix: prepend conversation_history to the FakeAgent's response so the
message list mirrors production behavior.
2026-05-14 14:48:08 +08:00
fxd-jason 120ec5eba2 fix: silent failure detection scans only new messages, not full history
When a provider error (401/429/rate-limit) causes the agent to return
without producing a new assistant reply, the WebUI should emit an
apperror event so the user sees an inline error. However, the detection
logic scanned ALL messages in result['messages'] — which includes the
full conversation history. If any prior turn had an assistant response,
_assistant_added would be True and the apperror would be silently
skipped, leaving the user staring at a blank response.

Extract a helper _has_new_assistant_reply(all_messages, prev_count)
that only inspects messages beyond the pre-turn history offset. Apply
it to both the main detection path and the self-heal/retry path.

Tests: 15 new cases covering history masking, empty content, whitespace,
edge-case shrinks, and multi-assistant scenarios.
2026-05-14 14:34:19 +08:00
Hermes Agent 3d34a72ee8 stage-353: apply Opus SHOULD-FIX — unconditional parent_session_id stamp on compression rotation
Opus identified that PR #2227's preservation block had two related bugs in
the parent_session_id handling:

1. During preservation save: code did
     _old_parent = s.parent_session_id
     s.parent_session_id = None
     s.save(touch_updated_at=False, skip_index=True)
     s.parent_session_id = _old_parent
   The save persisted parent=None to disk. The in-memory restoration didn't
   reach the disk copy. Result: a /branch fork session that subsequently
   compressed lost its 'Forked from X' badge on the preserved old snapshot.

2. Stamping the continuation: code did
     if not s.parent_session_id:
         s.parent_session_id = old_sid
   The 'if not' guard skipped the stamp when the session already had a
   parent_session_id from a prior fork. Result: fork-of-fork compression
   broke lineage — the continuation jumped back to the original fork parent
   instead of the just-preserved immediate predecessor snapshot.

Fix (matches Opus's recommendation):
  - Remove the parent clearing during preservation save (preserve as-is)
  - Drop the 'if not' guard; always stamp continuation to old_sid

This makes the lineage chain consistent: new → old → old.parent → ... root.
Traversal from the continuation always walks through the just-preserved
snapshot to get to its parent's parent, never jumping over the snapshot.

Two new regression tests pin both invariants:
  - test_parent_session_id_stamped_unconditionally (no 'if not' guard)
  - test_old_session_parent_preserved_during_archive_save (no parent=None)

Both pass against the fix. All 8 tests in the file pass.
2026-05-14 03:59:02 +00:00
RØG3R L!M4 5bbf18324c fix: preserve session history during compression rotation (#2223)
The previous implementation renamed old_sid.json → new_sid.json during
context compression, destroying the only persistent copy of the full
conversation history. If the summarisation LLM call also failed, the
user was left with zero recoverable messages.

Fix:
- Remove the destructive old_path.rename(new_path) call
- Preserve old_sid.json as an immutable pre-compression archive
- Create new_sid.json as a fresh file via s.save()
- Set parent_session_id on the continuation session for lineage
- Save in-memory messages to old_sid.json if they're newer than disk

Test: test_issue2223_compression_no_rename.py (6 tests, all passing)
2026-05-14 03:02:44 +00:00
Frank Song 28ec3af697 fix: strip only leading user-asking wrapper line
Refs #2215 Fix B: remove the mid-response stripping hazard without losing leading multi-line wrapper cleanup.

The pattern now strips only a leading 'the user is asking' wrapper line and preserves the visible answer that follows. Add regression coverage for both the leading-wrapper and mid-response prose cases.
2026-05-14 09:14:28 +08:00
Frank Song dc213d47b8 fix: preserve literal thinking tags 2026-05-14 07:13:34 +08:00
Hermes Agent 7209e89ef4 stage-350: apply Opus SHOULD-FIX — tighten _partial_already_present dedup scope
Opus flagged that PR #2151's cancel-handler partial-dedup loop used a
substring check that was too broad: any short prior assistant reply
('OK', 'Here is the answer:') would dedup a longer new partial containing
it, silently dropping the partial and resurrecting the #893 data-loss bug.

Tightened to only dedup against actual prior _partial=True markers with
exact (whitespace-stripped) content match. Three new regression tests
added (short-non-partial-prefix-does-not-dedup, exact-partial-match-still-
dedups, same-content-non-partial-does-not-dedup).

10/10 partial-cancel tests pass after the fix. Also updated CHANGELOG with
the conflict-resolution notes for #2151 vs #2136 and the #2178 test-fix.
2026-05-13 21:11:01 +00:00
Hermes Agent 3f851051cf Merge pull request #2151 into stage-350
fix: clarify cancelled chat turn status (Jordan-SkyLF)

Conflict resolution on api/streaming.py:4549-4567 (the cancel-handler
ownership guard). Both this PR and the already-shipped PR #2136 add a
guard at the same site against stale stream writebacks, from different
angles:

  - PR #2136 (HEAD): _stream_writeback_is_current(_cs, stream_id) — strictly
    dominates by checking the active_stream_id token equality.
  - PR #2151: 'worker won the race' check via (active_stream_id != stream_id
    and not pending_user_message), with _emit_cancel_event = False to suppress
    the terminal cancel event.

Resolution merges both: keep #2136's strictly-stronger condition for skip
detection, and adopt #2151's _emit_cancel_event = False semantic so the
cancel event isn't emitted in addition to skipping the writeback (when
client may have already received the successful done payload).

55/55 tests pass across cancelled-turn-status + stale-stream-writeback +
the four cancel/data-loss sibling test files.
2026-05-13 20:44:44 +00:00
Hermes Agent 7150e9fe70 Merge pull request #2202 into stage-349
feat: show early session titles on chat start (Jordan-SkyLF)
2026-05-13 19:03:03 +00:00