hermes-webui

mirror of https://github.com/nesquena/hermes-webui.git synced 2026-05-25 11:10:18 +00:00

Author	SHA1	Message	Date
Hermes Agent	62e4d9b2f5	Merge pull request #2327 into stage-362 fix: use assistant name in cancel copy (dotBeeps)	2026-05-15 22:55:35 +00:00
Michael Lam	6799ec56cf	test: retarget compression snapshot runtime regression	2026-05-15 15:29:28 -07:00
dot 🐶	3add6f450f	fix: use assistant name in cancel copy Replace the hardcoded Skyly cancellation wording with the configured bot_name from settings, falling back to Hermes when unset. Keep the client-side fallback in sync by using window._botName if the session refresh after cancellation fails. Co-authored-by: Obryn 🐉 <obryn-ai@dotbeeps.dev>	2026-05-15 16:00:30 -04:00
Hermes Agent	29d13953d6	stage-361: apply Opus SHOULD-FIX — allow _attachment_root() in _build_native_multimodal_message	2026-05-15 19:55:34 +00:00
Hermes Agent	a8a27eeb7d	stage-360: Opus follow-up — update _ENV_LOCK docstring to reflect narrow-lock semantics Opus stage-360 review caught that the docstring at api/streaming.py:40-43 said 'around the entire agent run' which is no longer accurate after the narrow-lock refactor. The lock is now held only briefly for the env-mutation critical section; the agent runs outside the lock and the finally block re-acquires to atomically restore env vars. Docstring now points to both narrow-lock implementations as references: - _run_agent_streaming at line ~2719 (the original pattern) - profile_env_for_background_worker at api/profiles.py:715 (added stage-360)	2026-05-15 19:05:37 +00:00
Hermes Agent	fb0e664a10	stage-360 maintainer fix: narrow _ENV_LOCK to env mutation only in profile_env_for_background_worker #2299 introduced profile_env_for_background_worker() in api/profiles.py and changed _ENV_LOCK from threading.Lock() to threading.RLock(). Both changes were incorrect: 1. RLock masked rather than fixed the underlying deadlock. The QA test_env_lock_is_non_reentrant test exists precisely to enforce non-reentrance — RLock would let a single thread hold _ENV_LOCK across nested critical sections, which hides bugs while still allowing different-thread races. 2. The original context manager held _ENV_LOCK for the ENTIRE 'yield' duration, meaning the lock was held for the full background worker's runtime (title generation, compression, update summary — possibly many seconds). That blocked ALL other sessions on _ENV_LOCK, which the QA test_third_message_completes runtime test caught as a timeout on the third sequential message. Fix: mirror the narrow-lock pattern from _run_agent_streaming: - Acquire _ENV_LOCK only for env mutation (set runtime_env + patch skill modules) - Release immediately, yield to worker (no lock held) - Reacquire in finally to restore env + skill modules Restored _ENV_LOCK back to threading.Lock(). All 20 QA tests now pass, including test_third_message_completes (was timing out, now 35s).	2026-05-15 17:11:45 +00:00
Hermes Agent	3b05929f1a	Merge pull request #2299 into stage-360 Fix profile-scoped auxiliary routing for background workers (starship-s)	2026-05-15 16:15:39 +00:00
Hermes Agent	b2ebbebf01	Merge pull request #2279 into stage-360 Fix WebUI stream completion recovery gaps (franksong2702, closes #2262 + #2168)	2026-05-15 16:15:38 +00:00
Hermes Agent	75a2464821	stage-359: apply Opus SHOULD-FIX — symmetric runtime-field clearing on snapshot load-and-mark path	2026-05-15 15:27:24 +00:00
Hermes Agent	fb8b91019e	Merge pull request #2295 into stage-359 fix: clear runtime fields on compression snapshots (ai-ag2026) # Conflicts: # CHANGELOG.md # api/streaming.py	2026-05-15 15:06:35 +00:00
Hermes Agent	4826a31fbc	Merge pull request #2285 into stage-359 fix: hide pre-compression snapshots from sidebar (dso2ng, refs #2230) # Conflicts: # CHANGELOG.md	2026-05-15 14:55:19 +00:00
starship-s	abb6057304	test(profiles): keep profile module reloads isolated	2026-05-15 04:14:09 -06:00
Frank Song	cadcf983d5	Tighten silent failure shrink detection	2026-05-15 18:04:53 +08:00
starship-s	4ffecdd7c9	refactor(profiles): consolidate background profile env	2026-05-15 03:58:40 -06:00
Dennis Soong	eb31b4ed1e	test: tighten compression snapshot preservation coverage	2026-05-15 17:31:37 +08:00
starship-s	aa1c7c24f4	fix(profiles): route background aux workers via session profile	2026-05-15 03:02:42 -06:00
ai-ag2026	3a4259476d	fix: clear runtime fields on compression snapshots	2026-05-15 09:20:19 +02:00
Dennis Soong	bfccdc5c94	fix: hide pre-compression snapshots from sidebar	2026-05-15 11:20:17 +08:00
Frank Song	5f9b9c02b2	Fix WebUI stream completion recovery gaps	2026-05-15 08:36:48 +08:00
fxd-jason	1e80b51560	fix: align usage-overwrite test FakeAgent with real agent message format The FakeAgent in test_issue1857_usage_overwrite returned only 2 messages (user + assistant) without the conversation history. The real agent always returns the full history plus new messages. This mismatch caused the new _has_new_assistant_reply helper (which checks only messages beyond the pre-turn offset) to see len(result)==len(prev) and incorrectly flag the turn as a silent failure. Fix: prepend conversation_history to the FakeAgent's response so the message list mirrors production behavior.	2026-05-14 14:48:08 +08:00
fxd-jason	120ec5eba2	fix: silent failure detection scans only new messages, not full history When a provider error (401/429/rate-limit) causes the agent to return without producing a new assistant reply, the WebUI should emit an apperror event so the user sees an inline error. However, the detection logic scanned ALL messages in result['messages'] — which includes the full conversation history. If any prior turn had an assistant response, _assistant_added would be True and the apperror would be silently skipped, leaving the user staring at a blank response. Extract a helper _has_new_assistant_reply(all_messages, prev_count) that only inspects messages beyond the pre-turn history offset. Apply it to both the main detection path and the self-heal/retry path. Tests: 15 new cases covering history masking, empty content, whitespace, edge-case shrinks, and multi-assistant scenarios.	2026-05-14 14:34:19 +08:00
Hermes Agent	3d34a72ee8	stage-353: apply Opus SHOULD-FIX — unconditional parent_session_id stamp on compression rotation Opus identified that PR #2227's preservation block had two related bugs in the parent_session_id handling: 1. During preservation save: code did _old_parent = s.parent_session_id s.parent_session_id = None s.save(touch_updated_at=False, skip_index=True) s.parent_session_id = _old_parent The save persisted parent=None to disk. The in-memory restoration didn't reach the disk copy. Result: a /branch fork session that subsequently compressed lost its 'Forked from X' badge on the preserved old snapshot. 2. Stamping the continuation: code did if not s.parent_session_id: s.parent_session_id = old_sid The 'if not' guard skipped the stamp when the session already had a parent_session_id from a prior fork. Result: fork-of-fork compression broke lineage — the continuation jumped back to the original fork parent instead of the just-preserved immediate predecessor snapshot. Fix (matches Opus's recommendation): - Remove the parent clearing during preservation save (preserve as-is) - Drop the 'if not' guard; always stamp continuation to old_sid This makes the lineage chain consistent: new → old → old.parent → ... root. Traversal from the continuation always walks through the just-preserved snapshot to get to its parent's parent, never jumping over the snapshot. Two new regression tests pin both invariants: - test_parent_session_id_stamped_unconditionally (no 'if not' guard) - test_old_session_parent_preserved_during_archive_save (no parent=None) Both pass against the fix. All 8 tests in the file pass.	2026-05-14 03:59:02 +00:00
RØG3R L!M4	5bbf18324c	fix: preserve session history during compression rotation (#2223 ) The previous implementation renamed old_sid.json → new_sid.json during context compression, destroying the only persistent copy of the full conversation history. If the summarisation LLM call also failed, the user was left with zero recoverable messages. Fix: - Remove the destructive old_path.rename(new_path) call - Preserve old_sid.json as an immutable pre-compression archive - Create new_sid.json as a fresh file via s.save() - Set parent_session_id on the continuation session for lineage - Save in-memory messages to old_sid.json if they're newer than disk Test: test_issue2223_compression_no_rename.py (6 tests, all passing)	2026-05-14 03:02:44 +00:00
Frank Song	28ec3af697	fix: strip only leading user-asking wrapper line Refs #2215 Fix B: remove the mid-response stripping hazard without losing leading multi-line wrapper cleanup. The pattern now strips only a leading 'the user is asking' wrapper line and preserves the visible answer that follows. Add regression coverage for both the leading-wrapper and mid-response prose cases.	2026-05-14 09:14:28 +08:00
Frank Song	dc213d47b8	fix: preserve literal thinking tags	2026-05-14 07:13:34 +08:00
Hermes Agent	7209e89ef4	stage-350: apply Opus SHOULD-FIX — tighten _partial_already_present dedup scope Opus flagged that PR #2151's cancel-handler partial-dedup loop used a substring check that was too broad: any short prior assistant reply ('OK', 'Here is the answer:') would dedup a longer new partial containing it, silently dropping the partial and resurrecting the #893 data-loss bug. Tightened to only dedup against actual prior _partial=True markers with exact (whitespace-stripped) content match. Three new regression tests added (short-non-partial-prefix-does-not-dedup, exact-partial-match-still- dedups, same-content-non-partial-does-not-dedup). 10/10 partial-cancel tests pass after the fix. Also updated CHANGELOG with the conflict-resolution notes for #2151 vs #2136 and the #2178 test-fix.	2026-05-13 21:11:01 +00:00
Hermes Agent	3f851051cf	Merge pull request #2151 into stage-350 fix: clarify cancelled chat turn status (Jordan-SkyLF) Conflict resolution on api/streaming.py:4549-4567 (the cancel-handler ownership guard). Both this PR and the already-shipped PR #2136 add a guard at the same site against stale stream writebacks, from different angles: - PR #2136 (HEAD): _stream_writeback_is_current(_cs, stream_id) — strictly dominates by checking the active_stream_id token equality. - PR #2151: 'worker won the race' check via (active_stream_id != stream_id and not pending_user_message), with _emit_cancel_event = False to suppress the terminal cancel event. Resolution merges both: keep #2136's strictly-stronger condition for skip detection, and adopt #2151's _emit_cancel_event = False semantic so the cancel event isn't emitted in addition to skipping the writeback (when client may have already received the successful done payload). 55/55 tests pass across cancelled-turn-status + stale-stream-writeback + the four cancel/data-loss sibling test files.	2026-05-13 20:44:44 +00:00
Hermes Agent	7150e9fe70	Merge pull request #2202 into stage-349 feat: show early session titles on chat start (Jordan-SkyLF)	2026-05-13 19:03:03 +00:00
Jordan SkyLF	0381294f1c	feat: add early session provisional titles	2026-05-13 11:37:11 -07:00
MrFant	520795fdd2	fix: preserve reasoning_content in API message whitelist Providers like Xiaomi MiMo, DeepSeek, and Kimi require reasoning_content to be echoed back on every assistant message in multi-turn conversations with tool calls. Omitting it causes HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' The WebUI's _sanitize_messages_for_api() strips all fields not in _API_SAFE_MSG_KEYS before sending conversation history to the LLM API. reasoning_content was not in this whitelist, so it was silently dropped. The CLI path (run_agent.py) is unaffected because it has its own _copy_reasoning_content_for_api() logic that operates on raw message dicts without going through this filter. This is why the same session works from CLI but fails from WebUI with HTTP 400. The fix adds 'reasoning_content' to _API_SAFE_MSG_KEYS so the field passes through sanitization intact.	2026-05-14 02:29:17 +08:00
Lumen Yang	3289c44fb6	fix: refresh context ring after compression	2026-05-13 14:02:28 +02:00
Frank Song	9ea4f1145d	Fix stale stream exception writeback guards	2026-05-13 10:23:03 +08:00
Hermes Agent	20717a0d0a	Merge pull request #2136 into stage-345 fix: guard stale stream writebacks (LumenYoung) Prevents stale WebUI stream workers from writing old results into a session after that session has already moved on to another stream. Adds new helper _stream_writeback_is_current() (a token equality check against the session's active_stream_id) and short-circuits the two finalize/cancel paths when the worker no longer owns the session writeback.	2026-05-12 23:11:48 +00:00
Jordan SkyLF	112eadc209	fix: address cancelled turn review feedback - classify string-only CancelledError payloads as cancelled - centralize cancel marker substring matching - add targeted regression coverage	2026-05-12 15:43:36 -07:00
Lumen Yang	4b57b202a0	fix: guard stale stream writebacks	2026-05-13 00:05:09 +02:00
Jordan SkyLF	e4d16e93c7	fix: clarify cancelled chat turn status	2026-05-12 13:26:49 -07:00
Hermes Agent	a06952ab00	Merge pull request #2140 into stage-344 Preserve fallback provider credential hints (closes #2133) # Conflicts: # CHANGELOG.md	2026-05-12 16:12:54 +00:00
Frank Song	76e611d49f	Preserve fallback provider credential hints	2026-05-12 20:42:55 +08:00
Michael Lam	265496782a	docs: clarify compression anchor helpers	2026-05-12 01:43:16 -07:00
nesquena-hermes	d75b59135a	stage-341: apply Opus SHOULD-FIX (it i18n + short-circuit logger.debug + docstring) Opus advisor pass on stage-341 found three surgical items: 1. static/i18n.js:it — PR #2064 branched before stage-340 landed the 'it' locale (#2067), missing 9 session_worktree keys. Mechanical mirror of en/ja position. Italian falls back to English silently without this fix. 2. api/streaming.py — PR #2107's new break short-circuit was silent in both the aux and agent title-generation paths. Added logger.debug calls before each break so production logs surface the exit shape. 3. api/streaming.py — Expanded _title_should_skip_remaining_attempts docstring to document the membership criterion explicitly (vs the implicit reasoning-only-burn case it ships with today). Future additions (llm_safety_blocked, llm_oauth_quota) have a clear inclusion test. CHANGELOG updated under the Stage-341 maintainer fixes section to mirror the stage-340 pattern. All targeted tests pass (57/57 in the affected modules).	2026-05-12 00:16:33 +00:00
nesquena-hermes	e20eb2c784	fix: skip budget-doubling title retry for reasoning-only responses (#2083 ) Reasoning models (Qwen3-thinking via LM Studio, DeepSeek-R1, Kimi-K2, etc.) can burn their entire output budget on hidden reasoning tokens and emit no visible content. The previous title-generation retry path classified that as llm_length and doubled the budget — but the second call produces the same shape, so the retry only doubled the GPU/credit burn. Repeated across the two prompts in _title_prompts() this came to ~3000 reasoning tokens of GPU work per new chat. On local LM Studio servers behind a custom: provider (where is_lmstudio=False means reasoning_effort: none never reaches the model) it manifested as the GPU never going idle after a prompt. Fix: - _extract_title_response: classify reasoning-bearing empty responses as llm_empty_reasoning regardless of finish_reason. The presence of reasoning_content is the diagnostic signal, not finish_reason. - _title_retry_status: drop llm_empty_reasoning from the retry set. Length-truncated responses WITHOUT reasoning still retry (those are legitimately recoverable by a larger budget). - Add _title_should_skip_remaining_attempts() and break out of the prompt-iteration loop on empty-reasoning. A second prompt against the same model would produce the same shape. - Falls through to _fallback_title_from_exchange for a local-summary title. Tests updated to invert the previous reasoning-retry assertions: - test_aux_short_circuits_on_empty_reasoning_without_retrying - test_aux_still_retries_finish_length_without_reasoning - test_agent_route_short_circuits_on_empty_reasoning_without_retrying - test_agent_route_still_retries_finish_length_without_reasoning Companion agent-side work (LM Studio classifier for custom: providers) is tracked separately on the hermes-agent side; this WebUI fix is the belt-and-braces guard so the loop stops regardless of agent classifier state. Reported by @darkopetrovic. Closes #2083. Co-authored-by: darkopetrovic <darkopetrovic@users.noreply.github.com> (cherry picked from commit `efeae4a86e`)	2026-05-12 00:04:11 +00:00
nesquena-hermes	fd069155af	Merge PR #2062 into stage-339 feat: record turn journal lifecycle events by @ai-ag2026	2026-05-11 17:43:58 +00:00
nesquena-hermes	6a016dae6c	Merge PR #2077 into stage-338 Refactor compression anchor visibility helpers by @franksong2702	2026-05-11 17:17:25 +00:00
ai-ag2026	c864ad47af	fix: address turn journal lifecycle review	2026-05-11 17:16:43 +02:00
Frank Song	18124ced62	Refactor compression anchor visibility helpers	2026-05-11 20:56:30 +08:00
Frank Song	a0e9c06102	Fix HERMES_HOME skill cache patching	2026-05-11 19:12:02 +08:00
ai-ag2026	4b486f2860	feat: record turn journal lifecycle events	2026-05-11 09:13:25 +02:00
Frank Song	5a445e7562	Fix duplicate assistant transcript merge	2026-05-11 13:09:16 +08:00
nesquena-hermes	97b283c5a4	Merge PR #2039 into stage-335	2026-05-11 00:25:07 +00:00
ai-ag2026	2ead7daa2f	fix: expose active run lifecycle in health	2026-05-11 02:15:00 +02:00

1 2 3 4

198 Commits