Commit Graph

2703 Commits

Author SHA1 Message Date
nesquena-hermes e35c94bf55 Stage 393: PR #2615 2026-05-20 22:23:53 +00:00
nesquena-hermes f3b8d57c99 Merge pull request #2652 from nesquena/release/stage-392
Release v0.51.99 (Release BW / stage-392 / 5-PR batch)
v0.51.99
2026-05-20 15:07:40 -07:00
nesquena-hermes 7c7ae8ead2 Stamp CHANGELOG for v0.51.99 (Release BW / stage-392 / 5-PR batch) 2026-05-20 21:48:56 +00:00
nesquena-hermes aaf30b7b0a Stage 392: PR #2643 2026-05-20 21:48:04 +00:00
nesquena-hermes fa459aa01e Stage 392: PR #2651 2026-05-20 21:48:04 +00:00
nesquena-hermes b4a00b5aae Stage 392: PR #2650 2026-05-20 21:48:04 +00:00
nesquena-hermes dc0c833744 Stage 392: PR #2647 2026-05-20 21:48:04 +00:00
nesquena-hermes 6ed66daac2 Stage 392: PR #2638 2026-05-20 21:48:04 +00:00
Lumen Yang 71fbc796b2 fix: dedupe replayed context tail after compression 2026-05-20 23:15:54 +02:00
nesquena-hermes 329a7fa6f3 Merge pull request #2649 from nesquena/release/stage-391
Release v0.51.98 (Release BV / stage-391 / 1-PR follow-on)
v0.51.98
2026-05-20 13:43:53 -07:00
nesquena-hermes 1bf905a0cc Stamp CHANGELOG for v0.51.98 (Release BV / stage-391 / 1-PR follow-on) 2026-05-20 20:40:48 +00:00
nesquena-hermes 2403e7cd2b Stage 391: PR #2640 2026-05-20 20:40:30 +00:00
starship-s 153e035d12 fix: forward title generation api key 2026-05-20 14:39:38 -06:00
Colin Chang 9c3e37d2ee fix: custom_providers models allowlist takes priority over live /v1/models fetch
Custom providers that have a curated models: list in config.yaml
(e.g. ZenMux gateways) should show ONLY those configured models in
the picker dropdown, not the full /v1/models catalog.

Before this fix, _named_custom_groups unconditionally called
_read_custom_endpoint_models() which would pull hundreds of models
from aggregator gateways and overwrite the user's curated list.

Now the build checks if the custom_provider entry has a non-empty
models dict/list in config.yaml — if so, it skips the live fetch
and uses only the configured models (same behavior as hermes-agent
model_switch.py Section 4 patch).

Closes: configure-model-list-should-be-authoritative
2026-05-20 20:22:11 +00:00
nesquena-hermes ba0b4c367f Merge pull request #2648 from nesquena/release/stage-390
Release v0.51.97 (Release BU / stage-390 / 3-PR batch)
v0.51.97
2026-05-20 13:20:32 -07:00
dobby-d-elf 87527ff4f6 Fix state db legacy dedup repeat preservation 2026-05-20 14:18:47 -06:00
nesquena-hermes 6301b0e87b Stamp CHANGELOG for v0.51.97 (Release BU / stage-390 / 2-PR batch) 2026-05-20 20:16:50 +00:00
nesquena-hermes 1e3ca07575 Stage 390: PR #2634
# Conflicts:
#	CHANGELOG.md
2026-05-20 20:16:30 +00:00
nesquena-hermes 495991c2db Stage 390: PR #2642 2026-05-20 20:16:30 +00:00
dobby-d-elf 7742b83062 Merge remote-tracking branch 'origin/master' into tool-tooltip-fix 2026-05-20 14:12:29 -06:00
Arsh Kumar Singh 2253cf5a32 chore: address review notes — dedup comment and 409-path clarification 2026-05-20 19:57:20 +00:00
Michael Lam 6e64068f0f fix: cap CLI session sidebar state scans 2026-05-20 12:47:03 -07:00
nesquena-hermes 6c60925a54 Merge pull request #2644 from nesquena/release/stage-389
Release v0.51.96 (Release BT / stage-389 / 8-PR batch)
v0.51.96
2026-05-20 11:25:15 -07:00
nesquena-hermes 7c2d56c920 Stage 389 follow-up: close TOCTOU race in pin-cap (Opus advisor #2614) 2026-05-20 18:12:38 +00:00
nesquena-hermes 2b5a960df2 Stamp CHANGELOG for v0.51.96 (Release BT / stage-389 / 8-PR batch) 2026-05-20 16:43:15 +00:00
nesquena-hermes 360a57164a Stage 389: PR #2627
# Conflicts:
#	CHANGELOG.md
2026-05-20 16:41:45 +00:00
nesquena-hermes dd36d09f89 Stage 389: PR #2626
# Conflicts:
#	CHANGELOG.md
2026-05-20 16:41:45 +00:00
nesquena-hermes 3d34eef02d Stage 389: PR #2620 2026-05-20 16:41:45 +00:00
nesquena-hermes 84f6bf5323 Stage 389: PR #2619
# Conflicts:
#	CHANGELOG.md
2026-05-20 16:41:45 +00:00
nesquena-hermes 4d8e1ccc10 Stage 389: PR #2618 2026-05-20 16:41:44 +00:00
nesquena-hermes eaff4d0b8e Stage 389: PR #2614
# Conflicts:
#	CHANGELOG.md
2026-05-20 16:41:44 +00:00
nesquena-hermes 3bcd81b79f Stage 389: PR #2612
# Conflicts:
#	CHANGELOG.md
2026-05-20 16:41:44 +00:00
nesquena-hermes 9c564ccc1b Stage 389: PR #2610 2026-05-20 16:40:42 +00:00
Arsh Kumar Singh d385db69d5 fix(clarify): require stable clarify_id and wait for backend ack so stale responses are rejected
The WebUI clarification popup had a response-delivery failure: users
submitted answers in the popup, but the agent still fell through to the
timeout fallback message.  Three bugs conspired:

1. No stable clarify_id — _ClarifyEntry had no unique identifier, so
   the frontend could not reference a specific pending prompt.  The
   backend used FIFO resolution which silently failed for stale/late
   responses.

2. Frontend hid the card before confirmation — respondClarify() called
   hideClarifyCard(true, 'sent') BEFORE the API call completed.  If the
   backend rejected the response, the card was already gone and the
   user's draft was discarded.

3. Backend lied about success — _resolve_clarify_legacy() returned
   bool(resolved) or not bool(clarify_id).  Since the frontend never
   sent clarify_id, the backend always reported ok:true even when
   nothing was resolved.

Changes:

api/clarify.py:
- _ClarifyEntry now auto-generates a stable clarify_id (uuid4.hex[:12])
- submit_pending() injects clarify_id into the data dict visible to the
  frontend via SSE and polling
- New resolve_clarify_by_id() for O(1) lookup by id instead of FIFO pop

api/routes.py:
- _resolve_clarify_legacy() uses resolve_clarify_by_id when clarify_id
  is provided; returns actual bool result (no more unconditional True)
- _handle_clarify_respond() returns HTTP 409 + {ok:false, stale:true}
  when resolution fails

static/messages.js:
- respondClarify() now sends clarify_id in the POST body
- Waits for a positive backend acknowledgement before hiding the card
- Saves a draft copy before POST and restores it on failure
- On 409/network error: re-enables controls, shows error toast
- Guards against parallel-SSE race where clearing the cache after a
  successful response could erase a newly queued next prompt (codex P1)

tests:
- Updated test_sprint30.py for new ack-before-hide behaviour
- Updated test_clarify_unblock.py for 409 on stale responses

Closes #2639.
2026-05-20 16:35:15 +00:00
Michael Lam 6eb5d939d7 test: allow custom provider settings filter 2026-05-20 09:33:51 -07:00
Dennis Soong cec435a833 fix(session): rebuild missing startup index 2026-05-20 23:43:30 +08:00
dobby-d-elf 4c8914304b fix: keep compact tool activity grouped
Compact tool activity regressed into separate Activity rows and standalone Thinking blurbs when interim assistant text retired the current live activity group and Thinking rendered outside the disclosure.

Render Compact-mode Thinking inside the shared Activity body for live and settled turns, keep interim assistant text from splitting the current Activity group, and remove the now-unused stream-local activity-close path. This restores the intended single compact disclosure without adding new functionality.
2026-05-20 08:29:46 -06:00
Michael Lam 8ef8fae831 fix: show config-managed custom providers 2026-05-20 06:27:00 -07:00
Isla Liu 98106c809b docs(session): clarify lazy retry trigger for metadata-only polling 2026-05-20 20:55:08 +08:00
Isla Liu 37c3e84ad2 test(session): cover lazy journal retry give-up paths 2026-05-20 20:55:08 +08:00
Isla Liu 2a303de2a3 fix(session): preserve retry budget while journal is still arriving 2026-05-20 20:55:07 +08:00
Isla Liu d5a185d9c6 fix(session): serialize lazy journal retry per session 2026-05-20 20:48:38 +08:00
Michael Lam 680d0cbc92 docs(runtime): define runner backend harness gate 2026-05-20 04:05:36 -07:00
Michael Lam c3eafa34f8 fix: surface custom provider model endpoint errors 2026-05-20 03:12:33 -07:00
manji ff0aa69d5f fix(session): use second-level timestamp granularity in legacy dedup key
The _normalized_message_timestamp_for_key helper was preserving
microsecond precision (%.6f). When the same message is persisted by
both the WebUI sidecar JSON writer and the Hermes agent state.db
writer, their timestamps can differ by a few microseconds, causing
_session_message_merge_key to produce different keys for the same
logical message and letting both copies survive the dedup pass in
merge_session_messages_append_only.

Truncating to second-level granularity collapses sub-second drift to
the same key, so the duplicate is suppressed correctly.

Fixes #2616
2026-05-20 07:13:55 +00:00
Michael Lam 471b75d762 docs: move Hermes overview out of agent context root 2026-05-19 23:55:58 -07:00
Lumen Yang b2c6af12f1 fix(webui): prefer sidecar counts over stale session index 2026-05-20 05:42:55 +00:00
Isla Liu 1957785332 fix(session): address Copilot round-2 review — correct stale comment and drop unused fixture arg
Two non-functional cleanups from the second Copilot pass:

1. The inline comment in `test_error_marker_no_preserved_as_draft`
   said the legacy "user message above was preserved" wording was used
   for the post-retry-give-up case.  The actual implementation demotes
   give-up markers to a different neutral wording ("Partial output may
   have been lost.").  Comment rewritten to match the contract.

2. The regression test `test_lost_response_recovered_on_second_read`
   declared a `monkeypatch` parameter it never used.  Dropped.
2026-05-20 13:08:08 +08:00
Isla Liu 9870e8f111 fix(session): address Copilot review — scope tool-card dedupe by stream id + tighten docs
Four code-review comments from the automated Copilot reviewer on this PR:

1. `_journal_tool_already_present` dedupe was session-wide, so a
   legitimately-repeated tool (e.g. a second `terminal: ls` in an
   earlier turn) could cause the retry path to falsely skip
   materializing the recovered tool card.  The helper now takes a
   keyword `stream_id` argument; when supplied, a tool card whose
   `_recovered_stream_id` is set AND differs from the candidate is no
   longer treated as a duplicate.  Untagged tool cards (live tools, or
   tool cards carried over from a pre-tagging core transcript) still
   match, preserving the existing 'core transcript already has this
   tool, don't duplicate' invariant.  Two new tests in
   `TestJournalToolDedupeScoping` cover both legs of the rule.

2./3. The troubleshooting FAQ pointed at `~/.hermes/webui/sessions/session_<sid>.json`
   and `~/.hermes/_run_journal/...`.  The actual sidecar filename has
   no `session_` prefix and the run-journal lives under the WebUI
   sessions dir (`~/.hermes/webui/sessions/_run_journal/<sid>/<stream>.jsonl`,
   default).  Both paths fixed and an explicit note added about
   `HERMES_WEBUI_STATE_DIR` overriding the state root.

4. Drop unused `json` / `queue` / `Path` imports from
   `tests/test_session_lost_response_regression.py` so the file stops
   carrying noise that future linting would flag.
2026-05-20 12:18:03 +08:00
Isla Liu 66b6d8f019 docs(session): CHANGELOG entry + troubleshooting FAQ for the lost-response self-heal
CHANGELOG: append an Unreleased / Fixed entry describing the user-visible
behaviour change (interrupted-turn marker now self-heals on the next
session read; gives up gracefully after 12 retries or 24h).

docs/troubleshooting.md: add a 'Symptom → Why → Diagnostic → Fix →
Caps → When to file a bug' entry for the
'no agent output was recovered' marker so users who hit the lost-response
shape on WSL2 / network FS can recognise it, verify the run-journal on
disk, and know that reloading the session is enough.
2026-05-20 11:59:06 +08:00