Per deep-review verdict SHIP-WITH-FIXES on PR #2636:
1. Profile-switch reconciliation: _refreshProfileSwitchBackground now re-fetches
/api/settings and re-applies hidden_tabs for the new profile. Without this,
Profile A's hidden-tabs choice stayed in effect under Profile B until the
user opened Settings → Appearance.
2. A11y: switched chips from role=button + aria-pressed to role=switch +
aria-checked. The pressed/not-pressed wording confused screen-reader users
because chip-off looks like the off state. Added role=group +
aria-labelledby on the container, and a :focus-visible style on the chips.
3. Server-side belt-and-suspenders: api/config.py now strips 'chat' and
'settings' from hidden_tabs at validation time, matching the client's apply-
time filter. A tampered POST can no longer persist the forbidden values.
3 new regression tests added (chat/settings rejection, profile-switch wiring,
chip a11y attributes).
Co-authored-by: FrancescoFarinola <francesco.farinola@example.com>
Custom providers that have a curated models: list in config.yaml
(e.g. ZenMux gateways) should show ONLY those configured models in
the picker dropdown, not the full /v1/models catalog.
Before this fix, _named_custom_groups unconditionally called
_read_custom_endpoint_models() which would pull hundreds of models
from aggregator gateways and overwrite the user's curated list.
Now the build checks if the custom_provider entry has a non-empty
models dict/list in config.yaml — if so, it skips the live fetch
and uses only the configured models (same behavior as hermes-agent
model_switch.py Section 4 patch).
Closes: configure-model-list-should-be-authoritative
The WebUI clarification popup had a response-delivery failure: users
submitted answers in the popup, but the agent still fell through to the
timeout fallback message. Three bugs conspired:
1. No stable clarify_id — _ClarifyEntry had no unique identifier, so
the frontend could not reference a specific pending prompt. The
backend used FIFO resolution which silently failed for stale/late
responses.
2. Frontend hid the card before confirmation — respondClarify() called
hideClarifyCard(true, 'sent') BEFORE the API call completed. If the
backend rejected the response, the card was already gone and the
user's draft was discarded.
3. Backend lied about success — _resolve_clarify_legacy() returned
bool(resolved) or not bool(clarify_id). Since the frontend never
sent clarify_id, the backend always reported ok:true even when
nothing was resolved.
Changes:
api/clarify.py:
- _ClarifyEntry now auto-generates a stable clarify_id (uuid4.hex[:12])
- submit_pending() injects clarify_id into the data dict visible to the
frontend via SSE and polling
- New resolve_clarify_by_id() for O(1) lookup by id instead of FIFO pop
api/routes.py:
- _resolve_clarify_legacy() uses resolve_clarify_by_id when clarify_id
is provided; returns actual bool result (no more unconditional True)
- _handle_clarify_respond() returns HTTP 409 + {ok:false, stale:true}
when resolution fails
static/messages.js:
- respondClarify() now sends clarify_id in the POST body
- Waits for a positive backend acknowledgement before hiding the card
- Saves a draft copy before POST and restores it on failure
- On 409/network error: re-enables controls, shows error toast
- Guards against parallel-SSE race where clearing the cache after a
successful response could erase a newly queued next prompt (codex P1)
tests:
- Updated test_sprint30.py for new ack-before-hide behaviour
- Updated test_clarify_unblock.py for 409 on stale responses
Closes#2639.
The _normalized_message_timestamp_for_key helper was preserving
microsecond precision (%.6f). When the same message is persisted by
both the WebUI sidecar JSON writer and the Hermes agent state.db
writer, their timestamps can differ by a few microseconds, causing
_session_message_merge_key to produce different keys for the same
logical message and letting both copies survive the dedup pass in
merge_session_messages_append_only.
Truncating to second-level granularity collapses sub-second drift to
the same key, so the duplicate is suppressed correctly.
Fixes#2616
Four code-review comments from the automated Copilot reviewer on this PR:
1. `_journal_tool_already_present` dedupe was session-wide, so a
legitimately-repeated tool (e.g. a second `terminal: ls` in an
earlier turn) could cause the retry path to falsely skip
materializing the recovered tool card. The helper now takes a
keyword `stream_id` argument; when supplied, a tool card whose
`_recovered_stream_id` is set AND differs from the candidate is no
longer treated as a duplicate. Untagged tool cards (live tools, or
tool cards carried over from a pre-tagging core transcript) still
match, preserving the existing 'core transcript already has this
tool, don't duplicate' invariant. Two new tests in
`TestJournalToolDedupeScoping` cover both legs of the rule.
2./3. The troubleshooting FAQ pointed at `~/.hermes/webui/sessions/session_<sid>.json`
and `~/.hermes/_run_journal/...`. The actual sidecar filename has
no `session_` prefix and the run-journal lives under the WebUI
sessions dir (`~/.hermes/webui/sessions/_run_journal/<sid>/<stream>.jsonl`,
default). Both paths fixed and an explicit note added about
`HERMES_WEBUI_STATE_DIR` overriding the state root.
4. Drop unused `json` / `queue` / `Path` imports from
`tests/test_session_lost_response_regression.py` so the file stops
carrying noise that future linting would flag.
When the WebUI process restarts mid-stream and sidecar repair runs while
the run-journal for the dead stream is not yet visible on disk (WSL2 9p
/ DrvFs page-cache loss, un-fsynced journal tail on network FS, …),
`_append_journaled_partial_output()` returns False and the marker is
permanently baked with the "no agent output was recovered" wording even
though the journaled tokens appear on disk shortly afterwards.
This commit reframes the recovery contract so the read side can
self-heal:
* `_interrupted_recovery_marker` gains a `pending_retry=True` mode
that produces a third wording ("Recovering the partial output …
reload this session to retry.") and stamps a
`_pending_journal_recovery` flag.
* `_apply_core_sync_or_error_marker` now writes that pending-retry
marker (with `_journal_retry_stream_id`,
`_journal_retry_attempts`, `_journal_retry_first_seen_ts` meta)
whenever it cannot recover visible output AND the stream id is
known. The legacy "no output" wording is reserved for the
no-stream-id case. The core-sync branch leaves marker emission to
the existing visible-output check (the core transcript itself is the
canonical history in that branch).
* A new `_retry_journal_recovery_in_place(session)` helper re-runs
`_append_journaled_partial_output(…, dedupe_existing=True)` for the
latest pending marker. On success the marker is promoted in place to
the recovered-output wording, the journaled rows are reordered to
sit above the marker (preserving chronological order), and all
retry meta is stripped. On failure attempts is incremented; after
_JOURNAL_RETRY_MAX_ATTEMPTS (12) or _JOURNAL_RETRY_GIVEUP_SECONDS
(24h) the marker is demoted to a neutral "Partial output may have
been lost." wording.
* `get_session()` cheaply short-circuits via
`_session_has_pending_journal_retry()` and invokes the helper on
both cache-hit and cold-load paths when a pending marker is found.
`metadata_only=True` skips the helper to keep sidebar refresh
cheap. The retry call runs OUTSIDE the SESSIONS LOCK to avoid a
deadlock with `session.save()` write paths.
No streaming write path or run_journal fsync behaviour is changed — the
fix is read-side only.