hermes-webui

gooneyryan/hermes-webui

Fork 0

mirror of https://github.com/nesquena/hermes-webui.git synced 2026-05-25 19:20:16 +00:00

Commit Graph

Author SHA1 Message Date

Author	SHA1	Message	Date
nesquena-hermes	58ad315dca	v0.50.216: compression chains, renderer fixes, HTML preview, approval z-index, /steer fix, reasoning chip (#1075 ) * fix(workspace): add .html/.htm to MIME_MAP so HTML preview renders correctly MIME_MAP was missing entries for .html and .htm. The server fell back to Content-Type: application/octet-stream, which browsers refuse to render as HTML in an iframe — causing a blank white preview. The rest of the pipeline was already correct: the iframe exists in static/index.html, openFile() in static/workspace.js routes .html to showPreview('html'), and _handle_file_raw() in api/routes.py sets the correct CSP sandbox header when ?inline=1 is present. The only missing piece was the MIME type. * test(workspace): lock in MIME_MAP entry for .html/.htm PR #1070 added .html/.htm → text/html to MIME_MAP in api/config.py to fix the blank workspace HTML preview iframe. Without a direct assertion on the MIME_MAP entries, the fix could silently regress (the existing test_779_html_preview.py tests cover the iframe wiring, the inline=1 query handling, and the CSP sandbox header — but none of them touch MIME_MAP itself). Add a single regression test that asserts MIME_MAP['.html'] and MIME_MAP['.htm'] are both 'text/html' so any future removal of those entries fails CI immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(composer): raise .approval-card.visible z-index above .queue-card .queue-card has z-index:2. .approval-card.visible had no z-index, so the queue flyout would render on top of the approval card when both were visible simultaneously — obscuring the Allow/Deny buttons. Fix: add z-index:3 to .approval-card.visible so approvals always render above the queue flyout. Approval is a blocking, security-relevant interaction and must never be obscured by passive UI elements. * test(composer): pin approval-card z-index > queue-card invariant PR #1071 raises .approval-card.visible to z-index:3 so the security- relevant Allow / Deny buttons stay clickable when the queue flyout is also open. Without a regression test, a future CSS edit could silently drop the z-index back below queue-card (z-index:2) and reintroduce the bug — there is no automated UI test covering this stacking interaction. Add a focused regex check that pins the invariant: .approval-card.visible z-index must be strictly greater than .queue-card z-index. Modeled on the existing CSS-regex regression style in tests/test_mobile_layout.py (test_profile_dropdown_not_clipped_by_overflow). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: intercept /steer /interrupt /queue before busy-mode routing in send() Root cause: slash commands entered while the agent is busy never reached the command dispatcher. send() enters the busy block and returns early at line ~50, so the slash-command intercept (~line 56) is never reached. The text was queued as a plain message. When it drained after the turn ended, cmdSteer / cmdInterrupt ran on an idle session, saw no active stream, and showed "No active task to stop." Fix: at the top of the busy block, before checking busyMode, check if the text starts with / and is one of the three control commands. If so, dispatch the handler immediately and return. This lets the user type /steer, /interrupt, or /queue at any time — including while the agent is mid-stream — and have them execute against the live session. Two new regression tests added: - test_slash_commands_intercepted_before_busymode_routing: verifies the intercept appears before the busyMode routing in the busy block - test_steer_intercept_calls_handler_directly: verifies the intercept calls _bc.fn(_pc.args) and returns, not queues * test(busy-intercept): pin sync input-clear before await in slash intercept PR #1072's intercept clears the msg input before awaiting the handler. Order matters: if the await happens first (or if the clear is moved inside the handler), the input still shows '/steer foo' for the duration of the await. A reflexive second Enter press during that window — common while waiting for the toast — re-runs send(): either re-fires the handler (double-steer) or, if the turn just ended, falls through to the non-busy slash dispatcher and drops a confusing "No active task to stop." Add test_steer_intercept_clears_input_before_await pinning the order so this UX invariant cannot silently regress. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: update steer i18n and settings copy — steer no longer interrupts With the real /steer implementation (agent.steer() via /api/chat/steer), steer injects a correction mid-turn WITHOUT interrupting the current stream. The previous copy said "falls back to interrupt", "Steer (interrupt + send)", etc. — accurate only for the old placeholder, not the real implementation. Changes across all 6 locales (en/ru/es/de/zh/zh-Hant): cmd_steer: "falls back to interrupt" removed settings_busy_input_mode_steer: "interrupt + send" → "mid-turn correction" cmd_steer_fallback: "interrupted" → "queued for next turn" busy_steer_fallback: "interrupted instead" → "queued for next turn" settings_desc_busy_input_mode: "currently falls back to interrupt" removed Also: static/index.html: inline fallback text updated to match static/commands.js: internal comment clarified (fallback = queue+cancel, not "interrupt mode" which implies the primary action) * fix(renderer): group consecutive blockquote lines into single element Root cause: the old rule `s.replace(/^> (.+)$/gm, ...)` had three bugs: 1. `.+` required at least one character — bare `>` lines (blank continuation lines) did not match and passed through as literal `>` 2. Each matching line became its own `<blockquote>` element — a 10-line blockquote produced 10 stacked `<blockquote>` tags with no grouping 3. When a fenced code block sat inside a blockquote, the fence-stash pass consumed the code content and left orphaned `>` lines that the old `.+` pattern could not match Fix: replace the single-line regex with a group-based approach that matches one or more consecutive `>` lines as a single block, strips the `>` prefix from each line, passes each non-empty line through inlineMd(), turns blank `>` lines into `<br>`, and wraps the entire group in one `<blockquote>`. 14 regression tests added covering: - Single-line blockquotes (regression) - Multi-line grouping (2 and 10 lines) - Two separate blockquotes staying separate - Bare `>` and `>text` (no space) edge cases - Blank continuation lines → <br> - Bold / italic / inline-code inside blockquotes - Blockquote followed by normal paragraph * fix(renderer): drop empty trailing line from blockquote match The new group-based blockquote rule introduced in this PR captures the trailing newline in its (?:\n\|$) clause. After block.split('\n') that trailing newline produces an empty final element. The original filter only dropped lone bare '>' artifacts on the last line, so the empty final element survived, and the .map(blank → '<br>') step turned it into a phantom <br> immediately before </blockquote>. Visible symptom: any blockquote whose source ends with \n (the common case — a quote followed by another paragraph or end-of-message) renders with an extra blank line at the bottom of the quote. Reproducer: '> Hello\n\nThe rest of the message.' → '<blockquote>Hello\n<br></blockquote>\nThe rest of the message.' ^^^ phantom <br> Fix: replace the single-line filter with a while-loop that pops trailing lines while they are either empty OR a bare '>'. This matches the intent the Python test mirror in tests/test_blockquote_rendering.py already had (the mirror was correct; the JS was not — that's why the original tests passed despite the bug). Also add four new regression tests in TestNoPhantomTrailingBr that pin the no-trailing-<br> invariant for the common shapes: - input ending with \n - quote followed by paragraph (the real-world case) - multi-line quote ending with \n - quote with blank continuation + trailing \n (internal <br> stays, trailing <br> does not) Verified end-to-end with node against the actual JS regex. 244 renderer-adjacent tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(renderer): comprehensive markdown fixes — strikethrough, task lists, CRLF, nested blockquotes Five additional fixes on top of the blockquote grouping from the initial commit: 1. CRLF normalisation: strip \r\n → \n at start of renderMd so Windows line endings do not produce stray \r characters in rendered output 2. Strikethrough: ~~text~~ → <del>text</del> in both inlineMd() (for use inside blockquotes/lists) and the outer pass (for plain paragraphs). Added <del> to SAFE_TAGS and SAFE_INLINE so it is not HTML-escaped. 3. Task lists: - [x] / - [ ] items in unordered lists render as ✅/☐ via task-done/task-todo span wrappers. Checks [X] (uppercase) too. 4. Nested blockquotes: >> / >>> etc. now recurse so each level gets its own <blockquote> element rather than passing through as literal >. Implemented by extracting the blockquote rule into _applyBlockquotes() which calls itself recursively on the stripped inner content. 5. Lists inside blockquotes: > - item now renders <ul><li> inside the blockquote instead of a literal "- item" string. Task list items work inside blockquotes too (> - [x] done → ✅ inside <blockquote><ul>). Also fixed test_issue342.py search window (5000→10000 chars) — the CRLF strip at the top of renderMd pushed the autolink regex past the old limit. 68 new tests in test_renderer_comprehensive.py + test_blockquote_rendering.py covering all constructs, edge cases, and combinations. * fix(renderer): restore space in blockquote prefix-strip regex Commit `04e7b53` changed the blockquote prefix-strip regex from /^>[ \t]?/ (consume "> ", "\t>", or just ">") to /^>[\t]?/ (only consume "\t>" or just ">") The space character was dropped from the character class. Since practically every blockquote an LLM produces is "> " (greater-than followed by a space), this leaves a leading space artifact on every stripped blockquote line. Worse, the leading space breaks the list-detection regex `^(?: )?[-+] ` inside the new `_applyBlockquotes` helper — that regex requires either zero or two leading spaces, never one — so the new "list inside blockquote" feature never fired for the canonical input shape `> - item`. Reproducer (against the actual ui.js via node, before the fix): > Hello world → <blockquote> Hello world</blockquote> ^ phantom leading space > Steps: → <blockquote>Steps: > - one - one > - two - two</blockquote> ^ literal text, NOT a <ul>; lists-in-quote feature broken > - [x] done → blockquote with literal "[x] done", no checkbox span Tests passed despite the bug because tests/test_blockquote_rendering.py and tests/test_renderer_comprehensive.py validate against a Python mirror (`_apply_blockquotes`) whose strip regex is `^>[ \t]?` — i.e. the mirror is correct, the JS is not, and the static-mirror tests can't catch the divergence. Same shape of bug as commit `94d63d0` (phantom <br> in trailing line) where the mirror was right and the JS was wrong. Fix: restore the space character in the strip regex's character class. Add tests/test_renderer_js_behaviour.py — 11 tests that drive the ACTUAL renderMd via node and assert on rendered output for the most common LLM shapes (single-line quote, multi-line quote, list inside quote, task list inside quote, nested >>>, strikethrough inside and outside quote, top-level task list, quote followed by heading, multi-paragraph quote with list, CRLF normalisation). Verified: the buggy regex makes 6 of those 11 tests fail; the corrected regex makes all 11 pass. Suite: 2354 passed, 0 new failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Collapse agent session compression chains * Restore upstream changelog entries * fix(agent_sessions): bubble active compression chains to top by tip last_activity The original PR merge kept the chain head's id/title/started_at and overrode id/model/message_count/ended_at/end_reason from the tip — but did NOT override last_activity. Since the projected list is sorted by last_activity DESC and the WebUI sidebar surfaces updated_at = last_activity, an actively-used compression chain whose tip is being edited NOW would sort by the ROOT's old last_activity and fall below recently touched standalone sessions. Reproducer (with the harness against actual code, before the fix): - root: started 30 days ago, last msg 30 days ago - tip: started 28 days ago (parent_session_id=root), last msg 5 seconds ago - standalone: last msg 2 days ago Sidebar order with original PR: [0] standalone (48h ago) [1] active_tip (last_activity=root's 720h ago) ← wrong Sidebar order after fix: [0] active_tip (last_activity=tip's 0h ago) ← correct [1] standalone (48h ago) This matches Hermes Agent's own list_sessions_rich projection at hermes_state.py:903-909, which overrides "last_active" from the tip exactly so that the agent CLI's session list orders the same way. Add ``last_activity`` to the merge-from-tip key list, update the existing test_compression_chain_collapses_to_latest_tip_in_sidebar assertion to expect tip-derived updated_at, and add test_compression_chain_bubbles_to_top_by_tip_activity locking in the bubble-to-top invariant — without this regression test the previous behaviour passed CI because no test exercised the sort order against a mixed set of chains and standalone sessions. The chain head's started_at (created_at) and title remain preserved, so users can still find the conversation by its original date and name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: v0.50.216 release notes and version bump Compression chains, renderer fixes, HTML preview, approval z-index, /steer fix. * chore: gitignore local-only review harness directory Adds .local-review/ to .gitignore so renderer drivers, sample inputs, fixture builders, and other reviewer scratch files do not accidentally get committed. Nothing under that path is ever shared in the repo; keeping the entry tracked makes the boundary explicit for any future contributor who creates the directory locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Keep reasoning chip visible for None effort * test(reasoning): pin chip render output via node, not just source regex The PR's static checks in test_reasoning_chip_btw_fixes.py validate the shape of _applyReasoningChip (no display='none' literal, the right classList.toggle call exists, the right label literals are in the function body) but pass even if the runtime detail is wrong — for example if `inactive` were inverted, _normalizeReasoningEffort mishandled whitespace, or _formatReasoningEffortLabel returned the wrong literal for an unknown input. Add tests/test_reasoning_chip_js_behaviour.py — 11 tests that drive the actual _applyReasoningChip() via node and assert on the rendered DOM state for each effort value: TestChipAlwaysVisible - empty / null -> "Default" label, inactive=true - "none" -> "None" label, inactive=true - "low"/"high" -> verbatim label, inactive=false TestNormalizationEdgeCases - "NONE" -> normalises to "None" - " none " -> trims and normalises - unknown junk -> falls through visible, never hidden TestTitleAttributeAccessibility - title attribute carries the human-readable label for tooltip / screen-reader use Sanity-checked against master's pre-fix ui.js: 11/11 fail (bug caught). Against this PR's ui.js: 11/11 pass. This pattern (drive the actual JS via node) caught two regex-only regressions in PR #1073 where the Python mirror was correct while the JS was broken. Same protection added here so the chip-visibility contract can't silently break in a future refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: add #1074 to v0.50.216 changelog, bump test count to 2428 * fix(i18n): restore broken Unicode in Russian and Spanish steer strings Commit `56c7a14` (fix: update steer i18n and settings copy) accidentally stripped the `\u` prefix from Unicode escape sequences in two locales, producing garbled literal hex strings visible to users: Spanish (es): - cmd_steer: correcci00f3n → corrección - cmd_steer_fallback: 2014 en cola → — en cola - busy_steer_fallback: 2014 en cola → — en cola - settings_desc_busy_input_mode: qu00e9, est00e1, correcci00f3n → qué, está, corrección - settings_busy_input_mode_steer: correcci00f3n → corrección Russian (ru): - settings_desc_busy_input_mode: the entire Cyrillic string was replaced with raw 4-hex-char code-points without the \u prefix (041e043f... instead of actual Cyrillic). Decoded: "Определяет поведение при отправке сообщения во время работы агента. Очередь ждёт; Прерывание отменяет и начинает заново; Steer внедряет коррекцию без прерывания." Fix: write the correct characters directly (UTF-8 is the file encoding so embedding them literally is cleaner than \u escapes for long text). All other locales (en, de, zh, zh-Hant) were not affected — confirmed by grepping for bare hex run-ons in the updated file. Verified: node --check static/i18n.js passes; full pytest suite green (2365 passed, 47 skipped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: remove duplicate compression chain entry from [Unreleased] --------- Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com> Co-authored-by: Nathan Esquenazi <nesquena@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Frank Song <franksong2702@gmail.com>	2026-04-25 21:06:31 -07:00
nesquena-hermes	7d1aa2e261	v0.50.209: check-for-updates, workspace toggle, HTML preview, provider categories, queue flyout docs (#1042 ) * feat: add manual 'Check for Updates' button in System settings (#785) Add a 'Check now' button next to the version badge in the System settings section, allowing users to manually trigger an update check at any time without waiting for the automatic periodic check. Changes: - index.html: add button with spinner and status text inline with version badge - panels.js: add checkUpdatesNow() calling /api/updates/check?force=1 with immediate feedback (checking... / up to date / X updates available) - style.css: style the button block and spinner - i18n.js: add 5 new keys (settings_check_now, settings_checking, settings_up_to_date, settings_updates_available, settings_updates_disabled) in all 6 locales (en, ru, es, de, zh, zh-Hant) * fix: sanitize error message in checkUpdatesNow to avoid exposing paths Review feedback: strip filesystem paths from error messages and cap length to prevent internal details leaking into the UI. * fix: fully sanitize error in update check — never expose raw e.message in UI Previous partial fix (`80cdaee`) stripped filesystem paths from e.message but still displayed the JS exception message to users. Per reviewer feedback and project convention (NEVER expose raw e.message in UI), replace with: - A generic user-facing i18n key (settings_update_check_failed) as default - Fallback to API response body error if available (structured, not raw) - Full error logged via console.warn for debugging - Button disable-during-check already confirmed working (try/finally pattern) - settings_update_check_failed key added in all 6 locales * fix(#785): align HTML selectors with CSS and add regression tests - Wrap update button in div#checkUpdatesBlock so CSS selectors apply - Change button class from sm-btn to btn-tiny (matching stylesheet) - Remove inline styles now handled by CSS (#checkUpdatesBlock, .btn-tiny) - Move spinner sizing to CSS class .spinner-xs - Add 4 static tests in test_update_banner_fixes.py: checkUpdatesNow defined, btnCheckUpdatesNow in HTML, CSS selectors exist, i18n key in all locales * feat: 'Keep workspace panel open' toggle in Appearance settings (#999) * feat: categorize providers in setup wizard (#603) - Add 6 new providers: Google Gemini, DeepSeek, Mistral, xAI (Grok), Ollama, LM Studio to the onboarding quick-setup catalog - Group providers into 3 categories: Easy start, Open/self-hosted, Specialized — rendered as <optgroup> in the provider dropdown - Generic base_url save logic (requires_base_url + default_base_url) instead of hardcoded provider checks - i18n keys for category labels in en, ru, es, zh, zh-Hant * ci: re-run tests * fix(tests): prevent reload_config() from overwriting in-memory mock in test_issue644 The test helper _available_models_with_cfg patches cfg in-memory but get_available_models() calls reload_config() when the config file's mtime doesn't match _cfg_mtime. On CI, config.yaml exists so mtime > 0 and _cfg_mtime starts at 0.0, triggering a reload that overwrites the test's mock with on-disk content. Fix: freeze _cfg_mtime to the current config file mtime inside the helper, so reload_config() is not triggered during the test. * fix: correct default model IDs for gemini, xai, deepseek; add specialized provider tests - gemini: gemini-3.1-pro-preview → gemini-2.5-pro-preview - x-ai: grok-4.20 → grok-3 - deepseek: deepseek-chat-v3-0324 → deepseek-chat - Add TestApplyBaseURLSpecialized: 4 tests verifying base_url written for gemini, deepseek, mistral, and x-ai through apply_onboarding_setup * test: add TestApplyBaseURLSpecialized — verify base_url written for gemini, deepseek, mistralai, x-ai * fix(onboarding): correct stale model defaults for specialized providers Three issues in the new specialized provider catalog (#1027 hold reason): 1. gemini default_model was `gemini-2.5-pro-preview` — agent's catalog has the 3.1 family. Updated to `gemini-3.1-pro-preview`. 2. x-ai default_model was `grok-3` — agent's catalog has `grok-4.20`. Updated. 3. gemini `models` list was sourcing from `_PROVIDER_MODELS.get("gemini")` which returns []. The catalog in api/config.py is keyed under "google" (even though the agent's alias map normalizes google -> gemini). Switched to `_PROVIDER_MODELS.get("google")` so the wizard surfaces the actual 5-model list. Also forward-compatible lookup for x-ai (xai or x-ai key). Without these fixes, users picking gemini or x-ai in the wizard would see no model dropdown and the default_model written to config.yaml would 404 on first chat. deepseek default_model bumped from `deepseek-chat` to `deepseek-chat-v3-0324` to match the test fixture's expectation and the agent catalog's pinned version. Added two regression tests: - test_gemini_model_list_is_populated: pins the catalog-key correctness - test_specialized_default_models_match_catalog: pins the version prefixes (3.x for gemini, 4.x for grok) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: inline HTML preview in workspace panel (#779) Render .html/.htm files as live previews in a sandboxed iframe instead of showing raw source code. Adds an 'Open in browser' button to open the file in a new tab. Changes: - workspace.js: add HTML_EXTS set, 'html' preview mode, iframe routing in openFile(), and openInBrowser() function - index.html: add sandboxed iframe element and 'Open in browser' button in preview toolbar (visible only for HTML files) - i18n.js: add 'open_in_browser' key in all 6 locales The iframe uses sandbox='allow-scripts' for security. Download button remains available alongside the new preview. * docs: document sandbox security tradeoff for HTML preview Review feedback: fileExt() already lowercases extensions so .HTML/.HTM work. Added code comment explaining the deliberate sandbox=allow-scripts choice: scripts are needed for most HTML documents but the iframe is still origin- isolated and cannot access parent cookies/data. * fix: pass ?inline=1 to file/raw so HTML preview iframe renders instead of downloading routes.py: add inline_preview param — bypasses Content-Disposition:attachment for text/html when ?inline=1 is set, serving the file inline for the sandboxed iframe. workspace.js: add &inline=1 to the iframe src URL. test: add 5 static regression tests for the inline HTML preview. * fix(security): CSP sandbox header for inline HTML preview The iframe sandbox="allow-scripts" attribute on previewHtmlIframe only applies when HTML is loaded INSIDE that iframe. A user tricked into opening /api/file/raw?path=evil.html&inline=1 directly in a top-level tab (e.g. via a chat link) would render the HTML in the WebUI's origin without any sandbox, giving the page full access to cookies and localStorage. Server-side Content-Security-Policy: sandbox allow-scripts mirrors the iframe sandbox exactly: scripts run, but the document is treated as a unique opaque origin (no allow-same-origin) and cannot read WebUI cookies, localStorage, or postMessage to the parent regardless of how the URL is accessed. Added test_inline_html_response_sets_csp_sandbox to pin the header. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: v0.50.209 release notes — 4 PRs, 2212 tests (+43) * docs(changelog): document #1040 queue flyout and Cloudflare CSP in v0.50.209 The stage commit `ed2bd18` listed v0.50.209 as a 4-PR release but the stage actually bundles 5 PRs — #1040 (queue flyout) was cherry-picked in without a corresponding CHANGELOG entry. Without this fix, the queue feature ships silently and the bundled Cloudflare CSP relaxation in api/helpers.py is also undocumented. Adds two entries: - Added: queue flyout (#1040) under v0.50.209 - Changed: CSP allowlist for Cloudflare Access deployments Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: bergeouss <bergeouss@users.noreply.github.com> Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com> Co-authored-by: Nathan Esquenazi <nesquena@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 14:33:41 -07:00

nesquena-hermes

58ad315dca

v0.50.216: compression chains, renderer fixes, HTML preview, approval z-index, /steer fix, reasoning chip (#1075 )

* fix(workspace): add .html/.htm to MIME_MAP so HTML preview renders correctly

MIME_MAP was missing entries for .html and .htm. The server fell back to
Content-Type: application/octet-stream, which browsers refuse to render as
HTML in an iframe — causing a blank white preview.

The rest of the pipeline was already correct: the iframe exists in
static/index.html, openFile() in static/workspace.js routes .html to
showPreview('html'), and _handle_file_raw() in api/routes.py sets the
correct CSP sandbox header when ?inline=1 is present. The only missing
piece was the MIME type.

* test(workspace): lock in MIME_MAP entry for .html/.htm

PR #1070 added .html/.htm → text/html to MIME_MAP in api/config.py
to fix the blank workspace HTML preview iframe. Without a direct
assertion on the MIME_MAP entries, the fix could silently regress
(the existing test_779_html_preview.py tests cover the iframe wiring,
the inline=1 query handling, and the CSP sandbox header — but none of
them touch MIME_MAP itself).

Add a single regression test that asserts MIME_MAP['.html'] and
MIME_MAP['.htm'] are both 'text/html' so any future removal of those
entries fails CI immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(composer): raise .approval-card.visible z-index above .queue-card

.queue-card has z-index:2. .approval-card.visible had no z-index, so the
queue flyout would render on top of the approval card when both were visible
simultaneously — obscuring the Allow/Deny buttons.

Fix: add z-index:3 to .approval-card.visible so approvals always render
above the queue flyout. Approval is a blocking, security-relevant interaction
and must never be obscured by passive UI elements.

* test(composer): pin approval-card z-index > queue-card invariant

PR #1071 raises .approval-card.visible to z-index:3 so the security-
relevant Allow / Deny buttons stay clickable when the queue flyout is
also open. Without a regression test, a future CSS edit could silently
drop the z-index back below queue-card (z-index:2) and reintroduce the
bug — there is no automated UI test covering this stacking interaction.

Add a focused regex check that pins the invariant:
.approval-card.visible z-index must be strictly greater than
.queue-card z-index.

Modeled on the existing CSS-regex regression style in
tests/test_mobile_layout.py (test_profile_dropdown_not_clipped_by_overflow).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: intercept /steer /interrupt /queue before busy-mode routing in send()

Root cause: slash commands entered while the agent is busy never reached
the command dispatcher. send() enters the busy block and returns early at
line ~50, so the slash-command intercept (~line 56) is never reached.
The text was queued as a plain message. When it drained after the turn
ended, cmdSteer / cmdInterrupt ran on an idle session, saw no active stream,
and showed "No active task to stop."

Fix: at the top of the busy block, before checking busyMode, check if the
text starts with / and is one of the three control commands. If so, dispatch
the handler immediately and return. This lets the user type /steer, /interrupt,
or /queue at any time — including while the agent is mid-stream — and have
them execute against the live session.

Two new regression tests added:
- test_slash_commands_intercepted_before_busymode_routing: verifies the
  intercept appears before the busyMode routing in the busy block
- test_steer_intercept_calls_handler_directly: verifies the intercept calls
  _bc.fn(_pc.args) and returns, not queues

* test(busy-intercept): pin sync input-clear before await in slash intercept

PR #1072's intercept clears the msg input before awaiting the handler.
Order matters: if the await happens first (or if the clear is moved
inside the handler), the input still shows '/steer foo' for the duration
of the await. A reflexive second Enter press during that window — common
while waiting for the toast — re-runs send(): either re-fires the
handler (double-steer) or, if the turn just ended, falls through to the
non-busy slash dispatcher and drops a confusing "No active task to stop."

Add test_steer_intercept_clears_input_before_await pinning the order so
this UX invariant cannot silently regress.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: update steer i18n and settings copy — steer no longer interrupts

With the real /steer implementation (agent.steer() via /api/chat/steer),
steer injects a correction mid-turn WITHOUT interrupting the current stream.
The previous copy said "falls back to interrupt", "Steer (interrupt + send)",
etc. — accurate only for the old placeholder, not the real implementation.

Changes across all 6 locales (en/ru/es/de/zh/zh-Hant):
  cmd_steer:                  "falls back to interrupt" removed
  settings_busy_input_mode_steer: "interrupt + send" → "mid-turn correction"
  cmd_steer_fallback:         "interrupted" → "queued for next turn"
  busy_steer_fallback:        "interrupted instead" → "queued for next turn"
  settings_desc_busy_input_mode: "currently falls back to interrupt" removed

Also:
  static/index.html: inline fallback text updated to match
  static/commands.js: internal comment clarified (fallback = queue+cancel,
                      not "interrupt mode" which implies the primary action)

* fix(renderer): group consecutive blockquote lines into single element

Root cause: the old rule `s.replace(/^> (.+)$/gm, ...)` had three bugs:
  1. `.+` required at least one character — bare `>` lines (blank
     continuation lines) did not match and passed through as literal `>`
  2. Each matching line became its own `<blockquote>` element — a 10-line
     blockquote produced 10 stacked `<blockquote>` tags with no grouping
  3. When a fenced code block sat inside a blockquote, the fence-stash
     pass consumed the code content and left orphaned `>` lines that the
     old `.+` pattern could not match

Fix: replace the single-line regex with a group-based approach that matches
one or more consecutive `>` lines as a single block, strips the `>` prefix
from each line, passes each non-empty line through inlineMd(), turns blank
`>` lines into `<br>`, and wraps the entire group in one `<blockquote>`.

14 regression tests added covering:
- Single-line blockquotes (regression)
- Multi-line grouping (2 and 10 lines)
- Two separate blockquotes staying separate
- Bare `>` and `>text` (no space) edge cases
- Blank continuation lines → <br>
- Bold / italic / inline-code inside blockquotes
- Blockquote followed by normal paragraph

* fix(renderer): drop empty trailing line from blockquote match

The new group-based blockquote rule introduced in this PR captures the
trailing newline in its (?:\n|$) clause. After block.split('\n') that
trailing newline produces an empty final element. The original filter
only dropped lone bare '>' artifacts on the last line, so the empty
final element survived, and the .map(blank → '<br>') step turned it
into a phantom <br> immediately before </blockquote>.

Visible symptom: any blockquote whose source ends with \n (the common
case — a quote followed by another paragraph or end-of-message) renders
with an extra blank line at the bottom of the quote.

Reproducer:
  '> Hello\n\nThe rest of the message.'
    → '<blockquote>Hello\n<br></blockquote>\nThe rest of the message.'
                          ^^^ phantom <br>

Fix: replace the single-line filter with a while-loop that pops trailing
lines while they are either empty OR a bare '>'. This matches the
intent the Python test mirror in tests/test_blockquote_rendering.py
already had (the mirror was correct; the JS was not — that's why
the original tests passed despite the bug).

Also add four new regression tests in TestNoPhantomTrailingBr that pin
the no-trailing-<br> invariant for the common shapes:
  - input ending with \n
  - quote followed by paragraph (the real-world case)
  - multi-line quote ending with \n
  - quote with blank continuation + trailing \n (internal <br> stays,
    trailing <br> does not)

Verified end-to-end with node against the actual JS regex.
244 renderer-adjacent tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(renderer): comprehensive markdown fixes — strikethrough, task lists, CRLF, nested blockquotes

Five additional fixes on top of the blockquote grouping from the initial commit:

1. CRLF normalisation: strip \r\n → \n at start of renderMd so Windows
   line endings do not produce stray \r characters in rendered output

2. Strikethrough: ~~text~~ → <del>text</del> in both inlineMd() (for use
   inside blockquotes/lists) and the outer pass (for plain paragraphs).
   Added <del> to SAFE_TAGS and SAFE_INLINE so it is not HTML-escaped.

3. Task lists: - [x] / - [ ] items in unordered lists render as ✅/☐
   via task-done/task-todo span wrappers. Checks [X] (uppercase) too.

4. Nested blockquotes: >> / >>> etc. now recurse so each level gets its
   own <blockquote> element rather than passing through as literal >.
   Implemented by extracting the blockquote rule into _applyBlockquotes()
   which calls itself recursively on the stripped inner content.

5. Lists inside blockquotes: > - item now renders <ul><li> inside the
   blockquote instead of a literal "- item" string. Task list items work
   inside blockquotes too (> - [x] done → ✅ inside <blockquote><ul>).

Also fixed test_issue342.py search window (5000→10000 chars) — the CRLF
strip at the top of renderMd pushed the autolink regex past the old limit.

68 new tests in test_renderer_comprehensive.py + test_blockquote_rendering.py
covering all constructs, edge cases, and combinations.

* fix(renderer): restore space in blockquote prefix-strip regex

Commit 04e7b53 changed the blockquote prefix-strip regex from
  /^>[ \t]?/   (consume "> ", "\t>", or just ">")
to
  /^>[\t]?/    (only consume "\t>" or just ">")

The space character was dropped from the character class. Since
practically every blockquote an LLM produces is "> " (greater-than
followed by a space), this leaves a leading space artifact on every
stripped blockquote line. Worse, the leading space breaks the
list-detection regex `^(?:  )?[-*+] ` inside the new `_applyBlockquotes`
helper — that regex requires either zero or two leading spaces, never
one — so the new "list inside blockquote" feature never fired for
the canonical input shape `> - item`.

Reproducer (against the actual ui.js via node, before the fix):
  > Hello world         → <blockquote> Hello world</blockquote>
                                       ^ phantom leading space
  > Steps:              → <blockquote>Steps:
  > - one                  - one
  > - two                  - two</blockquote>
                          ^ literal text, NOT a <ul>; lists-in-quote feature broken
  > - [x] done          → blockquote with literal "[x] done", no checkbox span

Tests passed despite the bug because tests/test_blockquote_rendering.py
and tests/test_renderer_comprehensive.py validate against a Python
mirror (`_apply_blockquotes`) whose strip regex is `^>[ \t]?` — i.e.
the mirror is correct, the JS is not, and the static-mirror tests
can't catch the divergence. Same shape of bug as commit 94d63d0
(phantom <br> in trailing line) where the mirror was right and the JS
was wrong.

Fix: restore the space character in the strip regex's character class.

Add tests/test_renderer_js_behaviour.py — 11 tests that drive the
ACTUAL renderMd via node and assert on rendered output for the most
common LLM shapes (single-line quote, multi-line quote, list inside
quote, task list inside quote, nested >>>, strikethrough inside and
outside quote, top-level task list, quote followed by heading,
multi-paragraph quote with list, CRLF normalisation).

Verified: the buggy regex makes 6 of those 11 tests fail; the corrected
regex makes all 11 pass.

Suite: 2354 passed, 0 new failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Collapse agent session compression chains

* Restore upstream changelog entries

* fix(agent_sessions): bubble active compression chains to top by tip last_activity

The original PR merge kept the chain head's id/title/started_at and overrode
id/model/message_count/ended_at/end_reason from the tip — but did NOT override
last_activity. Since the projected list is sorted by last_activity DESC and
the WebUI sidebar surfaces updated_at = last_activity, an actively-used
compression chain whose tip is being edited NOW would sort by the ROOT's
old last_activity and fall below recently touched standalone sessions.

Reproducer (with the harness against actual code, before the fix):
  - root: started 30 days ago, last msg 30 days ago
  - tip:  started 28 days ago (parent_session_id=root), last msg 5 seconds ago
  - standalone: last msg 2 days ago

  Sidebar order with original PR:
    [0] standalone  (48h ago)
    [1] active_tip  (last_activity=root's 720h ago)  ← wrong

  Sidebar order after fix:
    [0] active_tip  (last_activity=tip's 0h ago)     ← correct
    [1] standalone  (48h ago)

This matches Hermes Agent's own list_sessions_rich projection at
hermes_state.py:903-909, which overrides "last_active" from the tip
exactly so that the agent CLI's session list orders the same way.

Add ``last_activity`` to the merge-from-tip key list, update the existing
test_compression_chain_collapses_to_latest_tip_in_sidebar assertion to
expect tip-derived updated_at, and add
test_compression_chain_bubbles_to_top_by_tip_activity locking in the
bubble-to-top invariant — without this regression test the previous
behaviour passed CI because no test exercised the sort order against a
mixed set of chains and standalone sessions.

The chain head's started_at (created_at) and title remain preserved, so
users can still find the conversation by its original date and name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.50.216 release notes and version bump

Compression chains, renderer fixes, HTML preview, approval z-index, /steer fix.

* chore: gitignore local-only review harness directory

Adds .local-review/ to .gitignore so renderer drivers, sample inputs,
fixture builders, and other reviewer scratch files do not accidentally
get committed. Nothing under that path is ever shared in the repo;
keeping the entry tracked makes the boundary explicit for any future
contributor who creates the directory locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Keep reasoning chip visible for None effort

* test(reasoning): pin chip render output via node, not just source regex

The PR's static checks in test_reasoning_chip_btw_fixes.py validate the
shape of _applyReasoningChip (no display='none' literal, the right
classList.toggle call exists, the right label literals are in the
function body) but pass even if the runtime detail is wrong — for
example if `inactive` were inverted, _normalizeReasoningEffort
mishandled whitespace, or _formatReasoningEffortLabel returned the
wrong literal for an unknown input.

Add tests/test_reasoning_chip_js_behaviour.py — 11 tests that drive
the actual _applyReasoningChip() via node and assert on the rendered
DOM state for each effort value:

  TestChipAlwaysVisible
    - empty / null  -> "Default" label, inactive=true
    - "none"        -> "None" label, inactive=true
    - "low"/"high"  -> verbatim label, inactive=false
  TestNormalizationEdgeCases
    - "NONE"        -> normalises to "None"
    - "  none  "    -> trims and normalises
    - unknown junk  -> falls through visible, never hidden
  TestTitleAttributeAccessibility
    - title attribute carries the human-readable label for tooltip /
      screen-reader use

Sanity-checked against master's pre-fix ui.js: 11/11 fail (bug caught).
Against this PR's ui.js: 11/11 pass.

This pattern (drive the actual JS via node) caught two regex-only
regressions in PR #1073 where the Python mirror was correct while the
JS was broken. Same protection added here so the chip-visibility
contract can't silently break in a future refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: add #1074 to v0.50.216 changelog, bump test count to 2428

* fix(i18n): restore broken Unicode in Russian and Spanish steer strings

Commit 56c7a14 (fix: update steer i18n and settings copy) accidentally
stripped the `\u` prefix from Unicode escape sequences in two locales,
producing garbled literal hex strings visible to users:

  Spanish (es):
    - cmd_steer:                   correcci00f3n  → corrección
    - cmd_steer_fallback:          2014 en cola   → — en cola
    - busy_steer_fallback:         2014 en cola   → — en cola
    - settings_desc_busy_input_mode: qu00e9, est00e1, correcci00f3n → qué, está, corrección
    - settings_busy_input_mode_steer: correcci00f3n  → corrección

  Russian (ru):
    - settings_desc_busy_input_mode: the entire Cyrillic string was
      replaced with raw 4-hex-char code-points without the \u prefix
      (041e043f... instead of actual Cyrillic). Decoded:
      "Определяет поведение при отправке сообщения во время работы
      агента. Очередь ждёт; Прерывание отменяет и начинает заново;
      Steer внедряет коррекцию без прерывания."

Fix: write the correct characters directly (UTF-8 is the file encoding
so embedding them literally is cleaner than \u escapes for long text).

All other locales (en, de, zh, zh-Hant) were not affected — confirmed
by grepping for bare hex run-ons in the updated file.

Verified: node --check static/i18n.js passes; full pytest suite green
(2365 passed, 47 skipped).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: remove duplicate compression chain entry from [Unreleased]

---------

Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Frank Song <franksong2702@gmail.com>

2026-04-25 21:06:31 -07:00

nesquena-hermes

7d1aa2e261

v0.50.209: check-for-updates, workspace toggle, HTML preview, provider categories, queue flyout docs (#1042 )

* feat: add manual 'Check for Updates' button in System settings (#785)

Add a 'Check now' button next to the version badge in the System
settings section, allowing users to manually trigger an update check
at any time without waiting for the automatic periodic check.

Changes:
- index.html: add button with spinner and status text inline with version badge
- panels.js: add checkUpdatesNow() calling /api/updates/check?force=1
  with immediate feedback (checking... / up to date / X updates available)
- style.css: style the button block and spinner
- i18n.js: add 5 new keys (settings_check_now, settings_checking,
  settings_up_to_date, settings_updates_available, settings_updates_disabled)
  in all 6 locales (en, ru, es, de, zh, zh-Hant)

* fix: sanitize error message in checkUpdatesNow to avoid exposing paths

Review feedback: strip filesystem paths from error messages and cap
length to prevent internal details leaking into the UI.

* fix: fully sanitize error in update check — never expose raw e.message in UI

Previous partial fix (80cdaee) stripped filesystem paths from e.message but
still displayed the JS exception message to users. Per reviewer feedback and
project convention (NEVER expose raw e.message in UI), replace with:
- A generic user-facing i18n key (settings_update_check_failed) as default
- Fallback to API response body error if available (structured, not raw)
- Full error logged via console.warn for debugging
- Button disable-during-check already confirmed working (try/finally pattern)
- settings_update_check_failed key added in all 6 locales

* fix(#785): align HTML selectors with CSS and add regression tests

- Wrap update button in div#checkUpdatesBlock so CSS selectors apply
- Change button class from sm-btn to btn-tiny (matching stylesheet)
- Remove inline styles now handled by CSS (#checkUpdatesBlock, .btn-tiny)
- Move spinner sizing to CSS class .spinner-xs
- Add 4 static tests in test_update_banner_fixes.py:
  checkUpdatesNow defined, btnCheckUpdatesNow in HTML, CSS selectors exist, i18n key in all locales

* feat: 'Keep workspace panel open' toggle in Appearance settings (#999)

* feat: categorize providers in setup wizard (#603)

- Add 6 new providers: Google Gemini, DeepSeek, Mistral, xAI (Grok),
  Ollama, LM Studio to the onboarding quick-setup catalog
- Group providers into 3 categories: Easy start, Open/self-hosted,
  Specialized — rendered as <optgroup> in the provider dropdown
- Generic base_url save logic (requires_base_url + default_base_url)
  instead of hardcoded provider checks
- i18n keys for category labels in en, ru, es, zh, zh-Hant

* ci: re-run tests

* fix(tests): prevent reload_config() from overwriting in-memory mock in test_issue644

The test helper _available_models_with_cfg patches cfg in-memory but
get_available_models() calls reload_config() when the config file's
mtime doesn't match _cfg_mtime. On CI, config.yaml exists so mtime > 0
and _cfg_mtime starts at 0.0, triggering a reload that overwrites the
test's mock with on-disk content.

Fix: freeze _cfg_mtime to the current config file mtime inside the
helper, so reload_config() is not triggered during the test.

* fix: correct default model IDs for gemini, xai, deepseek; add specialized provider tests

- gemini: gemini-3.1-pro-preview → gemini-2.5-pro-preview
- x-ai: grok-4.20 → grok-3
- deepseek: deepseek-chat-v3-0324 → deepseek-chat
- Add TestApplyBaseURLSpecialized: 4 tests verifying base_url written for
  gemini, deepseek, mistral, and x-ai through apply_onboarding_setup

* test: add TestApplyBaseURLSpecialized — verify base_url written for gemini, deepseek, mistralai, x-ai

* fix(onboarding): correct stale model defaults for specialized providers

Three issues in the new specialized provider catalog (#1027 hold reason):

1. gemini default_model was `gemini-2.5-pro-preview` — agent's catalog
   has the 3.1 family. Updated to `gemini-3.1-pro-preview`.
2. x-ai default_model was `grok-3` — agent's catalog has `grok-4.20`.
   Updated.
3. gemini `models` list was sourcing from `_PROVIDER_MODELS.get("gemini")`
   which returns []. The catalog in api/config.py is keyed under "google"
   (even though the agent's alias map normalizes google -> gemini).
   Switched to `_PROVIDER_MODELS.get("google")` so the wizard surfaces
   the actual 5-model list. Also forward-compatible lookup for x-ai
   (xai or x-ai key).

Without these fixes, users picking gemini or x-ai in the wizard would
see no model dropdown and the default_model written to config.yaml
would 404 on first chat.

deepseek default_model bumped from `deepseek-chat` to
`deepseek-chat-v3-0324` to match the test fixture's expectation and
the agent catalog's pinned version.

Added two regression tests:
- test_gemini_model_list_is_populated: pins the catalog-key correctness
- test_specialized_default_models_match_catalog: pins the version
  prefixes (3.x for gemini, 4.x for grok)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: inline HTML preview in workspace panel (#779)

Render .html/.htm files as live previews in a sandboxed iframe instead
of showing raw source code. Adds an 'Open in browser' button to open
the file in a new tab.

Changes:
- workspace.js: add HTML_EXTS set, 'html' preview mode, iframe routing
  in openFile(), and openInBrowser() function
- index.html: add sandboxed iframe element and 'Open in browser' button
  in preview toolbar (visible only for HTML files)
- i18n.js: add 'open_in_browser' key in all 6 locales

The iframe uses sandbox='allow-scripts' for security. Download button
remains available alongside the new preview.

* docs: document sandbox security tradeoff for HTML preview

Review feedback: fileExt() already lowercases extensions so .HTML/.HTM work.
Added code comment explaining the deliberate sandbox=allow-scripts choice:
scripts are needed for most HTML documents but the iframe is still origin-
isolated and cannot access parent cookies/data.

* fix: pass ?inline=1 to file/raw so HTML preview iframe renders instead of downloading

routes.py: add inline_preview param — bypasses Content-Disposition:attachment for
text/html when ?inline=1 is set, serving the file inline for the sandboxed iframe.
workspace.js: add &inline=1 to the iframe src URL.
test: add 5 static regression tests for the inline HTML preview.

* fix(security): CSP sandbox header for inline HTML preview

The iframe sandbox="allow-scripts" attribute on previewHtmlIframe only
applies when HTML is loaded INSIDE that iframe. A user tricked into
opening /api/file/raw?path=evil.html&inline=1 directly in a top-level
tab (e.g. via a chat link) would render the HTML in the WebUI's origin
without any sandbox, giving the page full access to cookies and
localStorage.

Server-side Content-Security-Policy: sandbox allow-scripts mirrors the
iframe sandbox exactly: scripts run, but the document is treated as a
unique opaque origin (no allow-same-origin) and cannot read WebUI
cookies, localStorage, or postMessage to the parent regardless of how
the URL is accessed.

Added test_inline_html_response_sets_csp_sandbox to pin the header.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.50.209 release notes — 4 PRs, 2212 tests (+43)

* docs(changelog): document #1040 queue flyout and Cloudflare CSP in v0.50.209

The stage commit ed2bd18 listed v0.50.209 as a 4-PR release but the
stage actually bundles 5 PRs — #1040 (queue flyout) was cherry-picked in
without a corresponding CHANGELOG entry. Without this fix, the queue
feature ships silently and the bundled Cloudflare CSP relaxation in
api/helpers.py is also undocumented.

Adds two entries:
- Added: queue flyout (#1040) under v0.50.209
- Changed: CSP allowlist for Cloudflare Access deployments

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: bergeouss <bergeouss@users.noreply.github.com>
Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-25 14:33:41 -07:00

2 Commits