* fix(workspace): add .html/.htm to MIME_MAP so HTML preview renders correctly
MIME_MAP was missing entries for .html and .htm. The server fell back to
Content-Type: application/octet-stream, which browsers refuse to render as
HTML in an iframe — causing a blank white preview.
The rest of the pipeline was already correct: the iframe exists in
static/index.html, openFile() in static/workspace.js routes .html to
showPreview('html'), and _handle_file_raw() in api/routes.py sets the
correct CSP sandbox header when ?inline=1 is present. The only missing
piece was the MIME type.
* test(workspace): lock in MIME_MAP entry for .html/.htm
PR #1070 added .html/.htm → text/html to MIME_MAP in api/config.py
to fix the blank workspace HTML preview iframe. Without a direct
assertion on the MIME_MAP entries, the fix could silently regress
(the existing test_779_html_preview.py tests cover the iframe wiring,
the inline=1 query handling, and the CSP sandbox header — but none of
them touch MIME_MAP itself).
Add a single regression test that asserts MIME_MAP['.html'] and
MIME_MAP['.htm'] are both 'text/html' so any future removal of those
entries fails CI immediately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(composer): raise .approval-card.visible z-index above .queue-card
.queue-card has z-index:2. .approval-card.visible had no z-index, so the
queue flyout would render on top of the approval card when both were visible
simultaneously — obscuring the Allow/Deny buttons.
Fix: add z-index:3 to .approval-card.visible so approvals always render
above the queue flyout. Approval is a blocking, security-relevant interaction
and must never be obscured by passive UI elements.
* test(composer): pin approval-card z-index > queue-card invariant
PR #1071 raises .approval-card.visible to z-index:3 so the security-
relevant Allow / Deny buttons stay clickable when the queue flyout is
also open. Without a regression test, a future CSS edit could silently
drop the z-index back below queue-card (z-index:2) and reintroduce the
bug — there is no automated UI test covering this stacking interaction.
Add a focused regex check that pins the invariant:
.approval-card.visible z-index must be strictly greater than
.queue-card z-index.
Modeled on the existing CSS-regex regression style in
tests/test_mobile_layout.py (test_profile_dropdown_not_clipped_by_overflow).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: intercept /steer /interrupt /queue before busy-mode routing in send()
Root cause: slash commands entered while the agent is busy never reached
the command dispatcher. send() enters the busy block and returns early at
line ~50, so the slash-command intercept (~line 56) is never reached.
The text was queued as a plain message. When it drained after the turn
ended, cmdSteer / cmdInterrupt ran on an idle session, saw no active stream,
and showed "No active task to stop."
Fix: at the top of the busy block, before checking busyMode, check if the
text starts with / and is one of the three control commands. If so, dispatch
the handler immediately and return. This lets the user type /steer, /interrupt,
or /queue at any time — including while the agent is mid-stream — and have
them execute against the live session.
Two new regression tests added:
- test_slash_commands_intercepted_before_busymode_routing: verifies the
intercept appears before the busyMode routing in the busy block
- test_steer_intercept_calls_handler_directly: verifies the intercept calls
_bc.fn(_pc.args) and returns, not queues
* test(busy-intercept): pin sync input-clear before await in slash intercept
PR #1072's intercept clears the msg input before awaiting the handler.
Order matters: if the await happens first (or if the clear is moved
inside the handler), the input still shows '/steer foo' for the duration
of the await. A reflexive second Enter press during that window — common
while waiting for the toast — re-runs send(): either re-fires the
handler (double-steer) or, if the turn just ended, falls through to the
non-busy slash dispatcher and drops a confusing "No active task to stop."
Add test_steer_intercept_clears_input_before_await pinning the order so
this UX invariant cannot silently regress.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: update steer i18n and settings copy — steer no longer interrupts
With the real /steer implementation (agent.steer() via /api/chat/steer),
steer injects a correction mid-turn WITHOUT interrupting the current stream.
The previous copy said "falls back to interrupt", "Steer (interrupt + send)",
etc. — accurate only for the old placeholder, not the real implementation.
Changes across all 6 locales (en/ru/es/de/zh/zh-Hant):
cmd_steer: "falls back to interrupt" removed
settings_busy_input_mode_steer: "interrupt + send" → "mid-turn correction"
cmd_steer_fallback: "interrupted" → "queued for next turn"
busy_steer_fallback: "interrupted instead" → "queued for next turn"
settings_desc_busy_input_mode: "currently falls back to interrupt" removed
Also:
static/index.html: inline fallback text updated to match
static/commands.js: internal comment clarified (fallback = queue+cancel,
not "interrupt mode" which implies the primary action)
* fix(renderer): group consecutive blockquote lines into single element
Root cause: the old rule `s.replace(/^> (.+)$/gm, ...)` had three bugs:
1. `.+` required at least one character — bare `>` lines (blank
continuation lines) did not match and passed through as literal `>`
2. Each matching line became its own `<blockquote>` element — a 10-line
blockquote produced 10 stacked `<blockquote>` tags with no grouping
3. When a fenced code block sat inside a blockquote, the fence-stash
pass consumed the code content and left orphaned `>` lines that the
old `.+` pattern could not match
Fix: replace the single-line regex with a group-based approach that matches
one or more consecutive `>` lines as a single block, strips the `>` prefix
from each line, passes each non-empty line through inlineMd(), turns blank
`>` lines into `<br>`, and wraps the entire group in one `<blockquote>`.
14 regression tests added covering:
- Single-line blockquotes (regression)
- Multi-line grouping (2 and 10 lines)
- Two separate blockquotes staying separate
- Bare `>` and `>text` (no space) edge cases
- Blank continuation lines → <br>
- Bold / italic / inline-code inside blockquotes
- Blockquote followed by normal paragraph
* fix(renderer): drop empty trailing line from blockquote match
The new group-based blockquote rule introduced in this PR captures the
trailing newline in its (?:\n|$) clause. After block.split('\n') that
trailing newline produces an empty final element. The original filter
only dropped lone bare '>' artifacts on the last line, so the empty
final element survived, and the .map(blank → '<br>') step turned it
into a phantom <br> immediately before </blockquote>.
Visible symptom: any blockquote whose source ends with \n (the common
case — a quote followed by another paragraph or end-of-message) renders
with an extra blank line at the bottom of the quote.
Reproducer:
'> Hello\n\nThe rest of the message.'
→ '<blockquote>Hello\n<br></blockquote>\nThe rest of the message.'
^^^ phantom <br>
Fix: replace the single-line filter with a while-loop that pops trailing
lines while they are either empty OR a bare '>'. This matches the
intent the Python test mirror in tests/test_blockquote_rendering.py
already had (the mirror was correct; the JS was not — that's why
the original tests passed despite the bug).
Also add four new regression tests in TestNoPhantomTrailingBr that pin
the no-trailing-<br> invariant for the common shapes:
- input ending with \n
- quote followed by paragraph (the real-world case)
- multi-line quote ending with \n
- quote with blank continuation + trailing \n (internal <br> stays,
trailing <br> does not)
Verified end-to-end with node against the actual JS regex.
244 renderer-adjacent tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(renderer): comprehensive markdown fixes — strikethrough, task lists, CRLF, nested blockquotes
Five additional fixes on top of the blockquote grouping from the initial commit:
1. CRLF normalisation: strip \r\n → \n at start of renderMd so Windows
line endings do not produce stray \r characters in rendered output
2. Strikethrough: ~~text~~ → <del>text</del> in both inlineMd() (for use
inside blockquotes/lists) and the outer pass (for plain paragraphs).
Added <del> to SAFE_TAGS and SAFE_INLINE so it is not HTML-escaped.
3. Task lists: - [x] / - [ ] items in unordered lists render as ✅/☐
via task-done/task-todo span wrappers. Checks [X] (uppercase) too.
4. Nested blockquotes: >> / >>> etc. now recurse so each level gets its
own <blockquote> element rather than passing through as literal >.
Implemented by extracting the blockquote rule into _applyBlockquotes()
which calls itself recursively on the stripped inner content.
5. Lists inside blockquotes: > - item now renders <ul><li> inside the
blockquote instead of a literal "- item" string. Task list items work
inside blockquotes too (> - [x] done → ✅ inside <blockquote><ul>).
Also fixed test_issue342.py search window (5000→10000 chars) — the CRLF
strip at the top of renderMd pushed the autolink regex past the old limit.
68 new tests in test_renderer_comprehensive.py + test_blockquote_rendering.py
covering all constructs, edge cases, and combinations.
* fix(renderer): restore space in blockquote prefix-strip regex
Commit 04e7b53 changed the blockquote prefix-strip regex from
/^>[ \t]?/ (consume "> ", "\t>", or just ">")
to
/^>[\t]?/ (only consume "\t>" or just ">")
The space character was dropped from the character class. Since
practically every blockquote an LLM produces is "> " (greater-than
followed by a space), this leaves a leading space artifact on every
stripped blockquote line. Worse, the leading space breaks the
list-detection regex `^(?: )?[-*+] ` inside the new `_applyBlockquotes`
helper — that regex requires either zero or two leading spaces, never
one — so the new "list inside blockquote" feature never fired for
the canonical input shape `> - item`.
Reproducer (against the actual ui.js via node, before the fix):
> Hello world → <blockquote> Hello world</blockquote>
^ phantom leading space
> Steps: → <blockquote>Steps:
> - one - one
> - two - two</blockquote>
^ literal text, NOT a <ul>; lists-in-quote feature broken
> - [x] done → blockquote with literal "[x] done", no checkbox span
Tests passed despite the bug because tests/test_blockquote_rendering.py
and tests/test_renderer_comprehensive.py validate against a Python
mirror (`_apply_blockquotes`) whose strip regex is `^>[ \t]?` — i.e.
the mirror is correct, the JS is not, and the static-mirror tests
can't catch the divergence. Same shape of bug as commit 94d63d0
(phantom <br> in trailing line) where the mirror was right and the JS
was wrong.
Fix: restore the space character in the strip regex's character class.
Add tests/test_renderer_js_behaviour.py — 11 tests that drive the
ACTUAL renderMd via node and assert on rendered output for the most
common LLM shapes (single-line quote, multi-line quote, list inside
quote, task list inside quote, nested >>>, strikethrough inside and
outside quote, top-level task list, quote followed by heading,
multi-paragraph quote with list, CRLF normalisation).
Verified: the buggy regex makes 6 of those 11 tests fail; the corrected
regex makes all 11 pass.
Suite: 2354 passed, 0 new failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Collapse agent session compression chains
* Restore upstream changelog entries
* fix(agent_sessions): bubble active compression chains to top by tip last_activity
The original PR merge kept the chain head's id/title/started_at and overrode
id/model/message_count/ended_at/end_reason from the tip — but did NOT override
last_activity. Since the projected list is sorted by last_activity DESC and
the WebUI sidebar surfaces updated_at = last_activity, an actively-used
compression chain whose tip is being edited NOW would sort by the ROOT's
old last_activity and fall below recently touched standalone sessions.
Reproducer (with the harness against actual code, before the fix):
- root: started 30 days ago, last msg 30 days ago
- tip: started 28 days ago (parent_session_id=root), last msg 5 seconds ago
- standalone: last msg 2 days ago
Sidebar order with original PR:
[0] standalone (48h ago)
[1] active_tip (last_activity=root's 720h ago) ← wrong
Sidebar order after fix:
[0] active_tip (last_activity=tip's 0h ago) ← correct
[1] standalone (48h ago)
This matches Hermes Agent's own list_sessions_rich projection at
hermes_state.py:903-909, which overrides "last_active" from the tip
exactly so that the agent CLI's session list orders the same way.
Add ``last_activity`` to the merge-from-tip key list, update the existing
test_compression_chain_collapses_to_latest_tip_in_sidebar assertion to
expect tip-derived updated_at, and add
test_compression_chain_bubbles_to_top_by_tip_activity locking in the
bubble-to-top invariant — without this regression test the previous
behaviour passed CI because no test exercised the sort order against a
mixed set of chains and standalone sessions.
The chain head's started_at (created_at) and title remain preserved, so
users can still find the conversation by its original date and name.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: v0.50.216 release notes and version bump
Compression chains, renderer fixes, HTML preview, approval z-index, /steer fix.
* chore: gitignore local-only review harness directory
Adds .local-review/ to .gitignore so renderer drivers, sample inputs,
fixture builders, and other reviewer scratch files do not accidentally
get committed. Nothing under that path is ever shared in the repo;
keeping the entry tracked makes the boundary explicit for any future
contributor who creates the directory locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Keep reasoning chip visible for None effort
* test(reasoning): pin chip render output via node, not just source regex
The PR's static checks in test_reasoning_chip_btw_fixes.py validate the
shape of _applyReasoningChip (no display='none' literal, the right
classList.toggle call exists, the right label literals are in the
function body) but pass even if the runtime detail is wrong — for
example if `inactive` were inverted, _normalizeReasoningEffort
mishandled whitespace, or _formatReasoningEffortLabel returned the
wrong literal for an unknown input.
Add tests/test_reasoning_chip_js_behaviour.py — 11 tests that drive
the actual _applyReasoningChip() via node and assert on the rendered
DOM state for each effort value:
TestChipAlwaysVisible
- empty / null -> "Default" label, inactive=true
- "none" -> "None" label, inactive=true
- "low"/"high" -> verbatim label, inactive=false
TestNormalizationEdgeCases
- "NONE" -> normalises to "None"
- " none " -> trims and normalises
- unknown junk -> falls through visible, never hidden
TestTitleAttributeAccessibility
- title attribute carries the human-readable label for tooltip /
screen-reader use
Sanity-checked against master's pre-fix ui.js: 11/11 fail (bug caught).
Against this PR's ui.js: 11/11 pass.
This pattern (drive the actual JS via node) caught two regex-only
regressions in PR #1073 where the Python mirror was correct while the
JS was broken. Same protection added here so the chip-visibility
contract can't silently break in a future refactor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: add #1074 to v0.50.216 changelog, bump test count to 2428
* fix(i18n): restore broken Unicode in Russian and Spanish steer strings
Commit 56c7a14 (fix: update steer i18n and settings copy) accidentally
stripped the `\u` prefix from Unicode escape sequences in two locales,
producing garbled literal hex strings visible to users:
Spanish (es):
- cmd_steer: correcci00f3n → corrección
- cmd_steer_fallback: 2014 en cola → — en cola
- busy_steer_fallback: 2014 en cola → — en cola
- settings_desc_busy_input_mode: qu00e9, est00e1, correcci00f3n → qué, está, corrección
- settings_busy_input_mode_steer: correcci00f3n → corrección
Russian (ru):
- settings_desc_busy_input_mode: the entire Cyrillic string was
replaced with raw 4-hex-char code-points without the \u prefix
(041e043f... instead of actual Cyrillic). Decoded:
"Определяет поведение при отправке сообщения во время работы
агента. Очередь ждёт; Прерывание отменяет и начинает заново;
Steer внедряет коррекцию без прерывания."
Fix: write the correct characters directly (UTF-8 is the file encoding
so embedding them literally is cleaner than \u escapes for long text).
All other locales (en, de, zh, zh-Hant) were not affected — confirmed
by grepping for bare hex run-ons in the updated file.
Verified: node --check static/i18n.js passes; full pytest suite green
(2365 passed, 47 skipped).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: remove duplicate compression chain entry from [Unreleased]
---------
Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Frank Song <franksong2702@gmail.com>
* feat: add manual 'Check for Updates' button in System settings (#785)
Add a 'Check now' button next to the version badge in the System
settings section, allowing users to manually trigger an update check
at any time without waiting for the automatic periodic check.
Changes:
- index.html: add button with spinner and status text inline with version badge
- panels.js: add checkUpdatesNow() calling /api/updates/check?force=1
with immediate feedback (checking... / up to date / X updates available)
- style.css: style the button block and spinner
- i18n.js: add 5 new keys (settings_check_now, settings_checking,
settings_up_to_date, settings_updates_available, settings_updates_disabled)
in all 6 locales (en, ru, es, de, zh, zh-Hant)
* fix: sanitize error message in checkUpdatesNow to avoid exposing paths
Review feedback: strip filesystem paths from error messages and cap
length to prevent internal details leaking into the UI.
* fix: fully sanitize error in update check — never expose raw e.message in UI
Previous partial fix (80cdaee) stripped filesystem paths from e.message but
still displayed the JS exception message to users. Per reviewer feedback and
project convention (NEVER expose raw e.message in UI), replace with:
- A generic user-facing i18n key (settings_update_check_failed) as default
- Fallback to API response body error if available (structured, not raw)
- Full error logged via console.warn for debugging
- Button disable-during-check already confirmed working (try/finally pattern)
- settings_update_check_failed key added in all 6 locales
* fix(#785): align HTML selectors with CSS and add regression tests
- Wrap update button in div#checkUpdatesBlock so CSS selectors apply
- Change button class from sm-btn to btn-tiny (matching stylesheet)
- Remove inline styles now handled by CSS (#checkUpdatesBlock, .btn-tiny)
- Move spinner sizing to CSS class .spinner-xs
- Add 4 static tests in test_update_banner_fixes.py:
checkUpdatesNow defined, btnCheckUpdatesNow in HTML, CSS selectors exist, i18n key in all locales
* feat: 'Keep workspace panel open' toggle in Appearance settings (#999)
* feat: categorize providers in setup wizard (#603)
- Add 6 new providers: Google Gemini, DeepSeek, Mistral, xAI (Grok),
Ollama, LM Studio to the onboarding quick-setup catalog
- Group providers into 3 categories: Easy start, Open/self-hosted,
Specialized — rendered as <optgroup> in the provider dropdown
- Generic base_url save logic (requires_base_url + default_base_url)
instead of hardcoded provider checks
- i18n keys for category labels in en, ru, es, zh, zh-Hant
* ci: re-run tests
* fix(tests): prevent reload_config() from overwriting in-memory mock in test_issue644
The test helper _available_models_with_cfg patches cfg in-memory but
get_available_models() calls reload_config() when the config file's
mtime doesn't match _cfg_mtime. On CI, config.yaml exists so mtime > 0
and _cfg_mtime starts at 0.0, triggering a reload that overwrites the
test's mock with on-disk content.
Fix: freeze _cfg_mtime to the current config file mtime inside the
helper, so reload_config() is not triggered during the test.
* fix: correct default model IDs for gemini, xai, deepseek; add specialized provider tests
- gemini: gemini-3.1-pro-preview → gemini-2.5-pro-preview
- x-ai: grok-4.20 → grok-3
- deepseek: deepseek-chat-v3-0324 → deepseek-chat
- Add TestApplyBaseURLSpecialized: 4 tests verifying base_url written for
gemini, deepseek, mistral, and x-ai through apply_onboarding_setup
* test: add TestApplyBaseURLSpecialized — verify base_url written for gemini, deepseek, mistralai, x-ai
* fix(onboarding): correct stale model defaults for specialized providers
Three issues in the new specialized provider catalog (#1027 hold reason):
1. gemini default_model was `gemini-2.5-pro-preview` — agent's catalog
has the 3.1 family. Updated to `gemini-3.1-pro-preview`.
2. x-ai default_model was `grok-3` — agent's catalog has `grok-4.20`.
Updated.
3. gemini `models` list was sourcing from `_PROVIDER_MODELS.get("gemini")`
which returns []. The catalog in api/config.py is keyed under "google"
(even though the agent's alias map normalizes google -> gemini).
Switched to `_PROVIDER_MODELS.get("google")` so the wizard surfaces
the actual 5-model list. Also forward-compatible lookup for x-ai
(xai or x-ai key).
Without these fixes, users picking gemini or x-ai in the wizard would
see no model dropdown and the default_model written to config.yaml
would 404 on first chat.
deepseek default_model bumped from `deepseek-chat` to
`deepseek-chat-v3-0324` to match the test fixture's expectation and
the agent catalog's pinned version.
Added two regression tests:
- test_gemini_model_list_is_populated: pins the catalog-key correctness
- test_specialized_default_models_match_catalog: pins the version
prefixes (3.x for gemini, 4.x for grok)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: inline HTML preview in workspace panel (#779)
Render .html/.htm files as live previews in a sandboxed iframe instead
of showing raw source code. Adds an 'Open in browser' button to open
the file in a new tab.
Changes:
- workspace.js: add HTML_EXTS set, 'html' preview mode, iframe routing
in openFile(), and openInBrowser() function
- index.html: add sandboxed iframe element and 'Open in browser' button
in preview toolbar (visible only for HTML files)
- i18n.js: add 'open_in_browser' key in all 6 locales
The iframe uses sandbox='allow-scripts' for security. Download button
remains available alongside the new preview.
* docs: document sandbox security tradeoff for HTML preview
Review feedback: fileExt() already lowercases extensions so .HTML/.HTM work.
Added code comment explaining the deliberate sandbox=allow-scripts choice:
scripts are needed for most HTML documents but the iframe is still origin-
isolated and cannot access parent cookies/data.
* fix: pass ?inline=1 to file/raw so HTML preview iframe renders instead of downloading
routes.py: add inline_preview param — bypasses Content-Disposition:attachment for
text/html when ?inline=1 is set, serving the file inline for the sandboxed iframe.
workspace.js: add &inline=1 to the iframe src URL.
test: add 5 static regression tests for the inline HTML preview.
* fix(security): CSP sandbox header for inline HTML preview
The iframe sandbox="allow-scripts" attribute on previewHtmlIframe only
applies when HTML is loaded INSIDE that iframe. A user tricked into
opening /api/file/raw?path=evil.html&inline=1 directly in a top-level
tab (e.g. via a chat link) would render the HTML in the WebUI's origin
without any sandbox, giving the page full access to cookies and
localStorage.
Server-side Content-Security-Policy: sandbox allow-scripts mirrors the
iframe sandbox exactly: scripts run, but the document is treated as a
unique opaque origin (no allow-same-origin) and cannot read WebUI
cookies, localStorage, or postMessage to the parent regardless of how
the URL is accessed.
Added test_inline_html_response_sets_csp_sandbox to pin the header.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: v0.50.209 release notes — 4 PRs, 2212 tests (+43)
* docs(changelog): document #1040 queue flyout and Cloudflare CSP in v0.50.209
The stage commit ed2bd18 listed v0.50.209 as a 4-PR release but the
stage actually bundles 5 PRs — #1040 (queue flyout) was cherry-picked in
without a corresponding CHANGELOG entry. Without this fix, the queue
feature ships silently and the bundled Cloudflare CSP relaxation in
api/helpers.py is also undocumented.
Adds two entries:
- Added: queue flyout (#1040) under v0.50.209
- Changed: CSP allowlist for Cloudflare Access deployments
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: bergeouss <bergeouss@users.noreply.github.com>
Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>