v0.50.216: compression chains, renderer fixes, HTML preview, approval z-index, /steer fix, reasoning chip (#1075)

* fix(workspace): add .html/.htm to MIME_MAP so HTML preview renders correctly

MIME_MAP was missing entries for .html and .htm. The server fell back to
Content-Type: application/octet-stream, which browsers refuse to render as
HTML in an iframe — causing a blank white preview.

The rest of the pipeline was already correct: the iframe exists in
static/index.html, openFile() in static/workspace.js routes .html to
showPreview('html'), and _handle_file_raw() in api/routes.py sets the
correct CSP sandbox header when ?inline=1 is present. The only missing
piece was the MIME type.

* test(workspace): lock in MIME_MAP entry for .html/.htm

PR #1070 added .html/.htm → text/html to MIME_MAP in api/config.py
to fix the blank workspace HTML preview iframe. Without a direct
assertion on the MIME_MAP entries, the fix could silently regress
(the existing test_779_html_preview.py tests cover the iframe wiring,
the inline=1 query handling, and the CSP sandbox header — but none of
them touch MIME_MAP itself).

Add a single regression test that asserts MIME_MAP['.html'] and
MIME_MAP['.htm'] are both 'text/html' so any future removal of those
entries fails CI immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(composer): raise .approval-card.visible z-index above .queue-card

.queue-card has z-index:2. .approval-card.visible had no z-index, so the
queue flyout would render on top of the approval card when both were visible
simultaneously — obscuring the Allow/Deny buttons.

Fix: add z-index:3 to .approval-card.visible so approvals always render
above the queue flyout. Approval is a blocking, security-relevant interaction
and must never be obscured by passive UI elements.

* test(composer): pin approval-card z-index > queue-card invariant

PR #1071 raises .approval-card.visible to z-index:3 so the security-
relevant Allow / Deny buttons stay clickable when the queue flyout is
also open. Without a regression test, a future CSS edit could silently
drop the z-index back below queue-card (z-index:2) and reintroduce the
bug — there is no automated UI test covering this stacking interaction.

Add a focused regex check that pins the invariant:
.approval-card.visible z-index must be strictly greater than
.queue-card z-index.

Modeled on the existing CSS-regex regression style in
tests/test_mobile_layout.py (test_profile_dropdown_not_clipped_by_overflow).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: intercept /steer /interrupt /queue before busy-mode routing in send()

Root cause: slash commands entered while the agent is busy never reached
the command dispatcher. send() enters the busy block and returns early at
line ~50, so the slash-command intercept (~line 56) is never reached.
The text was queued as a plain message. When it drained after the turn
ended, cmdSteer / cmdInterrupt ran on an idle session, saw no active stream,
and showed "No active task to stop."

Fix: at the top of the busy block, before checking busyMode, check if the
text starts with / and is one of the three control commands. If so, dispatch
the handler immediately and return. This lets the user type /steer, /interrupt,
or /queue at any time — including while the agent is mid-stream — and have
them execute against the live session.

Two new regression tests added:
- test_slash_commands_intercepted_before_busymode_routing: verifies the
  intercept appears before the busyMode routing in the busy block
- test_steer_intercept_calls_handler_directly: verifies the intercept calls
  _bc.fn(_pc.args) and returns, not queues

* test(busy-intercept): pin sync input-clear before await in slash intercept

PR #1072's intercept clears the msg input before awaiting the handler.
Order matters: if the await happens first (or if the clear is moved
inside the handler), the input still shows '/steer foo' for the duration
of the await. A reflexive second Enter press during that window — common
while waiting for the toast — re-runs send(): either re-fires the
handler (double-steer) or, if the turn just ended, falls through to the
non-busy slash dispatcher and drops a confusing "No active task to stop."

Add test_steer_intercept_clears_input_before_await pinning the order so
this UX invariant cannot silently regress.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: update steer i18n and settings copy — steer no longer interrupts

With the real /steer implementation (agent.steer() via /api/chat/steer),
steer injects a correction mid-turn WITHOUT interrupting the current stream.
The previous copy said "falls back to interrupt", "Steer (interrupt + send)",
etc. — accurate only for the old placeholder, not the real implementation.

Changes across all 6 locales (en/ru/es/de/zh/zh-Hant):
  cmd_steer:                  "falls back to interrupt" removed
  settings_busy_input_mode_steer: "interrupt + send" → "mid-turn correction"
  cmd_steer_fallback:         "interrupted" → "queued for next turn"
  busy_steer_fallback:        "interrupted instead" → "queued for next turn"
  settings_desc_busy_input_mode: "currently falls back to interrupt" removed

Also:
  static/index.html: inline fallback text updated to match
  static/commands.js: internal comment clarified (fallback = queue+cancel,
                      not "interrupt mode" which implies the primary action)

* fix(renderer): group consecutive blockquote lines into single element

Root cause: the old rule `s.replace(/^> (.+)$/gm, ...)` had three bugs:
  1. `.+` required at least one character — bare `>` lines (blank
     continuation lines) did not match and passed through as literal `>`
  2. Each matching line became its own `<blockquote>` element — a 10-line
     blockquote produced 10 stacked `<blockquote>` tags with no grouping
  3. When a fenced code block sat inside a blockquote, the fence-stash
     pass consumed the code content and left orphaned `>` lines that the
     old `.+` pattern could not match

Fix: replace the single-line regex with a group-based approach that matches
one or more consecutive `>` lines as a single block, strips the `>` prefix
from each line, passes each non-empty line through inlineMd(), turns blank
`>` lines into `<br>`, and wraps the entire group in one `<blockquote>`.

14 regression tests added covering:
- Single-line blockquotes (regression)
- Multi-line grouping (2 and 10 lines)
- Two separate blockquotes staying separate
- Bare `>` and `>text` (no space) edge cases
- Blank continuation lines → <br>
- Bold / italic / inline-code inside blockquotes
- Blockquote followed by normal paragraph

* fix(renderer): drop empty trailing line from blockquote match

The new group-based blockquote rule introduced in this PR captures the
trailing newline in its (?:\n|$) clause. After block.split('\n') that
trailing newline produces an empty final element. The original filter
only dropped lone bare '>' artifacts on the last line, so the empty
final element survived, and the .map(blank → '<br>') step turned it
into a phantom <br> immediately before </blockquote>.

Visible symptom: any blockquote whose source ends with \n (the common
case — a quote followed by another paragraph or end-of-message) renders
with an extra blank line at the bottom of the quote.

Reproducer:
  '> Hello\n\nThe rest of the message.'
    → '<blockquote>Hello\n<br></blockquote>\nThe rest of the message.'
                          ^^^ phantom <br>

Fix: replace the single-line filter with a while-loop that pops trailing
lines while they are either empty OR a bare '>'. This matches the
intent the Python test mirror in tests/test_blockquote_rendering.py
already had (the mirror was correct; the JS was not — that's why
the original tests passed despite the bug).

Also add four new regression tests in TestNoPhantomTrailingBr that pin
the no-trailing-<br> invariant for the common shapes:
  - input ending with \n
  - quote followed by paragraph (the real-world case)
  - multi-line quote ending with \n
  - quote with blank continuation + trailing \n (internal <br> stays,
    trailing <br> does not)

Verified end-to-end with node against the actual JS regex.
244 renderer-adjacent tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(renderer): comprehensive markdown fixes — strikethrough, task lists, CRLF, nested blockquotes

Five additional fixes on top of the blockquote grouping from the initial commit:

1. CRLF normalisation: strip \r\n → \n at start of renderMd so Windows
   line endings do not produce stray \r characters in rendered output

2. Strikethrough: ~~text~~ → <del>text</del> in both inlineMd() (for use
   inside blockquotes/lists) and the outer pass (for plain paragraphs).
   Added <del> to SAFE_TAGS and SAFE_INLINE so it is not HTML-escaped.

3. Task lists: - [x] / - [ ] items in unordered lists render as /☐
   via task-done/task-todo span wrappers. Checks [X] (uppercase) too.

4. Nested blockquotes: >> / >>> etc. now recurse so each level gets its
   own <blockquote> element rather than passing through as literal >.
   Implemented by extracting the blockquote rule into _applyBlockquotes()
   which calls itself recursively on the stripped inner content.

5. Lists inside blockquotes: > - item now renders <ul><li> inside the
   blockquote instead of a literal "- item" string. Task list items work
   inside blockquotes too (> - [x] done →  inside <blockquote><ul>).

Also fixed test_issue342.py search window (5000→10000 chars) — the CRLF
strip at the top of renderMd pushed the autolink regex past the old limit.

68 new tests in test_renderer_comprehensive.py + test_blockquote_rendering.py
covering all constructs, edge cases, and combinations.

* fix(renderer): restore space in blockquote prefix-strip regex

Commit 04e7b53 changed the blockquote prefix-strip regex from
  /^>[ \t]?/   (consume "> ", "\t>", or just ">")
to
  /^>[\t]?/    (only consume "\t>" or just ">")

The space character was dropped from the character class. Since
practically every blockquote an LLM produces is "> " (greater-than
followed by a space), this leaves a leading space artifact on every
stripped blockquote line. Worse, the leading space breaks the
list-detection regex `^(?:  )?[-*+] ` inside the new `_applyBlockquotes`
helper — that regex requires either zero or two leading spaces, never
one — so the new "list inside blockquote" feature never fired for
the canonical input shape `> - item`.

Reproducer (against the actual ui.js via node, before the fix):
  > Hello world         → <blockquote> Hello world</blockquote>
                                       ^ phantom leading space
  > Steps:              → <blockquote>Steps:
  > - one                  - one
  > - two                  - two</blockquote>
                          ^ literal text, NOT a <ul>; lists-in-quote feature broken
  > - [x] done          → blockquote with literal "[x] done", no checkbox span

Tests passed despite the bug because tests/test_blockquote_rendering.py
and tests/test_renderer_comprehensive.py validate against a Python
mirror (`_apply_blockquotes`) whose strip regex is `^>[ \t]?` — i.e.
the mirror is correct, the JS is not, and the static-mirror tests
can't catch the divergence. Same shape of bug as commit 94d63d0
(phantom <br> in trailing line) where the mirror was right and the JS
was wrong.

Fix: restore the space character in the strip regex's character class.

Add tests/test_renderer_js_behaviour.py — 11 tests that drive the
ACTUAL renderMd via node and assert on rendered output for the most
common LLM shapes (single-line quote, multi-line quote, list inside
quote, task list inside quote, nested >>>, strikethrough inside and
outside quote, top-level task list, quote followed by heading,
multi-paragraph quote with list, CRLF normalisation).

Verified: the buggy regex makes 6 of those 11 tests fail; the corrected
regex makes all 11 pass.

Suite: 2354 passed, 0 new failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Collapse agent session compression chains

* Restore upstream changelog entries

* fix(agent_sessions): bubble active compression chains to top by tip last_activity

The original PR merge kept the chain head's id/title/started_at and overrode
id/model/message_count/ended_at/end_reason from the tip — but did NOT override
last_activity. Since the projected list is sorted by last_activity DESC and
the WebUI sidebar surfaces updated_at = last_activity, an actively-used
compression chain whose tip is being edited NOW would sort by the ROOT's
old last_activity and fall below recently touched standalone sessions.

Reproducer (with the harness against actual code, before the fix):
  - root: started 30 days ago, last msg 30 days ago
  - tip:  started 28 days ago (parent_session_id=root), last msg 5 seconds ago
  - standalone: last msg 2 days ago

  Sidebar order with original PR:
    [0] standalone  (48h ago)
    [1] active_tip  (last_activity=root's 720h ago)  ← wrong

  Sidebar order after fix:
    [0] active_tip  (last_activity=tip's 0h ago)     ← correct
    [1] standalone  (48h ago)

This matches Hermes Agent's own list_sessions_rich projection at
hermes_state.py:903-909, which overrides "last_active" from the tip
exactly so that the agent CLI's session list orders the same way.

Add ``last_activity`` to the merge-from-tip key list, update the existing
test_compression_chain_collapses_to_latest_tip_in_sidebar assertion to
expect tip-derived updated_at, and add
test_compression_chain_bubbles_to_top_by_tip_activity locking in the
bubble-to-top invariant — without this regression test the previous
behaviour passed CI because no test exercised the sort order against a
mixed set of chains and standalone sessions.

The chain head's started_at (created_at) and title remain preserved, so
users can still find the conversation by its original date and name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.50.216 release notes and version bump

Compression chains, renderer fixes, HTML preview, approval z-index, /steer fix.

* chore: gitignore local-only review harness directory

Adds .local-review/ to .gitignore so renderer drivers, sample inputs,
fixture builders, and other reviewer scratch files do not accidentally
get committed. Nothing under that path is ever shared in the repo;
keeping the entry tracked makes the boundary explicit for any future
contributor who creates the directory locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Keep reasoning chip visible for None effort

* test(reasoning): pin chip render output via node, not just source regex

The PR's static checks in test_reasoning_chip_btw_fixes.py validate the
shape of _applyReasoningChip (no display='none' literal, the right
classList.toggle call exists, the right label literals are in the
function body) but pass even if the runtime detail is wrong — for
example if `inactive` were inverted, _normalizeReasoningEffort
mishandled whitespace, or _formatReasoningEffortLabel returned the
wrong literal for an unknown input.

Add tests/test_reasoning_chip_js_behaviour.py — 11 tests that drive
the actual _applyReasoningChip() via node and assert on the rendered
DOM state for each effort value:

  TestChipAlwaysVisible
    - empty / null  -> "Default" label, inactive=true
    - "none"        -> "None" label, inactive=true
    - "low"/"high"  -> verbatim label, inactive=false
  TestNormalizationEdgeCases
    - "NONE"        -> normalises to "None"
    - "  none  "    -> trims and normalises
    - unknown junk  -> falls through visible, never hidden
  TestTitleAttributeAccessibility
    - title attribute carries the human-readable label for tooltip /
      screen-reader use

Sanity-checked against master's pre-fix ui.js: 11/11 fail (bug caught).
Against this PR's ui.js: 11/11 pass.

This pattern (drive the actual JS via node) caught two regex-only
regressions in PR #1073 where the Python mirror was correct while the
JS was broken. Same protection added here so the chip-visibility
contract can't silently break in a future refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: add #1074 to v0.50.216 changelog, bump test count to 2428

* fix(i18n): restore broken Unicode in Russian and Spanish steer strings

Commit 56c7a14 (fix: update steer i18n and settings copy) accidentally
stripped the `\u` prefix from Unicode escape sequences in two locales,
producing garbled literal hex strings visible to users:

  Spanish (es):
    - cmd_steer:                   correcci00f3n  → corrección
    - cmd_steer_fallback:          2014 en cola   → — en cola
    - busy_steer_fallback:         2014 en cola   → — en cola
    - settings_desc_busy_input_mode: qu00e9, est00e1, correcci00f3n → qué, está, corrección
    - settings_busy_input_mode_steer: correcci00f3n  → corrección

  Russian (ru):
    - settings_desc_busy_input_mode: the entire Cyrillic string was
      replaced with raw 4-hex-char code-points without the \u prefix
      (041e043f... instead of actual Cyrillic). Decoded:
      "Определяет поведение при отправке сообщения во время работы
      агента. Очередь ждёт; Прерывание отменяет и начинает заново;
      Steer внедряет коррекцию без прерывания."

Fix: write the correct characters directly (UTF-8 is the file encoding
so embedding them literally is cleaner than \u escapes for long text).

All other locales (en, de, zh, zh-Hant) were not affected — confirmed
by grepping for bare hex run-ons in the updated file.

Verified: node --check static/i18n.js passes; full pytest suite green
(2365 passed, 47 skipped).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: remove duplicate compression chain entry from [Unreleased]

---------

Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Frank Song <franksong2702@gmail.com>
This commit is contained in:
nesquena-hermes
2026-04-25 21:06:31 -07:00
committed by GitHub
parent 3d96dc1498
commit 58ad315dca
22 changed files with 1806 additions and 48 deletions
+4
View File
@@ -39,3 +39,7 @@ Thumbs.db
docs/*
!docs/ui-ux/
!docs/ui-ux/**
# Local-only PR review harness: rendering drivers, sample bank, fixtures.
# Used by Claude during deep reviews; never shared in the repo.
.local-review/
+18
View File
@@ -2,6 +2,24 @@
## [Unreleased]
## v0.50.216 — 2026-04-26
### Added
- **Compression chain collapse** — `get_importable_agent_sessions()` now merges linear compression continuation chains into a single sidebar entry, showing the chain tip's activity time and model. The chain root's title and start time are preserved for display; the latest importable segment is used for import. Non-compression parent/child pairs are unchanged. (`api/agent_sessions.py`, `tests/test_gateway_sync.py`) Closes #1012 [#1012 @franksong2702]
- **Comprehensive markdown renderer improvements** — blockquote grouping, strikethrough, task lists, CRLF normalisation, nested blockquotes, lists inside blockquotes. See details below. (`static/ui.js`) [#1073]
### Fixed
- **Blockquote rendering** — consecutive `> lines` now group into one `<blockquote>`, blank `>` continuation lines become `<br>`, bare `>` (no space) handled, `>>` nested blockquotes recurse correctly, lists inside blockquotes render `<ul>`, inline markdown (bold/italic/code) works inside quotes. (`static/ui.js`) [#1073]
- **Strikethrough** — `~~text~~` now renders as `<del>text</del>` in all contexts (paragraphs, blockquotes, list items). (`static/ui.js`) [#1073]
- **Task lists** — `- [x]` renders as ✅, `- [ ]` renders as ☐ in all unordered list contexts including inside blockquotes. (`static/ui.js`) [#1073]
- **CRLF line endings** — Windows `\r\n` line endings are normalised at the start of `renderMd()` so `\r` never appears in rendered text. (`static/ui.js`) [#1073]
- **HTML/HTM preview in workspace** — `.html` and `.htm` files now render correctly in the workspace preview iframe. Root cause: `MIME_MAP` was missing these extensions; the fallback `application/octet-stream` caused browsers to refuse to render in the iframe. (`api/config.py`) [#1070]
- **Approval card obscured by queue flyout** — the approval card's "Allow once / Allow session / Always allow / Deny" buttons are no longer hidden behind the queue flyout when both are visible simultaneously. (`static/style.css` — one line: `z-index:3` on `.approval-card.visible`) [#1071]
- **`/steer`, `/interrupt`, `/queue` not working while agent is busy** — typing these commands while the agent is running now executes them immediately instead of queuing the raw text. Root cause: `send()` returned early inside the busy block before reaching the slash-command dispatcher. Fix: intercept the three control commands at the top of the busy block. (`static/messages.js`) [#1072]
- **Reasoning chip always visible** — the composer reasoning chip is now shown for all effort states. When effort is unset/default it shows a muted "Default" label; when explicitly set to `none` it shows "None". Previously both states hid the chip entirely, removing the affordance to inspect or change it. (`static/ui.js`, `static/style.css`) Closes #1068 [#1074 @franksong2702]
- **Steer settings copy updated** — removed "falls back to interrupt" / "interrupt + send" language across all 6 locales; steer mode now correctly described as "mid-turn correction without interrupting". (`static/i18n.js`, `static/index.html`) [#1072]
## v0.50.215 — 2026-04-26
### Added
+1 -1
View File
@@ -3,7 +3,7 @@
> Goal: Full 1:1 parity with the Hermes CLI experience via a clean dark web UI.
> Everything you can do from the CLI terminal, you can do from this UI.
>
> Last updated: v0.50.215 (April 26, 2026) — 2319 tests collected
> Last updated: v0.50.216 (April 26, 2026) — 2428 tests collected
> Tests: 2107 collected (`pytest tests/ --collect-only -q`)
> Source: <repo>/
+1 -1
View File
@@ -8,7 +8,7 @@
> Prerequisites: SSH tunnel is active on port 8787. Open http://localhost:8787 in browser.
> Server health check: curl http://127.0.0.1:8787/health should return {"status":"ok"}.
>
> Automated coverage: 2319 tests collected via `pytest tests/ --collect-only -q`. Includes onboarding coverage for bootstrap/static wizard presence, real provider config persistence (`config.yaml` + `.env`), the `/api/onboarding/*` backend, the onboarding skip/existing-config guard, and CSS regression coverage for smooth thinking/tool card disclosure animation.
> Automated coverage: 2428 tests collected via `pytest tests/ --collect-only -q`. Includes onboarding coverage for bootstrap/static wizard presence, real provider config persistence (`config.yaml` + `.env`), the `/api/onboarding/*` backend, the onboarding skip/existing-config guard, and CSS regression coverage for smooth thinking/tool card disclosure animation.
> Run: `pytest tests/ -v --timeout=60`
>
> Local regression focus: verify that a previously closed workspace panel stays visually closed from first paint through boot completion on desktop refresh; there should be no brief open-then-close flash.
+129 -9
View File
@@ -6,13 +6,126 @@ from pathlib import Path
logger = logging.getLogger(__name__)
def _optional_col(name: str, columns: set[str], fallback: str = "NULL") -> str:
return f"s.{name}" if name in columns else f"{fallback} AS {name}"
def _is_compression_continuation(parent: dict | None, child: dict) -> bool:
"""Mirror Hermes Agent's compression-child guard.
A child is a continuation only when the parent ended because of compression
and the child started after that compression boundary. Plain parent/child
relationships are left alone for future subagent-tree work.
"""
if not parent:
return False
if parent.get('end_reason') != 'compression':
return False
ended_at = parent.get('ended_at')
if ended_at is None:
return False
try:
return float(child.get('started_at') or 0) >= float(ended_at)
except (TypeError, ValueError):
return False
def _project_agent_session_rows(rows: list[dict]) -> list[dict]:
"""Collapse compression chains into one logical sidebar row.
The visible conversation should still look like the original chain head
(title and timestamps), while importing should use the latest importable
segment so the user continues from the current compressed state.
"""
rows_by_id = {row['id']: row for row in rows}
children_by_parent: dict[str, list[dict]] = {}
continuation_child_ids = set()
for row in rows:
parent_id = row.get('parent_session_id')
if not parent_id:
continue
children_by_parent.setdefault(parent_id, []).append(row)
if _is_compression_continuation(rows_by_id.get(parent_id), row):
continuation_child_ids.add(row['id'])
for children in children_by_parent.values():
children.sort(key=lambda row: row.get('started_at') or 0, reverse=True)
def compression_tip(row: dict) -> tuple[dict | None, int]:
current = row
seen = {row['id']}
latest_importable = row if (row.get('actual_message_count') or 0) > 0 else None
segment_count = 1
for _ in range(len(rows_by_id) + 1):
candidates = [
child for child in children_by_parent.get(current['id'], [])
if child['id'] not in seen and _is_compression_continuation(current, child)
]
if not candidates:
return latest_importable, segment_count
current = candidates[0]
seen.add(current['id'])
segment_count += 1
if (current.get('actual_message_count') or 0) > 0:
latest_importable = current
return latest_importable, segment_count
projected = []
for row in rows:
if row['id'] in continuation_child_ids:
continue
segment_count = 1
tip = row
if row.get('end_reason') == 'compression':
tip, segment_count = compression_tip(row)
if not tip or (tip.get('actual_message_count') or 0) <= 0:
continue
if tip is row:
projected.append(dict(row))
continue
merged = dict(row)
# Keep the chain head's visible identity (title, started_at), but
# point the row at the latest importable segment for navigation AND
# surface the tip's recency so an actively-used chain bubbles to the
# top of the sidebar by its true last activity. Without overriding
# last_activity, a long-lived chain whose tip is being edited NOW
# would sort by the root's old timestamp and fall below recently
# touched standalone sessions — exactly the inverse of what a user
# expects from "Show agent sessions" sorted by activity.
for key in (
'id', 'model', 'message_count', 'actual_message_count',
'ended_at', 'end_reason', 'last_activity',
):
if key in tip:
merged[key] = tip[key]
if not merged.get('title'):
merged['title'] = tip.get('title')
if not merged.get('source'):
merged['source'] = tip.get('source')
merged['_lineage_root_id'] = row['id']
merged['_lineage_tip_id'] = tip['id']
merged['_compression_segment_count'] = segment_count
projected.append(merged)
projected.sort(
key=lambda row: row.get('last_activity') or row.get('started_at') or 0,
reverse=True,
)
return projected
def read_importable_agent_session_rows(db_path: Path, limit: int = 200, log=None) -> list[dict]:
"""Return non-WebUI agent sessions that have readable message rows.
"""Return non-WebUI agent sessions projected as importable conversations.
Hermes Agent can create rows in ``state.db.sessions`` before a session has
any messages. WebUI cannot import those rows, so both the regular
``/api/sessions`` path and the gateway SSE watcher must filter them the
same way.
any messages, and long conversations can be split into compression-linked
rows. WebUI cannot import empty rows and should not show compression
segments as separate conversations, so both the regular ``/api/sessions``
path and the gateway SSE watcher use this shared projection.
"""
db_path = Path(db_path)
if not db_path.exists():
@@ -36,20 +149,27 @@ def read_importable_agent_session_rows(db_path: Path, limit: int = 200, log=None
)
return []
parent_expr = _optional_col('parent_session_id', session_cols)
ended_expr = _optional_col('ended_at', session_cols)
end_reason_expr = _optional_col('end_reason', session_cols)
cur.execute(
"""
f"""
SELECT s.id, s.title, s.model, s.message_count,
s.started_at, s.source,
{parent_expr},
{ended_expr},
{end_reason_expr},
COUNT(m.id) AS actual_message_count,
MAX(m.timestamp) AS last_activity
FROM sessions s
LEFT JOIN messages m ON m.session_id = s.id
WHERE s.source IS NOT NULL AND s.source != 'webui'
GROUP BY s.id
HAVING COUNT(m.id) > 0
ORDER BY COALESCE(MAX(m.timestamp), s.started_at) DESC
LIMIT ?
""",
(int(limit),),
)
return [dict(row) for row in cur.fetchall()]
projected = _project_agent_session_rows([dict(row) for row in cur.fetchall()])
if limit is None:
return projected
return projected[:max(0, int(limit))]
+2
View File
@@ -442,6 +442,8 @@ MIME_MAP = {
".bmp": "image/bmp",
".pdf": "application/pdf",
".json": "application/json",
".html": "text/html",
".htm": "text/html",
".xls": "application/vnd.ms-excel",
".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
".doc": "application/msword",
+2 -2
View File
@@ -595,8 +595,8 @@ async function cmdSteer(args){
* Shared implementation for /steer and the busy_input_mode='steer' path.
*
* Tries the real steer endpoint first. On any non-accept response (no cached
* agent, agent lacks steer, stream dead, etc.) falls back to interrupt mode:
* queue the message + cancel the stream so the existing drain re-sends.
* agent, agent lacks steer, stream dead, etc.) falls back to interrupt+queue:
* queues the message and cancels the stream so the drain re-sends it.
*
* @param {string} msg - The steer text.
* @param {boolean} explicitSteer - True if the user explicitly invoked /steer
+22 -22
View File
@@ -101,23 +101,23 @@ const LOCALES = {
no_active_session: 'No active session',
cmd_queue: 'Queue a message for the next turn',
cmd_interrupt: 'Cancel current turn and send a new message',
cmd_steer: 'Steer the agent with a correction (falls back to interrupt)',
cmd_steer: 'Inject a mid-turn correction without interrupting the agent',
cmd_queue_no_msg: 'Usage: /queue <message>',
cmd_queue_not_busy: 'No active task — just send normally',
cmd_queue_confirm: 'Message queued',
cmd_interrupt_no_msg: 'Usage: /interrupt <message>',
cmd_interrupt_confirm: 'Interrupted — sending new message',
cmd_steer_no_msg: 'Usage: /steer <message>',
cmd_steer_fallback: 'Steer unavailable — interrupted and queued instead',
cmd_steer_fallback: 'Steer unavailable — queued for next turn instead',
cmd_steer_delivered: 'Steer delivered — agent will see it on its next tool result',
steer_leftover_queued: 'Steer queued for next turn',
busy_steer_fallback: 'Steer not available — interrupted instead',
busy_steer_fallback: 'Steer unavailable — queued for next turn',
busy_interrupt_confirm: 'Interrupted — sending new message',
settings_label_busy_input_mode: 'Busy input mode',
settings_desc_busy_input_mode: 'Controls what happens when you send a message while the agent is running. Queue waits; Interrupt cancels and starts fresh; Steer sends a correction (currently falls back to interrupt).',
settings_desc_busy_input_mode: 'Controls what happens when you send a message while the agent is running. Queue waits; Interrupt cancels and starts fresh; Steer injects a correction mid-turn without interrupting (falls back to queue when agent or stream unavailable).',
settings_busy_input_mode_queue: 'Queue follow-up',
settings_busy_input_mode_interrupt: 'Interrupt current turn',
settings_busy_input_mode_steer: 'Steer (interrupt + send)',
settings_busy_input_mode_steer: 'Steer (mid-turn correction)',
slash_skill_badge:'Skill',
slash_skill_desc:'Invoke this skill',
@@ -742,7 +742,7 @@ const LOCALES = {
busy_steer_fallback: 'Steer недоступен — прервано',
busy_interrupt_confirm: 'Прервано — отправка нового сообщения',
settings_label_busy_input_mode: 'Режим ввода при занятости',
settings_desc_busy_input_mode: 'Определяет поведение при отправке сообщения во время работы агента. Очередь ждёт; Прерывание отменяет и начинает заново; Steer отправляет исправление (сейчас как прерывание).',
settings_desc_busy_input_mode: 'Определяет поведение при отправке сообщения во время работы агента. Очередь ждёт; Прерывание отменяет и начинает заново; Steer внедряет коррекцию без прерывания.',
settings_busy_input_mode_queue: 'Поставить в очередь',
settings_busy_input_mode_interrupt: 'Прервать текущий оборот',
settings_busy_input_mode_steer: 'Steer (прерывание + отправка)',
@@ -1340,23 +1340,23 @@ const LOCALES = {
no_active_session: 'No hay ninguna sesión activa',
cmd_queue: 'Poner mensaje en cola para el siguiente turno',
cmd_interrupt: 'Cancelar turno actual y enviar nuevo mensaje',
cmd_steer: 'Redirigir al agente con una correcci\u00f3n (usa interrupci\u00f3n)',
cmd_steer: 'Inyectar una corrección a mitad del turno sin interrumpir al agente',
cmd_queue_no_msg: 'Uso: /queue <mensaje>',
cmd_queue_not_busy: 'Sin tarea activa \u2014 env\u00eda normalmente',
cmd_queue_confirm: 'Mensaje en cola',
cmd_interrupt_no_msg: 'Uso: /interrupt <mensaje>',
cmd_interrupt_confirm: 'Interrumpido \u2014 enviando nuevo mensaje',
cmd_steer_no_msg: 'Uso: /steer <mensaje>',
cmd_steer_fallback: 'Steer no disponible \u2014 interrumpido y encolado',
cmd_steer_fallback: 'Steer no disponible — en cola para el siguiente turno',
cmd_steer_delivered: 'Steer entregado \u2014 el agente lo ver\u00e1 en su pr\u00f3ximo resultado de herramienta',
steer_leftover_queued: 'Steer en cola para el pr\u00f3ximo turno',
busy_steer_fallback: 'Steer no disponible \u2014 interrumpido',
busy_steer_fallback: 'Steer no disponible — en cola para el siguiente turno',
busy_interrupt_confirm: 'Interrumpido \u2014 enviando nuevo mensaje',
settings_label_busy_input_mode: 'Modo de entrada ocupada',
settings_desc_busy_input_mode: 'Controla qu\u00e9 sucede al enviar un mensaje mientras el agente est\u00e1 activo. Cola espera; Interrumpir cancela y empieza de nuevo; Steer env\u00eda una correcci\u00f3n (actualmente usa interrupci\u00f3n).',
settings_desc_busy_input_mode: 'Controla qué sucede al enviar mensajes mientras el agente está activo. Cola espera; Interrumpir cancela y empieza de nuevo; Steer inyecta una corrección sin interrumpir (usa cola si el agente no está disponible).',
settings_busy_input_mode_queue: 'Poner en cola',
settings_busy_input_mode_interrupt: 'Interrumpir turno actual',
settings_busy_input_mode_steer: 'Steer (interrupci\u00f3n + env\u00edo)',
settings_busy_input_mode_steer: 'Steer (corrección a mitad de turno)',
no_personalities: 'No se encontraron personalidades (añádelas a ~/.hermes/personalities/)',
available_personalities: 'Personalidades disponibles:',
personality_switch_hint: '\n\nUsa `/personality <name>` para cambiar, o `/personality none` para limpiar.',
@@ -1920,23 +1920,23 @@ const LOCALES = {
no_active_session: 'Keine aktive Sitzung',
cmd_queue: 'Nachricht f\u00fcr den n\u00e4chsten Durchgang einreihen',
cmd_interrupt: 'Aktuellen Durchgang abbrechen und neue Nachricht senden',
cmd_steer: 'Agent mit Korrektur lenken (f\u00e4llt zur\u00fcck auf Unterbrechung)',
cmd_steer: 'Korrektursignal einf\u00fcgen ohne Unterbrechung',
cmd_queue_no_msg: 'Verwendung: /queue <Nachricht>',
cmd_queue_not_busy: 'Keine aktive Aufgabe \u2014 normal senden',
cmd_queue_confirm: 'Nachricht eingereiht',
cmd_interrupt_no_msg: 'Verwendung: /interrupt <Nachricht>',
cmd_interrupt_confirm: 'Unterbrochen \u2014 neue Nachricht wird gesendet',
cmd_steer_no_msg: 'Verwendung: /steer <Nachricht>',
cmd_steer_fallback: 'Steer nicht verf\u00fcgbar \u2014 unterbrochen und eingereiht',
cmd_steer_fallback: 'Steer nicht verf\u00fcgbar \u2014 f\u00fcr n\u00e4chsten Durchgang eingereiht',
cmd_steer_delivered: 'Steer geliefert \u2014 der Agent sieht es bei seinem n\u00e4chsten Tool-Ergebnis',
steer_leftover_queued: 'Steer f\u00fcr n\u00e4chsten Durchgang eingereiht',
busy_steer_fallback: 'Steer nicht verf\u00fcgbar \u2014 unterbrochen',
busy_steer_fallback: 'Steer nicht verf\u00fcgbar \u2014 f\u00fcr n\u00e4chsten Durchgang eingereiht',
busy_interrupt_confirm: 'Unterbrochen \u2014 neue Nachricht wird gesendet',
settings_label_busy_input_mode: 'Eingabemodus bei Besch\u00e4ftigung',
settings_desc_busy_input_mode: 'Steuert, was passiert, wenn Sie w\u00e4hrend der Agentenaktivit\u00e4t eine Nachricht senden. Warteschlange wartet; Unterbrechen bricht ab und startet neu; Steer sendet eine Korrektur (aktuell wie Unterbrechen).',
settings_desc_busy_input_mode: 'Steuert, was passiert, wenn Sie w\u00e4hrend der Agentenaktivit\u00e4t eine Nachricht senden. Warteschlange wartet; Unterbrechen bricht ab und startet neu; Steer f\u00fcgt eine Korrektur ein ohne zu unterbrechen.',
settings_busy_input_mode_queue: 'In Warteschlange einreihen',
settings_busy_input_mode_interrupt: 'Aktuellen Durchgang unterbrechen',
settings_busy_input_mode_steer: 'Steer (Unterbrechen + Senden)',
settings_busy_input_mode_steer: 'Steer (Korrektur ohne Unterbrechung)',
no_personalities: 'Keine Persönlichkeiten gefunden (füge sie in ~/.hermes/personalities/ hinzu)',
available_personalities: 'Verfügbare Persönlichkeiten:',
personality_switch_hint: '\n\nNutze `/personality <name>` zum Wechseln, oder `/personality none` zum Löschen.',
@@ -2303,7 +2303,7 @@ const LOCALES = {
busy_steer_fallback: 'Steer \u4e0d\u53ef\u7528 \u2014 \u5df2\u4e2d\u65ad',
busy_interrupt_confirm: '\u5df2\u4e2d\u65ad \u2014 \u6b63\u5728\u53d1\u9001\u65b0\u6d88\u606f',
settings_label_busy_input_mode: '\u5fd9\u788c\u8f93\u5165\u6a21\u5f0f',
settings_desc_busy_input_mode: '\u63a7\u5236\u5728\u4ee3\u7406\u8fd0\u884c\u65f6\u53d1\u9001\u6d88\u606f\u7684\u884c\u4e3a\u3002\u961f\u5217\u7b49\u5f85\uff1b\u4e2d\u65ad\u53d6\u6d88\u5e76\u91cd\u65b0\u5f00\u59cb\uff1bSteer \u53d1\u9001\u7ea0\u6b63\uff08\u76ee\u524d\u56de\u9000\u4e3a\u4e2d\u65ad\uff09\u3002',
settings_desc_busy_input_mode: '\u63a7\u5236\u5728\u4ee3\u7406\u8fd0\u884c\u65f6\u53d1\u9001\u6d88\u606f\u7684\u884c\u4e3a\u3002\u961f\u5217\u7b49\u5f85\uff1b\u4e2d\u65ad\u53d6\u6d88\u5e76\u91cd\u65b0\u5f00\u59cb\uff1bSteer\u4e2d\u9014\u6ce8\u5165\u7ea0\u6b63\uff0c\u4e0d\u4e2d\u65ad\u3002',
settings_busy_input_mode_queue: '\u52a0\u5165\u961f\u5217',
settings_busy_input_mode_interrupt: '\u4e2d\u65ad\u5f53\u524d\u56de\u5408',
settings_busy_input_mode_steer: 'Steer\uff08\u4e2d\u65ad + \u53d1\u9001\uff09',
@@ -3215,23 +3215,23 @@ const LOCALES = {
no_active_session: '\u7121\u6d3b\u8e8d\u6703\u8a71',
cmd_queue: '\u5c07\u8a0a\u606f\u52a0\u5165\u4e0b\u4e00\u8f2a\u7684\u4f47\u5217',
cmd_interrupt: '\u53d6\u6d88\u7576\u524d\u56de\u5408\u4e26\u767c\u9001\u65b0\u8a0a\u606f',
cmd_steer: '\u7528\u7d0a\u6b63\u8a0a\u606f\u5f15\u5c0e\u4ee3\u7406\uff08\u56de\u9000\u70ba\u4e2d\u65ad\uff09',
cmd_steer: '\u5728\u56de\u5408\u9032\u884c\u4e2d\u6ce8\u5165\u7d3a\u6b63\uff0c\u4e0d\u4e2d\u65b7\u4ee3\u7406',
cmd_queue_no_msg: '\u7528\u6cd5\uff1a/queue <\u8a0a\u606f>',
cmd_queue_not_busy: '\u6c92\u6709\u6d3b\u52d5\u4efb\u52d9 \u2014 \u76f4\u63a5\u767c\u9001\u5373\u53ef',
cmd_queue_confirm: '\u8a0a\u606f\u5df2\u52a0\u5165\u4f47\u5217',
cmd_interrupt_no_msg: '\u7528\u6cd5\uff1a/interrupt <\u8a0a\u606f>',
cmd_interrupt_confirm: '\u5df2\u4e2d\u65ad \u2014 \u6b63\u5728\u767c\u9001\u65b0\u8a0a\u606f',
cmd_steer_no_msg: '\u7528\u6cd5\uff1a/steer <\u8a0a\u606f>',
cmd_steer_fallback: 'Steer \u4e0d\u53ef\u7528 \u2014 \u5df2\u4e2d\u65ad\u4e26\u52a0\u5165\u4f47\u5217',
cmd_steer_fallback: 'Steer \u4e0d\u53ef\u7528 \u2014 \u5df2\u52a0\u5165\u4e0b\u4e00\u8f2a\u4f47\u5217',
cmd_steer_delivered: 'Steer \u5df2\u9001\u9054 \u2014 \u4ee3\u7406\u5c07\u5728\u4e0b\u4e00\u500b\u5de5\u5177\u7d50\u679c\u4e2d\u770b\u5230',
steer_leftover_queued: 'Steer \u5df2\u52a0\u5165\u4e0b\u4e00\u8f2a\u4f47\u5217',
busy_steer_fallback: 'Steer \u4e0d\u53ef\u7528 \u2014 \u5df2\u4e2d\u65ad',
busy_steer_fallback: 'Steer \u4e0d\u53ef\u7528 \u2014 \u5df2\u52a0\u5165\u4e0b\u4e00\u8f2a\u4f47\u5217',
busy_interrupt_confirm: '\u5df2\u4e2d\u65ad \u2014 \u6b63\u5728\u767c\u9001\u65b0\u8a0a\u606f',
settings_label_busy_input_mode: '\u5fd9\u788c\u8f38\u5165\u6a21\u5f0f',
settings_desc_busy_input_mode: '\u63a7\u5236\u5728\u4ee3\u7406\u904b\u884c\u6642\u767c\u9001\u8a0a\u606f\u7684\u884c\u70ba\u3002\u4f47\u5217\u7b49\u5f85\uff1b\u4e2d\u65ad\u53d6\u6d88\u4e26\u91cd\u65b0\u958b\u59cb\uff1bSteer \u767c\u9001\u7d0a\u6b63\uff08\u76ee\u524d\u56de\u9000\u70ba\u4e2d\u65ad\uff09\u3002',
settings_desc_busy_input_mode: '\u63a7\u5236\u5728\u4ee3\u7406\u904b\u884c\u6642\u767c\u9001\u8a0a\u606f\u7684\u884c\u70ba\u3002\u4f47\u5217\u7b49\u5f85\uff1b\u4e2d\u65b7\u53d6\u6d88\u4e26\u91cd\u65b0\u958b\u59cb\uff1bSteer\u4e2d\u9014\u6ce8\u5165\u7d3a\u6b63\uff0c\u4e0d\u4e2d\u65b7\u3002',
settings_busy_input_mode_queue: '\u52a0\u5165\u4f47\u5217',
settings_busy_input_mode_interrupt: '\u4e2d\u65ad\u7576\u524d\u56de\u5408',
settings_busy_input_mode_steer: 'Steer\uff08\u4e2d\u65ad + \u767c\u9001\uff09',
settings_busy_input_mode_steer: 'Steer\uff08\u4e2d\u9014\u7d3a\u6b63\uff09',
no_active_task: '\u7121\u57f7\u884c\u4e2d\u7684\u4efb\u52d9\u53ef\u505c\u6b62\u3002',
no_notes_yet: '\u5c1a\u7121\u5099\u8a3b\u3002',
no_profile_yet: '\u5c1a\u7121\u8a2d\u5b9a\u6a94\u3002',
+1 -1
View File
@@ -667,7 +667,7 @@
<select id="settingsBusyInputMode" style="width:100%;padding:8px;background:var(--code-bg);color:var(--text);border:1px solid var(--border2);border-radius:6px">
<option value="queue" data-i18n="settings_busy_input_mode_queue">Queue follow-up</option>
<option value="interrupt" data-i18n="settings_busy_input_mode_interrupt">Interrupt current turn</option>
<option value="steer" data-i18n="settings_busy_input_mode_steer">Steer (interrupt + send)</option>
<option value="steer" data-i18n="settings_busy_input_mode_steer">Steer (mid-turn correction)</option>
</select>
<div style="font-size:11px;color:var(--muted);margin-top:4px" data-i18n="settings_desc_busy_input_mode">Controls what happens when you send a message while the agent is running. Queue waits for the current task; Interrupt cancels and starts fresh; Steer injects a mid-turn correction without interrupting (falls back to interrupt when the agent is not yet cached or the stream has ended).</div>
</div>
+17
View File
@@ -14,6 +14,23 @@ async function send(){
if(S.busy||compressionRunning){
if(text){
if(!S.session){await newSession();await renderSessionList();}
// Busy-control slash commands must be intercepted HERE, before the
// busyMode routing block, so the user can always type /steer, /interrupt,
// or /queue while the agent is running and have them execute immediately.
// Without this intercept they fall through to the queue and execute after
// the current turn ends — by which point there is no active stream and
// cmdSteer / cmdInterrupt say "No active task to stop."
if(text.startsWith('/')){
const _pc=typeof parseCommand==='function'&&parseCommand(text);
if(_pc&&['steer','interrupt','queue'].includes(_pc.name)){
const _bc=COMMANDS.find(c=>c.name===_pc.name);
if(_bc){
$('msg').value='';autoResize();
await _bc.fn(_pc.args);
return;
}
}
}
const busyMode=window._busyInputMode||'queue';
if(busyMode==='steer'&&S.activeStreamId&&typeof _trySteer==='function'){
// Real steer: clear the input first so the user gets immediate
+2 -1
View File
@@ -401,7 +401,7 @@
.composer-flyout{position:relative;height:0;z-index:1;}
/* ── Approval card ── */
.approval-card{position:absolute;left:0;right:0;bottom:-24px;max-width:var(--msg-max);margin:0 auto;padding:0 20px;box-sizing:border-box;width:100%;overflow:hidden;pointer-events:none;}
.approval-card.visible{pointer-events:auto;}
.approval-card.visible{pointer-events:auto;z-index:3;}
.approval-inner{background:var(--surface);backdrop-filter:blur(8px);border:1px solid var(--accent-bg-strong);border-radius:14px;padding:16px 18px 40px;transform:translateY(100%);opacity:0;transition:transform .4s cubic-bezier(.32,.72,.16,1),opacity .25s ease;}
.approval-card.visible .approval-inner{transform:translateY(0);opacity:1;}
.approval-header{display:flex;align-items:center;gap:8px;margin-bottom:10px;font-size:13px;font-weight:600;color:var(--error);}
@@ -655,6 +655,7 @@
.composer-workspace-label{min-width:0;overflow:hidden;text-overflow:ellipsis;white-space:nowrap;}
.composer-reasoning-wrap{position:relative;flex:0 1 auto;min-width:0;}
.composer-reasoning-chip{display:inline-flex;align-items:center;gap:8px;max-width:180px;padding:8px 10px 8px 12px;border-radius:999px;border:1px solid transparent;background-color:transparent;color:var(--muted);font-weight:500;cursor:pointer;transition:color .15s,background-color .15s,border-color .15s;}
.composer-reasoning-chip.inactive{opacity:.78;}
.composer-reasoning-chip:hover{color:var(--text);background-color:var(--hover-bg);}
.composer-reasoning-chip.active{color:var(--text);background:var(--accent-bg);border-color:var(--accent-bg);}
.composer-reasoning-icon,.composer-reasoning-chevron{display:inline-flex;align-items:center;justify-content:center;flex-shrink:0;line-height:1;}
+69 -10
View File
@@ -432,15 +432,31 @@ window.addEventListener('resize',()=>{
// ── Reasoning effort chip ────────────────────────────────────────────────────
let _currentReasoningEffort=null;
function _normalizeReasoningEffort(eff){
return String(eff||'').trim().toLowerCase();
}
function _formatReasoningEffortLabel(effort){
if(effort==='none') return 'None';
if(!effort) return 'Default';
return effort;
}
function _applyReasoningChip(eff){
_currentReasoningEffort=eff;
const effort=_normalizeReasoningEffort(eff);
_currentReasoningEffort=effort;
const wrap=$('composerReasoningWrap');
const label=$('composerReasoningLabel');
const chip=$('composerReasoningChip');
if(!wrap||!label) return;
if(!eff||eff==='none'){wrap.style.display='none';return;}
wrap.style.display='';
label.textContent=eff;
_highlightReasoningOption(eff);
label.textContent=_formatReasoningEffortLabel(effort);
if(chip){
const inactive=!effort||effort==='none';
chip.classList.toggle('inactive',inactive);
chip.title='Reasoning effort: '+_formatReasoningEffortLabel(effort);
}
_highlightReasoningOption(effort);
}
function fetchReasoningChip(){
@@ -664,7 +680,7 @@ function _sanitizeThinkingDisplayText(text){
}
function renderMd(raw){
let s=raw||'';
let s=(raw||'').replace(/\r\n/g,'\n').replace(/\r/g,'\n');
// ── MEDIA: token stash (must run first, before any other processing) ───────
// Detect MEDIA:<path-or-url> tokens emitted by the agent (e.g. screenshots,
// generated images) and replace them with inline <img> or download links.
@@ -731,6 +747,8 @@ function renderMd(raw){
t=t.replace(/\*\*\*(.+?)\*\*\*/g,(_,x)=>`<strong><em>${esc(x)}</em></strong>`);
t=t.replace(/\*\*(.+?)\*\*/g,(_,x)=>`<strong>${esc(x)}</strong>`);
t=t.replace(/\*([^*\n]+)\*/g,(_,x)=>`<em>${esc(x)}</em>`);
// Strikethrough: ~~text~~ → <del>text</del>
t=t.replace(/~~(.+?)~~/g,(_,x)=>`<del>${esc(x)}</del>`);
// #487: Image pass — runs while code stash is active so ![x](url) inside
// backticks stays protected as a \x00C token and is never rendered as <img>.
// Must run before _code_stash restore and before _link_stash so the image
@@ -748,7 +766,7 @@ function renderMd(raw){
t=t.replace(/\x00G(\d+)\x00/g,(_,i)=>_img_stash[+i]);
// Escape any plain text that isn't already wrapped in a tag we produced
// by escaping bare < > that are not part of our own tags
const SAFE_INLINE=/^<\/?(strong|em|code|a|img)([\s>]|$)/i;
const SAFE_INLINE=/^<\/?(strong|em|del|code|a|img)([\s>]|$)/i;
t=t.replace(/<\/?[a-z][^>]*>/gi,tag=>SAFE_INLINE.test(tag)?tag:esc(tag));
return t;
}
@@ -759,10 +777,47 @@ function renderMd(raw){
s=s.replace(/\*\*\*(.+?)\*\*\*/g,(_,t)=>`<strong><em>${esc(t)}</em></strong>`);
s=s.replace(/\*\*(.+?)\*\*/g,(_,t)=>`<strong>${esc(t)}</strong>`);
s=s.replace(/\*([^*\n]+)\*/g,(_,t)=>`<em>${esc(t)}</em>`);
s=s.replace(/~~(.+?)~~/g,(_,t)=>`<del>${esc(t)}</del>`);
s=s.replace(/\x00O(\d+)\x00/g,(_,i)=>_ob_stash[+i]);
s=s.replace(/^### (.+)$/gm,(_,t)=>`<h3>${inlineMd(t)}</h3>`).replace(/^## (.+)$/gm,(_,t)=>`<h2>${inlineMd(t)}</h2>`).replace(/^# (.+)$/gm,(_,t)=>`<h1>${inlineMd(t)}</h1>`);
s=s.replace(/^---+$/gm,'<hr>');
s=s.replace(/^> (.+)$/gm,(_,t)=>`<blockquote>${inlineMd(t)}</blockquote>`);
// Group consecutive > lines into one <blockquote>.
// Handles: blank continuation lines (> alone), nested blockquotes (>>),
// lists inside blockquotes (> - item), and inline markdown in quoted text.
function _applyBlockquotes(src){
return src.replace(/((?:^>[^\n]*(?:\n|$))+)/gm,block=>{
const lines=block.split('\n');
// Drop trailing bare '>' artifact
while(lines.length&&(lines[lines.length-1].trim()==='>'||lines[lines.length-1]===''))
{if(lines[lines.length-1].trim()==='>'){lines.pop();break;}lines.pop();}
const stripped=lines.map(l=>l.replace(/^>[ \t]?/,''));
const innerRaw=stripped.join('\n');
let inner;
if(/^>/m.test(innerRaw)){
// Nested blockquote: recurse so >> → <blockquote><blockquote>
inner=_applyBlockquotes(innerRaw);
} else if(/(^(?: )?[-*+] .+)/m.test(innerRaw)){
// List inside blockquote: run list pass on stripped inner content
inner=innerRaw.replace(/((?:^(?: )?[-*+] .+\n?)+)/gm,lb=>{
const ll=lb.trimEnd().split('\n');let h='<ul>';
for(const li of ll){
const txt=li.replace(/^ {0,4}[-*+] /,'');
let ih;
if(/^\[x\] /i.test(txt)) ih='<span class="task-done">✅</span> '+inlineMd(txt.slice(4));
else if(/^\[ \] /.test(txt)) ih='<span class="task-todo">☐</span> '+inlineMd(txt.slice(4));
else ih=inlineMd(txt);
h+=`<li>${ih}</li>`;
}
return h+'</ul>';
});
} else {
// Plain lines: blank line → <br>, text → inlineMd
inner=stripped.map(l=>l.trim()===''?'<br>':inlineMd(l)).join('\n');
}
return `<blockquote>${inner}</blockquote>`;
});
}
s=_applyBlockquotes(s);
// B8: improved list handling supporting up to 2 levels of indentation
s=s.replace(/((?:^(?: )?[-*+] .+\n?)+)/gm,block=>{
const lines=block.trimEnd().split('\n');
@@ -770,8 +825,12 @@ function renderMd(raw){
for(const l of lines){
const indent=/^ {2,}/.test(l);
const text=l.replace(/^ {0,4}[-*+] /,'');
if(indent) html+=`<li style="margin-left:16px">${inlineMd(text)}</li>`;
else html+=`<li>${inlineMd(text)}</li>`;
let _ih;
if(/^\[x\] /i.test(text)) _ih='<span class="task-done">✅</span> '+inlineMd(text.slice(4));
else if(/^\[ \] /.test(text)) _ih='<span class="task-todo">☐</span> '+inlineMd(text.slice(4));
else _ih=inlineMd(text);
if(indent) html+=`<li style="margin-left:16px">${_ih}</li>`;
else html+=`<li>${_ih}</li>`;
}
return html+'</ul>';
});
@@ -820,7 +879,7 @@ function renderMd(raw){
// Our pipeline only emits: <strong>,<em>,<code>,<pre>,<h1-6>,<ul>,<ol>,<li>,
// <table>,<thead>,<tbody>,<tr>,<th>,<td>,<hr>,<blockquote>,<p>,<br>,<a>,
// <div class="..."> (mermaid/pre-header). Everything else is untrusted input.
const SAFE_TAGS=/^<\/?(strong|em|code|pre|h[1-6]|ul|ol|li|table|thead|tbody|tr|th|td|hr|blockquote|p|br|a|img|div|span)([\s>]|$)/i;
const SAFE_TAGS=/^<\/?(strong|em|del|code|pre|h[1-6]|ul|ol|li|table|thead|tbody|tr|th|td|hr|blockquote|p|br|a|img|div|span)([\s>]|$)/i;
s=s.replace(/<\/?[a-z][^>]*>/gi,tag=>SAFE_TAGS.test(tag)?tag:esc(tag));
// Autolink: convert plain URLs to clickable links.
// Stash <a>, <img> and <pre> blocks so autolink never runs inside them.
+68
View File
@@ -176,6 +176,74 @@ class TestSendBusyBranchDispatch:
)
def test_slash_commands_intercepted_before_busymode_routing(self):
"""The three busy-control slash commands (/steer /interrupt /queue) must be
intercepted at the TOP of the busy block before the busyMode routing so
they execute immediately while the agent is running.
Without this intercept, typing /steer while busy queues the text as a plain
message. When it drains after the turn ends there is no active stream, so
cmdSteer says "No active task to stop." and the steer is lost entirely.
"""
send_idx = MESSAGES_JS.find("async function send(")
assert send_idx >= 0, "send() not found"
# Look in the first 500 chars of the busy block for the intercept
busy_start = MESSAGES_JS.find("S.busy||compressionRunning", send_idx)
assert busy_start >= 0, "busy block not found"
# The intercept must appear BEFORE the busyMode assignment
intercept_idx = MESSAGES_JS.find("'steer','interrupt','queue'", busy_start)
busymode_idx = MESSAGES_JS.find("_busyInputMode||'queue'", busy_start)
assert intercept_idx >= 0, (
"send() must intercept /steer /interrupt /queue before the busyMode "
"routing block — otherwise they queue instead of executing immediately"
)
assert intercept_idx < busymode_idx, (
"The slash-command intercept must come BEFORE the busyMode routing "
"so /steer executes while the agent is running, not after the turn ends"
)
def test_steer_intercept_calls_handler_directly(self):
"""The busy-intercept must dispatch via _bc.fn(_pc.args), not queue the text."""
send_idx = MESSAGES_JS.find("async function send(")
busy_start = MESSAGES_JS.find("S.busy||compressionRunning", send_idx)
intercept_idx = MESSAGES_JS.find("'steer','interrupt','queue'", busy_start)
assert intercept_idx >= 0
# Get the intercept block (up to the next busyMode assignment)
busymode_idx = MESSAGES_JS.find("_busyInputMode||'queue'", busy_start)
intercept_block = MESSAGES_JS[intercept_idx:busymode_idx]
assert "_bc.fn(_pc.args)" in intercept_block, (
"The intercept must call the command handler directly via _bc.fn(_pc.args)"
)
assert "return;" in intercept_block, (
"The intercept must return after dispatching so send() does not also queue"
)
def test_steer_intercept_clears_input_before_await(self):
"""The intercept must clear $('msg').value BEFORE awaiting the handler.
Without the sync clear, the input field still shows '/steer foo' after
the steer fires. If the user presses Enter again (a common reflex while
waiting for the toast), send() re-runs and either re-fires the command
or once the turn ended drops a confusing 'No active task to stop.'
"""
send_idx = MESSAGES_JS.find("async function send(")
busy_start = MESSAGES_JS.find("S.busy||compressionRunning", send_idx)
intercept_idx = MESSAGES_JS.find("'steer','interrupt','queue'", busy_start)
busymode_idx = MESSAGES_JS.find("_busyInputMode||'queue'", busy_start)
intercept_block = MESSAGES_JS[intercept_idx:busymode_idx]
clear_idx = intercept_block.find("$('msg').value=''")
await_idx = intercept_block.find("await _bc.fn")
assert clear_idx >= 0, (
"The intercept must clear $('msg').value (so the field doesn't keep "
"showing /steer foo after the command fires)"
)
assert await_idx >= 0, "await _bc.fn(...) must be present in the intercept"
assert clear_idx < await_idx, (
"$('msg').value='' must be cleared BEFORE awaiting the handler — "
"otherwise a reflexive Enter press during the await re-fires the command"
)
# ── Boot init + settings panel wiring ───────────────────────────────────
class TestBootAndPanelsWiring:
+13
View File
@@ -62,6 +62,19 @@ def test_sandbox_allows_scripts_only():
)
def test_mime_map_includes_html_and_htm():
"""MIME_MAP must map .html/.htm to text/html — without this, _handle_file_raw
falls back to application/octet-stream and browsers refuse to render the
response inside the preview iframe (issue #779 follow-up: PR #1070)."""
from api.config import MIME_MAP
assert MIME_MAP.get(".html") == "text/html", (
"MIME_MAP['.html'] must be 'text/html' for the workspace HTML preview iframe"
)
assert MIME_MAP.get(".htm") == "text/html", (
"MIME_MAP['.htm'] must be 'text/html' for the workspace HTML preview iframe"
)
def test_inline_html_response_sets_csp_sandbox():
"""Defense-in-depth: ?inline=1 HTML responses must set Content-Security-Policy:
sandbox so the same origin isolation applies even when the URL is opened
+40
View File
@@ -0,0 +1,40 @@
"""
Regression test for PR #1071: approval card must render above the queue flyout.
Both `.approval-card` and `.queue-card` are siblings inside `.composer-flyout`
and share the same absolute positioning slot just above the composer. When
both are visible at the same time (queue flyout open + tool approval card
sliding up) the approval card MUST win the stacking order so its security-
relevant Allow / Deny buttons stay clickable.
The old CSS had `.queue-card { z-index: 2 }` and no z-index on
`.approval-card.visible`, so the queue card painted on top and blocked the
approval buttons. The fix raises `.approval-card.visible` to z-index 3.
This test pins the invariant: approval-card.visible z-index must be strictly
greater than queue-card z-index.
"""
import re
from pathlib import Path
CSS = Path("static/style.css").read_text(encoding="utf-8")
def _z_index_of(selector_regex: str) -> int | None:
m = re.search(selector_regex + r"\s*\{[^}]*z-index:(\d+)", CSS)
return int(m.group(1)) if m else None
def test_approval_card_visible_outranks_queue_card():
queue_z = _z_index_of(r"\.queue-card")
approval_visible_z = _z_index_of(r"\.approval-card\.visible")
assert queue_z is not None, ".queue-card must declare a z-index"
assert approval_visible_z is not None, (
".approval-card.visible must declare a z-index — without it, the approval "
"buttons get covered by the queue flyout (PR #1071)"
)
assert approval_visible_z > queue_z, (
f".approval-card.visible z-index ({approval_visible_z}) must be strictly "
f"greater than .queue-card z-index ({queue_z}) so approval buttons "
f"remain clickable when both flyouts are open."
)
+215
View File
@@ -0,0 +1,215 @@
"""Regression tests for the blockquote rendering fix (fix/blockquote-rendering).
Root cause: the old rule was `s.replace(/^> (.+)$/gm, ...)` which had three bugs:
1. `.+` required at least one character bare `>` lines passed through as literal `>`
2. Each line became its own `<blockquote>` no grouping, so 10-line quotes became
10 stacked `<blockquote>` elements
3. Fenced code blocks inside blockquotes left orphaned `>` literals after the
fence-stash pass had consumed the code content
Fix: group consecutive `>` lines into a single `<blockquote>`, handle bare `>` lines
as `<br>`, and strip the `>` prefix before passing each line to `inlineMd()`.
"""
import re
import pathlib
UI_JS = (pathlib.Path(__file__).parent.parent / "static" / "ui.js").read_text(encoding="utf-8")
# ---------------------------------------------------------------------------
# Python mirror of the new blockquote rule + inlineMd (for behavioural tests)
# ---------------------------------------------------------------------------
import html as _html
def _esc(s):
return _html.escape(str(s), quote=True)
def _inline_md(t):
"""Minimal inlineMd mirror — bold, italic, inline-code only."""
t = re.sub(r"`([^`\n]+)`", lambda m: f"<code>{_esc(m.group(1))}</code>", t)
t = re.sub(r"\*\*\*(.+?)\*\*\*", lambda m: f"<strong><em>{_esc(m.group(1))}</em></strong>", t)
t = re.sub(r"\*\*(.+?)\*\*", lambda m: f"<strong>{_esc(m.group(1))}</strong>", t)
t = re.sub(r"\*([^*\n]+)\*", lambda m: f"<em>{_esc(m.group(1))}</em>", t)
return t
def _apply_blockquote(s):
"""Python mirror of the new group-based blockquote rule in ui.js."""
def replacer(m):
block = m.group(0)
lines = block.split("\n")
# Drop a lone trailing ">" artifact that the regex can leave
while lines and lines[-1].strip() in (">", ""):
if lines[-1].strip() == ">":
lines.pop()
break
lines.pop()
processed = []
for l in lines:
stripped = re.sub(r"^>[ \t]?", "", l)
if stripped.strip() == "":
processed.append("<br>")
else:
processed.append(_inline_md(stripped))
inner = "\n".join(processed)
return f"<blockquote>{inner}</blockquote>"
return re.sub(r"((?:^>[^\n]*(?:\n|$))+)", replacer, s, flags=re.MULTILINE)
# ---------------------------------------------------------------------------
# Source-level structural tests
# ---------------------------------------------------------------------------
class TestBlockquoteSourceStructure:
"""The new rule must be present in ui.js and the old single-line rule must be gone."""
def test_old_single_line_rule_removed(self):
"""The old `.+` pattern that skipped blank lines must be gone."""
assert "replace(/^> (.+)$/gm" not in UI_JS, (
"Old single-line blockquote rule still present — it misses blank '>'"
" lines and creates one <blockquote> per line"
)
def test_new_group_rule_present(self):
"""The new grouping regex must be present."""
assert "(?:^>[^\\n]*(?:\\n|$))+" in UI_JS, (
"New group-based blockquote rule not found in ui.js"
)
def test_prefix_strip_present(self):
"""The fix must strip the '> ' prefix from each line."""
assert "replace(/^>[" in UI_JS or "replace(/^>[ " in UI_JS, (
"Expected prefix-strip pattern not found in the blockquote block"
)
# ---------------------------------------------------------------------------
# Behavioural tests (using the Python mirror)
# ---------------------------------------------------------------------------
class TestMultiLineBlockquote:
"""Consecutive > lines must become ONE <blockquote>, not many."""
def test_single_line_still_works(self):
out = _apply_blockquote("> Hello world")
assert out.count("<blockquote>") == 1
assert "Hello world" in out
assert ">" not in out.replace("<blockquote>", "").replace("</blockquote>", "")
def test_two_consecutive_lines_grouped(self):
src = "> Line one\n> Line two"
out = _apply_blockquote(src)
assert out.count("<blockquote>") == 1, (
f"Expected 1 <blockquote>, got {out.count('<blockquote>')}: {out!r}"
)
def test_ten_lines_one_blockquote(self):
src = "\n".join(f"> Line {i}" for i in range(10))
out = _apply_blockquote(src)
assert out.count("<blockquote>") == 1
def test_two_separate_quotes_stay_separate(self):
src = "> First quote\n\n> Second quote"
out = _apply_blockquote(src)
# Each quote is its own group (separated by a blank line)
assert out.count("<blockquote>") == 2
class TestBlankContinuationLines:
"""Bare '>' lines (blank continuation) must not appear as literal '>'."""
def test_bare_gt_line_no_literal(self):
src = "> Para one\n>\n> Para two"
out = _apply_blockquote(src)
assert out.count("<blockquote>") == 1, f"Expected 1 blockquote: {out!r}"
# No stray '>' outside of HTML tags
text_only = re.sub(r"<[^>]+>", "", out)
assert ">" not in text_only, f"Literal '>' in text: {text_only!r}"
def test_bare_gt_no_space_handled(self):
"""'>' with no space at all should also be consumed, not rendered literally."""
src = ">no space after"
out = _apply_blockquote(src)
assert out.count("<blockquote>") == 1
text_only = re.sub(r"<[^>]+>", "", out)
assert ">" not in text_only
def test_blank_line_becomes_br(self):
src = "> First\n>\n> Second"
out = _apply_blockquote(src)
assert "<br>" in out, f"Expected <br> for blank > line: {out!r}"
class TestInlineMarkdownInsideBlockquote:
"""Bold, italic, and inline code must still render correctly inside a blockquote."""
def test_bold_inside_blockquote(self):
out = _apply_blockquote("> This is **important**")
assert "<strong>" in out
assert "<blockquote>" in out
def test_inline_code_inside_blockquote(self):
out = _apply_blockquote("> Run `git status` first")
assert "<code>" in out
assert "<blockquote>" in out
def test_italic_inside_blockquote(self):
out = _apply_blockquote("> *emphasis* here")
assert "<em>" in out
assert "<blockquote>" in out
class TestNoPhantomTrailingBr:
"""The fix must drop both empty trailing lines (from a trailing \\n in the
match) and bare '>' artifacts. Without this, the common case a blockquote
followed by another paragraph renders with a phantom <br> right before
</blockquote>, leaving a visible blank line at the bottom of the quote.
"""
def test_input_ending_with_newline_no_trailing_br(self):
"""`> Hello\\n` must NOT produce `<blockquote>Hello\\n<br></blockquote>`."""
out = _apply_blockquote("> Hello\n")
assert "<br></blockquote>" not in out, (
f"Trailing <br> leaked inside the blockquote (phantom blank line): {out!r}"
)
def test_blockquote_followed_by_paragraph_no_trailing_br(self):
"""The common real-world shape: quote + blank line + paragraph."""
src = "> Quoted text\n\nNormal paragraph"
out = _apply_blockquote(src)
assert "<br></blockquote>" not in out, (
f"Trailing <br> leaked inside blockquote when followed by paragraph: {out!r}"
)
def test_multiline_quote_ending_with_newline_no_trailing_br(self):
out = _apply_blockquote("> Line one\n> Line two\n")
assert "<br></blockquote>" not in out, (
f"Multi-line quote ending with \\n must not leave a trailing <br>: {out!r}"
)
def test_quote_with_blank_continuation_then_newline(self):
"""`> A\\n>\\n> B\\n` — the internal `<br>` for the blank line stays,
but the trailing newline must not add a second `<br>` at the end."""
out = _apply_blockquote("> A\n>\n> B\n")
# Internal <br> for the blank-line continuation is intentional
assert "<br>" in out
# But there must not be a <br> immediately before the closing tag
assert "<br></blockquote>" not in out, (
f"Trailing <br> leaked at end of blockquote: {out!r}"
)
class TestBlockquoteFollowedByParagraph:
"""A blockquote followed by a normal paragraph must not bleed into each other."""
def test_non_blockquote_paragraph_untouched(self):
src = "> Quoted text\n\nNormal paragraph"
out = _apply_blockquote(src)
assert out.count("<blockquote>") == 1
assert "Normal paragraph" in out
# Normal paragraph must be outside the blockquote
after_bq = out[out.index("</blockquote>"):]
assert "Normal paragraph" in after_bq
+393
View File
@@ -86,6 +86,14 @@ def _ensure_state_db():
timestamp REAL NOT NULL
);
""")
for column, ddl in (
('parent_session_id', 'ALTER TABLE sessions ADD COLUMN parent_session_id TEXT'),
('ended_at', 'ALTER TABLE sessions ADD COLUMN ended_at REAL'),
('end_reason', 'ALTER TABLE sessions ADD COLUMN end_reason TEXT'),
):
existing = {row[1] for row in conn.execute("PRAGMA table_info(sessions)").fetchall()}
if column not in existing:
conn.execute(ddl)
conn.commit()
return conn
@@ -113,6 +121,50 @@ def _insert_gateway_session(conn, session_id='20260401_120000_abcdefgh', source=
conn.commit()
def _insert_agent_session_row(
conn,
session_id,
source='weixin',
title='Agent Session',
model='openai/gpt-5',
started_at=None,
parent_session_id=None,
ended_at=None,
end_reason=None,
messages=1,
):
"""Insert an agent session row with optional compression lineage."""
started_at = started_at or time.time()
conn.execute(
"INSERT OR REPLACE INTO sessions "
"(id, source, title, model, started_at, message_count, parent_session_id, ended_at, end_reason) "
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)",
(
session_id,
source,
title,
model,
started_at,
messages,
parent_session_id,
ended_at,
end_reason,
),
)
conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
for i in range(messages):
conn.execute(
"INSERT INTO messages (session_id, role, content, timestamp) VALUES (?, ?, ?, ?)",
(
session_id,
'user' if i % 2 == 0 else 'assistant',
f'{title} message {i + 1}',
started_at + i,
),
)
conn.commit()
def _remove_test_sessions(conn, *session_ids):
"""Remove specific test sessions from state.db (parallel-safe cleanup)."""
for sid in session_ids:
@@ -229,6 +281,347 @@ def test_gateway_watcher_hides_sessions_without_messages(monkeypatch):
pass
def test_compression_chain_collapses_to_latest_tip_in_sidebar():
"""Show one logical agent conversation for a compression continuation chain."""
conn = _ensure_state_db()
ids_to_remove = ('chain_root_001', 'chain_empty_mid_001', 'chain_tip_001')
t0 = time.time() - 600
try:
_insert_agent_session_row(
conn,
'chain_root_001',
title='Magazine Style PPT Skill',
started_at=t0,
ended_at=t0 + 100,
end_reason='compression',
messages=3,
)
_insert_agent_session_row(
conn,
'chain_empty_mid_001',
title='Magazine Style PPT Skill #2',
started_at=t0 + 101,
parent_session_id='chain_root_001',
ended_at=t0 + 200,
end_reason='compression',
messages=0,
)
_insert_agent_session_row(
conn,
'chain_tip_001',
title='Magazine Style PPT Skill #3',
started_at=t0 + 201,
parent_session_id='chain_empty_mid_001',
messages=2,
)
post('/api/settings', {'show_cli_sessions': True})
data, status = get('/api/sessions')
assert status == 200
ids = {s.get('session_id') for s in data.get('sessions', [])}
tip = next((s for s in data.get('sessions', []) if s.get('session_id') == 'chain_tip_001'), None)
assert 'chain_tip_001' in ids
assert 'chain_root_001' not in ids
assert 'chain_empty_mid_001' not in ids
assert tip is not None
assert tip.get('title') == 'Magazine Style PPT Skill'
assert tip.get('message_count') == 2
# created_at = the chain head's started_at (preserves original conversation date)
assert abs(tip.get('created_at') - t0) < 0.01
# updated_at = the tip's last message timestamp so the sidebar entry
# bubbles to the top by true recency, not by the root's stale activity.
# tip messages are at t0+201 and t0+202, so last_activity = t0 + 202.
assert abs(tip.get('updated_at') - (t0 + 202)) < 0.01
from api.agent_sessions import read_importable_agent_session_rows
rows = read_importable_agent_session_rows(_get_state_db_path(), limit=None)
projected_tip = next((row for row in rows if row.get('id') == 'chain_tip_001'), None)
assert projected_tip is not None
assert projected_tip.get('title') == 'Magazine Style PPT Skill'
assert projected_tip.get('_lineage_root_id') == 'chain_root_001'
assert projected_tip.get('_lineage_tip_id') == 'chain_tip_001'
assert projected_tip.get('_compression_segment_count') == 3
finally:
try:
_remove_test_sessions(conn, *ids_to_remove)
conn.close()
except Exception:
pass
post('/api/settings', {'show_cli_sessions': False})
def test_compression_chain_with_empty_latest_tip_falls_back_to_latest_importable_segment():
"""Empty latest tips should not make the whole conversation disappear."""
conn = _ensure_state_db()
ids_to_remove = ('empty_tip_root_001', 'empty_tip_001')
t0 = time.time() - 500
try:
_insert_agent_session_row(
conn,
'empty_tip_root_001',
title='Long Conversation',
started_at=t0,
ended_at=t0 + 100,
end_reason='compression',
messages=2,
)
_insert_agent_session_row(
conn,
'empty_tip_001',
title='Long Conversation #2',
started_at=t0 + 101,
parent_session_id='empty_tip_root_001',
messages=0,
)
post('/api/settings', {'show_cli_sessions': True})
data, status = get('/api/sessions')
assert status == 200
ids = {s.get('session_id') for s in data.get('sessions', [])}
assert 'empty_tip_root_001' in ids
assert 'empty_tip_001' not in ids
root = next((s for s in data.get('sessions', []) if s.get('session_id') == 'empty_tip_root_001'), None)
assert root and root.get('title') == 'Long Conversation'
finally:
try:
_remove_test_sessions(conn, *ids_to_remove)
conn.close()
except Exception:
pass
post('/api/settings', {'show_cli_sessions': False})
def test_compression_chain_with_all_empty_segments_is_hidden():
"""A compression chain with no importable segment should not appear."""
conn = _ensure_state_db()
ids_to_remove = ('all_empty_root_001', 'all_empty_tip_001')
t0 = time.time() - 450
try:
_insert_agent_session_row(
conn,
'all_empty_root_001',
title='Empty Long Conversation',
started_at=t0,
ended_at=t0 + 100,
end_reason='compression',
messages=0,
)
_insert_agent_session_row(
conn,
'all_empty_tip_001',
title='Empty Long Conversation #2',
started_at=t0 + 101,
parent_session_id='all_empty_root_001',
messages=0,
)
post('/api/settings', {'show_cli_sessions': True})
data, status = get('/api/sessions')
assert status == 200
ids = {s.get('session_id') for s in data.get('sessions', [])}
assert 'all_empty_root_001' not in ids
assert 'all_empty_tip_001' not in ids
finally:
try:
_remove_test_sessions(conn, *ids_to_remove)
conn.close()
except Exception:
pass
post('/api/settings', {'show_cli_sessions': False})
def test_non_compression_child_is_not_collapsed_into_parent():
"""Parent/child relationships that are not compression continuations stay flat."""
conn = _ensure_state_db()
ids_to_remove = ('branch_parent_001', 'branch_child_001')
t0 = time.time() - 400
try:
_insert_agent_session_row(
conn,
'branch_parent_001',
title='Branch Parent',
started_at=t0,
ended_at=t0 + 100,
end_reason='branched',
messages=2,
)
_insert_agent_session_row(
conn,
'branch_child_001',
title='Branch Child',
started_at=t0 + 101,
parent_session_id='branch_parent_001',
messages=2,
)
from api.agent_sessions import read_importable_agent_session_rows
rows = read_importable_agent_session_rows(_get_state_db_path(), limit=None)
ids = {row.get('id') for row in rows}
assert 'branch_parent_001' in ids
assert 'branch_child_001' in ids
finally:
try:
_remove_test_sessions(conn, *ids_to_remove)
conn.close()
except Exception:
pass
def test_agent_session_limit_applies_after_compression_projection():
"""A long raw chain should count as one logical sidebar row before limiting."""
conn = _ensure_state_db()
chain_ids = [f'limit_chain_{i:03d}' for i in range(8)]
standalone_id = 'limit_standalone_001'
t0 = time.time() - 300
try:
for i, sid in enumerate(chain_ids):
_insert_agent_session_row(
conn,
sid,
title=f'Limit Chain #{i + 1}',
started_at=t0 + i,
parent_session_id=chain_ids[i - 1] if i else None,
ended_at=t0 + i + 0.5 if i < len(chain_ids) - 1 else None,
end_reason='compression' if i < len(chain_ids) - 1 else None,
messages=1,
)
_insert_agent_session_row(
conn,
standalone_id,
title='Limit Standalone',
started_at=t0 + 20,
messages=1,
)
from api.agent_sessions import read_importable_agent_session_rows
rows = read_importable_agent_session_rows(_get_state_db_path(), limit=2)
ids = [row.get('id') for row in rows]
assert len(rows) == 2
assert chain_ids[-1] in ids
assert standalone_id in ids
assert not any(sid in ids for sid in chain_ids[:-1])
chain = next(row for row in rows if row.get('id') == chain_ids[-1])
assert chain.get('title') == 'Limit Chain #1'
assert chain.get('_lineage_root_id') == chain_ids[0]
assert chain.get('_compression_segment_count') == len(chain_ids)
finally:
try:
_remove_test_sessions(conn, *(chain_ids + [standalone_id]))
conn.close()
except Exception:
pass
def test_compression_chain_bubbles_to_top_by_tip_activity():
"""An actively-used compression chain must surface in the sidebar by its
TIP's last activity, not by the (stale) root's last activity.
Without overriding ``last_activity`` from the tip, a long-running chain
whose tip is being actively edited NOW would sort by the root's old
timestamp and fall below recently touched standalone sessions the
inverse of what users expect from "Show agent sessions" sorted by
recency. This regression test pins the override.
"""
conn = _ensure_state_db()
ids_to_remove = ('bubble_root_001', 'bubble_tip_001', 'bubble_standalone_001')
now = time.time()
# Root started long ago; tip is being edited "now" (very recent message)
root_started = now - 30 * 86400
root_ended = now - 28 * 86400
tip_started = root_ended + 1
tip_latest_msg = now - 5 # 5 seconds ago — most recent activity in the DB
# A standalone session active 2 days ago — older than tip, much newer
# than the root. Without the fix, the chain row sorts by ROOT's age and
# standalone wins; with the fix, the chain wins.
standalone_msg = now - 2 * 86400
try:
_insert_agent_session_row(
conn,
'bubble_root_001',
title='Bubble Root',
started_at=root_started,
ended_at=root_ended,
end_reason='compression',
messages=2,
)
# Override message timestamps so root's last_activity is genuinely old.
conn.execute("DELETE FROM messages WHERE session_id = 'bubble_root_001'")
conn.execute(
"INSERT INTO messages (session_id, role, content, timestamp) VALUES (?, ?, ?, ?)",
('bubble_root_001', 'user', 'old root msg', root_started + 60),
)
_insert_agent_session_row(
conn,
'bubble_tip_001',
title='Bubble Tip',
started_at=tip_started,
parent_session_id='bubble_root_001',
messages=1,
)
conn.execute("DELETE FROM messages WHERE session_id = 'bubble_tip_001'")
conn.execute(
"INSERT INTO messages (session_id, role, content, timestamp) VALUES (?, ?, ?, ?)",
('bubble_tip_001', 'user', 'fresh tip msg', tip_latest_msg),
)
_insert_agent_session_row(
conn,
'bubble_standalone_001',
title='Bubble Standalone',
started_at=now - 2 * 86400 - 60,
messages=1,
)
conn.execute("DELETE FROM messages WHERE session_id = 'bubble_standalone_001'")
conn.execute(
"INSERT INTO messages (session_id, role, content, timestamp) VALUES (?, ?, ?, ?)",
('bubble_standalone_001', 'user', 'standalone msg', standalone_msg),
)
conn.commit()
from api.agent_sessions import read_importable_agent_session_rows
rows = read_importable_agent_session_rows(_get_state_db_path(), limit=200)
ids = [row.get('id') for row in rows]
# Filter out unrelated rows from the shared DB
ids = [i for i in ids if i in ('bubble_root_001', 'bubble_tip_001', 'bubble_standalone_001')]
assert 'bubble_tip_001' in ids, (
f"Compression tip must appear in projected output. ids={ids}"
)
assert 'bubble_root_001' not in ids, (
"Compression root row must be hidden once the tip is the active row."
)
tip_pos = ids.index('bubble_tip_001')
standalone_pos = ids.index('bubble_standalone_001') if 'bubble_standalone_001' in ids else -1
assert standalone_pos == -1 or tip_pos < standalone_pos, (
f"Active compression tip (last msg 5s ago) must sort BEFORE standalone "
f"session (last msg 2d ago). Got order: {ids}. "
f"This indicates merged.last_activity is the root's stale value, "
f"not the tip's recent value."
)
tip_row = next(r for r in rows if r['id'] == 'bubble_tip_001')
assert abs(tip_row['last_activity'] - tip_latest_msg) < 0.01, (
f"Projected tip's last_activity must equal the tip's most recent "
f"message timestamp ({tip_latest_msg}), not the root's "
f"({root_started + 60}). Got: {tip_row['last_activity']}"
)
finally:
try:
_remove_test_sessions(conn, *ids_to_remove)
conn.close()
except Exception:
pass
def test_gateway_sessions_excluded_when_disabled():
"""Gateway sessions are NOT returned when show_cli_sessions is off."""
conn = _ensure_state_db()
+1 -1
View File
@@ -31,7 +31,7 @@ def test_autolink_regex_in_rendermd():
rendermd_start = content.find('function renderMd(raw){')
assert rendermd_start != -1, "renderMd function not found in ui.js"
# Find the closing brace after renderMd (look for the autolink pattern within it)
rendermd_body = content[rendermd_start:rendermd_start + 5000]
rendermd_body = content[rendermd_start:rendermd_start + 10000]
assert 'https?:\\/\\/' in rendermd_body, (
"Autolink regex (https?:\\/\\/) not found inside renderMd() body."
)
+52
View File
@@ -28,6 +28,7 @@ INDEX = (REPO / "static" / "index.html").read_text(encoding="utf-8")
UI_JS = (REPO / "static" / "ui.js").read_text(encoding="utf-8")
COMMANDS_JS = (REPO / "static" / "commands.js").read_text(encoding="utf-8")
MESSAGES_JS = (REPO / "static" / "messages.js").read_text(encoding="utf-8")
STYLE_CSS = (REPO / "static" / "style.css").read_text(encoding="utf-8")
# ── #1 dropdown escapes composer-left ─────────────────────────────────────────
@@ -115,6 +116,57 @@ class TestReasoningChipIcon:
)
# ── #1068 None/default reasoning chip stays visible ──────────────────────────
class TestReasoningChipNoneState:
"""Reasoning effort is a current setting like model selection. Setting it
to None disables reasoning, but the chip must remain visible so users can
see and change the current level."""
def get_apply_reasoning_chip(self):
m = re.search(
r"function\s+_applyReasoningChip\b[\s\S]*?^}",
UI_JS,
re.MULTILINE,
)
assert m, "_applyReasoningChip not found in ui.js"
return m.group(0)
def test_none_and_default_do_not_hide_reasoning_chip(self):
fn = self.get_apply_reasoning_chip()
assert "wrap.style.display='';" in fn, (
"_applyReasoningChip must show the reasoning chip even for empty/"
"default or 'none' effort values"
)
assert "if(!eff" not in fn and "wrap.style.display='none'" not in fn, (
"_applyReasoningChip must not use a truthy guard that hides the "
"chip for the valid 'none' state"
)
assert "wrap.style.display='none'" not in fn, (
"the None/default reasoning state should be visible, not hidden"
)
def test_none_and_default_have_visible_labels(self):
assert "if(effort==='none') return 'None';" in UI_JS, (
"the disabled reasoning state must render a visible 'None' label"
)
assert "if(!effort) return 'Default';" in UI_JS, (
"the unset reasoning state must render a visible 'Default' label"
)
def test_none_and_default_are_visually_inactive_not_missing(self):
fn = self.get_apply_reasoning_chip()
assert "chip.classList.toggle('inactive',inactive)" in fn, (
"None/default should be shown with an inactive visual treatment "
"instead of removing the chip"
)
assert ".composer-reasoning-chip.inactive" in STYLE_CSS, (
"the inactive chip state needs a CSS rule so the visible None/"
"default state is intentionally muted"
)
# ── #3 /reasoning immediately updates chip ────────────────────────────────────
+199
View File
@@ -0,0 +1,199 @@
"""Behavioural tests that drive the actual `_applyReasoningChip()` from
static/ui.js via node, not just a regex over the source.
The static checks in test_reasoning_chip_btw_fixes.py confirm the *shape*
of the function (no `display='none'`, the right toggle call exists, etc.)
but they pass even if a runtime detail is wrong e.g. if `inactive` were
inverted, or `_normalizeReasoningEffort` mishandled whitespace, or the
label fell through to a wrong value for an unknown input.
This file pins the actual rendered output for every effort state so the
chip's None/Default visibility cannot silently regress.
"""
import os
import shutil
import subprocess
from pathlib import Path
import pytest
REPO_ROOT = Path(__file__).parent.parent.resolve()
UI_JS_PATH = REPO_ROOT / "static" / "ui.js"
NODE = shutil.which("node")
pytestmark = pytest.mark.skipif(NODE is None, reason="node not on PATH")
_DRIVER_SRC = r"""
const fs = require('fs');
const src = fs.readFileSync(process.argv[2], 'utf8');
function makeEl() {
return {
style: {},
classList: {
_set: new Set(),
add(c){this._set.add(c)},
remove(c){this._set.delete(c)},
toggle(c, on){
const want = on === undefined ? !this._set.has(c) : Boolean(on);
if (want) this._set.add(c); else this._set.delete(c);
},
contains(c){return this._set.has(c)},
},
dataset: {},
title: '',
textContent: '',
querySelectorAll(){return []},
};
}
const els = {
composerReasoningWrap: makeEl(),
composerReasoningLabel: makeEl(),
composerReasoningChip: makeEl(),
composerReasoningDropdown: makeEl(),
};
els.composerReasoningWrap.style.display = 'none'; // mirrors the HTML default
global.window = {};
global.document = {
createElement: () => makeEl(),
addEventListener: () => {},
querySelectorAll: () => [],
querySelector: () => null,
};
global.$ = id => els[id] || null;
global.api = () => ({ then: () => ({ catch: () => {} }), catch: () => {} });
function extractFunc(name) {
const re = new RegExp('function\\s+' + name + '\\s*\\(');
const start = src.search(re);
if (start < 0) throw new Error(name + ' not found');
let i = src.indexOf('{', start);
let depth = 1; i++;
while (depth > 0 && i < src.length) {
if (src[i] === '{') depth++;
else if (src[i] === '}') depth--;
i++;
}
return src.slice(start, i);
}
eval(extractFunc('_normalizeReasoningEffort'));
eval(extractFunc('_formatReasoningEffortLabel'));
eval(extractFunc('_highlightReasoningOption'));
eval(extractFunc('_applyReasoningChip'));
const input = JSON.parse(process.argv[3]);
_applyReasoningChip(input);
const result = {
display: els.composerReasoningWrap.style.display,
label: els.composerReasoningLabel.textContent,
inactive: els.composerReasoningChip.classList.contains('inactive'),
title: els.composerReasoningChip.title,
};
process.stdout.write(JSON.stringify(result));
"""
@pytest.fixture(scope="module")
def driver_path(tmp_path_factory):
p = tmp_path_factory.mktemp("reasoning_driver") / "driver.js"
p.write_text(_DRIVER_SRC, encoding="utf-8")
return str(p)
def _apply(driver_path, value):
"""Run _applyReasoningChip(value) against the actual ui.js."""
import json as _json
result = subprocess.run(
[NODE, driver_path, str(UI_JS_PATH), _json.dumps(value)],
capture_output=True, text=True, timeout=10,
)
if result.returncode != 0:
raise RuntimeError(f"node driver failed: {result.stderr}")
return _json.loads(result.stdout)
# ─────────────────────────────────────────────────────────────────────────────
# The chip MUST stay visible for every effort state (issue #1068). This used
# to be hidden for !eff and 'none', and the source-regex tests in
# test_reasoning_chip_btw_fixes.py verify the literal `display='none'` is gone
# — but only a behavioural check confirms the wrap actually receives `''`.
# ─────────────────────────────────────────────────────────────────────────────
class TestChipAlwaysVisible:
def test_empty_string_shows_chip_with_default_label(self, driver_path):
out = _apply(driver_path, "")
assert out["display"] == "", f"empty effort must show the chip: {out}"
assert out["label"] == "Default"
assert out["inactive"] is True
def test_null_shows_chip_with_default_label(self, driver_path):
out = _apply(driver_path, None)
assert out["display"] == ""
assert out["label"] == "Default"
assert out["inactive"] is True
def test_none_shows_chip_with_none_label(self, driver_path):
"""The bug from #1068 — 'none' must NOT hide the chip."""
out = _apply(driver_path, "none")
assert out["display"] == "", (
f"'none' must show the chip (the regression that started #1068): {out}"
)
assert out["label"] == "None"
assert out["inactive"] is True
def test_low_shows_chip_active(self, driver_path):
out = _apply(driver_path, "low")
assert out["display"] == ""
assert out["label"] == "low"
assert out["inactive"] is False
def test_high_shows_chip_active(self, driver_path):
out = _apply(driver_path, "high")
assert out["display"] == ""
assert out["inactive"] is False
class TestNormalizationEdgeCases:
"""Pin the input-normalisation contract so it can't silently shift."""
def test_uppercase_normalises(self, driver_path):
# Even though the API and slash command use lowercase, defensive
# normalisation matters — copy/paste of an uppercase value or a
# mis-cased server response shouldn't break the chip.
out = _apply(driver_path, "NONE")
assert out["label"] == "None"
assert out["inactive"] is True
def test_whitespace_trimmed(self, driver_path):
out = _apply(driver_path, " none ")
assert out["label"] == "None"
assert out["inactive"] is True
def test_unknown_value_falls_through_visible(self, driver_path):
# Defensive: unknown effort still shows the chip rather than hiding.
out = _apply(driver_path, "banana")
assert out["display"] == ""
assert out["label"] == "banana"
assert out["inactive"] is False
class TestTitleAttributeAccessibility:
"""The chip's `title` is the hover tooltip and a screen-reader hint —
confirm it always carries the current state in human-readable form."""
def test_title_has_default_label_for_unset(self, driver_path):
out = _apply(driver_path, "")
assert out["title"] == "Reasoning effort: Default"
def test_title_has_none_label_for_none(self, driver_path):
out = _apply(driver_path, "none")
assert out["title"] == "Reasoning effort: None"
def test_title_has_active_label_for_high(self, driver_path):
out = _apply(driver_path, "high")
assert out["title"] == "Reasoning effort: high"
+366
View File
@@ -0,0 +1,366 @@
"""Comprehensive renderer audit tests for static/ui.js renderMd().
This file covers the full suite of markdown constructs an LLM might produce,
with a focus on edge cases and combinations. Tests are grouped by construct.
Python mirrors the renderMd/inlineMd pipeline at the level needed for each
test either source-level assertions (checking the JS source directly) or
behavioural assertions (checking rendered HTML via a Python mirror).
"""
import re
import pathlib
UI_JS = (pathlib.Path(__file__).parent.parent / "static" / "ui.js").read_text(encoding="utf-8")
import html as _html
def _esc(s):
return _html.escape(str(s), quote=True)
def _inline_md(t):
"""Mirror of inlineMd() in ui.js — processes one line of text."""
_code_stash = []
t = re.sub(r"`([^`\n]+)`",
lambda m: (_code_stash.append(f"<code>{_esc(m.group(1))}</code>")
or f"\x00C{len(_code_stash)-1}\x00"), t)
t = re.sub(r"\*\*\*(.+?)\*\*\*", lambda m: f"<strong><em>{_esc(m.group(1))}</em></strong>", t)
t = re.sub(r"\*\*(.+?)\*\*", lambda m: f"<strong>{_esc(m.group(1))}</strong>", t)
t = re.sub(r"\*([^*\n]+)\*", lambda m: f"<em>{_esc(m.group(1))}</em>", t)
t = re.sub(r"~~(.+?)~~", lambda m: f"<del>{_esc(m.group(1))}</del>", t)
t = re.sub(r"\x00C(\d+)\x00", lambda m: _code_stash[int(m.group(1))], t)
return t
def _apply_blockquotes(src):
"""Mirror of _applyBlockquotes() — handles nested + lists + blank lines."""
def replacer(m):
block = m.group(0)
lines = block.split("\n")
while lines and (lines[-1].strip() in (">", "")):
if lines[-1].strip() == ">":
lines.pop(); break
lines.pop()
stripped = [re.sub(r"^>[ \t]?", "", l) for l in lines]
inner_raw = "\n".join(stripped)
if re.search(r"^>", inner_raw, re.MULTILINE):
inner = _apply_blockquotes(inner_raw)
elif re.search(r"^( )?[-*+] .+", inner_raw, re.MULTILINE):
def inner_list(lb):
ll = lb.strip().split("\n"); h = "<ul>"
for li in ll:
txt = re.sub(r"^ {0,4}[-*+] ", "", li)
if re.match(r"\[x\] ", txt, re.I): ih = f"{_inline_md(txt[4:])}"
elif txt.startswith("[ ] "): ih = f"{_inline_md(txt[4:])}"
else: ih = _inline_md(txt)
h += f"<li>{ih}</li>"
return h + "</ul>"
inner = re.sub(r"((?:^(?: )?[-*+] .+\n?)+)", lambda m2: inner_list(m2.group(0)),
inner_raw, flags=re.MULTILINE)
else:
inner = "\n".join("<br>" if l.strip() == "" else _inline_md(l) for l in stripped)
return f"<blockquote>{inner}</blockquote>"
return re.sub(r"((?:^>[^\n]*(?:\n|$))+)", replacer, src, flags=re.MULTILINE)
# ─────────────────────────────────────────────────────────────────────────────
# Source-level structural checks (JS must contain these patterns)
# ─────────────────────────────────────────────────────────────────────────────
class TestSourceStructure:
"""Verify key patterns are present in ui.js."""
def test_crlf_normalisation_present(self):
assert ".replace(/\\r\\n/g,'\\n').replace(/\\r/g,'\\n')" in UI_JS, (
"renderMd must normalise \\r\\n and bare \\r to \\n at the start"
)
def test_strikethrough_in_inline_md(self):
assert "~~(.+?)~~" in UI_JS and "<del>" in UI_JS, (
"inlineMd must handle ~~strikethrough~~ → <del>"
)
def test_del_in_safe_tags(self):
assert "del" in UI_JS and "SAFE_TAGS" in UI_JS, (
"<del> must be in SAFE_TAGS so it is not HTML-escaped"
)
def test_del_in_safe_inline(self):
# SAFE_INLINE is used inside inlineMd
safe_inline_idx = UI_JS.find("SAFE_INLINE")
assert safe_inline_idx >= 0
window = UI_JS[safe_inline_idx: safe_inline_idx + 100]
assert "del" in window, "<del> must be in SAFE_INLINE"
def test_task_list_checked_handled(self):
assert "task-done" in UI_JS or "\\u2705" in UI_JS or "" in UI_JS, (
"Checked task list items [x] must produce a ✅ or task-done class"
)
def test_task_list_unchecked_handled(self):
assert "task-todo" in UI_JS or "\\u2610" in UI_JS or "" in UI_JS, (
"Unchecked task list items [ ] must produce ☐ or task-todo class"
)
def test_nested_blockquote_recurse(self):
assert "_applyBlockquotes" in UI_JS, (
"Blockquote handler must use a named function for recursive nesting"
)
def test_blockquote_handler_is_function(self):
assert "function _applyBlockquotes" in UI_JS, (
"Must define _applyBlockquotes as a named inner function for recursion"
)
def test_old_single_line_blockquote_removed(self):
assert "replace(/^> (.+)$/gm" not in UI_JS, (
"Old single-line blockquote rule must be removed"
)
def test_h1_h2_h3_handled(self):
for h in ("h1", "h2", "h3"):
assert f"<{h}>" in UI_JS or f"`<{h}>" in UI_JS
def test_ordered_list_value_attr(self):
assert 'value=' in UI_JS, "Ordered list items must use value= to preserve numbering"
def test_table_handler_present(self):
assert "<table>" in UI_JS and "<thead>" in UI_JS
def test_fenced_code_lang_header(self):
assert "pre-header" in UI_JS
def test_autolink_present(self):
# JS stores regex slashes as \/ — search for both forms
assert ("https?:\\/\\/" in UI_JS or "https?://" in UI_JS) and "target=\"_blank\"" in UI_JS
# ─────────────────────────────────────────────────────────────────────────────
# Behavioural: inline formatting
# ─────────────────────────────────────────────────────────────────────────────
class TestInlineFormatting:
def test_bold(self):
assert _inline_md("**bold**") == "<strong>bold</strong>"
def test_italic(self):
assert _inline_md("*italic*") == "<em>italic</em>"
def test_bold_italic(self):
out = _inline_md("***bi***")
assert "<strong><em>" in out
def test_strikethrough(self):
out = _inline_md("~~deleted~~")
assert "<del>deleted</del>" == out
def test_strikethrough_inline(self):
out = _inline_md("keep ~~remove~~ keep")
assert "<del>remove</del>" in out
assert "keep" in out
def test_inline_code(self):
out = _inline_md("`git status`")
assert "<code>git status</code>" in out
def test_strikethrough_inside_code_not_processed(self):
out = _inline_md("`~~not deleted~~`")
assert "<del>" not in out
assert "~~not deleted~~" in out
def test_bold_with_inline_code(self):
# **`code`** → <strong><code>code</code></strong>
out = _inline_md("**`code`**")
# The code stash protects the backtick span from bold regex
assert "<code>" in out
def test_xss_in_bold(self):
out = _inline_md("**<script>alert(1)</script>**")
assert "<script>" not in out
def test_xss_in_strikethrough(self):
out = _inline_md("~~<img onerror=alert(1)>~~")
assert "onerror" not in out.lower() or "&lt;" in out
# ─────────────────────────────────────────────────────────────────────────────
# Behavioural: blockquotes
# ─────────────────────────────────────────────────────────────────────────────
class TestBlockquotes:
def test_single_line(self):
out = _apply_blockquotes("> Hello")
assert out.count("<blockquote>") == 1
assert "Hello" in out
def test_multi_line_grouped(self):
out = _apply_blockquotes("> Line one\n> Line two\n> Line three")
assert out.count("<blockquote>") == 1
def test_blank_continuation_no_literal_gt(self):
out = _apply_blockquotes("> Para one\n>\n> Para two")
assert out.count("<blockquote>") == 1
text = re.sub(r"<[^>]+>", "", out)
assert ">" not in text, f"Literal > in output: {text!r}"
def test_blank_continuation_becomes_br(self):
out = _apply_blockquotes("> Para one\n>\n> Para two")
assert "<br>" in out
def test_bare_gt_no_space(self):
out = _apply_blockquotes(">no space after")
assert out.count("<blockquote>") == 1
assert "no space after" in out
def test_two_separate_blockquotes(self):
out = _apply_blockquotes("> First\n\n> Second")
assert out.count("<blockquote>") == 2
def test_inline_markdown_in_blockquote(self):
out = _apply_blockquotes("> **bold** and *italic*")
assert "<strong>" in out and "<em>" in out and "<blockquote>" in out
def test_inline_code_in_blockquote(self):
out = _apply_blockquotes("> run `git status` first")
assert "<code>" in out and "<blockquote>" in out
def test_strikethrough_in_blockquote(self):
out = _apply_blockquotes("> ~~old~~ new")
assert "<del>" in out and "<blockquote>" in out
def test_nested_blockquote_double(self):
out = _apply_blockquotes(">> deeply nested")
assert out.count("<blockquote>") == 2
def test_nested_blockquote_outer_and_inner(self):
out = _apply_blockquotes("> outer\n>> inner line")
assert out.count("<blockquote>") == 2
def test_list_inside_blockquote(self):
out = _apply_blockquotes("> - item one\n> - item two")
assert "<ul>" in out and "<li>" in out and "<blockquote>" in out
def test_task_list_inside_blockquote(self):
out = _apply_blockquotes("> - [x] done\n> - [ ] todo")
assert "" in out or "task-done" in out
assert "" in out or "task-todo" in out
assert "<blockquote>" in out
def test_blockquote_followed_by_paragraph(self):
out = _apply_blockquotes("> Quoted\n\nNormal text")
assert out.count("<blockquote>") == 1
after = out[out.index("</blockquote>"):]
assert "Normal text" in after
# ─────────────────────────────────────────────────────────────────────────────
# Behavioural: task lists
# ─────────────────────────────────────────────────────────────────────────────
class TestTaskLists:
def _apply_list(self, block):
lines = block.strip().split("\n")
html = "<ul>"
for l in lines:
text = re.sub(r"^ {0,4}[-*+] ", "", l)
if re.match(r"\[x\] ", text, re.I):
html += f"<li>✅ {_inline_md(text[4:])}</li>"
elif text.startswith("[ ] "):
html += f"<li>☐ {_inline_md(text[4:])}</li>"
else:
html += f"<li>{_inline_md(text)}</li>"
return html + "</ul>"
def test_checked_item(self):
out = self._apply_list("- [x] done task")
assert "" in out and "done task" in out
def test_checked_uppercase_X(self):
out = self._apply_list("- [X] also done")
assert "" in out
def test_unchecked_item(self):
out = self._apply_list("- [ ] pending task")
assert "" in out and "pending task" in out
def test_mixed_task_and_normal(self):
out = self._apply_list("- [x] done\n- [ ] todo\n- normal")
assert "" in out and "" in out
assert "<li>" in out
def test_task_item_with_bold(self):
out = self._apply_list("- [x] **important** task")
assert "" in out and "<strong>" in out
def test_non_task_list_unaffected(self):
out = self._apply_list("- regular item\n- another item")
assert "" not in out and "" not in out
# ─────────────────────────────────────────────────────────────────────────────
# Behavioural: strikethrough edge cases
# ─────────────────────────────────────────────────────────────────────────────
class TestStrikethrough:
def test_basic(self):
assert _inline_md("~~text~~") == "<del>text</del>"
def test_multiword(self):
out = _inline_md("~~multiple words here~~")
assert "<del>multiple words here</del>" == out
def test_inside_bold(self):
# **~~text~~** — outer bold picks up the raw ~~ which inlineMd then handles
# In practice bold runs first in the JS, then ~~ — let's verify the pattern exists
out = _inline_md("~~inside strikethrough~~")
assert "<del>" in out
def test_xss_escaped(self):
out = _inline_md("~~<b>bad</b>~~")
assert "<b>" not in out or "&lt;b&gt;" in out
# ─────────────────────────────────────────────────────────────────────────────
# Edge-case combinations
# ─────────────────────────────────────────────────────────────────────────────
class TestEdgeCases:
def test_empty_string(self):
out = _apply_blockquotes("")
assert out == ""
def test_no_blockquote(self):
s = "just normal text"
assert _apply_blockquotes(s) == s
def test_crlf_in_blockquote(self):
# \r\n should not produce literal \r in output
src = "> line one\r\n> line two"
# First normalise \r\n (as renderMd does)
src = src.replace("\r\n", "\n")
out = _apply_blockquotes(src)
assert "\r" not in out
assert out.count("<blockquote>") == 1
def test_blockquote_with_code_and_nested(self):
src = "> `code`\n>> nested"
out = _apply_blockquotes(src)
# Outer blockquote wraps everything
assert out.count("<blockquote>") >= 2
def test_deeply_nested_blockquote(self):
src = ">>> triple nested"
out = _apply_blockquotes(src)
assert out.count("<blockquote>") == 3
def test_task_list_normal_list_mixed(self):
src = "> - [x] done\n> - normal item\n> - [ ] todo"
out = _apply_blockquotes(src)
assert "<blockquote>" in out
assert "<ul>" in out
+191
View File
@@ -0,0 +1,191 @@
"""Behavioural tests that drive the ACTUAL renderMd() in static/ui.js via node.
The Python mirrors in test_blockquote_rendering.py and
test_renderer_comprehensive.py validate intent, but they can drift from the
JS. Twice now (PR #1073 commit 94d63d0 — phantom <br>; PR #1073 commit
04e7b53 leading-space-in-blockquote prefix-strip regex) the Python mirror
was correct while the JS was not, so the static-mirror tests passed even
though the live UI was broken.
This file closes that gap by spawning ``node`` on the real ui.js and
asserting the rendered HTML for the most common LLM-output shapes.
Add a case here whenever the renderer fix targets a class of input the
Python mirror cannot exercise faithfully.
"""
import os
import shutil
import subprocess
import sys
from pathlib import Path
import pytest
REPO_ROOT = Path(__file__).parent.parent.resolve()
UI_JS_PATH = REPO_ROOT / "static" / "ui.js"
NODE = shutil.which("node")
pytestmark = pytest.mark.skipif(NODE is None, reason="node not on PATH")
_DRIVER_SRC = r"""
const fs = require('fs');
const src = fs.readFileSync(process.argv[2], 'utf8');
global.window = {};
global.document = { createElement: () => ({ innerHTML: '', textContent: '' }) };
const esc = s => String(s ?? '').replace(/[&<>"']/g, c => (
{'&':'&amp;','<':'&lt;','>':'&gt;','"':'&quot;',"'":'&#39;'}[c]));
function extractFunc(name) {
const re = new RegExp('function\\s+' + name + '\\s*\\(');
const start = src.search(re);
if (start < 0) throw new Error(name + ' not found');
let i = src.indexOf('{', start);
let depth = 1; i++;
while (depth > 0 && i < src.length) {
if (src[i] === '{') depth++;
else if (src[i] === '}') depth--;
i++;
}
return src.slice(start, i);
}
eval(extractFunc('renderMd'));
let buf = '';
process.stdin.on('data', c => { buf += c; });
process.stdin.on('end', () => { process.stdout.write(renderMd(buf)); });
"""
@pytest.fixture(scope="module")
def driver_path(tmp_path_factory):
"""Write the node driver to a tmp file (works around `node -e` arg quirks)."""
p = tmp_path_factory.mktemp("renderer_driver") / "driver.js"
p.write_text(_DRIVER_SRC, encoding="utf-8")
return str(p)
def _render(driver_path, markdown: str) -> str:
"""Run renderMd against the actual ui.js and return the rendered HTML."""
result = subprocess.run(
[NODE, driver_path, str(UI_JS_PATH)],
input=markdown,
capture_output=True,
text=True,
timeout=10,
)
if result.returncode != 0:
raise RuntimeError(f"node driver failed: {result.stderr}")
return result.stdout
# ─────────────────────────────────────────────────────────────────────────────
# Blockquote prefix strip — the bug commit 04e7b53 introduced was a one-char
# regex regression where `^>[\t]?` (only tab) replaced `^>[ \t]?` (space or
# tab), producing leading-space artifacts and breaking lists-in-quotes
# because the list-detection regex `^( )?[-*+]` couldn't match the
# space-prefixed lines. These tests exercise the actual JS so the regex
# can't silently regress to tab-only again.
# ─────────────────────────────────────────────────────────────────────────────
class TestBlockquotePrefixStrip:
"""Drive the actual renderMd to confirm `> ` is fully stripped."""
def test_single_line_blockquote_no_leading_space(self, driver_path):
out = _render(driver_path, "> Hello world").strip()
assert "<blockquote>Hello world</blockquote>" in out, (
f"`> Hello world` must render as <blockquote>Hello world</blockquote> "
f"with no leading space. Got: {out!r}. Likely cause: prefix-strip "
f"regex consumes only \\t, not space."
)
def test_multiline_blockquote_no_leading_space(self, driver_path):
out = _render(driver_path, "> Line one\n> Line two").strip()
assert ">Line one\nLine two<" in out, (
f"Multi-line blockquote must strip the space after each `>`. "
f"Got: {out!r}"
)
# Belt-and-braces: there must be no space-after-newline-in-content
assert "\n " not in out.replace("</blockquote>", ""), (
f"Inner content of blockquote should not contain leading-space "
f"lines. Got: {out!r}"
)
def test_list_inside_blockquote_renders_as_ul(self, driver_path):
"""The PR explicitly added 'lists inside blockquotes' as a feature.
With the prefix-strip bug, the list-detection regex can't match the
space-prefixed lines, so the list never renders. This pins it."""
out = _render(driver_path, "> Steps:\n> - one\n> - two")
assert "<ul>" in out, (
f"`> - item` lines inside a blockquote must render as a <ul>. "
f"Got: {out!r}. Likely cause: prefix-strip leaves a leading "
f"space, list regex `^( )?[-*+] ` can't match one-space prefix."
)
assert "<li>one</li>" in out
assert "<li>two</li>" in out
def test_task_list_inside_blockquote(self, driver_path):
"""Task lists inside blockquotes render checkbox spans, not literal [x]."""
out = _render(driver_path, "> - [x] done\n> - [ ] todo")
assert 'class="task-done"' in out, (
f"`- [x]` inside a blockquote must produce a task-done span. "
f"Got: {out!r}"
)
assert 'class="task-todo"' in out
# ─────────────────────────────────────────────────────────────────────────────
# Common LLM output shapes — sanity-check the most frequent constructs render
# the way a user would expect.
# ─────────────────────────────────────────────────────────────────────────────
class TestCommonLLMShapes:
def test_strikethrough_outside_quote(self, driver_path):
out = _render(driver_path, "This was ~~outdated~~ but is now fine.")
assert "<del>outdated</del>" in out
def test_strikethrough_inside_blockquote(self, driver_path):
out = _render(driver_path, "> This is ~~wrong~~ actually")
assert "<blockquote>" in out and "<del>wrong</del>" in out
def test_top_level_task_list(self, driver_path):
out = _render(driver_path, "- [x] done\n- [ ] todo\n- regular item")
assert 'class="task-done"' in out
assert 'class="task-todo"' in out
assert "regular item" in out
def test_nested_blockquote_recurses(self, driver_path):
out = _render(driver_path, ">>> deeply nested")
assert out.count("<blockquote>") == 3
assert out.count("</blockquote>") == 3
def test_quote_then_heading(self, driver_path):
out = _render(driver_path, "> Note this.\n\n## Heading")
assert "<blockquote>Note this.</blockquote>" in out
assert "<h2>Heading</h2>" in out
def test_crlf_does_not_leak_carriage_return(self, driver_path):
out = _render(driver_path, "Line1\r\nLine2\r\nLine3")
assert "\r" not in out, f"CRLF must be normalised; got {out!r}"
def test_llm_multiparagraph_quote_with_list(self, driver_path):
"""The shape an LLM emits when summarising decisions inside a quote."""
src = (
"> Here are the key points:\n"
">\n"
"> - Point one\n"
"> - Point two\n"
">\n"
"> And a closing remark."
)
out = _render(driver_path, src)
assert "<blockquote>" in out
assert "<ul>" in out
assert "<li>Point one</li>" in out
assert "<li>Point two</li>" in out
assert "And a closing remark." in out
# No leading-space artifacts in the quoted text
assert "\n " not in out.replace("</blockquote>", "")