Merge pull request #1534 from nesquena/stage-279

v0.50.279 — 8-PR batch from full PR sweep + Opus MUST-FIX caught
This commit is contained in:
nesquena-hermes
2026-05-03 09:26:03 -07:00
committed by GitHub
16 changed files with 287 additions and 25 deletions
+29
View File
@@ -1,5 +1,34 @@
# Hermes Web UI -- Changelog
## [v0.50.279] — 2026-05-03
### Fixed (8-PR batch from full PR sweep — closes #1463, #1491, #1503, #1509, #1522)
- **Branch indicator codepoint corrected** (#1523, @franksong2702; closes #1522) — the fork-indicator glyph in the sidebar was rendering `⒂ PARENTHESIZED DIGIT FIFTEEN` (`\u2482`) instead of the intended `⑂ OCR FORK` (`\u2442`). Forked sessions appeared with a mysterious "(15)" prefix that looked like a message count or unread badge — users would click expecting something related to "15" and find nothing. The actual fork indicator was invisible. One-character fix in `static/sessions.js:1657` plus the matching test assertion update.
- **Onboarding API-key field stops losing focus during probe** (#1519, @franksong2702; closes #1503) — the wizard's API-key input had `oninput="_scheduleOnboardingProbe()"` firing a 400ms-debounced probe on every keystroke. When the probe completed, `_renderOnboardingBody()` rebuilt the entire form DOM, destroying the `<input>` element the user was typing into. On localhost the probe completes in ~5-50ms so the bug window was narrow; on slow networks (VPN, corporate proxy, cold-start vLLM) the re-render routinely landed between keystrokes. Especially painful on the password field where users paste long secrets. **Fix:** removed `_scheduleOnboardingProbe()` from the api-key input's `oninput` handler in `static/onboarding.js:200`; added `onblur="_runOnboardingProbe()"` so the probe still fires when the user tabs away. The probe also still fires via the "Test connection" button and `nextOnboardingStep()` before Continue — no flow breakage.
- **Voice-mode pref toggle-off now stops the recognizer** (#1518, @franksong2702; closes #1491) — if a user enabled the hands-free voice mode (PR #1489, v0.50.271), started a conversation, then opened Settings → Preferences and disabled the pref, the button disappeared but the SpeechRecognition kept running. The user had no way to stop it short of reloading the page — and it was consuming microphone access + battery the whole time. **Fix:** `_applyVoiceModePref()` in `static/boot.js` now reads the pref into a local `enabled` variable and calls `_deactivate()` (the standard cleanup path that stops recognition, clears timers, restores TTS, resets UI state) when `!enabled && _voiceModeActive`. Plus a TDZ-safety hoist: `let _voiceModeActive = false` moved above `_applyVoiceModePref()` (was previously declared after the function — Temporal Dead Zone risk if the function were ever called before init).
- **YAML code blocks render with newlines** (#1516, @franksong2702; closes #1463) — Prism's YAML grammar wraps tokens in `<span class="token …">` elements where `white-space` defaults to `normal`, collapsing `\n` characters into spaces even when the underlying `textContent` preserved them. Plain code blocks and `language-bash` rendered correctly; only `language-yaml` was affected. YAML is one of the most common LLM output formats (config files, docker-compose, CI pipelines, Kubernetes manifests) — flattened YAML in chat is unreadable. **Fix:** two CSS rules in `static/style.css` forcing `white-space: pre !important` on `.msg-body pre code.language-yaml .token` and `.preview-md pre code.language-yaml .token`. Scoped tightly to YAML — no impact on other languages. Verified via the reporter's two diagnostic probes (`textContent` had `\n`, only `language-yaml` was affected) that the renderer pipeline was correct and the fix needed to be at the CSS layer.
- **Service-worker placeholder consolidation** (#1517, @franksong2702; closes #1509) — `__CACHE_VERSION__` (in `static/sw.js`) and `__WEBUI_VERSION__` (in `static/index.html`) were functionally identical: both substituted at request time via `quote(WEBUI_VERSION, safe="")`. Two names existed for historical reasons (different files added at different releases). Naming hygiene flagged by both the independent reviewer and the Opus advisor during the v0.50.276 release review. **Fix:** rename `__CACHE_VERSION__``__WEBUI_VERSION__` across `static/sw.js`, `api/routes.py`, `tests/test_pwa_manifest_sw.py`. Pure rename, no behavior change — same `?v=vX.Y.Z` query strings on the same URLs at the wire.
- **WebUI-origin state.db sessions recoverable when JSON sidecar missing** (#1532, @ai-ag2026; refs #1471) — when a WebUI-origin session existed in `state.db.sessions` / `state.db.messages` but the matching `~/.hermes/webui/sessions/<id>.json` sidecar was missing (possible after disk-write failures, partial restore, or interrupted writes), the session was invisible to `/api/sessions` even though the canonical SQLite messages were intact. Root cause: `read_importable_agent_session_rows()` had a hard-coded `s.source != 'webui'` predicate that re-applied the filter even when callers opted out via `exclude_sources=None`. Slice 1 of the #1471 session-recovery class. **Fix:** `api/agent_sessions.py` makes the default exclusion explicit (`("cron", "webui")`) and removes the hard-coded predicate so `exclude_sources=None` actually includes WebUI-origin rows. New regression test `test_webui_state_db_session_without_sidecar_appears_when_agent_sessions_enabled`.
- **Stale runtime stream state cleared proactively** (#1525, @ai-ag2026; refs #1471) — session JSON could retain `active_stream_id` plus paired pending fields (`pending_user_message`, `pending_attachments`, `pending_started_at`) after a stream failure, provider exception, or server restart. `/health` would correctly report `active_streams: 0`, but `/sessions/<id>` would still claim `agent_running` (pure truthiness on `s.active_stream_id`) and the frontend's `INFLIGHT[sid]` would keep the UI busy on a dead stream. Slice 2 of the #1471 session-recovery class, distinct from #1532's "session in DB but no sidecar" path. **Fix:** new `_clear_stale_stream_state()` helper in `api/streaming.py` runs proactively at the read boundary (`/sessions/<id>` GET) and before new turns start. Verifies the stream is actually missing from `STREAMS` (the in-memory registry) before clearing — never expires live streams by age. Frontend half: `static/sessions.js` clears `INFLIGHT[sid]` when the server reports no `active_stream_id`. **Maintainer merge-conflict resolution:** kept the rename-side `CACHE_NAME = 'hermes-shell-__WEBUI_VERSION__'` (post-#1517 rename) over the PR's manual `-stale-stream-cleanup1` suffix. The renamed placeholder still auto-bumps with each release through `quote(WEBUI_VERSION, safe="")`, so the manual suffix was redundant — natural version bump (v0.50.278 → v0.50.279) already invalidates the old cache via `caches.delete(k)` for `k !== CACHE_NAME` in the SW activate handler. 5 new regression tests in `test_stale_stream_cleanup.py`.
- **WebUI max_tokens forwarded to agent + OpenRouter quota classifier** (#1526, @ai-ag2026; refs #1524) — WebUI agent initialization didn't pass the configured `max_tokens` to `AIAgent`, so provider-native output ceilings could be requested. On OpenRouter this could fail with quota-style HTTP 402 messages like `more credits`, `can only afford`, `fewer max_tokens`. Pre-fix, those phrases weren't classified as quota failures and didn't trigger the fallback chain — users saw raw 402 errors instead of automatic fallback to a less-expensive model. **Fix:** `api/streaming.py` reads configured `max_tokens` from top-level + `agent.max_tokens` fallback, parses positive integers, includes both `max_tokens` and the fallback state in the `SESSION_AGENT_CACHE` signature (so config changes don't reuse a stale cached agent), and passes `max_tokens` to `AIAgent` only when the constructor supports it (uses `inspect.signature(AIAgent.__init__)` rather than a try/except that would swallow real `TypeError`s). Quota classifier additions for the three OpenRouter phrases route to the same fallback chain as existing quota markers. New regression tests in `test_streaming_max_tokens_quota.py`.
### Notes
- 3936 → **3946** tests passing (+9 from constituent PRs + 1 conflict-marker regression guard added in-release per Opus MUST-FIX).
- Pre-release Opus advisor pass: **caught a MUST-FIX (sw.js merge-conflict markers still in tree despite earlier `git add`/`commit`)** that would have shipped a broken service worker. Resolution applied in stage and a `test_sw_js_has_no_merge_conflict_markers` regression guard added so this can't happen silently again. One SHOULD-FIX (race in `_clear_stale_stream_state` between registry-check and session-mutate) explicitly deferred to follow-up #1533 per Opus's "fine to defer given the narrow window" advice — bounded effect (orphaned stream requires retry, no data corruption).
- One merge conflict resolved during stage build (#1525 vs #1517 cache-name placeholder collision); resolution drops PR #1525's manual `-stale-stream-cleanup1` suffix in favor of the canonical `__WEBUI_VERSION__` token (natural release-bump preserves the cache-invalidation guarantee).
- 2 PRs closed as duplicates during sweep: #1528 (identical to #1517) and #1529 (superseded by #1516, `.preview-md` coverage missing).
- 5 PRs stay on hold: #1418 (hard prereq hermes-agent#18534 not yet merged), #1464 (blocker — `noResults` ternary inverted, awaiting JKJameson fix), #1404 (UX — aronprins width feedback unresolved), #1353 (already `ready-for-review` tagged, durability path needs independent review), #1311 (draft + CONFLICTING).
- 1 PR routed to maintainer-review: #1531 (Asunfly stowaway change in force-push to title aux generation that wasn't in PR description; awaiting scope decision).
## [v0.50.278] — 2026-05-03
### Added (1 PR — splices best of #1497 + #1513)
+1 -1
View File
@@ -3,7 +3,7 @@
> Goal: Full 1:1 parity with the Hermes CLI experience via a clean dark web UI.
> Everything you can do from the CLI terminal, you can do from this UI.
>
> Last updated: v0.50.278 (May 03, 2026) — 3936 tests collected
> Last updated: v0.50.279 (May 03, 2026) — 3946 tests collected
> Tests: `pytest tests/ --collect-only -q`
> Source: <repo>/
+2 -2
View File
@@ -1835,8 +1835,8 @@ Bridged CLI sessions:
---
*Last updated: v0.50.278, May 03, 2026*
*Total automated tests collected: 3936*
*Last updated: v0.50.279, May 03, 2026*
*Total automated tests collected: 3946*
*Regression gate: tests/test_regressions.py*
*Run: pytest tests/ -v --timeout=60*
*Source: <repo>/*
+3 -3
View File
@@ -214,9 +214,9 @@ def read_importable_agent_session_rows(
db_path: Path,
limit: int = 200,
log=None,
exclude_sources: tuple[str, ...] | None = ("cron",),
exclude_sources: tuple[str, ...] | None = ("cron", "webui"),
) -> list[dict]:
"""Return non-WebUI agent sessions projected as importable conversations.
"""Return agent sessions projected as importable conversations.
Hermes Agent can create rows in ``state.db.sessions`` before a session has
any messages, and long conversations can be split into compression-linked
@@ -256,7 +256,7 @@ def read_importable_agent_session_rows(
ended_expr = _optional_col('ended_at', session_cols)
end_reason_expr = _optional_col('end_reason', session_cols)
where_clauses = ["s.source IS NOT NULL", "s.source != 'webui'"]
where_clauses = ["s.source IS NOT NULL"]
params: list[str] = []
if exclude_sources:
excluded = tuple(str(source) for source in exclude_sources if source)
+32 -2
View File
@@ -233,6 +233,34 @@ from api.helpers import (
_redact_text,
)
def _clear_stale_stream_state(session) -> bool:
"""Clear persisted streaming flags when the in-memory stream no longer exists.
A server restart or worker crash can leave active_stream_id/pending_* in the
session JSON while STREAMS is empty. The frontend then keeps reconnecting to
a dead stream and shows a permanent running/thinking state.
"""
stream_id = getattr(session, "active_stream_id", None)
if not stream_id:
return False
with STREAMS_LOCK:
stream_alive = stream_id in STREAMS
if stream_alive:
return False
session.active_stream_id = None
if hasattr(session, "pending_user_message"):
session.pending_user_message = None
if hasattr(session, "pending_attachments"):
session.pending_attachments = []
if hasattr(session, "pending_started_at"):
session.pending_started_at = None
try:
session.save()
except Exception:
pass
return True
# ── CSRF: validate Origin/Referer on POST ────────────────────────────────────
import re as _re
@@ -1188,7 +1216,7 @@ def handle_get(handler, parsed) -> bool:
from api.updates import WEBUI_VERSION
version_token = quote(WEBUI_VERSION, safe="")
text = sw_path.read_text(encoding="utf-8").replace(
"__CACHE_VERSION__", version_token
"__WEBUI_VERSION__", version_token
)
data = text.encode("utf-8")
handler.send_response(200)
@@ -1309,6 +1337,7 @@ def handle_get(handler, parsed) -> bool:
try:
_t1 = _time.monotonic()
s = get_session(sid, metadata_only=(not load_messages))
_clear_stale_stream_state(s)
_t2 = _time.monotonic()
effective_model = (
_resolve_effective_session_model_for_display(s)
@@ -1435,6 +1464,7 @@ def handle_get(handler, parsed) -> bool:
return bad(handler, "Missing session_id")
try:
from api.session_ops import session_status
_clear_stale_stream_state(get_session(sid, metadata_only=True))
return j(handler, session_status(sid))
except KeyError:
return bad(handler, "Session not found", 404)
@@ -4265,7 +4295,7 @@ def _handle_chat_start(handler, body):
status=409,
)
# Stale stream id from a previous run; clear and continue.
s.active_stream_id = None
_clear_stale_stream_state(s)
stream_id = uuid.uuid4().hex
with _get_session_agent_lock(s.session_id):
s.workspace = workspace
+29
View File
@@ -1792,6 +1792,25 @@ def _run_agent_streaming(
import inspect as _inspect
_agent_params = set(_inspect.signature(_AIAgent.__init__).parameters)
# CLI-parity max output cap: read config.yaml's max_tokens and pass
# it to AIAgent when supported. Without this WebUI-created agents use
# provider-native output ceilings (e.g. Claude via OpenRouter can
# request 64k), which may turn an otherwise usable fallback into a
# 402 "more credits / fewer max_tokens" failure.
_max_tokens_cfg = None
try:
_raw_max_tokens = _cfg.get('max_tokens')
if _raw_max_tokens is None:
_agent_cfg_for_tokens = _cfg.get('agent', {})
if isinstance(_agent_cfg_for_tokens, dict):
_raw_max_tokens = _agent_cfg_for_tokens.get('max_tokens')
if _raw_max_tokens is not None:
_parsed_max_tokens = int(_raw_max_tokens)
if _parsed_max_tokens > 0:
_max_tokens_cfg = _parsed_max_tokens
except Exception:
_max_tokens_cfg = None
# CLI-parity reasoning effort: read agent.reasoning_effort from the
# active profile's config.yaml (the same key the CLI writes via
# `/reasoning <level>`) and hand the parsed dict to AIAgent. When
@@ -1830,6 +1849,8 @@ def _run_agent_streaming(
# but guard defensively to avoid TypeError on an older agent build.
if 'reasoning_config' in _agent_params and _reasoning_config is not None:
_agent_kwargs['reasoning_config'] = _reasoning_config
if 'max_tokens' in _agent_params and _max_tokens_cfg is not None:
_agent_kwargs['max_tokens'] = _max_tokens_cfg
# Params added in newer hermes-agent — skip if not supported
if 'api_mode' in _agent_params:
_agent_kwargs['api_mode'] = _rt.get('api_mode')
@@ -1861,6 +1882,8 @@ def _run_agent_streaming(
_hashlib.sha256((resolved_api_key or '').encode()).hexdigest()[:16],
resolved_base_url or '',
resolved_provider or '',
_max_tokens_cfg or '',
_fallback_resolved or {},
sorted(_toolsets) if _toolsets else [],
], sort_keys=True)
_agent_sig = _hashlib.sha256(_sig_blob.encode()).hexdigest()[:16]
@@ -2098,6 +2121,9 @@ def _run_agent_streaming(
'insufficient credit' in _err_lower
or 'credit balance' in _err_lower
or 'credits exhausted' in _err_lower
or 'more credits' in _err_lower
or 'can only afford' in _err_lower
or 'fewer max_tokens' in _err_lower
or 'quota_exceeded' in _err_lower
or 'quota exceeded' in _err_lower
or 'exceeded your current quota' in _err_lower
@@ -2433,6 +2459,9 @@ def _run_agent_streaming(
'insufficient credit' in _exc_lower
or 'credit balance' in _exc_lower
or 'credits exhausted' in _exc_lower
or 'more credits' in _exc_lower
or 'can only afford' in _exc_lower
or 'fewer max_tokens' in _exc_lower
or 'quota_exceeded' in _exc_lower
or 'quota exceeded' in _exc_lower
or 'exceeded your current quota' in _exc_lower
+5 -2
View File
@@ -470,14 +470,17 @@ window._micPendingSend=window._micPendingSend||false;
try{ return localStorage.getItem('hermes-voice-mode-button')==='true'; }
catch(_){ return false; }
}
let _voiceModeActive=false;
function _applyVoiceModePref(){
modeBtn.style.display = _voiceModePrefEnabled() ? '' : 'none';
const enabled = _voiceModePrefEnabled();
modeBtn.style.display = enabled ? '' : 'none';
if(!enabled && _voiceModeActive) _deactivate();
}
_applyVoiceModePref();
// Expose so the settings pane can re-apply immediately on toggle.
window._applyVoiceModePref = _applyVoiceModePref;
let _voiceModeActive=false;
let _voiceModeState='idle'; // idle | listening | thinking | speaking
let _recognition=null;
let _silenceTimer=null;
+1 -1
View File
@@ -197,7 +197,7 @@ function _renderOnboardingApiKeyField(){
const labelKey=keyOptional?'onboarding_api_key_label_optional':'onboarding_api_key_label';
const placeholderKey=keyOptional?'onboarding_api_key_placeholder_optional':'onboarding_api_key_placeholder';
const helpHtml=keyOptional?`<p class="onboarding-copy onboarding-api-key-help">${esc(t('onboarding_api_key_help_keyless')||'')}</p>`:'';
return `<label class="onboarding-field" id="onboardingApiKeyField"><span>${t(labelKey)}</span><input id="onboardingApiKeyInput" type="password" value="${esc(ONBOARDING.form.apiKey||'')}" placeholder="${t(placeholderKey)}" oninput="ONBOARDING.form.apiKey=this.value;_scheduleOnboardingProbe()"></label>${helpHtml}`;
return `<label class="onboarding-field" id="onboardingApiKeyField"><span>${t(labelKey)}</span><input id="onboardingApiKeyInput" type="password" value="${esc(ONBOARDING.form.apiKey||'')}" placeholder="${t(placeholderKey)}" oninput="ONBOARDING.form.apiKey=this.value" onblur="_runOnboardingProbe()"></label>${helpHtml}`;
}
function _getOnboardingSelectedModel(){
+10 -1
View File
@@ -387,6 +387,15 @@ async function loadSession(sid){
_setActiveSessionUrl(S.session.session_id);
const activeStreamId=S.session.active_stream_id||null;
// If the server says the session is idle, discard any browser-side inflight
// cache left behind by a crashed/restarted stream. Otherwise the UI can keep
// showing a permanent thinking/running state even though active_streams=0.
if(!activeStreamId&&INFLIGHT[sid]){
delete INFLIGHT[sid];
if(typeof clearInflightState==='function') clearInflightState(sid);
S.activeStreamId=null;
S.busy=false;
}
// Phase 2a: If session is streaming, restore from INFLIGHT cache before
// loading full messages (INFLIGHT state is self-contained and sufficient).
@@ -1654,7 +1663,7 @@ function renderSessionListFromCache(){
if(s.parent_session_id){
const branchInd=document.createElement('span');
branchInd.className='session-branch-indicator';
branchInd.textContent='\u2482'; // ⑂
branchInd.textContent='\u2442'; // ⑂
branchInd.title=(typeof t==='function'?t('forked_from'):'Forked from')+' '+s.parent_session_id;
branchInd.style.cursor='pointer';
branchInd.onclick=(e)=>{
+4
View File
@@ -735,6 +735,8 @@
.msg-body pre code{background:none;padding:0;border-radius:0;color:var(--pre-text);font-size:13px;line-height:1.6;}
/* Keep original theme background — prevent prism-tomorrow from overriding --code-bg */
.msg-body pre[class*="language-"],.msg-body pre code[class*="language-"]{background:var(--code-bg) !important;}
/* Fix #1463: Prism YAML grammar collapses newlines inside token spans — force pre */
.msg-body pre code.language-yaml .token{white-space:pre !important;}
.pre-header{font-size:10px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:var(--muted);padding:8px 16px 8px;background:var(--input-bg);border-radius:10px 10px 0 0;border:1px solid var(--border);border-bottom:1px solid var(--border);display:flex;align-items:center;gap:6px;}
.pre-header::before{content:'';width:8px;height:8px;border-radius:50%;background:var(--muted);opacity:.4;}
.pre-header+pre{border-radius:0 0 10px 10px;border-top:none;margin-top:0;}
@@ -1128,6 +1130,8 @@
.preview-md pre code{background:none;padding:0;color:var(--pre-text);font-size:11.5px;line-height:1.55;}
/* Keep original theme background — prevent prism-tomorrow from overriding --code-bg */
.preview-md pre[class*="language-"],.preview-md pre code[class*="language-"]{background:var(--code-bg) !important;}
/* Fix #1463: Prism YAML grammar collapses newlines inside token spans — force pre */
.preview-md pre code.language-yaml .token{white-space:pre !important;}
.preview-md blockquote{border-left:3px solid var(--blue);padding-left:12px;color:var(--muted);font-style:italic;margin:8px 0;}
.preview-md blockquote p{margin:0;}
.preview-md strong{color:var(--strong);font-weight:600;}.preview-md em{color:var(--em);}
+3 -3
View File
@@ -7,18 +7,18 @@
// Cache version is injected by the server at request time (routes.py /sw.js handler).
// Bumps automatically whenever the git commit changes — no manual edits needed.
const CACHE_NAME = 'hermes-shell-__CACHE_VERSION__';
const CACHE_NAME = 'hermes-shell-__WEBUI_VERSION__';
// Static assets that form the app shell.
//
// Versioned assets (CSS + JS) include `?v=__CACHE_VERSION__` to match the
// Versioned assets (CSS + JS) include `?v=__WEBUI_VERSION__` to match the
// query string the page sends — see index.html. Without the version query
// here, every cache lookup against `?v=...` URLs would miss and fall through
// to network, defeating the pre-cache.
//
// Unversioned assets (`./`, manifest.json, favicons) are referenced from
// index.html without a cache-bust query, so they stay unversioned here too.
const VQ = '?v=__CACHE_VERSION__';
const VQ = '?v=__WEBUI_VERSION__';
const SHELL_ASSETS = [
'./',
'./static/style.css' + VQ,
+1 -1
View File
@@ -225,7 +225,7 @@ def test_sidebar_parent_indicator():
"sessions.js should check parent_session_id"
assert 'session-branch-indicator' in src, \
"Should have session-branch-indicator class"
assert '\\u2482' in src, \
assert '\\u2442' in src, \
"Should use ⑂ character for parent indicator"
+35
View File
@@ -208,6 +208,41 @@ def test_gateway_sessions_appear_when_enabled():
post('/api/settings', {'show_cli_sessions': False})
def test_webui_state_db_session_without_sidecar_appears_when_agent_sessions_enabled():
"""Regression: WebUI-origin rows in state.db can recover missing JSON sidecars."""
conn = _ensure_state_db()
sid = 'webui_state_only_001'
try:
_insert_agent_session_row(
conn,
session_id=sid,
source='webui',
title='Recovered WebUI Session',
model='openai/gpt-5',
messages=2,
)
post('/api/settings', {'show_cli_sessions': True})
data, status = get('/api/sessions')
assert status == 200
sessions = data.get('sessions', [])
recovered = [s for s in sessions if s.get('session_id') == sid]
assert len(recovered) == 1, (
"WebUI-origin sessions that exist in state.db but have no JSON sidecar "
"should be surfaced through the agent-session bridge for recovery."
)
assert recovered[0].get('source_tag') == 'webui'
assert recovered[0].get('is_cli_session') is True
finally:
try:
_remove_test_sessions(conn, sid)
conn.close()
except Exception:
pass
post('/api/settings', {'show_cli_sessions': False})
def test_gateway_sessions_without_messages_are_hidden_from_sidebar():
"""Regression: empty agent session rows must not appear as broken sidebar entries."""
conn = _ensure_state_db()
+28 -9
View File
@@ -2,7 +2,7 @@
Covers:
- manifest.json is valid JSON with required PWA fields
- sw.js has the `__CACHE_VERSION__` placeholder the server replaces at request time
- sw.js has the `__WEBUI_VERSION__` placeholder the server replaces at request time
- sw.js offline-fallback uses a resolved promise (not `caches.match() || fallback`
which is broken Promise objects are always truthy in `||` checks, so the
fallback Response would never be used)
@@ -52,11 +52,30 @@ class TestManifest:
class TestServiceWorker:
def test_sw_has_cache_version_placeholder(self):
src = SW.read_text(encoding="utf-8")
assert "__CACHE_VERSION__" in src, (
"sw.js must contain __CACHE_VERSION__ placeholder for the server "
assert "__WEBUI_VERSION__" in src, (
"sw.js must contain __WEBUI_VERSION__ placeholder for the server "
"handler at /sw.js to replace with WEBUI_VERSION at request time"
)
def test_sw_js_has_no_merge_conflict_markers(self):
"""Regression guard for v0.50.279 stage build: a leftover git conflict
marker in static/sw.js made the file fail to parse as JavaScript even
though the substring-based source-string tests still passed (the
``__WEBUI_VERSION__`` token was present, just inside the conflict block).
A broken sw.js means the install handler throws on script load SW
never reaches activated state old SW keeps controlling the page
every "old SW deletes other caches" guarantee is forfeited and frontend
cache-bust pathways silently break. Caught by Opus advisor pre-merge,
ship blocked. This test would have caught it too.
"""
src = SW.read_text(encoding="utf-8")
for marker in ("<<<<<<<", "=======\n", ">>>>>>>"):
assert marker not in src, (
f"static/sw.js contains conflict marker {marker!r}; "
"the merge resolution did not actually land. Reject ship."
)
def test_sw_bypasses_api_and_stream(self):
src = SW.read_text(encoding="utf-8")
assert "/api/" in src, "SW must bypass /api/* (no cached auth/session responses)"
@@ -117,8 +136,8 @@ class TestPWARoutes:
idx = src.find('"/sw.js"')
assert idx != -1, "routes.py must handle /sw.js"
block = src[idx:idx + 1000]
assert "__CACHE_VERSION__" in block, (
"sw.js route must replace __CACHE_VERSION__ with the current WEBUI_VERSION"
assert "__WEBUI_VERSION__" in block, (
"sw.js route must replace __WEBUI_VERSION__ with the current WEBUI_VERSION"
)
assert "WEBUI_VERSION" in block, (
"sw.js route must import and use WEBUI_VERSION for cache busting"
@@ -185,7 +204,7 @@ class TestIndexHtmlIntegration:
def test_sw_shell_assets_match_versioned_asset_urls(self):
"""The service worker's SHELL_ASSETS pre-cache list must use the same
`?v=__CACHE_VERSION__` suffix on JS+CSS that index.html sends, so that
`?v=__WEBUI_VERSION__` suffix on JS+CSS that index.html sends, so that
the pre-cached entries actually serve when the page requests them.
Without this, every `cache.match()` for a versioned asset URL (e.g.
@@ -208,13 +227,13 @@ class TestIndexHtmlIntegration:
"terminal.js",
"onboarding.js",
):
# Either inline `?v=__CACHE_VERSION__` or via the VQ constant
# Either inline `?v=__WEBUI_VERSION__` or via the VQ constant
# produces a URL string the cache lookup can match.
has_inline = f"{asset}?v=__CACHE_VERSION__" in src
has_inline = f"{asset}?v=__WEBUI_VERSION__" in src
has_concat = f"{asset}' + VQ" in src or f"{asset}\" + VQ" in src
assert has_inline or has_concat, (
f"sw.js SHELL_ASSETS entry for {asset} must carry "
"?v=__CACHE_VERSION__ to match the URL the page requests"
"?v=__WEBUI_VERSION__ to match the URL the page requests"
)
def test_index_route_url_encodes_asset_version(self):
+65
View File
@@ -0,0 +1,65 @@
from pathlib import Path
REPO = Path(__file__).resolve().parents[1]
ROUTES_SRC = (REPO / "api" / "routes.py").read_text(encoding="utf-8")
SESSIONS_SRC = (REPO / "static" / "sessions.js").read_text(encoding="utf-8")
SW_SRC = (REPO / "static" / "sw.js").read_text(encoding="utf-8")
def test_stale_stream_cleanup_helper_exists():
assert "def _clear_stale_stream_state(session)" in ROUTES_SRC
assert "stream_id in STREAMS" in ROUTES_SRC
assert "session.active_stream_id = None" in ROUTES_SRC
assert "session.pending_user_message = None" in ROUTES_SRC
assert "session.pending_attachments = []" in ROUTES_SRC
assert "session.pending_started_at = None" in ROUTES_SRC
assert "session.save()" in ROUTES_SRC
def test_session_load_clears_stale_stream_before_response():
load_pos = ROUTES_SRC.index("s = get_session(sid, metadata_only=(not load_messages))")
cleanup_pos = ROUTES_SRC.index("_clear_stale_stream_state(s)", load_pos)
response_pos = ROUTES_SRC.index('"active_stream_id": getattr(s, "active_stream_id", None)', cleanup_pos)
assert load_pos < cleanup_pos < response_pos
def test_chat_start_clears_stale_pending_state_not_only_active_id():
stale_comment_pos = ROUTES_SRC.index("# Stale stream id from a previous run; clear and continue.")
cleanup_pos = ROUTES_SRC.index("_clear_stale_stream_state(s)", stale_comment_pos)
stream_id_pos = ROUTES_SRC.index("stream_id = uuid.uuid4().hex", cleanup_pos)
assert stale_comment_pos < cleanup_pos < stream_id_pos
def test_frontend_drops_inflight_cache_when_server_session_is_idle():
marker = "If the server says the session is idle, discard any browser-side inflight"
marker_pos = SESSIONS_SRC.index(marker)
window = SESSIONS_SRC[marker_pos:marker_pos + 500]
assert "if(!activeStreamId&&INFLIGHT[sid])" in window
assert "delete INFLIGHT[sid]" in window
assert "clearInflightState" in window
assert "S.busy=false" in window
def test_service_worker_cache_bumped_for_frontend_fix_delivery():
"""The SW CACHE_NAME must be keyed on the WEBUI_VERSION placeholder so
every release naturally invalidates the previous shell cache and delivers
the frontend half of the stale-stream cleanup fix to existing browsers.
Originally pinned a manual `-stale-stream-cleanup1` suffix on
`CACHE_NAME` (PR #1525 author shipped that to force-bump existing
SWs). During the v0.50.279 stage build that suffix collided with the
independent #1517 placeholder rename (`__CACHE_VERSION__` →
`__WEBUI_VERSION__`), so the maintainer dropped the manual suffix in
favor of the canonical version-token path. The natural bump still
invalidates the old cache via `keys.filter((k) => k !== CACHE_NAME)`
in the activate handler same delivery guarantee, less churn.
"""
# CACHE_NAME must include the WEBUI_VERSION placeholder so each release
# produces a different cache name. The activate handler then deletes any
# cache whose key != current CACHE_NAME, so the old shell is reaped on
# every upgrade and the new sessions.js (with the INFLIGHT[sid] clear)
# ships to existing browsers.
assert "CACHE_NAME = 'hermes-shell-__WEBUI_VERSION__'" in SW_SRC, (
"SW CACHE_NAME must include __WEBUI_VERSION__ so each release "
"invalidates the previous cache and delivers frontend changes."
)
+39
View File
@@ -0,0 +1,39 @@
"""Regression coverage for WebUI streaming provider failure handling.
The incident this guards against: WebUI-created AIAgent instances did not pass
config.yaml's max_tokens, so a fallback Claude model via OpenRouter requested its
native 64k output ceiling and failed with HTTP 402 "more credits / fewer
max_tokens". The stream then looked like a stuck Thinking card instead of a
clear quota error.
"""
from pathlib import Path
STREAMING = Path(__file__).resolve().parents[1] / "api" / "streaming.py"
def _src() -> str:
return STREAMING.read_text(encoding="utf-8")
def test_streaming_passes_configured_max_tokens_to_agent():
src = _src()
assert "_raw_max_tokens = _cfg.get('max_tokens')" in src
assert "_agent_cfg_for_tokens.get('max_tokens')" in src
assert "_agent_kwargs['max_tokens'] = _max_tokens_cfg" in src
def test_streaming_agent_cache_signature_includes_max_tokens_and_fallback():
src = _src()
assert "_max_tokens_cfg or ''" in src
assert "_fallback_resolved or {}" in src
def test_openrouter_more_credits_error_is_classified_as_quota():
src = _src()
assert "'more credits' in _err_lower" in src
assert "'can only afford' in _err_lower" in src
assert "'fewer max_tokens' in _err_lower" in src
assert "'more credits' in _exc_lower" in src
assert "'can only afford' in _exc_lower" in src
assert "'fewer max_tokens' in _exc_lower" in src