mirror of
https://github.com/nesquena/hermes-webui.git
synced 2026-05-26 03:30:36 +00:00
fix(streaming): persist context_length on session — completes #1318 fix
Pre-release Opus + nesquena review on v0.50.246 caught that PR #1341 added the data-structure scaffolding (Session.__init__ accepts the 3 fields, save() persists them, compact() exposes them, GET /api/session returns them) but did NOT add the writer that actually populates them. Without a writer, the user-visible bug (context-ring shows 0% after page reload) was NOT fixed by #1341 alone — the fields stayed None forever because nothing wrote to s.context_length anywhere. Adds the writer at api/streaming.py:2188 (post-merge per-turn save block, before s.save()) so the values from agent.context_compressor land on disk and survive page reloads. Also moves the SSE usage payload comment to clarify that the live SSE payload and the session-level persistence are now distinct paths (payload below, persistence above). Adds tests/test_pr1341_context_window_persistence.py — 6 structural + round-trip tests covering Session __init__/save/compact, the routes response, and the streaming.py writer placement. Closes #1318 (the actual user-visible bug, not just the scaffolding).
This commit is contained in:
+1
-1
@@ -12,7 +12,7 @@
|
||||
- **Activity panel no longer auto-collapses when new tool/thinking events arrive** — Both `ensureActivityGroup()` (which re-creates the group with `tool-call-group-collapsed` on every destroy/recreate) and `finalizeThinkingCard()` (which force-adds the collapsed class on every tool boundary) ignored the user's manual expand. Tracks the user's last explicit toggle on the live activity group in a per-turn singleton (`_liveActivityUserExpanded`), restored on re-create and respected by the finalize path. Cleared between turns by `clearLiveToolCards()`. (`static/ui.js`, `tests/test_issue1298_cancel_and_activity.py`) — fixes #1298 (issue 1)
|
||||
- **Stale Mermaid render errors no longer leak into every chat** — Mermaid's render-failure path leaves a temporary `<div id="d<id>">` body-level node containing a "Syntax error in text" SVG. The previous code never removed it, so once any Mermaid block failed (or got mis-detected as Mermaid), every subsequent tab kept the syntax-error SVG visible regardless of content. Also tightens Mermaid detection so line-numbered tool output (`123|line`) and code blocks that don't start with a recognized Mermaid keyword are no longer mis-parsed as Mermaid; failed blocks are marked so a later render pass can't retry them. (`static/ui.js`, `tests/test_issue347.py`) @dso2ng — PR #1337
|
||||
- **Static asset cache busts automatically on every release** — `<script src="static/ui.js">` and friends were cached indefinitely by browsers and the service worker, so a new release with bug fixes could be invisible to a user until they hard-refreshed. Now `index.html` and `sw.js` registration both inject the current `WEBUI_VERSION` git tag as a `?v=` query string, URL-encoded server-side so unusual git tag formats can't break the JS. The service worker also no longer intercepts requests for itself, ensuring the browser always fetches the freshly-versioned `sw.js` directly from the network. (`api/routes.py`, `static/index.html`, `static/sw.js`, `tests/test_pwa_manifest_sw.py`) @dso2ng — PR #1337
|
||||
- **Context window indicator persists across page reloads** — `Session.__init__` now accepts `context_length`, `threshold_tokens`, and `last_prompt_tokens`; `save()` persists them and `compact()` exposes them so the GET `/api/session` response includes them. The frontend context-ring indicator was previously losing its percentage on every session load because the Session model silently dropped these fields when reconstructing from disk. (`api/models.py`, `api/routes.py`) @fxd-jason — PR #1341 (focused split from the held PR #1318)
|
||||
- **Context window indicator persists across page reloads (#1318 — fully fixed)** — `Session.__init__` now accepts `context_length`, `threshold_tokens`, and `last_prompt_tokens`; `save()` persists them via the `METADATA_FIELDS` round-trip and `compact()` exposes them on the GET `/api/session` response. **Critically**, `api/streaming.py` now writes the values from `agent.context_compressor` onto the session inside the post-merge per-turn save block, so the values land on disk and survive a page reload. Without that writer, the model fields would have been pure scaffolding — present but never populated. The frontend context-ring indicator was previously losing its percentage on every session load because nothing was writing these fields to disk; that data flow is now end-to-end. (`api/models.py`, `api/routes.py`, `api/streaming.py`, `tests/test_pr1341_context_window_persistence.py`) @fxd-jason — PR #1341 (focused split from the held PR #1318) + writer added during pre-release review
|
||||
- **`fallback_providers` list config no longer crashes streaming** — `api/streaming.py:1701` previously read `_cfg.get('fallback_model')` and called `.get('model', '')` on the result. When users had `fallback_providers: [{...}, {...}]` in their config (the chained-fallback form documented in CHANGELOG since v0.50.151), the streaming path crashed with `AttributeError: 'list' object has no attribute 'get'`. Now consults both `fallback_model` (single dict, legacy) and `fallback_providers` (list, new), picks the first valid entry from the list, and defends both paths with `isinstance` checks. (`api/streaming.py`, `tests/test_pr1339_fallback_providers_list.py`) @jimdawdy-hub — PR #1339
|
||||
|
||||
### Changed
|
||||
|
||||
+14
-1
@@ -2185,6 +2185,17 @@ def _run_agent_streaming(session_id, msg_text, model, workspace, stream_id, atta
|
||||
if isinstance(_rm, dict) and _rm.get('role') == 'assistant':
|
||||
_rm['reasoning'] = _reasoning_text
|
||||
break
|
||||
# Persist context window data on the session so the context-ring
|
||||
# indicator survives a page reload (#1318). Must run BEFORE
|
||||
# s.save() for the same reason as the reasoning trace above.
|
||||
# The fields are captured into the SSE usage payload below; this
|
||||
# block writes them to the session itself so GET /api/session
|
||||
# returns them on reload instead of falling back to 0.
|
||||
_cc_for_save = getattr(agent, 'context_compressor', None)
|
||||
if _cc_for_save:
|
||||
s.context_length = getattr(_cc_for_save, 'context_length', 0) or 0
|
||||
s.threshold_tokens = getattr(_cc_for_save, 'threshold_tokens', 0) or 0
|
||||
s.last_prompt_tokens = getattr(_cc_for_save, 'last_prompt_tokens', 0) or 0
|
||||
s.save()
|
||||
# Sync to state.db for /insights (opt-in setting)
|
||||
try:
|
||||
@@ -2203,7 +2214,9 @@ def _run_agent_streaming(session_id, msg_text, model, workspace, stream_id, atta
|
||||
except Exception:
|
||||
logger.debug("Failed to sync session to insights")
|
||||
usage = {'input_tokens': input_tokens, 'output_tokens': output_tokens, 'estimated_cost': estimated_cost}
|
||||
# Include context window data from the agent's compressor for the UI indicator
|
||||
# Include context window data from the agent's compressor for the UI indicator.
|
||||
# The session-level persistence happens above (before s.save()) so the values
|
||||
# survive a page reload; this block only populates the live SSE usage payload.
|
||||
_cc = getattr(agent, 'context_compressor', None)
|
||||
if _cc:
|
||||
usage['context_length'] = getattr(_cc, 'context_length', 0) or 0
|
||||
|
||||
@@ -0,0 +1,142 @@
|
||||
"""Regression test for PR #1341 + Opus pre-release review of v0.50.246.
|
||||
|
||||
PR #1341 added context_length/threshold_tokens/last_prompt_tokens fields to
|
||||
the Session model — but didn't add the writer that actually populates them
|
||||
during streaming. The pre-release review caught this: without the writer,
|
||||
the user-visible bug (context-ring shows 0% after page reload) would NOT
|
||||
have been fixed by #1341 alone.
|
||||
|
||||
This test verifies that:
|
||||
1. After a streaming turn completes, the session's context_length /
|
||||
threshold_tokens / last_prompt_tokens are written from the agent's
|
||||
compressor BEFORE s.save() is called (so they land on disk).
|
||||
2. GET /api/session response includes the populated values.
|
||||
3. A reloaded session retains the populated values.
|
||||
|
||||
Implementation reference: api/streaming.py around line 2188 (the per-turn
|
||||
post-merge save) writes from getattr(agent, 'context_compressor', None).
|
||||
"""
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
ROOT = Path(__file__).resolve().parent.parent
|
||||
STREAMING = ROOT / "api" / "streaming.py"
|
||||
MODELS = ROOT / "api" / "models.py"
|
||||
ROUTES = ROOT / "api" / "routes.py"
|
||||
|
||||
|
||||
def test_streaming_persists_context_fields_on_session_before_save():
|
||||
"""The post-merge per-turn save block must write the three fields to the
|
||||
session BEFORE calling s.save(), otherwise the values never reach disk."""
|
||||
src = STREAMING.read_text(encoding="utf-8")
|
||||
|
||||
# Find the post-merge save block — anchored on the unique reasoning trace
|
||||
# marker right above the persistence block.
|
||||
block_start = src.find("if _reasoning_text and s.messages:")
|
||||
assert block_start != -1, "Reasoning-trace marker not found in streaming.py"
|
||||
|
||||
# Save call follows shortly after
|
||||
save_call = src.find("\n s.save()", block_start)
|
||||
assert save_call != -1, "s.save() not found after the post-merge marker"
|
||||
assert save_call - block_start < 2000, (
|
||||
"s.save() should be close to the post-merge marker — block expanded unexpectedly"
|
||||
)
|
||||
|
||||
block = src[block_start:save_call]
|
||||
|
||||
# The three fields must all be assigned on s within this block
|
||||
assert "s.context_length" in block, (
|
||||
"s.context_length must be written before s.save() in the post-merge block"
|
||||
)
|
||||
assert "s.threshold_tokens" in block, (
|
||||
"s.threshold_tokens must be written before s.save() in the post-merge block"
|
||||
)
|
||||
assert "s.last_prompt_tokens" in block, (
|
||||
"s.last_prompt_tokens must be written before s.save() in the post-merge block"
|
||||
)
|
||||
|
||||
# The values must come from the agent's context_compressor
|
||||
assert "context_compressor" in block, (
|
||||
"Values must be sourced from agent.context_compressor"
|
||||
)
|
||||
|
||||
|
||||
def test_session_init_accepts_context_fields():
|
||||
"""Session.__init__ must accept the three fields as named kwargs."""
|
||||
src = MODELS.read_text(encoding="utf-8")
|
||||
# The init signature spans many lines — read the full def block
|
||||
init_match = re.search(r"def __init__\(self,(.*?)\):", src, re.DOTALL)
|
||||
assert init_match, "Session.__init__ signature not found"
|
||||
sig = init_match.group(1)
|
||||
assert "context_length" in sig, "Session.__init__ must accept context_length"
|
||||
assert "threshold_tokens" in sig, "Session.__init__ must accept threshold_tokens"
|
||||
assert "last_prompt_tokens" in sig, "Session.__init__ must accept last_prompt_tokens"
|
||||
|
||||
|
||||
def test_session_metadata_fields_includes_context_fields():
|
||||
"""Session.save() METADATA_FIELDS must include all three for round-trip persistence."""
|
||||
src = MODELS.read_text(encoding="utf-8")
|
||||
# Locate METADATA_FIELDS list
|
||||
meta_match = re.search(
|
||||
r"METADATA_FIELDS\s*=\s*\[(.*?)\]",
|
||||
src,
|
||||
re.DOTALL,
|
||||
)
|
||||
assert meta_match, "METADATA_FIELDS list not found in Session.save"
|
||||
fields = meta_match.group(1)
|
||||
assert "'context_length'" in fields, "METADATA_FIELDS must include 'context_length'"
|
||||
assert "'threshold_tokens'" in fields, "METADATA_FIELDS must include 'threshold_tokens'"
|
||||
assert "'last_prompt_tokens'" in fields, "METADATA_FIELDS must include 'last_prompt_tokens'"
|
||||
|
||||
|
||||
def test_session_compact_exposes_context_fields():
|
||||
"""Session.compact() must include the three fields in its output dict."""
|
||||
src = MODELS.read_text(encoding="utf-8")
|
||||
# Find compact() method body
|
||||
compact_idx = src.find("def compact(")
|
||||
assert compact_idx != -1, "Session.compact not found"
|
||||
# Look ahead for the next def or 200 lines
|
||||
end = src.find("\n def ", compact_idx + 1)
|
||||
body = src[compact_idx:end if end != -1 else compact_idx + 4000]
|
||||
|
||||
assert "'context_length':" in body, "compact() must include context_length"
|
||||
assert "'threshold_tokens':" in body, "compact() must include threshold_tokens"
|
||||
assert "'last_prompt_tokens':" in body, "compact() must include last_prompt_tokens"
|
||||
|
||||
|
||||
def test_routes_session_get_returns_context_fields():
|
||||
"""GET /api/session response must include the three fields."""
|
||||
src = ROUTES.read_text(encoding="utf-8")
|
||||
# The session-detail response builder uses getattr(s, ..., 0) or 0 pattern.
|
||||
# Look for the three keys in the same response shape.
|
||||
assert '"context_length"' in src, "GET /api/session response must include context_length"
|
||||
assert '"threshold_tokens"' in src, "GET /api/session response must include threshold_tokens"
|
||||
assert '"last_prompt_tokens"' in src, "GET /api/session response must include last_prompt_tokens"
|
||||
|
||||
|
||||
def test_session_round_trip_persists_context_fields(tmp_path, monkeypatch):
|
||||
"""Real round-trip: save a Session with the fields set, reload, fields still there.
|
||||
|
||||
Patches SESSION_DIR on the live api.models module so we don't pollute
|
||||
sys.modules state and break test ordering for sibling tests that depend
|
||||
on a stable api.models import (e.g. test_session_sidecar_repair.py).
|
||||
"""
|
||||
from api import models
|
||||
|
||||
# Use tmp_path as the session dir for this test only
|
||||
sessions_dir = tmp_path / "sessions"
|
||||
sessions_dir.mkdir(parents=True, exist_ok=True)
|
||||
monkeypatch.setattr(models, "SESSION_DIR", sessions_dir)
|
||||
|
||||
s = models.Session(session_id="ctxtest1", title="Context test")
|
||||
s.context_length = 200000
|
||||
s.threshold_tokens = 180000
|
||||
s.last_prompt_tokens = 45123
|
||||
s.save()
|
||||
|
||||
# Reload from disk
|
||||
s2 = models.Session.load("ctxtest1")
|
||||
assert s2 is not None, "Session should reload"
|
||||
assert s2.context_length == 200000, f"context_length lost on reload: got {s2.context_length}"
|
||||
assert s2.threshold_tokens == 180000, f"threshold_tokens lost on reload: got {s2.threshold_tokens}"
|
||||
assert s2.last_prompt_tokens == 45123, f"last_prompt_tokens lost on reload: got {s2.last_prompt_tokens}"
|
||||
Reference in New Issue
Block a user