fix(streaming): persist context_length on session — completes #1318 fix

Pre-release Opus + nesquena review on v0.50.246 caught that PR #1341
added the data-structure scaffolding (Session.__init__ accepts the 3
fields, save() persists them, compact() exposes them, GET /api/session
returns them) but did NOT add the writer that actually populates them.

Without a writer, the user-visible bug (context-ring shows 0% after
page reload) was NOT fixed by #1341 alone — the fields stayed None
forever because nothing wrote to s.context_length anywhere.

Adds the writer at api/streaming.py:2188 (post-merge per-turn save block,
before s.save()) so the values from agent.context_compressor land on
disk and survive page reloads.

Also moves the SSE usage payload comment to clarify that the live SSE
payload and the session-level persistence are now distinct paths
(payload below, persistence above).

Adds tests/test_pr1341_context_window_persistence.py — 6 structural +
round-trip tests covering Session __init__/save/compact, the routes
response, and the streaming.py writer placement.

Closes #1318 (the actual user-visible bug, not just the scaffolding).
This commit is contained in:
nesquena-hermes
2026-04-30 16:38:55 +00:00
parent f328f3b843
commit a5c10d594d
3 changed files with 157 additions and 2 deletions
+1 -1
View File
@@ -12,7 +12,7 @@
- **Activity panel no longer auto-collapses when new tool/thinking events arrive** — Both `ensureActivityGroup()` (which re-creates the group with `tool-call-group-collapsed` on every destroy/recreate) and `finalizeThinkingCard()` (which force-adds the collapsed class on every tool boundary) ignored the user's manual expand. Tracks the user's last explicit toggle on the live activity group in a per-turn singleton (`_liveActivityUserExpanded`), restored on re-create and respected by the finalize path. Cleared between turns by `clearLiveToolCards()`. (`static/ui.js`, `tests/test_issue1298_cancel_and_activity.py`) — fixes #1298 (issue 1)
- **Stale Mermaid render errors no longer leak into every chat** — Mermaid's render-failure path leaves a temporary `<div id="d<id>">` body-level node containing a "Syntax error in text" SVG. The previous code never removed it, so once any Mermaid block failed (or got mis-detected as Mermaid), every subsequent tab kept the syntax-error SVG visible regardless of content. Also tightens Mermaid detection so line-numbered tool output (`123|line`) and code blocks that don't start with a recognized Mermaid keyword are no longer mis-parsed as Mermaid; failed blocks are marked so a later render pass can't retry them. (`static/ui.js`, `tests/test_issue347.py`) @dso2ng — PR #1337
- **Static asset cache busts automatically on every release**`<script src="static/ui.js">` and friends were cached indefinitely by browsers and the service worker, so a new release with bug fixes could be invisible to a user until they hard-refreshed. Now `index.html` and `sw.js` registration both inject the current `WEBUI_VERSION` git tag as a `?v=` query string, URL-encoded server-side so unusual git tag formats can't break the JS. The service worker also no longer intercepts requests for itself, ensuring the browser always fetches the freshly-versioned `sw.js` directly from the network. (`api/routes.py`, `static/index.html`, `static/sw.js`, `tests/test_pwa_manifest_sw.py`) @dso2ng — PR #1337
- **Context window indicator persists across page reloads**`Session.__init__` now accepts `context_length`, `threshold_tokens`, and `last_prompt_tokens`; `save()` persists them and `compact()` exposes them so the GET `/api/session` response includes them. The frontend context-ring indicator was previously losing its percentage on every session load because the Session model silently dropped these fields when reconstructing from disk. (`api/models.py`, `api/routes.py`) @fxd-jason — PR #1341 (focused split from the held PR #1318)
- **Context window indicator persists across page reloads (#1318 — fully fixed)**`Session.__init__` now accepts `context_length`, `threshold_tokens`, and `last_prompt_tokens`; `save()` persists them via the `METADATA_FIELDS` round-trip and `compact()` exposes them on the GET `/api/session` response. **Critically**, `api/streaming.py` now writes the values from `agent.context_compressor` onto the session inside the post-merge per-turn save block, so the values land on disk and survive a page reload. Without that writer, the model fields would have been pure scaffolding — present but never populated. The frontend context-ring indicator was previously losing its percentage on every session load because nothing was writing these fields to disk; that data flow is now end-to-end. (`api/models.py`, `api/routes.py`, `api/streaming.py`, `tests/test_pr1341_context_window_persistence.py`) @fxd-jason — PR #1341 (focused split from the held PR #1318) + writer added during pre-release review
- **`fallback_providers` list config no longer crashes streaming** — `api/streaming.py:1701` previously read `_cfg.get('fallback_model')` and called `.get('model', '')` on the result. When users had `fallback_providers: [{...}, {...}]` in their config (the chained-fallback form documented in CHANGELOG since v0.50.151), the streaming path crashed with `AttributeError: 'list' object has no attribute 'get'`. Now consults both `fallback_model` (single dict, legacy) and `fallback_providers` (list, new), picks the first valid entry from the list, and defends both paths with `isinstance` checks. (`api/streaming.py`, `tests/test_pr1339_fallback_providers_list.py`) @jimdawdy-hub — PR #1339
### Changed
+14 -1
View File
@@ -2185,6 +2185,17 @@ def _run_agent_streaming(session_id, msg_text, model, workspace, stream_id, atta
if isinstance(_rm, dict) and _rm.get('role') == 'assistant':
_rm['reasoning'] = _reasoning_text
break
# Persist context window data on the session so the context-ring
# indicator survives a page reload (#1318). Must run BEFORE
# s.save() for the same reason as the reasoning trace above.
# The fields are captured into the SSE usage payload below; this
# block writes them to the session itself so GET /api/session
# returns them on reload instead of falling back to 0.
_cc_for_save = getattr(agent, 'context_compressor', None)
if _cc_for_save:
s.context_length = getattr(_cc_for_save, 'context_length', 0) or 0
s.threshold_tokens = getattr(_cc_for_save, 'threshold_tokens', 0) or 0
s.last_prompt_tokens = getattr(_cc_for_save, 'last_prompt_tokens', 0) or 0
s.save()
# Sync to state.db for /insights (opt-in setting)
try:
@@ -2203,7 +2214,9 @@ def _run_agent_streaming(session_id, msg_text, model, workspace, stream_id, atta
except Exception:
logger.debug("Failed to sync session to insights")
usage = {'input_tokens': input_tokens, 'output_tokens': output_tokens, 'estimated_cost': estimated_cost}
# Include context window data from the agent's compressor for the UI indicator
# Include context window data from the agent's compressor for the UI indicator.
# The session-level persistence happens above (before s.save()) so the values
# survive a page reload; this block only populates the live SSE usage payload.
_cc = getattr(agent, 'context_compressor', None)
if _cc:
usage['context_length'] = getattr(_cc, 'context_length', 0) or 0
@@ -0,0 +1,142 @@
"""Regression test for PR #1341 + Opus pre-release review of v0.50.246.
PR #1341 added context_length/threshold_tokens/last_prompt_tokens fields to
the Session model but didn't add the writer that actually populates them
during streaming. The pre-release review caught this: without the writer,
the user-visible bug (context-ring shows 0% after page reload) would NOT
have been fixed by #1341 alone.
This test verifies that:
1. After a streaming turn completes, the session's context_length /
threshold_tokens / last_prompt_tokens are written from the agent's
compressor BEFORE s.save() is called (so they land on disk).
2. GET /api/session response includes the populated values.
3. A reloaded session retains the populated values.
Implementation reference: api/streaming.py around line 2188 (the per-turn
post-merge save) writes from getattr(agent, 'context_compressor', None).
"""
import re
from pathlib import Path
ROOT = Path(__file__).resolve().parent.parent
STREAMING = ROOT / "api" / "streaming.py"
MODELS = ROOT / "api" / "models.py"
ROUTES = ROOT / "api" / "routes.py"
def test_streaming_persists_context_fields_on_session_before_save():
"""The post-merge per-turn save block must write the three fields to the
session BEFORE calling s.save(), otherwise the values never reach disk."""
src = STREAMING.read_text(encoding="utf-8")
# Find the post-merge save block — anchored on the unique reasoning trace
# marker right above the persistence block.
block_start = src.find("if _reasoning_text and s.messages:")
assert block_start != -1, "Reasoning-trace marker not found in streaming.py"
# Save call follows shortly after
save_call = src.find("\n s.save()", block_start)
assert save_call != -1, "s.save() not found after the post-merge marker"
assert save_call - block_start < 2000, (
"s.save() should be close to the post-merge marker — block expanded unexpectedly"
)
block = src[block_start:save_call]
# The three fields must all be assigned on s within this block
assert "s.context_length" in block, (
"s.context_length must be written before s.save() in the post-merge block"
)
assert "s.threshold_tokens" in block, (
"s.threshold_tokens must be written before s.save() in the post-merge block"
)
assert "s.last_prompt_tokens" in block, (
"s.last_prompt_tokens must be written before s.save() in the post-merge block"
)
# The values must come from the agent's context_compressor
assert "context_compressor" in block, (
"Values must be sourced from agent.context_compressor"
)
def test_session_init_accepts_context_fields():
"""Session.__init__ must accept the three fields as named kwargs."""
src = MODELS.read_text(encoding="utf-8")
# The init signature spans many lines — read the full def block
init_match = re.search(r"def __init__\(self,(.*?)\):", src, re.DOTALL)
assert init_match, "Session.__init__ signature not found"
sig = init_match.group(1)
assert "context_length" in sig, "Session.__init__ must accept context_length"
assert "threshold_tokens" in sig, "Session.__init__ must accept threshold_tokens"
assert "last_prompt_tokens" in sig, "Session.__init__ must accept last_prompt_tokens"
def test_session_metadata_fields_includes_context_fields():
"""Session.save() METADATA_FIELDS must include all three for round-trip persistence."""
src = MODELS.read_text(encoding="utf-8")
# Locate METADATA_FIELDS list
meta_match = re.search(
r"METADATA_FIELDS\s*=\s*\[(.*?)\]",
src,
re.DOTALL,
)
assert meta_match, "METADATA_FIELDS list not found in Session.save"
fields = meta_match.group(1)
assert "'context_length'" in fields, "METADATA_FIELDS must include 'context_length'"
assert "'threshold_tokens'" in fields, "METADATA_FIELDS must include 'threshold_tokens'"
assert "'last_prompt_tokens'" in fields, "METADATA_FIELDS must include 'last_prompt_tokens'"
def test_session_compact_exposes_context_fields():
"""Session.compact() must include the three fields in its output dict."""
src = MODELS.read_text(encoding="utf-8")
# Find compact() method body
compact_idx = src.find("def compact(")
assert compact_idx != -1, "Session.compact not found"
# Look ahead for the next def or 200 lines
end = src.find("\n def ", compact_idx + 1)
body = src[compact_idx:end if end != -1 else compact_idx + 4000]
assert "'context_length':" in body, "compact() must include context_length"
assert "'threshold_tokens':" in body, "compact() must include threshold_tokens"
assert "'last_prompt_tokens':" in body, "compact() must include last_prompt_tokens"
def test_routes_session_get_returns_context_fields():
"""GET /api/session response must include the three fields."""
src = ROUTES.read_text(encoding="utf-8")
# The session-detail response builder uses getattr(s, ..., 0) or 0 pattern.
# Look for the three keys in the same response shape.
assert '"context_length"' in src, "GET /api/session response must include context_length"
assert '"threshold_tokens"' in src, "GET /api/session response must include threshold_tokens"
assert '"last_prompt_tokens"' in src, "GET /api/session response must include last_prompt_tokens"
def test_session_round_trip_persists_context_fields(tmp_path, monkeypatch):
"""Real round-trip: save a Session with the fields set, reload, fields still there.
Patches SESSION_DIR on the live api.models module so we don't pollute
sys.modules state and break test ordering for sibling tests that depend
on a stable api.models import (e.g. test_session_sidecar_repair.py).
"""
from api import models
# Use tmp_path as the session dir for this test only
sessions_dir = tmp_path / "sessions"
sessions_dir.mkdir(parents=True, exist_ok=True)
monkeypatch.setattr(models, "SESSION_DIR", sessions_dir)
s = models.Session(session_id="ctxtest1", title="Context test")
s.context_length = 200000
s.threshold_tokens = 180000
s.last_prompt_tokens = 45123
s.save()
# Reload from disk
s2 = models.Session.load("ctxtest1")
assert s2 is not None, "Session should reload"
assert s2.context_length == 200000, f"context_length lost on reload: got {s2.context_length}"
assert s2.threshold_tokens == 180000, f"threshold_tokens lost on reload: got {s2.threshold_tokens}"
assert s2.last_prompt_tokens == 45123, f"last_prompt_tokens lost on reload: got {s2.last_prompt_tokens}"