fix(streaming): persist context_length on session — completes #1318 fix

Pre-release Opus + nesquena review on v0.50.246 caught that PR #1341 added the data-structure scaffolding (Session.__init__ accepts the 3 fields, save() persists them, compact() exposes them, GET /api/session returns them) but did NOT add the writer that actually populates them. Without a writer, the user-visible bug (context-ring shows 0% after page reload) was NOT fixed by #1341 alone — the fields stayed None forever because nothing wrote to s.context_length anywhere. Adds the writer at api/streaming.py:2188 (post-merge per-turn save block, before s.save()) so the values from agent.context_compressor land on disk and survive page reloads. Also moves the SSE usage payload comment to clarify that the live SSE payload and the session-level persistence are now distinct paths (payload below, persistence above). Adds tests/test_pr1341_context_window_persistence.py — 6 structural + round-trip tests covering Session __init__/save/compact, the routes response, and the streaming.py writer placement. Closes #1318 (the actual user-visible bug, not just the scaffolding).
2026-05-26 03:30:36 +00:00 · 2026-04-30 16:38:55 +00:00
parent f328f3b843
commit a5c10d594d
3 changed files with 157 additions and 2 deletions
@@ -12,7 +12,7 @@
 - **Activity panel no longer auto-collapses when new tool/thinking events arrive** — Both `ensureActivityGroup()` (which re-creates the group with `tool-call-group-collapsed` on every destroy/recreate) and `finalizeThinkingCard()` (which force-adds the collapsed class on every tool boundary) ignored the user's manual expand. Tracks the user's last explicit toggle on the live activity group in a per-turn singleton (`_liveActivityUserExpanded`), restored on re-create and respected by the finalize path. Cleared between turns by `clearLiveToolCards()`. (`static/ui.js`, `tests/test_issue1298_cancel_and_activity.py`) — fixes #1298 (issue 1)
 - **Stale Mermaid render errors no longer leak into every chat** — Mermaid's render-failure path leaves a temporary `<div id="d<id>">` body-level node containing a "Syntax error in text" SVG. The previous code never removed it, so once any Mermaid block failed (or got mis-detected as Mermaid), every subsequent tab kept the syntax-error SVG visible regardless of content. Also tightens Mermaid detection so line-numbered tool output (`123|line`) and code blocks that don't start with a recognized Mermaid keyword are no longer mis-parsed as Mermaid; failed blocks are marked so a later render pass can't retry them. (`static/ui.js`, `tests/test_issue347.py`) @dso2ng — PR #1337
 - **Static asset cache busts automatically on every release** — `<script src="static/ui.js">` and friends were cached indefinitely by browsers and the service worker, so a new release with bug fixes could be invisible to a user until they hard-refreshed. Now `index.html` and `sw.js` registration both inject the current `WEBUI_VERSION` git tag as a `?v=` query string, URL-encoded server-side so unusual git tag formats can't break the JS. The service worker also no longer intercepts requests for itself, ensuring the browser always fetches the freshly-versioned `sw.js` directly from the network. (`api/routes.py`, `static/index.html`, `static/sw.js`, `tests/test_pwa_manifest_sw.py`) @dso2ng — PR #1337
- **Context window indicator persists across page reloads** — `Session.__init__` now accepts `context_length`, `threshold_tokens`, and `last_prompt_tokens`; `save()` persists them and `compact()` exposes them so the GET `/api/session` response includes them. The frontend context-ring indicator was previously losing its percentage on every session load because the Session model silently dropped these fields when reconstructing from disk. (`api/models.py`, `api/routes.py`) @fxd-jason — PR #1341 (focused split from the held PR #1318)
+- **Context window indicator persists across page reloads (#1318 — fully fixed)** — `Session.__init__` now accepts `context_length`, `threshold_tokens`, and `last_prompt_tokens`; `save()` persists them via the `METADATA_FIELDS` round-trip and `compact()` exposes them on the GET `/api/session` response. **Critically**, `api/streaming.py` now writes the values from `agent.context_compressor` onto the session inside the post-merge per-turn save block, so the values land on disk and survive a page reload. Without that writer, the model fields would have been pure scaffolding — present but never populated. The frontend context-ring indicator was previously losing its percentage on every session load because nothing was writing these fields to disk; that data flow is now end-to-end. (`api/models.py`, `api/routes.py`, `api/streaming.py`, `tests/test_pr1341_context_window_persistence.py`) @fxd-jason — PR #1341 (focused split from the held PR #1318) + writer added during pre-release review
 - **`fallback_providers` list config no longer crashes streaming** — `api/streaming.py:1701` previously read `_cfg.get('fallback_model')` and called `.get('model', '')` on the result. When users had `fallback_providers: [{...}, {...}]` in their config (the chained-fallback form documented in CHANGELOG since v0.50.151), the streaming path crashed with `AttributeError: 'list' object has no attribute 'get'`. Now consults both `fallback_model` (single dict, legacy) and `fallback_providers` (list, new), picks the first valid entry from the list, and defends both paths with `isinstance` checks. (`api/streaming.py`, `tests/test_pr1339_fallback_providers_list.py`) @jimdawdy-hub — PR #1339

 ### Changed
@@ -2185,6 +2185,17 @@ def _run_agent_streaming(session_id, msg_text, model, workspace, stream_id, atta
                        if isinstance(_rm, dict) and _rm.get('role') == 'assistant':
                            _rm['reasoning'] = _reasoning_text
                            break
+                # Persist context window data on the session so the context-ring
+                # indicator survives a page reload (#1318). Must run BEFORE
+                # s.save() for the same reason as the reasoning trace above.
+                # The fields are captured into the SSE usage payload below; this
+                # block writes them to the session itself so GET /api/session
+                # returns them on reload instead of falling back to 0.
+                _cc_for_save = getattr(agent, 'context_compressor', None)
+                if _cc_for_save:
+                    s.context_length = getattr(_cc_for_save, 'context_length', 0) or 0
+                    s.threshold_tokens = getattr(_cc_for_save, 'threshold_tokens', 0) or 0
+                    s.last_prompt_tokens = getattr(_cc_for_save, 'last_prompt_tokens', 0) or 0
                s.save()
            # Sync to state.db for /insights (opt-in setting)
            try:
@@ -2203,7 +2214,9 @@ def _run_agent_streaming(session_id, msg_text, model, workspace, stream_id, atta
            except Exception:
                logger.debug("Failed to sync session to insights")
            usage = {'input_tokens': input_tokens, 'output_tokens': output_tokens, 'estimated_cost': estimated_cost}
-            # Include context window data from the agent's compressor for the UI indicator
+            # Include context window data from the agent's compressor for the UI indicator.
+            # The session-level persistence happens above (before s.save()) so the values
+            # survive a page reload; this block only populates the live SSE usage payload.
            _cc = getattr(agent, 'context_compressor', None)
            if _cc:
                usage['context_length'] = getattr(_cc, 'context_length', 0) or 0
@@ -0,0 +1,142 @@
+"""Regression test for PR #1341 + Opus pre-release review of v0.50.246.
+
+PR #1341 added context_length/threshold_tokens/last_prompt_tokens fields to
+the Session model — but didn't add the writer that actually populates them
+during streaming. The pre-release review caught this: without the writer,
+the user-visible bug (context-ring shows 0% after page reload) would NOT
+have been fixed by #1341 alone.
+
+This test verifies that:
+1. After a streaming turn completes, the session's context_length /
+   threshold_tokens / last_prompt_tokens are written from the agent's
+   compressor BEFORE s.save() is called (so they land on disk).
+2. GET /api/session response includes the populated values.
+3. A reloaded session retains the populated values.
+
+Implementation reference: api/streaming.py around line 2188 (the per-turn
+post-merge save) writes from getattr(agent, 'context_compressor', None).
+"""
+import re
+from pathlib import Path
+
+ROOT = Path(__file__).resolve().parent.parent
+STREAMING = ROOT / "api" / "streaming.py"
+MODELS = ROOT / "api" / "models.py"
+ROUTES = ROOT / "api" / "routes.py"
+
+
+def test_streaming_persists_context_fields_on_session_before_save():
+    """The post-merge per-turn save block must write the three fields to the
+    session BEFORE calling s.save(), otherwise the values never reach disk."""
+    src = STREAMING.read_text(encoding="utf-8")
+
+    # Find the post-merge save block — anchored on the unique reasoning trace
+    # marker right above the persistence block.
+    block_start = src.find("if _reasoning_text and s.messages:")
+    assert block_start != -1, "Reasoning-trace marker not found in streaming.py"
+
+    # Save call follows shortly after
+    save_call = src.find("\n                s.save()", block_start)
+    assert save_call != -1, "s.save() not found after the post-merge marker"
+    assert save_call - block_start < 2000, (
+        "s.save() should be close to the post-merge marker — block expanded unexpectedly"
+    )
+
+    block = src[block_start:save_call]
+
+    # The three fields must all be assigned on s within this block
+    assert "s.context_length" in block, (
+        "s.context_length must be written before s.save() in the post-merge block"
+    )
+    assert "s.threshold_tokens" in block, (
+        "s.threshold_tokens must be written before s.save() in the post-merge block"
+    )
+    assert "s.last_prompt_tokens" in block, (
+        "s.last_prompt_tokens must be written before s.save() in the post-merge block"
+    )
+
+    # The values must come from the agent's context_compressor
+    assert "context_compressor" in block, (
+        "Values must be sourced from agent.context_compressor"
+    )
+
+
+def test_session_init_accepts_context_fields():
+    """Session.__init__ must accept the three fields as named kwargs."""
+    src = MODELS.read_text(encoding="utf-8")
+    # The init signature spans many lines — read the full def block
+    init_match = re.search(r"def __init__\(self,(.*?)\):", src, re.DOTALL)
+    assert init_match, "Session.__init__ signature not found"
+    sig = init_match.group(1)
+    assert "context_length" in sig, "Session.__init__ must accept context_length"
+    assert "threshold_tokens" in sig, "Session.__init__ must accept threshold_tokens"
+    assert "last_prompt_tokens" in sig, "Session.__init__ must accept last_prompt_tokens"
+
+
+def test_session_metadata_fields_includes_context_fields():
+    """Session.save() METADATA_FIELDS must include all three for round-trip persistence."""
+    src = MODELS.read_text(encoding="utf-8")
+    # Locate METADATA_FIELDS list
+    meta_match = re.search(
+        r"METADATA_FIELDS\s*=\s*\[(.*?)\]",
+        src,
+        re.DOTALL,
+    )
+    assert meta_match, "METADATA_FIELDS list not found in Session.save"
+    fields = meta_match.group(1)
+    assert "'context_length'" in fields, "METADATA_FIELDS must include 'context_length'"
+    assert "'threshold_tokens'" in fields, "METADATA_FIELDS must include 'threshold_tokens'"
+    assert "'last_prompt_tokens'" in fields, "METADATA_FIELDS must include 'last_prompt_tokens'"
+
+
+def test_session_compact_exposes_context_fields():
+    """Session.compact() must include the three fields in its output dict."""
+    src = MODELS.read_text(encoding="utf-8")
+    # Find compact() method body
+    compact_idx = src.find("def compact(")
+    assert compact_idx != -1, "Session.compact not found"
+    # Look ahead for the next def or 200 lines
+    end = src.find("\n    def ", compact_idx + 1)
+    body = src[compact_idx:end if end != -1 else compact_idx + 4000]
+
+    assert "'context_length':" in body, "compact() must include context_length"
+    assert "'threshold_tokens':" in body, "compact() must include threshold_tokens"
+    assert "'last_prompt_tokens':" in body, "compact() must include last_prompt_tokens"
+
+
+def test_routes_session_get_returns_context_fields():
+    """GET /api/session response must include the three fields."""
+    src = ROUTES.read_text(encoding="utf-8")
+    # The session-detail response builder uses getattr(s, ..., 0) or 0 pattern.
+    # Look for the three keys in the same response shape.
+    assert '"context_length"' in src, "GET /api/session response must include context_length"
+    assert '"threshold_tokens"' in src, "GET /api/session response must include threshold_tokens"
+    assert '"last_prompt_tokens"' in src, "GET /api/session response must include last_prompt_tokens"
+
+
+def test_session_round_trip_persists_context_fields(tmp_path, monkeypatch):
+    """Real round-trip: save a Session with the fields set, reload, fields still there.
+
+    Patches SESSION_DIR on the live api.models module so we don't pollute
+    sys.modules state and break test ordering for sibling tests that depend
+    on a stable api.models import (e.g. test_session_sidecar_repair.py).
+    """
+    from api import models
+
+    # Use tmp_path as the session dir for this test only
+    sessions_dir = tmp_path / "sessions"
+    sessions_dir.mkdir(parents=True, exist_ok=True)
+    monkeypatch.setattr(models, "SESSION_DIR", sessions_dir)
+
+    s = models.Session(session_id="ctxtest1", title="Context test")
+    s.context_length = 200000
+    s.threshold_tokens = 180000
+    s.last_prompt_tokens = 45123
+    s.save()
+
+    # Reload from disk
+    s2 = models.Session.load("ctxtest1")
+    assert s2 is not None, "Session should reload"
+    assert s2.context_length == 200000, f"context_length lost on reload: got {s2.context_length}"
+    assert s2.threshold_tokens == 180000, f"threshold_tokens lost on reload: got {s2.threshold_tokens}"
+    assert s2.last_prompt_tokens == 45123, f"last_prompt_tokens lost on reload: got {s2.last_prompt_tokens}"