diff --git a/CHANGELOG.md b/CHANGELOG.md index 083760a5..916b0d29 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,41 @@ # Hermes Web UI -- Changelog +## [v0.51.6] — 2026-05-05 — 5-PR full-sweep batch + +### Added + +- **PR #1719** by @Michaelyklam — Show active elapsed time in compact activity (closes #1716). Adds an in-progress elapsed counter while the agent is still working, complementing the already-shipped post-completion duration. Backend `/api/chat/start` now returns `pending_started_at` timestamp; UI uses that as the durable source of truth (instead of a browser-local timer that resets on rerender/reconnect). The compact Activity-row timer settles back to the existing post-completion duration display when the turn finishes. Cleanup timer paths attached to `setBusy(false)`, `clearLiveToolCards()`, `removeThinking()` so the counter stops on every terminal path (turn ends, session switch, error). + +### Fixed + +- **PR #1717** by @ai-ag2026 — Preserve imported session lineage visibility. Three independent fixes for the CLI/messaging session import path: (a) preserve `parent_session_id` when importing CLI/messaging sessions into WebUI sidecars (lineage was being dropped); (b) avoid shrinking sidebar `message_count` when CLI metadata has fewer messages than a repaired/aggregate sidecar (the sidebar was reverting to the shorter count); (c) prefer the longer WebUI sidecar transcript for messaging `/api/session` responses when it contains recovered visible history. 4 new regression tests cover lineage import, read-only imports, sidebar counts, and the recovered-sidecar transcript-selection path. +- **PR #1718** by @Michaelyklam — Preserve Activity count across chat focus changes (closes #1715). Root cause: `loadSession()` restored `S.toolCalls` from the per-session `INFLIGHT` cache, then replayed those tools through `appendLiveToolCard()` BEFORE restoring `S.activeStreamId`. `appendLiveToolCard()` intentionally no-ops without `S.activeStreamId`, so the replayed tools were dropped from the compact Activity group after focus changed. Fix: restore `S.activeStreamId` BEFORE the tool replay loop. Source-level regression assertion pins the new ordering. +- **PR #1720** by @Michaelyklam — Fix backend tool snippet cap for "Show more" (closes #1714). Frontend already had logic to preview long tool snippets at ~800 chars and reveal the rest with "Show more", but the backend was truncating persisted tool snippets to 200 chars — so the frontend threshold could never be reached. Raises the persisted snippet cap from 200 → 4000 chars (conservative; medium tool outputs can use the existing affordance, huge outputs are still bounded so session JSON doesn't balloon). Per-issue maintainer-confirmed direction. +- **PR #1722** by @ai-ag2026 — Suppress stale preserved task lists. After context compaction or reload, the UI was re-rendering the most recent preserved compression task-list card from history even after the actual todo state had moved on (all items completed/cancelled). Stale tasks reappeared as if still pending. Fix: only treat `pending` and `in_progress` todos as "active" when deciding whether to keep the preserved task list visible. Regression test covers the stale-preserved-task-list suppression path. Handles the `latestTodos === null` fallback correctly (no fresh todo tool message found → keep showing the preserved card, original behavior). + +### Tests + +4527 → **4537 passing** (+10 regression tests across the 5 PRs). 0 regressions. Full suite ~149s. + +### Pre-release verification + +- Stage-303: 5 PRs merged with zero conflicts (each rebased clean against current master). Zero stage-applied edits. +- All JS files syntax-clean (`node -c static/{messages,sessions,ui}.js`). +- All Python files syntax-clean (py_compile on every changed file). +- Live browser walkthrough on port 8789: + - PR #1718 ordering fix: `S.activeStreamId` is set BEFORE `appendLiveToolCard()` replay (CORRECT-ORDER verified in source). + - PR #1719 `pending_started_at` flows through to messages/UI; elapsed timer code present. + - PR #1722 todo state filter present in source. + - PR #1717 sidebar module helpers present. + - Sidebar scroll holds at 200 (carry-over fix from v0.51.2 preserved). + - System health card from v0.51.5 still working in Insights (CPU 15%, RAM 48.3%, disk 33.9%). +- Opus advisor: SHIP, 6/6 verification clean, 0 MUST-FIX, 0 SHOULD-FIX. Two non-blocking observations: + - #1717 "longer sidecar wins" heuristic won't honor explicit CLI-side message deletions (low likelihood for messaging sessions; documented). + - #1719 elapsed timer is client-clock-relative; gross browser clock drift will distort live counter (cosmetic; follow-up could send server-clock anchor). + +Closes #1714, #1715, #1716. + + ## [v0.51.5] — 2026-05-05 — 4-PR full-sweep batch ### Added diff --git a/ROADMAP.md b/ROADMAP.md index a3b09304..82f00296 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -2,7 +2,7 @@ > Web companion to the Hermes Agent CLI. Same workflows, browser-native. > -> Last updated: v0.51.5 (May 5, 2026) — 4527 tests collected +> Last updated: v0.51.6 (May 5, 2026) — 4537 tests collected > Test source: `pytest tests/ --collect-only -q` > Per-version detail: see [CHANGELOG.md](./CHANGELOG.md) diff --git a/TESTING.md b/TESTING.md index ad3affc9..c13172d6 100644 --- a/TESTING.md +++ b/TESTING.md @@ -1835,8 +1835,8 @@ Bridged CLI sessions: --- -*Last updated: v0.51.5, May 5, 2026* -*Total automated tests collected: 4527* +*Last updated: v0.51.6, May 5, 2026* +*Total automated tests collected: 4537* *Regression gate: tests/test_regressions.py* *Run: pytest tests/ -v --timeout=60* *Source: /* diff --git a/api/models.py b/api/models.py index 85e8bd82..7d2cccdc 100644 --- a/api/models.py +++ b/api/models.py @@ -1229,9 +1229,13 @@ def import_cli_session( profile=None, created_at=None, updated_at=None, + parent_session_id=None, ): - """Create a new WebUI session populated with CLI messages. - Returns the Session object. + """Create a new WebUI session populated with CLI/agent messages. + + Preserve parent_session_id from state.db so imported continuation segments + keep their lineage in the WebUI store and sidebar instead of reappearing as + detached orphan chats. """ s = Session( session_id=session_id, @@ -1242,6 +1246,7 @@ def import_cli_session( profile=profile, created_at=created_at, updated_at=updated_at, + parent_session_id=parent_session_id, ) s.save(touch_updated_at=False) return s diff --git a/api/routes.py b/api/routes.py index e2139a0c..f9a038ab 100644 --- a/api/routes.py +++ b/api/routes.py @@ -1270,9 +1270,15 @@ def _merge_cli_sidebar_metadata(ui_session: dict, cli_meta: dict) -> dict: if cli_meta.get("last_message_at") is not None: merged["last_message_at"] = cli_meta["last_message_at"] if cli_meta.get("message_count") is not None: - merged["message_count"] = cli_meta["message_count"] + merged["message_count"] = max( + _numeric_count(merged.get("message_count")), + _numeric_count(cli_meta.get("message_count")), + ) elif cli_meta.get("actual_message_count") is not None: - merged["message_count"] = cli_meta["actual_message_count"] + merged["message_count"] = max( + _numeric_count(merged.get("message_count")), + _numeric_count(cli_meta.get("actual_message_count")), + ) if cli_meta.get("title"): current_title = merged.get("title") @@ -2622,7 +2628,13 @@ def handle_get(handler, parsed) -> bool: _t3 = _time.monotonic() if load_messages: if is_messaging_session and cli_messages: - _all_msgs = cli_messages + sidecar_messages = getattr(s, "messages", []) or [] + # Recovery/aggregate sidecars can intentionally contain a + # longer visible conversation than the single state.db + # segment for this messaging session id. Prefer the longer + # sidecar so repaired WebUI history is not hidden behind the + # canonical per-segment transcript. + _all_msgs = sidecar_messages if len(sidecar_messages) > len(cli_messages) else cli_messages else: _all_msgs = s.messages else: @@ -5986,7 +5998,11 @@ def _handle_chat_start(handler, body): daemon=True, ) thr.start() - response = {"stream_id": stream_id, "session_id": s.session_id} + response = { + "stream_id": stream_id, + "session_id": s.session_id, + "pending_started_at": s.pending_started_at, + } if normalized_model: response["effective_model"] = model if model_provider: @@ -7661,6 +7677,7 @@ def _handle_session_import_cli(handler, body): "raw_source": existing.raw_source or cli_meta.get("raw_source") or cli_meta.get("source_tag"), "session_source": existing.session_source or cli_meta.get("session_source"), "source_label": existing.source_label or cli_meta.get("source_label"), + "parent_session_id": existing.parent_session_id or cli_meta.get("parent_session_id"), } for attr, value in updates.items(): if getattr(existing, attr, None) != value: @@ -7702,6 +7719,7 @@ def _handle_session_import_cli(handler, body): cli_thread_id = None cli_session_key = None cli_platform = None + cli_parent_session_id = None cli_read_only = False for cs in get_cli_sessions(): if cs["session_id"] == sid: @@ -7720,6 +7738,7 @@ def _handle_session_import_cli(handler, body): cli_thread_id = cs.get("thread_id") cli_session_key = cs.get("session_key") cli_platform = cs.get("platform") + cli_parent_session_id = cs.get("parent_session_id") cli_read_only = bool(cs.get("read_only")) break @@ -7750,6 +7769,7 @@ def _handle_session_import_cli(handler, body): "raw_source": cli_raw_source or cli_source_tag, "session_source": cli_session_source, "source_label": cli_source_label, + "parent_session_id": cli_parent_session_id, "read_only": True, "messages": msgs, "tool_calls": [], @@ -7764,6 +7784,7 @@ def _handle_session_import_cli(handler, body): profile=profile, created_at=created_at, updated_at=updated_at, + parent_session_id=cli_parent_session_id, ) if cron_project_id: s.project_id = cron_project_id diff --git a/api/streaming.py b/api/streaming.py index 6accd184..e129186b 100644 --- a/api/streaming.py +++ b/api/streaming.py @@ -1379,16 +1379,22 @@ def _merge_display_messages_after_agent_result(previous_display, previous_contex return merged -def _tool_result_snippet(raw) -> str: - """Extract a compact result preview from a stored tool message payload.""" +_TOOL_RESULT_SNIPPET_MAX = 4000 + + +def _tool_result_snippet(raw, limit: int = _TOOL_RESULT_SNIPPET_MAX) -> str: + """Extract a bounded result preview from a stored tool message payload.""" + if limit <= 0: + return '' text = str(raw or '') try: - data = json.loads(text) + data = raw if isinstance(raw, dict) else json.loads(text) if isinstance(data, dict): - return str(data.get('output') or data.get('result') or data.get('error') or text)[:200] + preview = data.get('output') or data.get('result') or data.get('error') or text + text = str(preview) except Exception: pass - return text[:200] + return text[:limit] def _truncate_tool_args(args, limit: int = 6) -> dict: diff --git a/docs/pr-media/1715/activity-focus-reload.png b/docs/pr-media/1715/activity-focus-reload.png new file mode 100644 index 00000000..5ca8f736 Binary files /dev/null and b/docs/pr-media/1715/activity-focus-reload.png differ diff --git a/docs/pr-media/1716/active-elapsed-timer.png b/docs/pr-media/1716/active-elapsed-timer.png new file mode 100644 index 00000000..59f468e9 Binary files /dev/null and b/docs/pr-media/1716/active-elapsed-timer.png differ diff --git a/static/messages.js b/static/messages.js index c32d8fdc..b54212ce 100644 --- a/static/messages.js +++ b/static/messages.js @@ -243,6 +243,9 @@ async function send(){ } streamId=startData.stream_id; S.activeStreamId = streamId; + if(S.session&&typeof startData.pending_started_at==='number'){ + S.session.pending_started_at=startData.pending_started_at; + } if(S.session&&S.session.session_id===activeSid){ S.session.active_stream_id = streamId; } diff --git a/static/sessions.js b/static/sessions.js index 713098c1..f40290d5 100644 --- a/static/sessions.js +++ b/static/sessions.js @@ -430,6 +430,10 @@ async function loadSession(sid){ S.messages=INFLIGHT[sid].messages; S.toolCalls=(INFLIGHT[sid].toolCalls||[]); S.busy=true; + // appendLiveToolCard() is guarded by S.activeStreamId; restore it before + // replaying persisted live tools so the compact Activity count survives + // switching away from and back to an active chat (#1715). + S.activeStreamId=activeStreamId; syncTopbar();renderMessages();appendThinking();loadDir('.'); clearLiveToolCards(); if(typeof placeLiveToolCardsHost==='function') placeLiveToolCardsHost(); @@ -440,7 +444,6 @@ async function loadSession(sid){ startApprovalPolling(sid); if(typeof startClarifyPolling==='function') startClarifyPolling(sid); if(typeof _fetchYoloState==='function') _fetchYoloState(sid); - S.activeStreamId=activeStreamId; if(INFLIGHT[sid].reattach&&activeStreamId&&typeof attachLiveStream==='function'){ INFLIGHT[sid].reattach=false; if (_loadingSessionId !== sid) return; diff --git a/static/ui.js b/static/ui.js index 72adb997..e5f74059 100644 --- a/static/ui.js +++ b/static/ui.js @@ -1420,6 +1420,72 @@ function _formatTurnDuration(seconds){ if(h)return`${h}h ${m}m`; return`${m}m ${s}s`; } +function _formatActiveElapsedTimer(seconds){ + const n=Number(seconds); + if(!Number.isFinite(n)||n<0)return''; + const total=Math.max(0,Math.floor(n)); + const m=Math.floor(total/60); + const s=total%60; + return`${String(m).padStart(2,'0')}:${String(s).padStart(2,'0')}`; +} +let _activityElapsedTimer=null; +let _activityElapsedTimerGroup=null; +function _activityElapsedStartedAt(group){ + if(!group)return null; + const raw=(group.dataset&&group.dataset.turnStartedAt!==undefined&&group.dataset.turnStartedAt!=='') + ?group.dataset.turnStartedAt + :(S.session&&S.session.pending_started_at); + const started=Number(raw); + return Number.isFinite(started)&&started>0?started:null; +} +function _activityElapsedLabel(group){ + const started=_activityElapsedStartedAt(group); + if(!started)return''; + return _formatActiveElapsedTimer((Date.now()/1000)-started); +} +function _setActivityElapsedStartedAt(group){ + if(!group||group.getAttribute('data-live-tool-call-group')!=='1')return; + const started=_activityElapsedStartedAt(group); + if(started)group.setAttribute('data-turn-started-at',String(started)); +} +function _updateActiveActivityElapsedTimer(){ + const group=_activityElapsedTimerGroup; + if(!group||!group.isConnected||group.getAttribute('data-live-tool-call-group')!=='1'){ + _clearActivityElapsedTimer(); + return; + } + const durationEl=group.querySelector('.tool-call-group-duration'); + const label=_activityElapsedLabel(group); + if(label){ + group.setAttribute('data-active-turn-elapsed',label); + }else{ + group.removeAttribute('data-active-turn-elapsed'); + } + if(durationEl){ + durationEl.textContent=label?`Working ${label}`:''; + durationEl.style.display=label?'':'none'; + } +} +function _startActivityElapsedTimer(group){ + if(!group||group.getAttribute('data-live-tool-call-group')!=='1')return; + _setActivityElapsedStartedAt(group); + if(_activityElapsedTimerGroup&&_activityElapsedTimerGroup!==group)_clearActivityElapsedTimer(); + _activityElapsedTimerGroup=group; + _updateActiveActivityElapsedTimer(); + if(!_activityElapsedTimer)_activityElapsedTimer=setInterval(_updateActiveActivityElapsedTimer,1000); +} +function _clearActivityElapsedTimer(){ + if(_activityElapsedTimer){ + clearInterval(_activityElapsedTimer); + _activityElapsedTimer=null; + } + if(_activityElapsedTimerGroup&&_activityElapsedTimerGroup.isConnected){ + _activityElapsedTimerGroup.removeAttribute('data-active-turn-elapsed'); + const durationEl=_activityElapsedTimerGroup.querySelector('.tool-call-group-duration'); + if(durationEl){durationEl.textContent='';durationEl.style.display='none';} + } + _activityElapsedTimerGroup=null; +} const _MOBILE_CONFIG_BASE_LABEL='Workspace, model, reasoning, and context settings'; @@ -2438,6 +2504,7 @@ function setBusy(v){ S.busy=v; updateSendBtn(); if(!v){ + if(typeof _clearActivityElapsedTimer==='function') _clearActivityElapsedTimer(); setStatus(''); setComposerStatus(''); const sid=_queueDrainSid||(S.session&&S.session.session_id); @@ -3727,7 +3794,9 @@ function ensureActivityGroup(inner, opts){ if(anchor&&anchor.parentElement===inner) anchor.insertAdjacentElement('afterend', group); else inner.appendChild(group); } + if(live) _setActivityElapsedStartedAt(group); _syncToolCallGroupSummary(group); + if(live) _startActivityElapsedTimer(group); return group; } function _compressionStateForCurrentSession(){ @@ -3966,9 +4035,29 @@ function _preservedCompressionTaskListCardHtml(m, open=false){ function _preservedCompressionTaskListCardsHtml(messages){ return (messages||[]).map(m=>_preservedCompressionTaskListCardHtml(m, false)).join(''); } +function _latestTodoToolItems(messages){ + for(let i=(messages||[]).length-1;i>=0;i--){ + const m=messages[i]; + if(!m||m.role!=='tool') continue; + try{ + const payload=typeof m.content==='string'?JSON.parse(m.content):m.content; + if(payload&&Array.isArray(payload.todos)) return payload.todos; + }catch(_){ } + } + return null; +} +function _hasActiveTodoItems(items){ + return Array.isArray(items) && items.some(item=>{ + const status=String(item&&item.status||'').trim().toLowerCase(); + return status==='pending'||status==='in_progress'; + }); +} function _latestPreservedCompressionTaskListMessages(messages){ const latest=[...(messages||[])].reverse().find(m=>_isPreservedCompressionTaskListMessage(m)); - return latest?[latest]:[]; + if(!latest) return []; + const latestTodos=_latestTodoToolItems(messages); + if(Array.isArray(latestTodos) && !_hasActiveTodoItems(latestTodos)) return []; + return [latest]; } function _isSameLocalDay(dateA, dateB){ return dateA.getFullYear()===dateB.getFullYear() @@ -4807,9 +4896,17 @@ function _syncToolCallGroupSummary(group){ } if(list) list.textContent=parts.join(' · ')||'tools / thinking'; if(durationEl){ - const durationText=_formatTurnDuration(group.dataset.turnDuration); - durationEl.textContent=durationText?`Done in ${durationText}`:''; - durationEl.style.display=durationText?'':'none'; + if(group.getAttribute('data-live-tool-call-group')==='1'){ + const activeText=_activityElapsedLabel(group); + if(activeText) group.setAttribute('data-active-turn-elapsed',activeText); + else group.removeAttribute('data-active-turn-elapsed'); + durationEl.textContent=activeText?`Working ${activeText}`:''; + durationEl.style.display=activeText?'':'none'; + }else{ + const durationText=_formatTurnDuration(group.dataset.turnDuration); + durationEl.textContent=durationText?`Done in ${durationText}`:''; + durationEl.style.display=durationText?'':'none'; + } } if(badge) badge.textContent=String(total); } @@ -4896,6 +4993,7 @@ function appendLiveToolCard(tc){ } function clearLiveToolCards(){ + if(typeof _clearActivityElapsedTimer==='function') _clearActivityElapsedTimer(); const inner=_assistantTurnBlocks($('liveAssistantTurn')); if(inner) inner.querySelectorAll('.tool-call-group[data-live-tool-call-group],.tool-card-row[data-live-tid]').forEach(el=>el.remove()); // Reset the per-turn user expand intent so the next turn starts at the @@ -5699,7 +5797,10 @@ function removeThinking(){ if(blocks) blocks.querySelectorAll('.agent-activity-thinking').forEach(el=>el.remove()); if(blocks) blocks.querySelectorAll('.tool-call-group[data-agent-activity-group="1"]').forEach(group=>{ _syncToolCallGroupSummary(group); - if(!group.querySelector('.tool-card-row,.agent-activity-thinking')) group.remove(); + if(!group.querySelector('.tool-card-row,.agent-activity-thinking')){ + if(typeof _clearActivityElapsedTimer==='function') _clearActivityElapsedTimer(); + group.remove(); + } }); if(turn&&blocks&&!blocks.children.length) turn.remove(); } diff --git a/tests/test_auto_compression_card.py b/tests/test_auto_compression_card.py index 9f5a8869..98108cde 100644 --- a/tests/test_auto_compression_card.py +++ b/tests/test_auto_compression_card.py @@ -155,6 +155,20 @@ def test_preserved_task_list_attaches_once_per_render(): assert "(!preservedCompressionTaskCardsAttached&&(!referenceMessage||compressionState)&&preservedCompressionTaskMessages.length)" in src +def test_preserved_task_list_is_suppressed_when_latest_todo_state_has_no_active_items(): + src = _read("static/ui.js") + start = src.find("function _latestTodoToolItems") + assert start != -1, "latest todo state helper not found" + end = src.find("function _isSameLocalDay", start) + assert end != -1, "preserved-task-list helper block end not found" + helpers = src[start:end] + + assert "if(payload&&Array.isArray(payload.todos)) return payload.todos;" in helpers + assert "function _hasActiveTodoItems" in helpers + assert "status==='pending'||status==='in_progress'" in helpers + assert "if(Array.isArray(latestTodos) && !_hasActiveTodoItems(latestTodos)) return [];" in helpers + + def test_preserved_task_list_rendering_does_not_mutate_history(): src = _read("static/ui.js") start = src.find("function _isPreservedCompressionTaskListMessage") diff --git a/tests/test_import_cli_session_lineage.py b/tests/test_import_cli_session_lineage.py new file mode 100644 index 00000000..e9165edc --- /dev/null +++ b/tests/test_import_cli_session_lineage.py @@ -0,0 +1,34 @@ +import json + + +def test_import_cli_session_preserves_parent_session_id(): + from api.models import import_cli_session, SESSION_DIR, Session + + parent_id = 'parent_lineage_001' + child_id = 'child_lineage_001' + + # Ensure clean fixture state for direct model-level import. + for sid in (parent_id, child_id): + try: + (SESSION_DIR / f'{sid}.json').unlink(missing_ok=True) + except Exception: + pass + + session = import_cli_session( + child_id, + 'Child Session', + [{'role': 'user', 'content': 'hello', 'timestamp': 1.0}], + model='test-model', + parent_session_id=parent_id, + created_at=1.0, + updated_at=2.0, + ) + + assert session.parent_session_id == parent_id + + payload = json.loads((SESSION_DIR / f'{child_id}.json').read_text(encoding='utf-8')) + assert payload['parent_session_id'] == parent_id + + loaded = Session.load(child_id) + assert loaded.parent_session_id == parent_id + assert loaded.compact()['parent_session_id'] == parent_id diff --git a/tests/test_regressions.py b/tests/test_regressions.py index e2a6ff12..444ec66e 100644 --- a/tests/test_regressions.py +++ b/tests/test_regressions.py @@ -415,7 +415,7 @@ def test_loadSession_inflight_restores_live_tool_cards(cleanup_test_sessions): # INFLIGHT branch must call appendLiveToolCard inflight_idx = src.find("if(INFLIGHT[sid]){") assert inflight_idx >= 0, "INFLIGHT branch not found in loadSession" - inflight_block = src[inflight_idx:inflight_idx+500] + inflight_block = src[inflight_idx:inflight_idx+900] assert "appendLiveToolCard" in inflight_block, "loadSession INFLIGHT branch must restore live tool cards via appendLiveToolCard" assert "clearLiveToolCards" in inflight_block, "loadSession INFLIGHT branch must clear old live cards before restoring" @@ -601,6 +601,29 @@ def test_loadSession_inflight_sets_busy_before_renderMessages(cleanup_test_sessi "loadSession must set S.busy=true before renderMessages() to avoid duplicate tool cards" +def test_loadSession_inflight_sets_active_stream_before_replaying_live_tool_cards(cleanup_test_sessions): + """#1715: returning to an active chat must replay persisted tool cards. + + appendLiveToolCard() intentionally no-ops unless S.activeStreamId is already + set for the viewed streaming session. If loadSession() restores S.toolCalls + and replays them before assigning S.activeStreamId, the compact Activity + counter drops the previously-seen tools after a focus change. + """ + src = (REPO_ROOT / "static/sessions.js").read_text() + inflight_idx = src.find("if(INFLIGHT[sid]){") + assert inflight_idx >= 0, "INFLIGHT branch not found in loadSession" + inflight_block = src[inflight_idx:inflight_idx+1000] + active_pos = inflight_block.find("S.activeStreamId=activeStreamId;") + replay_pos = inflight_block.find("appendLiveToolCard(tc);") + attach_pos = inflight_block.find("attachLiveStream(sid, activeStreamId") + assert active_pos >= 0, "loadSession INFLIGHT branch must restore S.activeStreamId" + assert replay_pos >= 0, "loadSession INFLIGHT branch must replay persisted live tool cards" + assert active_pos < replay_pos, \ + "S.activeStreamId must be restored before appendLiveToolCard() replays persisted tools" + assert attach_pos < 0 or active_pos < attach_pos, \ + "S.activeStreamId should also be restored before SSE reattach can deliver more tool events" + + def test_streaming_bridge_accepts_current_tool_progress_callback_signature(cleanup_test_sessions): """R17: api/streaming.py must accept the current Hermes agent callback contract. The agent now calls tool_progress_callback(event_type, name, preview, args, **kwargs). diff --git a/tests/test_session_import_cli_fallback_model.py b/tests/test_session_import_cli_fallback_model.py index c8399033..f47adeb9 100644 --- a/tests/test_session_import_cli_fallback_model.py +++ b/tests/test_session_import_cli_fallback_model.py @@ -90,6 +90,7 @@ def test_session_import_cli_refresh_matches_messages_despite_timestamp_type_diff self.raw_source = "weixin" self.session_source = "messaging" self.source_label = "WeChat" + self.parent_session_id = None def compact(self): return {"session_id": session_id, "title": "Imported"} @@ -141,6 +142,7 @@ def test_session_import_cli_refresh_rejects_prefix_if_non_timing_content_diverge self.session_source = "messaging" self.source_label = "Telegram" self.is_cli_session = True + self.parent_session_id = None def compact(self): return {"session_id": session_id, "title": "Imported"} @@ -169,3 +171,113 @@ def test_session_import_cli_refresh_rejects_prefix_if_non_timing_content_diverge assert response["session"]["messages"] == existing.messages assert existing.messages[0]["content"] == "old-prefix" assert save_calls == [] + + +def test_session_import_cli_preserves_parent_metadata_on_existing_import(monkeypatch): + """Refreshing an already-imported CLI session must persist lineage metadata.""" + import api.routes as routes + + session_id = "existing_parent_lineage_001" + parent_id = "root_parent_lineage_001" + + class FakeSession: + def __init__(self): + self.messages = [{"role": "user", "content": "hello", "timestamp": 1.0}] + self.source_tag = "telegram" + self.raw_source = "telegram" + self.session_source = "messaging" + self.source_label = "Telegram" + self.parent_session_id = None + self.is_cli_session = True + + def compact(self): + return {"session_id": session_id, "title": "Imported", "parent_session_id": self.parent_session_id} + + def save(self, touch_updated_at=False): + save_calls.append(touch_updated_at) + + save_calls = [] + existing = FakeSession() + + monkeypatch.setattr(routes.Session, "load", classmethod(lambda _cls, sid: existing if sid == session_id else None)) + monkeypatch.setattr(routes, "require", lambda body, *keys: None) + monkeypatch.setattr(routes, "j", lambda _handler, payload, status=200, extra_headers=None: payload) + monkeypatch.setattr(routes, "get_cli_session_messages", lambda sid: existing.messages if sid == session_id else []) + monkeypatch.setattr( + routes, + "get_cli_sessions", + lambda: [{ + "session_id": session_id, + "source_tag": "telegram", + "raw_source": "telegram", + "session_source": "messaging", + "source_label": "Telegram", + "parent_session_id": parent_id, + }], + ) + + response = routes._handle_session_import_cli(object(), {"session_id": session_id}) + + assert response["imported"] is False + assert existing.parent_session_id == parent_id + assert response["session"]["parent_session_id"] == parent_id + assert save_calls == [False] + + +def test_read_only_import_payload_includes_parent_session_id(monkeypatch): + """Read-only CLI/session imports should also expose lineage in the payload.""" + import api.routes as routes + + session_id = "readonly_parent_lineage_001" + parent_id = "readonly_root_lineage_001" + messages = [{"role": "user", "content": "hello", "timestamp": 1.0}] + + monkeypatch.setattr(routes.Session, "load", classmethod(lambda _cls, sid: None)) + monkeypatch.setattr(routes, "require", lambda body, *keys: None) + monkeypatch.setattr(routes, "bad", lambda _handler, msg, status=400: {"ok": False, "error": msg, "status": status}) + monkeypatch.setattr(routes, "j", lambda _handler, payload, status=200, extra_headers=None: payload) + monkeypatch.setattr(routes, "get_cli_session_messages", lambda sid: messages if sid == session_id else []) + monkeypatch.setattr( + routes, + "get_cli_sessions", + lambda: [{ + "session_id": session_id, + "title": "Read-only child", + "model": "test-model", + "created_at": 1.0, + "updated_at": 2.0, + "source_tag": "discord", + "raw_source": "discord", + "session_source": "messaging", + "source_label": "Discord", + "parent_session_id": parent_id, + "read_only": True, + }], + ) + + response = routes._handle_session_import_cli(object(), {"session_id": session_id}) + + assert response["imported"] is False + assert response["session"]["parent_session_id"] == parent_id + assert response["session"]["messages"] == messages + + +def test_merge_cli_sidebar_metadata_keeps_larger_sidecar_message_count(): + """Sidebar metadata merge should not shrink repaired aggregate sidecar counts.""" + import api.routes as routes + + merged = routes._merge_cli_sidebar_metadata( + {"session_id": "sid", "message_count": 535, "title": "Recovered"}, + {"session_id": "sid", "message_count": 407, "source_tag": "discord"}, + ) + + assert merged["message_count"] == 535 + + +def test_messaging_session_loader_prefers_longer_sidecar_transcript(): + """Pin the /api/session invariant that repaired sidecars can be longer than state.db segments.""" + handler = _extract_handler("handle_get") + old = "if is_messaging_session and cli_messages:\n _all_msgs = cli_messages" + assert old not in handler + assert "sidecar_messages = getattr(s, \"messages\", []) or []" in handler + assert "len(sidecar_messages) > len(cli_messages)" in handler diff --git a/tests/test_tool_call_persistence.py b/tests/test_tool_call_persistence.py index 22547914..050b2443 100644 --- a/tests/test_tool_call_persistence.py +++ b/tests/test_tool_call_persistence.py @@ -1,11 +1,12 @@ """Tests for backend tool-call summary extraction used by WebUI session persistence.""" +import json import pathlib import sys REPO_ROOT = pathlib.Path(__file__).parent.parent.resolve() sys.path.insert(0, str(REPO_ROOT)) -from api.streaming import _extract_tool_calls_from_messages +from api.streaming import _extract_tool_calls_from_messages, _tool_result_snippet def test_extract_tool_calls_from_openai_message_linkage(): @@ -32,6 +33,64 @@ def test_extract_tool_calls_from_openai_message_linkage(): assert result[0]["snippet"] == "file.txt" +def test_tool_result_snippet_allows_frontend_show_more_threshold_but_stays_bounded(): + """Persisted snippets should be long enough for frontend Show more but capped.""" + medium_output = "m" * 1200 + huge_output = "h" * 5000 + + medium_snippet = _tool_result_snippet(json.dumps({"output": medium_output})) + huge_snippet = _tool_result_snippet(json.dumps({"output": huge_output})) + + assert len(medium_snippet) == 1200 + assert len(medium_snippet) > 800 + assert len(huge_snippet) == 4000 + + +def test_extract_tool_calls_persists_show_more_sized_snippets_with_bounded_cap(): + """Tool-call summaries should store >800-char snippets without growing unbounded.""" + long_output = "x" * 1200 + huge_output = "y" * 5000 + messages = [ + { + "role": "assistant", + "content": "", + "tool_calls": [ + { + "id": "call-long", + "function": { + "name": "read_file", + "arguments": '{"path":"/tmp/medium.log"}', + }, + }, + { + "id": "call-huge", + "function": { + "name": "terminal", + "arguments": '{"command":"yes"}', + }, + }, + ], + }, + { + "role": "tool", + "tool_call_id": "call-long", + "content": json.dumps({"output": long_output}), + }, + { + "role": "tool", + "tool_call_id": "call-huge", + "content": json.dumps({"output": huge_output}), + }, + ] + + result = _extract_tool_calls_from_messages(messages) + + assert len(result) == 2 + assert len(result[0]["snippet"]) == 1200 + assert len(result[0]["snippet"]) > 800 + assert len(result[1]["snippet"]) == 4000 + + def test_extract_tool_calls_falls_back_to_live_progress_when_ids_missing(): messages = [ {"role": "user", "content": "write spec"}, diff --git a/tests/test_turn_duration_display.py b/tests/test_turn_duration_display.py index b87deb74..2bd1aef5 100644 --- a/tests/test_turn_duration_display.py +++ b/tests/test_turn_duration_display.py @@ -8,6 +8,7 @@ from pathlib import Path REPO = Path(__file__).resolve().parent.parent STREAMING_PY = (REPO / "api" / "streaming.py").read_text(encoding="utf-8") MESSAGES_JS = (REPO / "static" / "messages.js").read_text(encoding="utf-8") +ROUTES_PY = (REPO / "api" / "routes.py").read_text(encoding="utf-8") UI_JS = (REPO / "static" / "ui.js").read_text(encoding="utf-8") CSS = (REPO / "static" / "style.css").read_text(encoding="utf-8") @@ -61,3 +62,29 @@ def test_ui_formats_and_renders_turn_duration_in_footer_and_activity_summary(): assert ".msg-duration-inline" in CSS and ".tool-call-group-duration" in CSS, ( "Duration UI should have explicit CSS hooks for the footer chip and compact activity summary." ) + + +def test_active_compact_activity_elapsed_timer_uses_persisted_start_time(): + assert '"pending_started_at": s.pending_started_at' in ROUTES_PY, ( + "/api/chat/start should return the persisted pending_started_at timestamp " + "so the live timer starts from backend/session truth." + ) + assert "startData.pending_started_at" in MESSAGES_JS, ( + "send() should copy chat-start pending_started_at into S.session before " + "attaching the live stream." + ) + assert "function _formatActiveElapsedTimer" in UI_JS and "padStart(2,'0')" in UI_JS, ( + "ui.js should format the running timer in MM:SS form." + ) + assert "data-turn-started-at" in UI_JS and "data-active-turn-elapsed" in UI_JS, ( + "Live compact Activity groups need stable start-time and active-elapsed " + "hooks for browser QA and reconnect/rerender safety." + ) + assert "Working " in UI_JS, ( + "The in-progress Activity summary should distinguish the live counter " + "from the settled 'Done in …' duration." + ) + assert "setInterval" in UI_JS and "_clearActivityElapsedTimer" in UI_JS, ( + "The active elapsed label should tick while running and clear its interval " + "on terminal/error/session-switch cleanup paths." + )