Merge pull request #2110 from nesquena/stage-341

Release V0.51.48 — stage-341 (3-PR batch: title-retry fix + run-adapter RFC + worktree archive copy + 3 Opus SHOULD-FIX)
2026-05-25 11:10:18 +00:00 · 2026-05-11 17:19:17 -07:00
parent 27fff66e4c d75b59135a
commit 306dd2bf09
15 changed files with 782 additions and 26 deletions
@@ -4,7 +4,28 @@

 ### Added

- **PR #2100** by @ai-ag2026 — Per-cron toast notification toggle. New `toast_notifications` boolean on cron job payloads (default-true for legacy preservation) wired through `_renderCronForm`, `_renderCronDetail`, `openCronCreate`, `openCronEdit`, `duplicateCurrentCron`, and `saveCronForm`. The polling loop in `startCronPolling()` gates `showToast(...)` on `c.toast_notifications !== false` so muted jobs still update the Tasks badge and new-run marker but skip the toast. Full i18n parity (8 locales: en/it/ja/ru/es/de/zh/pt/ko after PR #2067 lands) and 158-line regression suite in `tests/test_cron_toast_notifications.py`.
+- **PR #2105** by @Michaelyklam — Hermes run adapter contract RFC at `docs/rfcs/hermes-run-adapter-contract.md` (refs #1925). 315-line spec/gap matrix that defines the event/control compatibility contract WebUI needs before browser-originated chat turns can be routed to Hermes-owned runtime execution. Documents the ownership boundary (Hermes Agent owns run creation, lifecycle, event ordering, replay, terminal state, approvals, clarify, cancellation; WebUI owns browser auth, transcript rendering, tool cards, approval/clarify widgets, workspace UX), the minimum `start_run`/`observe_run`/`get_run`/`cancel_run`/`queue_or_continue`/`respond_approval`/`respond_clarify` IPC surface, and a gap matrix mapping current `STREAMS`/`CANCEL_FLAGS`/`AGENT_INSTANCES`/callback queues to Hermes-owned targets with explicit "no private callback queue" / "no runtime surrogate" non-goals. First success criterion is restart/reattach (start a non-trivial run, restart hermes-webui, browser reconnects, replays from last cursor, cancels with Hermes-emitted terminal state) — not "basic chat streamed once." Status: Proposed.
+
+### Fixed
+
+- **PR #2107** (self-built, closes #2083) — Title-generation budget-doubling retry loop on reasoning-only model responses. Reporter @darkopetrovic on LM Studio with Qwen3.6-35B-A3B (and the broader class: DeepSeek-R1, Kimi-K2, other Qwen3-thinking variants) saw GPU never going idle after each prompt — the chat turn finished cleanly but the auto-title generation request burned its 500-token budget on hidden `reasoning_content`, emitted `content=""` with `finish_reason=length`, got classified as `llm_length`, retried at 1024 tokens, returned the same shape, then iterated through `_title_prompts()`'s two prompts for ~3000 reasoning tokens per new chat. The agent-side `is_lmstudio` classifier in `run_agent.py:9468` misses `custom:` providers pointing at LM Studio, so the `reasoning_effort: "none"` adapter never fires for that route. WebUI-side belt-and-braces fix: (1) `_extract_title_response()` reorders the empty-response classification to check `reasoning_content` first regardless of `finish_reason` — reasoning presence is the diagnostic signal, not finish_reason; (2) `_title_retry_status()` drops `llm_empty_reasoning{,_aux}` from the retry set (length-without-reasoning still retries — legitimate budget-truncation case); (3) new `_title_should_skip_remaining_attempts()` short-circuits the prompt-iteration loop, both aux and agent routes break to `_fallback_title_from_exchange` for a local-summary title. Net: 4 calls → 1 call per chat. `tests/test_title_aux_routing.py` inverts the old reasoning-retry assertions and adds two new tests for the legitimate length-without-reasoning retry path. nesquena APPROVED with 200-line end-to-end trace + behavioral harness confirming the 4→1 call reduction.
+
+- **PR #2064** by @franksong2702 — Worktree session archive/delete confirm copy now reassures users that the underlying worktree directory remains on disk (refs #2057). Pre-fix the confirm dialogs said only "Delete this conversation?" / "Archive this conversation?" without clarifying that worktree-backed conversations preserve the worktree files even when the conversation row is removed — users were reasonably afraid of losing local work. Adds an explicit `worktree_retained` boolean on the `/api/session` payload that the frontend reads to surface "The worktree at /path will remain on disk." (single) and "N worktree-backed conversation(s) will keep their worktree directories on disk." (bulk) variants in both archive and delete dialogs. 81-line i18n update across all 9 locales (en/it/ja/ru/es/de/zh/pt/ko) with an English-bundle locale-leak fix caught during screenshot capture (several worktree strings had landed under Russian in error). Regression coverage in `tests/test_issue2057_worktree_lifecycle.py` + `tests/test_issue2057_worktree_ui_static.py`. UX-gate cleared with 5 viewports (4×1280px desktop covering single + bulk archive/delete confirms, 1×390px mobile of single-delete confirming dialog fits without overflow).
+
+### Stage-341 maintainer fixes
+
+- **`docs/rfcs/README.md`** — Added a single bullet to the conventions block clarifying that RFCs are design directions, not invitations to file implementation PRs against fragments. Implementation slices need maintainer confirmation in the tracking issue first. Applied alongside PR #2105 to head off the speculative-fragment pattern we just had to put on hold with PR #2071 (well-written 651-LOC collector with no callers). ~6 LOC.
+
+- **`static/i18n.js:it` block** — Opus SHOULD-FIX from stage-341 review: PR #2064 was branched before stage-340 landed the `it` locale (#2067), so the 9 new `session_*worktree*` keys were missing for Italian users. Mechanical add inside the `it:` block at the parallel position to en/ja. Falls back to English silently without this fix; with this fix, Italian users see the worktree-retention reassurance copy in their locale. Parallels the stage-340 `cron_toast_notifications_*` fix exactly. ~9 LOC.
+
+- **`api/streaming.py` short-circuit observability** — Opus SHOULD-FIX from stage-341 review: PR #2107's new `_title_should_skip_remaining_attempts` short-circuit `break` was silent in both the aux and agent title-generation paths. Added a `logger.debug` call before each `break` so production logs surface why the prompt-iteration loop exited early (nesquena flagged this as non-blocking; landed as polish in the same release). Also expanded the function's docstring to document the membership criterion explicitly so future additions (`llm_safety_blocked`, `llm_oauth_quota`, etc.) have a clear inclusion test. ~16 LOC.
+
+
+## [v0.51.47] — 2026-05-11 — Release W (4-PR contributor batch — per-cron toast toggle + Italian locale + stale-gateway agent-health fix + CI/console hygiene)
+
+### Added
+
+- **PR #2100** by @ai-ag2026 — Per-cron toast notification toggle. New `toast_notifications` boolean on cron job payloads (default-true for legacy preservation) wired through `_renderCronForm`, `_renderCronDetail`, `openCronCreate`, `openCronEdit`, `duplicateCurrentCron`, and `saveCronForm`. The polling loop in `startCronPolling()` gates `showToast(...)` on `c.toast_notifications !== false` so muted jobs still update the Tasks badge and new-run marker but skip the toast. Full i18n parity (9 locales: en/it/ja/ru/es/de/zh/pt/ko after PR #2067 landed) and 158-line regression suite in `tests/test_cron_toast_notifications.py`.

 - **PR #2067** by @samuelgudi — Italian (`it`) locale. ~280 UI strings translated covering boot, messages, MCP, commands, goals, settings, sessions, kanban, panels, and the offline state. Inserted alphabetically (`en → it → ja`) in `static/i18n.js`'s `LOCALES` map and mirrored in the `LOGIN_LOCALES` server-rendered table in `api/routes.py`. Updated `TestComposerVoiceButtonI18n.LOCALES` to include `"it"`; sibling `TestVoiceModePreferenceGate` also gets the tuple so its newly-adaptive `len(self.LOCALES)` count assert resolves.

@@ -18,7 +39,7 @@

 - **`tests/test_issue1488_composer_voice_buttons.py:TestVoiceModePreferenceGate`** — Defined `LOCALES = ("en", "it", "ja", "ru", "es", "de", "zh", "zh-Hant", "pt", "ko")` on the class. PR #2067 made `test_settings_pane_has_voice_mode_i18n_keys` count adaptive via `len(self.LOCALES)` but only defined `LOCALES` on the sibling `TestComposerVoiceButtonI18n`, so CI failed with `AttributeError`. Mirroring the tuple is the surgical fix; the alternative (back to a hard-coded `9`) would have rotted next time someone adds a locale. ~2 LOC.

-## [v0.51.46] — 2026-05-11 — Release V (5-PR contributor batch — CSP report-only + logs panel polish + plugin slash commands + turn-journal crash-safe writer + lifecycle events)
+- **`static/i18n.js:it` block** — Opus SHOULD-FIX from stage-340 review: added the four `cron_toast_notifications_*` keys (label, hint, enabled, disabled) inside the `it:` block. PR #2067 inserted the `it` locale between `en` and `ja`; PR #2100 added those keys to the other 8 locales but missed `it`. ~4 LOC, mechanical add immediately after `cron_profile_server_default_hint` to mirror the en/ja position.

 ## [v0.51.46] — 2026-05-11 — Release V (5-PR contributor batch — CSP report-only + logs panel polish + plugin slash commands + turn-journal crash-safe writer + lifecycle events)

@@ -146,6 +146,34 @@ def _active_skill_search_dirs(skills_dir: Path) -> list[Path]:
    return [p for p in dirs if p.exists()]


+def _worktree_retained_payload(session) -> dict:
+    """Return explicit no-cleanup metadata for worktree-backed session actions."""
+    worktree_path = getattr(session, "worktree_path", None) if session else None
+    if not worktree_path:
+        return {}
+    payload = {
+        "worktree_retained": True,
+        "worktree_path": worktree_path,
+    }
+    worktree_branch = getattr(session, "worktree_branch", None)
+    worktree_repo_root = getattr(session, "worktree_repo_root", None)
+    if worktree_branch:
+        payload["worktree_branch"] = worktree_branch
+    if worktree_repo_root:
+        payload["worktree_repo_root"] = worktree_repo_root
+    return payload
+
+
+def _worktree_retained_payload_for_session_id(sid: str) -> dict:
+    try:
+        return _worktree_retained_payload(get_session(sid, metadata_only=True))
+    except KeyError:
+        return {}
+    except Exception:
+        logger.debug("Failed to read worktree metadata for deleted session %s", sid)
+        return {}
+
+
 def _skills_list_from_dir(skills_dir: Path, category: str | None = None) -> dict:
    """List skills using an explicit local skills directory.

@@ -4219,6 +4247,7 @@ def handle_post(handler, parsed) -> bool:
        if cli_meta_for_delete.get("read_only"):
            return bad(handler, "Read-only imported sessions cannot be deleted from WebUI", 400)
        is_messaging_session = _is_messaging_session_id(sid)
+        worktree_retained = _worktree_retained_payload_for_session_id(sid)
        # Delete from WebUI session store
        with LOCK:
            SESSIONS.pop(sid, None)
@@ -4257,7 +4286,7 @@ def handle_post(handler, parsed) -> bool:
                delete_cli_session(sid)
            except Exception:
                logger.debug("Failed to delete CLI session %s", sid)
-        return j(handler, {"ok": True})
+        return j(handler, {"ok": True, **worktree_retained})

    if parsed.path == "/api/session/clear":
        try:
@@ -4894,7 +4923,7 @@ def handle_post(handler, parsed) -> bool:
        with _get_session_agent_lock(sid):
            s.archived = bool(body.get("archived", True))
            s.save(touch_updated_at=False)
-        return j(handler, {"ok": True, "session": s.compact()})
+        return j(handler, {"ok": True, "session": s.compact(), **_worktree_retained_payload(s)})

    # ── Session move to project (POST) ──
    if parsed.path == "/api/session/move":
@@ -877,9 +877,41 @@ def _title_retry_completion_budget(provider: str = '', model: str = '', base_url


 def _title_retry_status(status: str) -> bool:
+    # Whether to grant a second budget attempt within the same prompt+model
+    # combination.  ``llm_length`` indicates the model would have produced
+    # content with more headroom, so doubling the budget can help.
+    #
+    # ``llm_empty_reasoning`` historically also triggered a retry, but for
+    # reasoning models (Qwen3-thinking, DeepSeek-R1, Kimi-K2, etc.) that
+    # status means the model burned its entire budget on hidden reasoning
+    # tokens and emitted nothing visible.  Doubling the budget in that case
+    # just doubles the GPU/credit cost without changing the outcome — the
+    # next attempt produces the same shape.  We skip the retry for empty-
+    # reasoning statuses and let the title path fall through to the local
+    # fallback summary.  See issue #2083 for the LM Studio + Qwen3 repro.
    return status in {
        'llm_length',
        'llm_length_aux',
+    }
+
+
+def _title_should_skip_remaining_attempts(status: str) -> bool:
+    """Statuses where re-issuing the next prompt against the same model
+    produces the same failing shape (model burned its budget on hidden
+    reasoning, hit a hard provider gate, etc.).
+
+    Short-circuit the prompt-iteration loop so we don't issue a second
+    full-budget LLM call (and twice the GPU/credit burn) only to land in
+    the same fallback path. See issue #2083.
+
+    Add a status here only when retrying the next prompt is provably
+    wasted work (single-call signal already establishes that the next
+    call will return the same shape). Length-truncation WITHOUT
+    reasoning is NOT in the set — that's legitimately recoverable by
+    a larger budget on a different prompt and stays in
+    :func:`_title_retry_status`.
+    """
+    return status in {
        'llm_empty_reasoning',
        'llm_empty_reasoning_aux',
    }
@@ -922,10 +954,16 @@ def _extract_title_response(resp, *, aux: bool = False) -> tuple[str, str]:
            or _safe_text_value(_safe_obj_value(message, 'reasoning_content'))
            or _safe_text_value(_safe_obj_value(message, 'thinking'))
        )
-        if finish_reason == 'length':
-            return '', f'llm_length{suffix}'
+        # When the model emitted reasoning tokens but no visible content, it
+        # burned its budget on hidden thinking — retrying with a larger budget
+        # almost never recovers a useful title (see issue #2083: Qwen3-thinking
+        # via LM Studio loops indefinitely on auto-title generation).  Report
+        # this case distinctly so callers can short-circuit instead of double-
+        # billing the GPU/credit on a near-certain repeat.
        if reasoning:
            return '', f'llm_empty_reasoning{suffix}'
+        if finish_reason == 'length':
+            return '', f'llm_length{suffix}'
        return '', f'llm_empty{suffix}'
    except Exception:
        return '', f'llm_empty{suffix}'
@@ -978,6 +1016,15 @@ def generate_title_raw_via_aux(
            except Exception as e:
                last_status = 'llm_error_aux'
                logger.debug("Aux title generation attempt %s failed: %s", idx + 1, e)
+            # If the model just burned its budget on hidden reasoning, retrying
+            # the next prompt against the same model produces the same shape.
+            # Short-circuit to the local fallback path (#2083).
+            if _title_should_skip_remaining_attempts(last_status):
+                logger.debug(
+                    "Aux title generation short-circuiting after %s (reasoning-only response).",
+                    last_status,
+                )
+                break
        return None, last_status
    except Exception as e:
        logger.debug("Aux title generation failed: %s", e)
@@ -1077,6 +1124,15 @@ def generate_title_raw_via_agent(agent, user_text: str, assistant_text: str) ->
                    getattr(agent, 'model', None),
                    e,
                )
+            # If the model just burned its budget on hidden reasoning, retrying
+            # the next prompt against the same model produces the same shape.
+            # Short-circuit to the local fallback path (#2083).
+            if _title_should_skip_remaining_attempts(last_status):
+                logger.debug(
+                    "Agent title generation short-circuiting after %s (reasoning-only response).",
+                    last_status,
+                )
+                break
        return None, last_status
    except Exception as e:
        logger.debug("Agent title generation failed: %s", e)
@@ -18,6 +18,12 @@ cutting infrastructure.
  questions, Rollout plan. Skip what doesn't apply.
 - An RFC is a starting point for review. Comments and revisions land via PR
  edits, not separate discussion threads.
+- An RFC documents a design direction. It is **not** an invitation to file
+  implementation PRs against fragments of it. Before opening any PR that
+  implements an accepted RFC, confirm with a maintainer in the tracking
+  issue that the implementation slice is wanted and that no other
+  contributor is already building it. Speculative implementations of RFC
+  fragments without a confirmed integration site will be held.

 ## When to file an RFC

@@ -32,5 +38,8 @@ First-time contributor RFCs should be discussed in an issue before opening a PR.

 ## Current RFCs

+- [`hermes-run-adapter-contract.md`](hermes-run-adapter-contract.md) — Event/control
+  compatibility contract and gap matrix for moving WebUI chat runs to Hermes-owned
+  runtime execution.
 - [`turn-journal.md`](turn-journal.md) — Crash-safe WebUI turn journal for
  recovering interrupted chat submissions.
@@ -0,0 +1,315 @@
+# Hermes Run Adapter Compatibility Contract
+
+- **Status:** Proposed
+- **Author:** @Michaelyklam
+- **Created:** 2026-05-11
+- **Tracking issue:** [#1925](https://github.com/nesquena/hermes-webui/issues/1925)
+
+## Problem
+
+Hermes WebUI currently gives a rich workbench experience, but browser-originated
+chat turns are still executed inside the WebUI server process. The WebUI path
+creates process-local stream state, starts background agent threads, constructs or
+reuses `AIAgent`, and owns callback queues for token, tool, reasoning, approval,
+and clarify state.
+
+The target boundary from #1925 is:
+
+> WebUI should be thin in execution ownership, not thin in product scope.
+
+That means WebUI remains the full browser workbench for sessions, workspace
+files, chat rendering, tools, approvals, status, diagnostics, and controls. The
+change is that Hermes Agent must own run lifecycle, event ordering, replay,
+approvals, clarify, cancellation, and terminal state.
+
+This document defines the first reviewable contract for a Hermes-owned run
+adapter. It is intentionally a spec/gap matrix, not an implementation plan for a
+new WebUI runtime surrogate.
+
+## Goals
+
+- Keep the browser-facing WebUI workbench contract stable while execution moves
+  out of the WebUI process.
+- Define the minimum Hermes Runtime API / IPC v0 surface WebUI needs before it
+  can route new runs to Hermes-owned execution.
+- Map current WebUI-owned runtime primitives to Hermes-owned APIs, WebUI
+  presentation state, or explicit temporary compatibility shims.
+- Make restart/reattach the first meaningful success criterion, not merely
+  "basic chat streamed once."
+
+## Non-goals
+
+- Do not implement the adapter in this RFC.
+- Do not create a new run-manager sidecar or broker requirement.
+- Do not re-create `STREAMS`, cached `AIAgent` objects, approval queues, clarify
+  queues, or cancellation flags under new names inside WebUI.
+- Do not reduce WebUI product scope. The rich workbench UX remains in WebUI.
+- Do not require every event to be durably persisted on day one if the first
+  upstream runtime slice can still prove Hermes-owned execution and reconnect.
+
+## Ownership boundary
+
+### Hermes Agent owns
+
+- run creation and lifecycle
+- run ids and session-to-active-run mapping
+- ordered event stream and replay cursor
+- terminal run state, final result, and error metadata
+- model/provider/profile/toolset routing
+- agent execution and tool dispatch
+- command semantics and capability metadata
+- approval and clarify lifecycle
+- cancel, interrupt, queue, continue, steer, and goal control where supported
+- durable runtime/session state needed for reconnect
+
+### WebUI owns
+
+- browser authentication and presentation-specific session routing
+- chat layout, transcript rendering, tool cards, thinking/progress display
+- approval and clarify widgets
+- workspace/file-panel UX
+- settings/admin/diagnostics presentation
+- adapting Hermes runtime events into WebUI-compatible browser events
+- temporary compatibility shims explicitly listed in this RFC
+
+## WebUI event/control compatibility contract
+
+The browser-facing contract should remain stable enough that the current WebUI
+workbench can render either the legacy in-process runtime or the Hermes-owned run
+adapter during migration. These are presentation events over Hermes runtime
+truth, not a second source of truth.
+
+All events should include enough metadata for idempotent rendering and
+reconnect:
+
+```json
+{
+  "event_id": "run_123:42",
+  "seq": 42,
+  "run_id": "run_123",
+  "session_id": "20260511_...",
+  "type": "tool.update",
+  "created_at": 1778540000.0,
+  "terminal": false,
+  "payload": {}
+}
+```
+
+`event_id` may be an SSE `id:` value or an equivalent cursor token. `seq` is a
+monotonic per-run cursor. Clients may send `Last-Event-ID` or `after_seq` on
+reconnect. The runtime should treat replay as at-least-once delivery; WebUI must
+deduplicate by `run_id` + `seq` / `event_id`.
+
+### Event families
+
+| WebUI event family | Required payload | Runtime source of truth |
+|---|---|---|
+| `run.started` / `status` | lifecycle state, controls available, session id, workspace/profile/model/toolset summary | Hermes run state |
+| `token.delta` | assistant message id/segment id, delta text, optional content type | Hermes model output stream |
+| `reasoning.delta` / `reasoning.done` | reasoning text or structured reasoning block, visibility metadata | Hermes reasoning callback/event stream |
+| `progress` | concise status/progress text, optional phase/tool context | Hermes agent progress callbacks |
+| `tool.started` | tool call id, tool name, sanitized arguments, start time | Hermes tool dispatch lifecycle |
+| `tool.updated` | stdout/stderr/structured partial data, progress metadata | Hermes tool dispatch lifecycle |
+| `tool.done` | result, exit/status, duration, error flag | Hermes tool dispatch lifecycle |
+| `approval.requested` | approval id, command/action summary, risk metadata, available choices | Hermes approval queue/control plane |
+| `approval.resolved` | approval id, choice, resulting status | Hermes approval queue/control plane |
+| `clarify.requested` | clarify id, question, choices/input mode | Hermes clarify lifecycle |
+| `clarify.resolved` | clarify id, answer metadata/status | Hermes clarify lifecycle |
+| `title.updated` | title text, title source/confidence | Hermes session/title subsystem |
+| `usage.updated` / `usage.final` | tokens, cost, model/provider, duration where available | Hermes usage accounting |
+| `error` | stable error code, safe message, redacted diagnostic metadata, terminal flag | Hermes run terminal/error state |
+| `done` | final lifecycle state, usage, terminal result/error summary, last seq | Hermes run terminal state |
+
+### Reconnect metadata
+
+Every active or terminal run must expose:
+
+- `run_id`
+- `session_id`
+- current `status`: `queued`, `running`, `awaiting_approval`,
+  `awaiting_clarify`, `paused`, `cancelling`, `cancelled`, `failed`,
+  `completed`, or `expired`
+- last committed event cursor / `last_event_id`
+- terminal state and final result/error when finished
+- currently available controls
+- pending approval/clarify ids, if any
+- session-to-active-run mapping for the current WebUI session
+
+### Controls
+
+| WebUI control | Required semantics | Runtime endpoint / IPC |
+|---|---|---|
+| cancel | Request graceful cancellation of the current run; terminal event must follow | `cancel_run` / `interrupt` |
+| queue / continue | Append follow-up work to a live, paused, or resumable run/session according to Hermes semantics | `queue_or_continue` |
+| approval | Resolve a pending approval request with `allow_once`, `allow_session`, `always`, or `deny` where supported | `respond_approval` |
+| clarify | Submit answer text or selected choice for a pending clarify request | `respond_clarify` |
+| goal | Set/status/pause/resume/clear goal where Hermes exposes goal capability for this surface | command/capability API |
+| observe | Attach to live events and replay from cursor | `observe_run` |
+| status | Poll lifecycle state when SSE/WebSocket is unavailable | `get_run` |
+
+WebUI may keep local UI state such as which disclosure rows are expanded, but it
+must not infer or privately mutate runtime state for these controls.
+
+## Hermes Runtime API / IPC v0 minimum
+
+The transport can be HTTP, stdio IPC, websocket, or another Hermes-owned local
+protocol. The key requirement is the semantic contract: Hermes owns the run id,
+lifecycle, event cursor, controls, pending human-interaction state, and terminal
+state.
+
+### `start_run`
+
+Creates a Hermes-owned run.
+
+Input fields:
+
+- `session_id` or instruction to create one
+- user message / queued input
+- workspace context and attachments metadata
+- profile/provider/model/toolset hints
+- source/surface metadata, e.g. `source=webui`
+- optional command intent, e.g. `/goal` if parsed by WebUI command UI
+- idempotency key for duplicate browser submissions
+
+Output fields:
+
+- `run_id`
+- `session_id`
+- initial `status`
+- `observe` cursor / first event id
+- supported controls for this run
+
+### `observe_run`
+
+Streams ordered run events, with replay from a cursor.
+
+Required behavior:
+
+- support `after_seq` or `Last-Event-ID`
+- emit events in monotonically increasing per-run order
+- replay terminal `error` / `done` state for completed runs
+- make duplicate delivery safe for reconnecting clients
+- preserve enough history for short WebUI restarts and browser reloads
+
+### `get_run`
+
+Returns current lifecycle state without consuming the event stream.
+
+Required fields:
+
+- `run_id`, `session_id`, `status`
+- `created_at`, `updated_at`, optional `completed_at`
+- `last_seq` / `last_event_id`
+- active controls
+- pending approval/clarify summaries
+- terminal result/error summary
+- usage/model/provider/profile/toolset summary where available
+
+### `cancel_run` / interrupt
+
+Requests graceful run cancellation or interruption. Hermes owns the final state
+transition and emits a terminal event. WebUI should not directly toggle a local
+cancellation flag as the source of truth.
+
+### `queue_or_continue`
+
+Submits follow-up work for a live, paused, or resumable run/session. Semantics
+must match Hermes-native queue/continue behavior so WebUI does not create a
+parallel continuation model.
+
+### `respond_approval`
+
+Resolves a pending approval request by id.
+
+Required behavior:
+
+- validate the approval belongs to the run/session
+- accept only supported choices
+- emit `approval.resolved`
+- continue, pause, or fail the run according to Hermes approval semantics
+
+### `respond_clarify`
+
+Resolves a pending clarification request by id.
+
+Required behavior:
+
+- validate the clarify request belongs to the run/session
+- accept text or selected-choice payloads
+- emit `clarify.resolved`
+- continue or fail the run according to Hermes clarify semantics
+
+## Gap matrix
+
+| Current WebUI primitive | Current role | Hermes-owned target | Temporary shim allowed? | Notes / gap |
+|---|---|---|---|---|
+| `STREAMS` / `STREAMS_LOCK` | Process-local live stream registry and subscriber fan-out | Hermes run registry + `observe_run` replay/fan-out | Yes, adapter may keep per-browser SSE connections only | Shim must not be the run source of truth and must survive WebUI restart by re-observing Hermes. |
+| `CANCEL_FLAGS` | Local cancellation signal checked by WebUI-owned agent thread | `cancel_run` / interrupt control | No, except translating button clicks into runtime calls | Cancellation result must come back as Hermes status/events. |
+| `AGENT_INSTANCES` | Cached `AIAgent` objects inside WebUI process | Hermes Agent runtime owns agent construction/reuse | No | Keeping this in the adapter would recreate the runtime surrogate. |
+| Partial text buffers | Reconstruct live assistant deltas for browser reconnect/render | Hermes event log/cursor plus WebUI renderer cache | Short-lived presentation cache only | Source should be replayed token events or persisted transcript, not WebUI-only execution state. |
+| Reasoning buffers | Preserve streamed reasoning/thinking text | Hermes reasoning events + replay | Short-lived presentation cache only | Replay must rebuild the same thinking cards after refresh. |
+| Tool buffers / live tool calls | Render tool cards and updates | Hermes tool lifecycle events + replay | Short-lived presentation cache only | WebUI owns card rendering, not tool execution state. |
+| Approval callbacks and queues | Bridge WebUI buttons to a live Python callback | Hermes pending approval state + `respond_approval` | No private callback queue | Pending approval must be discoverable after WebUI restart. |
+| Clarify callbacks and queues | Bridge WebUI form to a live Python callback | Hermes pending clarify state + `respond_clarify` | No private callback queue | Pending clarify must be discoverable after WebUI restart. |
+| Command capability metadata | Decide which slash commands render/execute in WebUI | Hermes command registry/capability API with owner/surface metadata | WebUI may cache metadata | Unknown commands should not be reimplemented in WebUI by default. |
+| Session-to-active-run mapping | Stored implicitly in WebUI session JSON / active stream ids | Hermes session/run mapping API | WebUI may cache last seen run id | Reopen session must rediscover active/completed run from Hermes. |
+| Reconnect/replay behavior | Depends on WebUI process memory and session JSON | `observe_run(after_seq)` + `get_run` terminal state | Browser SSE adapter only | First milestone must prove WebUI restart does not orphan the run. |
+| Usage/title/status events | Produced by WebUI streaming callbacks | Hermes usage/title/status events and run state | WebUI formatting only | WebUI can display and persist presentation copies after events arrive. |
+| Goal / queue / continue hooks | Mixed WebUI command handling and streaming callbacks | Hermes command/control plane | Only UI affordance shim | Goal support should be driven by Hermes capabilities. |
+
+## Migration ladder
+
+1. **Inventory and contract**: keep this RFC current with the current WebUI-owned
+   runtime primitives and browser event/control contract.
+2. **Hermes Runtime API / IPC v0**: add or stabilize upstream Hermes primitives
+   for `start_run`, `observe_run`, `get_run`, `cancel_run`, and replayable event
+   cursors.
+3. **Read-only observation spike**: from WebUI, observe an existing Hermes-owned
+   run and adapt its events into WebUI-compatible event objects without starting
+   a WebUI-owned agent thread.
+4. **Feature-flagged new-run path**: route new WebUI runs to Hermes-owned
+   `start_run` behind a flag while preserving the legacy path as fallback.
+5. **Restart/reattach milestone**: prove a non-trivial WebUI-started run
+   survives a WebUI-only restart and browser reload with ordered replay.
+6. **Controls migration**: move cancel, queue/continue, approval, clarify, and
+   goal controls to Hermes-owned endpoints/capabilities.
+7. **Parity tests**: compare legacy and adapter event streams for synthetic
+   token, reasoning, tool, approval, clarify, error, and done scenarios.
+8. **Retire runtime surrogate state**: remove normal WebUI chat ownership of
+   `AIAgent`, cancellation flags, callback queues, and process-local run truth
+   once parity and fallback criteria are satisfied.
+
+## First success criterion
+
+The first implementation milestone is not "basic chat streams through a new
+endpoint." The first meaningful milestone is:
+
+1. Start a non-trivial chat run from WebUI through the Hermes-owned path.
+2. Restart only `hermes-webui` while the run is active.
+3. Reload or reopen the browser session.
+4. Rediscover the same `run_id` from Hermes using `session_id` or last known run
+   metadata.
+5. Replay events from the last cursor with no duplicate visible transcript
+   content.
+6. Render the same token/reasoning/tool/approval/clarify state the workbench
+   would have rendered without the restart.
+7. Cancel the run from WebUI and observe Hermes emit the terminal cancelled
+   state.
+
+If this works, WebUI is moving toward a protocol translator over Hermes-owned
+execution instead of becoming another runtime with different variable names.
+
+## Open questions
+
+- Where should the normative Hermes Runtime API / IPC v0 spec live: in
+  `NousResearch/hermes-agent`, this WebUI RFC, or both with one designated
+  source of truth?
+- What retention window is enough for v0 event replay: active-run memory only,
+  SQLite-backed event log, or transcript-derived reconstruction plus terminal
+  state?
+- Should WebUI talk to Hermes over the existing API server, an embedded IPC
+  channel, or a profile-local runtime socket?
+- How should multiple clients observing the same run coordinate controls and
+  pending approval/clarify prompts?
+- Which slash commands need surface-specific capability metadata before WebUI
+  can safely delegate them to Hermes?
@@ -411,8 +411,10 @@ const LOCALES = {
    session_archive: 'Archive conversation',
    session_restore: 'Restore conversation',
    session_archive_desc: 'Hide this conversation until archived is shown',
+    session_archive_worktree_desc: 'Hide this conversation; keep its worktree on disk',
    session_restore_desc: 'Bring this conversation back into the main list',
    session_archived: 'Session archived',
+    session_archived_worktree: 'Session archived. Worktree remains on disk.',
    session_restored: 'Session restored',
    session_archive_failed: 'Archive failed: ',
    session_duplicate: 'Duplicate conversation',
@@ -423,6 +425,11 @@ const LOCALES = {
    session_stop_response_desc: 'Cancel the running response for this conversation',
    session_delete: 'Delete conversation',
    session_delete_desc: 'Permanently remove this conversation',
+    session_delete_confirm: 'Delete this conversation?',
+    session_delete_worktree_desc: 'Delete only the WebUI conversation; keep the worktree on disk',
+    session_delete_worktree_confirm: (path) => `Delete this conversation? The worktree at ${path} will remain on disk.`,
+    session_deleted: 'Conversation deleted',
+    session_deleted_worktree: 'Conversation deleted. Worktree remains on disk.',
    session_select_mode: 'Select',
    session_select_mode_desc: 'Select conversations to batch manage',
    session_select_all: 'Select all',
@@ -433,6 +440,8 @@ const LOCALES = {
    session_batch_move: 'Move to project',
    session_batch_delete_confirm: 'Delete {0} conversations?',
    session_batch_archive_confirm: 'Archive {0} conversations?',
+    session_batch_delete_worktree_confirm: 'Delete {0} conversations? {1} worktree-backed conversation(s) will leave their worktree directories on disk.',
+    session_batch_archive_worktree_confirm: 'Archive {0} conversations? {1} worktree-backed conversation(s) will keep their worktree directories on disk.',
    session_no_selection: 'No conversations selected',
    // settings panel
    settings_heading_title: 'Control Center',
@@ -1510,8 +1519,10 @@ const LOCALES = {
    session_archive: 'Archivia conversazione',
    session_restore: 'Ripristina conversazione',
    session_archive_desc: 'Nascondi questa conversazione fino a mostrare archiviate',
+    session_archive_worktree_desc: 'Nascondi questa conversazione; mantieni il suo worktree su disco',
    session_restore_desc: 'Riporta questa conversazione nella lista principale',
    session_archived: 'Sessione archiviata',
+    session_archived_worktree: 'Sessione archiviata. Il worktree rimane su disco.',
    session_restored: 'Sessione ripristinata',
    session_archive_failed: 'Archiviazione fallita: ',
    session_duplicate: 'Duplica conversazione',
@@ -1522,6 +1533,11 @@ const LOCALES = {
    session_stop_response_desc: 'Annulla la risposta in corso per questa conversazione',
    session_delete: 'Elimina conversazione',
    session_delete_desc: 'Rimuovi permanentemente questa conversazione',
+    session_delete_confirm: 'Eliminare questa conversazione?',
+    session_delete_worktree_desc: 'Elimina solo la conversazione di WebUI; mantieni il worktree su disco',
+    session_delete_worktree_confirm: (path) => `Eliminare questa conversazione? Il worktree in ${path} rimarrà su disco.`,
+    session_deleted: 'Conversazione eliminata',
+    session_deleted_worktree: 'Conversazione eliminata. Il worktree rimane su disco.',
    session_select_mode: 'Seleziona',
    session_select_mode_desc: 'Seleziona conversazioni per gestione in blocco',
    session_select_all: 'Seleziona tutto',
@@ -1532,6 +1548,8 @@ const LOCALES = {
    session_batch_move: 'Sposta nel progetto',
    session_batch_delete_confirm: 'Eliminare {0} conversazioni?',
    session_batch_archive_confirm: 'Archiviare {0} conversazioni?',
+    session_batch_delete_worktree_confirm: 'Eliminare {0} conversazioni? {1} conversazioni con worktree lasceranno le loro directory worktree su disco.',
+    session_batch_archive_worktree_confirm: 'Archiviare {0} conversazioni? {1} conversazioni con worktree manterranno le loro directory worktree su disco.',
    session_no_selection: 'Nessuna conversazione selezionata',
    // settings panel
    settings_heading_title: 'Pannello di Controllo',
@@ -2604,8 +2622,10 @@ const LOCALES = {
    session_archive: '会話をアーカイブ',
    session_restore: '会話を復元',
    session_archive_desc: 'アーカイブを表示するまでこの会話を非表示にする',
+    session_archive_worktree_desc: 'この会話を非表示にし、worktree はディスク上に残します',
    session_restore_desc: 'この会話をメイン一覧に戻す',
    session_archived: 'セッションをアーカイブしました',
+    session_archived_worktree: 'セッションをアーカイブしました。Worktree はディスク上に残ります。',
    session_restored: 'セッションを復元しました',
    session_archive_failed: 'アーカイブ失敗: ',
    session_duplicate: '会話を複製',
@@ -2616,6 +2636,11 @@ const LOCALES = {
    session_stop_response_desc: 'この会話の実行中の応答をキャンセルします',
    session_delete: '会話を削除',
    session_delete_desc: 'この会話を完全に削除',
+    session_delete_confirm: 'この会話を削除しますか?',
+    session_delete_worktree_desc: 'WebUI の会話だけを削除し、worktree はディスク上に残します',
+    session_delete_worktree_confirm: (path) => `この会話を削除しますか? ${path} の worktree はディスク上に残ります。`,
+    session_deleted: '会話を削除しました',
+    session_deleted_worktree: '会話を削除しました。Worktree はディスク上に残ります。',
    session_select_mode: '選択',
    session_select_mode_desc: '会話を選択して一括管理',
    session_select_all: 'すべて選択',
@@ -2626,6 +2651,8 @@ const LOCALES = {
    session_batch_move: 'プロジェクトへ移動',
    session_batch_delete_confirm: '{0} 件の会話を削除しますか?',
    session_batch_archive_confirm: '{0} 件の会話をアーカイブしますか?',
+    session_batch_delete_worktree_confirm: '{0} 件の会話を削除しますか? {1} 件の worktree 付き会話は、worktree ディレクトリをディスク上に残します。',
+    session_batch_archive_worktree_confirm: '{0} 件の会話をアーカイブしますか? {1} 件の worktree 付き会話は、worktree ディレクトリをディスク上に保持します。',
    session_no_selection: '会話が選択されていません',
    // settings panel
    settings_heading_title: 'コントロールセンター',
@@ -4149,10 +4176,17 @@ const LOCALES = {
    // Session management and settings keys (en fallback — pending translation)
    session_archive: 'Archive conversation',
    session_archive_desc: 'Hide this conversation until archived is shown',
+    session_archive_worktree_desc: 'Hide this conversation; keep its worktree on disk',
    session_archive_failed: 'Archive failed: ',
    session_archived: 'Session archived',
+    session_archived_worktree: 'Session archived. Worktree remains on disk.',
    session_delete: 'Delete conversation',
    session_delete_desc: 'Permanently remove this conversation',
+    session_delete_confirm: 'Delete this conversation?',
+    session_delete_worktree_desc: 'Delete only the WebUI conversation; keep the worktree on disk',
+    session_delete_worktree_confirm: (path) => `Delete this conversation? The worktree at ${path} will remain on disk.`,
+    session_deleted: 'Conversation deleted',
+    session_deleted_worktree: 'Conversation deleted. Worktree remains on disk.',
    session_duplicate: 'Duplicate conversation',
    session_duplicate_desc: 'Create a copy with the same workspace and model',
    session_duplicate_failed: 'Duplicate failed: ',
@@ -4180,6 +4214,8 @@ const LOCALES = {
    session_batch_move: 'Переместить в проект',
    session_batch_delete_confirm: 'Удалить {0} бесед(ы)?',
    session_batch_archive_confirm: 'Архивировать {0} бесед(ы)?',
+    session_batch_delete_worktree_confirm: 'Удалить {0} бесед(ы)? У {1} бесед с worktree каталоги worktree останутся на диске.',
+    session_batch_archive_worktree_confirm: 'Архивировать {0} бесед(ы)? У {1} бесед с worktree каталоги worktree останутся на диске.',
    session_no_selection: 'Ничего не выбрано',
    settings_dropdown_appearance: 'Appearance',
    settings_dropdown_conversation: 'Conversation',
@@ -5170,10 +5206,17 @@ const LOCALES = {
    // Session management and settings keys (en fallback — pending translation)
    session_archive: 'Archive conversation',
    session_archive_desc: 'Hide this conversation until archived is shown',
+    session_archive_worktree_desc: 'Ocultar esta conversación; conservar su worktree en disco',
    session_archive_failed: 'Archive failed: ',
    session_archived: 'Session archived',
+    session_archived_worktree: 'Sesión archivada. El worktree permanece en disco.',
    session_delete: 'Delete conversation',
    session_delete_desc: 'Permanently remove this conversation',
+    session_delete_confirm: '¿Eliminar esta conversación?',
+    session_delete_worktree_desc: 'Eliminar solo la conversación de WebUI; conservar el worktree en disco',
+    session_delete_worktree_confirm: (path) => `¿Eliminar esta conversación? El worktree en ${path} permanecerá en disco.`,
+    session_deleted: 'Conversación eliminada',
+    session_deleted_worktree: 'Conversación eliminada. El worktree permanece en disco.',
    session_duplicate: 'Duplicate conversation',
    session_duplicate_desc: 'Create a copy with the same workspace and model',
    session_duplicate_failed: 'Duplicate failed: ',
@@ -5201,6 +5244,8 @@ const LOCALES = {
    session_batch_move: 'Mover al proyecto',
    session_batch_delete_confirm: '¿Eliminar {0} conversaciones?',
    session_batch_archive_confirm: '¿Archivar {0} conversaciones?',
+    session_batch_delete_worktree_confirm: '¿Eliminar {0} conversaciones? {1} conversaciones con worktree dejarán sus directorios de worktree en disco.',
+    session_batch_archive_worktree_confirm: '¿Archivar {0} conversaciones? {1} conversaciones con worktree conservarán sus directorios de worktree en disco.',
    session_no_selection: 'Ninguna conversación seleccionada',
    settings_dropdown_appearance: 'Appearance',
    settings_dropdown_conversation: 'Conversation',
@@ -5931,10 +5976,17 @@ const LOCALES = {
    // Session management and settings keys (en fallback — pending translation)
    session_archive: 'Archive conversation',
    session_archive_desc: 'Hide this conversation until archived is shown',
+    session_archive_worktree_desc: 'Diese Konversation ausblenden; den Worktree auf der Festplatte behalten',
    session_archive_failed: 'Archive failed: ',
    session_archived: 'Session archived',
+    session_archived_worktree: 'Sitzung archiviert. Der Worktree bleibt auf der Festplatte.',
    session_delete: 'Delete conversation',
    session_delete_desc: 'Permanently remove this conversation',
+    session_delete_confirm: 'Diese Konversation löschen?',
+    session_delete_worktree_desc: 'Nur die WebUI-Konversation löschen; den Worktree auf der Festplatte behalten',
+    session_delete_worktree_confirm: (path) => `Diese Konversation löschen? Der Worktree unter ${path} bleibt auf der Festplatte.`,
+    session_deleted: 'Konversation gelöscht',
+    session_deleted_worktree: 'Konversation gelöscht. Der Worktree bleibt auf der Festplatte.',
    session_duplicate: 'Duplicate conversation',
    session_duplicate_desc: 'Create a copy with the same workspace and model',
    session_duplicate_failed: 'Duplicate failed: ',
@@ -5962,6 +6014,8 @@ const LOCALES = {
    session_batch_move: 'Zum Projekt verschieben',
    session_batch_delete_confirm: '{0} Konversationen löschen?',
    session_batch_archive_confirm: '{0} Konversationen archivieren?',
+    session_batch_delete_worktree_confirm: '{0} Konversationen löschen? {1} worktree-gestützte Konversation(en) behalten ihre Worktree-Verzeichnisse auf der Festplatte.',
+    session_batch_archive_worktree_confirm: '{0} Konversationen archivieren? {1} worktree-gestützte Konversation(en) behalten ihre Worktree-Verzeichnisse auf der Festplatte.',
    session_no_selection: 'Keine Konversationen ausgewählt',
    settings_dropdown_appearance: 'Appearance',
    settings_dropdown_conversation: 'Conversation',
@@ -7233,10 +7287,17 @@ const LOCALES = {
    // Session management and settings keys (en fallback — pending translation)
    session_archive: '归档会话',
    session_archive_desc: '隐藏此会话，直到显示归档',
+    session_archive_worktree_desc: '隐藏此会话；保留磁盘上的 worktree',
    session_archive_failed: '归档失败：',
    session_archived: '会话已归档',
+    session_archived_worktree: '会话已归档。Worktree 仍保留在磁盘上。',
    session_delete: '删除会话',
    session_delete_desc: '永久删除此会话',
+    session_delete_confirm: '删除此会话？',
+    session_delete_worktree_desc: '仅删除 WebUI 会话；保留磁盘上的 worktree',
+    session_delete_worktree_confirm: (path) => `删除此会话？位于 ${path} 的 worktree 将保留在磁盘上。`,
+    session_deleted: '会话已删除',
+    session_deleted_worktree: '会话已删除。Worktree 仍保留在磁盘上。',
    session_duplicate: '复制会话',
    session_duplicate_desc: '用相同工作区和模型创建副本',
    session_duplicate_failed: '复制失败：',
@@ -7264,6 +7325,8 @@ const LOCALES = {
    session_batch_move: '移动到项目',
    session_batch_delete_confirm: '删除 {0} 个会话？',
    session_batch_archive_confirm: '归档 {0} 个会话？',
+    session_batch_delete_worktree_confirm: '删除 {0} 个会话？其中 {1} 个 worktree 会话会把 worktree 目录保留在磁盘上。',
+    session_batch_archive_worktree_confirm: '归档 {0} 个会话？其中 {1} 个 worktree 会话会把 worktree 目录保留在磁盘上。',
    session_no_selection: '未选择任何会话',
    settings_dropdown_appearance: '外观',
    settings_dropdown_conversation: '对话',
@@ -7661,8 +7724,10 @@ const LOCALES = {
    session_archive: '封存對話',
    session_restore: '還原對話',
    session_archive_desc: '隱藏此對話，直到開啟顯示封存',
+    session_archive_worktree_desc: '隱藏此對話；保留磁碟上的 worktree',
    session_restore_desc: '將此對話移回主清單',
    session_archived: '對話已封存',
+    session_archived_worktree: '對話已封存。Worktree 仍保留在磁碟上。',
    session_restored: '對話已還原',
    session_archive_failed: '封存失敗：',
    session_duplicate: '複製對話',
@@ -7673,6 +7738,11 @@ const LOCALES = {
    session_stop_response_desc: 'Cancel the running response for this conversation',
    session_delete: '刪除對話',
    session_delete_desc: '永久移除這個對話',
+    session_delete_confirm: '刪除此對話？',
+    session_delete_worktree_desc: '只刪除 WebUI 對話；保留磁碟上的 worktree',
+    session_delete_worktree_confirm: (path) => `刪除此對話？位於 ${path} 的 worktree 將保留在磁碟上。`,
+    session_deleted: '對話已刪除',
+    session_deleted_worktree: '對話已刪除。Worktree 仍保留在磁碟上。',
    session_select_mode: '選取',
    session_select_mode_desc: '選取會話以批次管理',
    session_select_all: '全選',
@@ -7683,6 +7753,8 @@ const LOCALES = {
    session_batch_move: '移至專案',
    session_batch_delete_confirm: '刪除 {0} 個會話？',
    session_batch_archive_confirm: '封存 {0} 個會話？',
+    session_batch_delete_worktree_confirm: '刪除 {0} 個會話？其中 {1} 個 worktree 會話會把 worktree 目錄保留在磁碟上。',
+    session_batch_archive_worktree_confirm: '封存 {0} 個會話？其中 {1} 個 worktree 會話會把 worktree 目錄保留在磁碟上。',
    session_no_selection: '未選取任何會話',
    // settings panel
    settings_heading_title: '控制中心',
@@ -8854,8 +8926,10 @@ const LOCALES = {
    session_archive: 'Arquivar conversa',
    session_restore: 'Restaurar conversa',
    session_archive_desc: 'Esconder conversa até mostrar arquivados',
+    session_archive_worktree_desc: 'Esconder esta conversa; manter o worktree no disco',
    session_restore_desc: 'Trazer conversa de volta à lista principal',
    session_archived: 'Sessão arquivada',
+    session_archived_worktree: 'Sessão arquivada. O worktree permanece no disco.',
    session_restored: 'Sessão restaurada',
    session_archive_failed: 'Falha ao arquivar: ',
    session_duplicate: 'Duplicar conversa',
@@ -8866,6 +8940,13 @@ const LOCALES = {
    session_stop_response_desc: 'Cancel the running response for this conversation',
    session_delete: 'Excluir conversa',
    session_delete_desc: 'Remover permanentemente esta conversa',
+    session_delete_confirm: 'Excluir esta conversa?',
+    session_delete_worktree_desc: 'Excluir apenas a conversa no WebUI; manter o worktree no disco',
+    session_delete_worktree_confirm: (path) => `Excluir esta conversa? O worktree em ${path} permanecerá no disco.`,
+    session_deleted: 'Conversa excluída',
+    session_deleted_worktree: 'Conversa excluída. O worktree permanece no disco.',
+    session_batch_delete_worktree_confirm: 'Excluir {0} conversas? {1} conversa(s) com worktree manterão seus diretórios de worktree no disco.',
+    session_batch_archive_worktree_confirm: 'Arquivar {0} conversas? {1} conversa(s) com worktree manterão seus diretórios de worktree no disco.',
    // settings panel
    settings_heading_title: 'Control Center',
    settings_heading_subtitle: 'Preferências, ferramentas de conversa e controles do sistema.',
@@ -9840,8 +9921,10 @@ const LOCALES = {
    session_archive: 'Archive conversation',
    session_restore: 'Restore conversation',
    session_archive_desc: 'Hide this conversation until archived is shown',
+    session_archive_worktree_desc: '이 대화를 숨기고 worktree는 디스크에 유지합니다',
    session_restore_desc: 'Bring this conversation back into the main list',
    session_archived: 'Session archived',
+    session_archived_worktree: '세션이 보관되었습니다. Worktree는 디스크에 남아 있습니다.',
    session_restored: 'Session restored',
    session_archive_failed: 'Archive failed: ',
    session_duplicate: 'Duplicate conversation',
@@ -9852,6 +9935,11 @@ const LOCALES = {
    session_stop_response_desc: 'Cancel the running response for this conversation',
    session_delete: 'Delete conversation',
    session_delete_desc: 'Permanently remove this conversation',
+    session_delete_confirm: '이 대화를 삭제하시겠습니까?',
+    session_delete_worktree_desc: 'WebUI 대화만 삭제하고 worktree는 디스크에 유지합니다',
+    session_delete_worktree_confirm: (path) => `이 대화를 삭제하시겠습니까? ${path}의 worktree는 디스크에 남아 있습니다.`,
+    session_deleted: '대화가 삭제되었습니다',
+    session_deleted_worktree: '대화가 삭제되었습니다. Worktree는 디스크에 남아 있습니다.',
    session_select_mode: '선택',
    session_select_mode_desc: '일괄 관리할 대화를 선택하세요',
    session_select_all: '전체 선택',
@@ -9862,6 +9950,8 @@ const LOCALES = {
    session_batch_move: '프로젝트로 이동',
    session_batch_delete_confirm: '{0}개의 대화를 삭제하시겠습니까?',
    session_batch_archive_confirm: '{0}개의 대화를 보관하시겠습니까?',
+    session_batch_delete_worktree_confirm: '{0}개의 대화를 삭제하시겠습니까? worktree가 있는 대화 {1}개는 worktree 디렉터리를 디스크에 남깁니다.',
+    session_batch_archive_worktree_confirm: '{0}개의 대화를 보관하시겠습니까? worktree가 있는 대화 {1}개는 worktree 디렉터리를 디스크에 유지합니다.',
    session_no_selection: '선택된 대화가 없습니다',
    // settings panel
    settings_heading_title: '제어 센터',
@@ -1279,6 +1279,27 @@ const SESSION_VIRTUAL_THRESHOLD_ROWS = 80;
 let _sessionVirtualScrollList = null;
 let _sessionVirtualScrollRaf = 0;

+function _sessionSnapshotById(sid){
+  if(!sid)return null;
+  if(S.session&&S.session.session_id===sid) return S.session;
+  return (_allSessions||[]).find(s=>s&&s.session_id===sid)||null;
+}
+function _worktreeSessionCount(ids){
+  return (ids||[]).reduce((count,sid)=>{
+    const session=_sessionSnapshotById(sid);
+    return count+(session&&session.worktree_path?1:0);
+  },0);
+}
+function _sessionArchiveDescription(session){
+  return session&&session.worktree_path?t('session_archive_worktree_desc'):t('session_archive_desc');
+}
+function _sessionArchiveToast(session){
+  return session&&session.worktree_path?t('session_archived_worktree'):t('session_archived');
+}
+function _sessionDeleteDescription(session){
+  return session&&session.worktree_path?t('session_delete_worktree_desc'):t('session_delete_desc');
+}
+
 function _sessionIdFromLocation(){
  if(typeof window==='undefined'||!window.location) return null;
  const marker='/session/';
@@ -1376,10 +1397,15 @@ function _renderBatchActionBar(){
  archiveBtn.textContent=t('session_batch_archive');
  archiveBtn.onclick=async()=>{
    const ids=[..._selectedSessions];
-    const ok=await showConfirmDialog({message:t('session_batch_archive_confirm',ids.length),confirmLabel:t('session_batch_archive'),danger:true});
+    const wtCount=_worktreeSessionCount(ids);
+    const ok=await showConfirmDialog({
+      message:wtCount?t('session_batch_archive_worktree_confirm',ids.length,wtCount):t('session_batch_archive_confirm',ids.length),
+      confirmLabel:t('session_batch_archive'),
+      danger:true
+    });
    if(!ok)return;
    try{await Promise.all(ids.map(sid=>api('/api/session/archive',{method:'POST',body:JSON.stringify({session_id:sid,archived:true})})));
-      showToast(t('session_archived'));exitSessionSelectMode();await renderSessionList();
+      showToast(wtCount?t('session_archived_worktree'):t('session_archived'));exitSessionSelectMode();await renderSessionList();
    }catch(e){showToast('Archive failed: '+(e.message||e));}
  };bar.appendChild(archiveBtn);
  // Move
@@ -1391,7 +1417,12 @@ function _renderBatchActionBar(){
  deleteBtn.textContent=t('session_batch_delete');
  deleteBtn.onclick=async()=>{
    const ids=[..._selectedSessions];
-    const ok=await showConfirmDialog({message:t('session_batch_delete_confirm',ids.length),confirmLabel:t('delete_title'),danger:true});
+    const wtCount=_worktreeSessionCount(ids);
+    const ok=await showConfirmDialog({
+      message:wtCount?t('session_batch_delete_worktree_confirm',ids.length,wtCount):t('session_batch_delete_confirm',ids.length),
+      confirmLabel:t('delete_title'),
+      danger:true
+    });
    if(!ok)return;
    try{
      await Promise.all(ids.map(sid=>api('/api/session/delete',{method:'POST',body:JSON.stringify({session_id:sid})})));
@@ -1402,7 +1433,7 @@ function _renderBatchActionBar(){
        if(remaining.sessions&&remaining.sessions.length){await loadSession(remaining.sessions[0].session_id);}
        else{$('msgInner').innerHTML='';$('emptyState').style.display='';}
      }
-      showToast(t('session_delete')+' ('+ids.length+')');exitSessionSelectMode();await renderSessionList();
+      showToast((wtCount?t('session_deleted_worktree'):t('session_delete'))+' ('+ids.length+')');exitSessionSelectMode();await renderSessionList();
    }catch(e){showToast('Delete failed: '+(e.message||e));}
  };bar.appendChild(deleteBtn);
 }
@@ -1570,7 +1601,7 @@ function _openSessionActionMenu(session, anchorEl){
  ));
  menu.appendChild(_buildSessionAction(
    session.archived?t('session_restore'):t('session_archive'),
-    session.archived?t('session_restore_desc'):t('session_archive_desc'),
+    session.archived?t('session_restore_desc'):_sessionArchiveDescription(session),
    session.archived?ICONS.unarchive:ICONS.archive,
    async()=>{
      closeSessionActionMenu();
@@ -1579,7 +1610,7 @@ function _openSessionActionMenu(session, anchorEl){
        session.archived=!session.archived;
        if(S.session&&S.session.session_id===session.session_id) S.session.archived=session.archived;
        await renderSessionList();
-        showToast(session.archived?t('session_archived'):t('session_restored'));
+        showToast(session.archived?_sessionArchiveToast(session):t('session_restored'));
      }catch(err){showToast(t('session_archive_failed')+err.message);}
    }
  ));
@@ -1601,7 +1632,7 @@ function _openSessionActionMenu(session, anchorEl){
  if(!isExternalSession){
    menu.appendChild(_buildSessionAction(
      t('session_delete'),
-      t('session_delete_desc'),
+      _sessionDeleteDescription(session),
      ICONS.trash,
      async()=>{
        closeSessionActionMenu();
@@ -3013,8 +3044,9 @@ if(typeof window!=='undefined'){
 }

 async function deleteSession(sid){
+  const session=_sessionSnapshotById(sid);
  const ok=await showConfirmDialog({
-    message:'Delete this conversation?',
+    message:session&&session.worktree_path?t('session_delete_worktree_confirm',session.worktree_path):t('session_delete_confirm'),
    confirmLabel:t('delete_title'),
    danger:true
  });
@@ -3040,7 +3072,7 @@ async function deleteSession(sid){
      if(typeof syncAppTitlebar==='function') syncAppTitlebar();
    }
  }
-  showToast('Conversation deleted');
+  showToast(session&&session.worktree_path?t('session_deleted_worktree'):t('session_deleted'));
  await renderSessionList();
 }

@@ -0,0 +1,86 @@
+from types import SimpleNamespace
+
+import api.models as models
+import api.routes as routes
+from api.models import SESSIONS, Session
+
+
+def _capture_post(monkeypatch, body):
+    captured = {}
+    monkeypatch.setattr(routes, "_check_csrf", lambda handler: True)
+    monkeypatch.setattr(routes, "read_body", lambda handler: body)
+    monkeypatch.setattr(
+        routes,
+        "j",
+        lambda handler, payload, status=200, extra_headers=None: captured.update(
+            payload=payload,
+            status=status,
+        )
+        or True,
+    )
+    return captured
+
+
+def _isolate_session_store(tmp_path, monkeypatch):
+    session_dir = tmp_path / "sessions"
+    session_dir.mkdir()
+    monkeypatch.setattr(models, "SESSION_DIR", session_dir)
+    monkeypatch.setattr(models, "SESSION_INDEX_FILE", session_dir / "_index.json")
+    monkeypatch.setattr(routes, "SESSION_DIR", session_dir)
+    monkeypatch.setattr(routes, "SESSION_INDEX_FILE", session_dir / "_index.json")
+    SESSIONS.clear()
+    return session_dir
+
+
+def _worktree_session(tmp_path, session_id):
+    repo = tmp_path / "repo"
+    worktree = repo / ".worktrees" / f"hermes-{session_id}"
+    worktree.mkdir(parents=True)
+    s = Session(
+        session_id=session_id,
+        title="Worktree session",
+        workspace=str(worktree),
+        worktree_path=str(worktree),
+        worktree_branch=f"hermes/{session_id}",
+        worktree_repo_root=str(repo),
+    )
+    s.save()
+    return s, worktree
+
+
+def test_delete_worktree_session_reports_retained_worktree_without_cleanup(tmp_path, monkeypatch):
+    session_dir = _isolate_session_store(tmp_path, monkeypatch)
+    session, worktree = _worktree_session(tmp_path, "wtdelete1")
+    captured = _capture_post(monkeypatch, {"session_id": session.session_id})
+    monkeypatch.setattr(routes, "_lookup_cli_session_metadata", lambda sid: {})
+    monkeypatch.setattr(routes, "_is_messaging_session_id", lambda sid: False)
+    monkeypatch.setattr(models, "delete_cli_session", lambda sid: None)
+
+    assert routes.handle_post(object(), SimpleNamespace(path="/api/session/delete")) is True
+
+    assert captured["status"] == 200
+    assert captured["payload"]["ok"] is True
+    assert captured["payload"]["worktree_retained"] is True
+    assert captured["payload"]["worktree_path"] == str(worktree.resolve())
+    assert captured["payload"]["worktree_branch"] == "hermes/wtdelete1"
+    assert not (session_dir / "wtdelete1.json").exists()
+    assert worktree.exists(), "session delete must not remove the git worktree directory"
+
+
+def test_archive_worktree_session_reports_retained_worktree_without_cleanup(tmp_path, monkeypatch):
+    _isolate_session_store(tmp_path, monkeypatch)
+    session, worktree = _worktree_session(tmp_path, "wtarchive1")
+    captured = _capture_post(
+        monkeypatch,
+        {"session_id": session.session_id, "archived": True},
+    )
+
+    assert routes.handle_post(object(), SimpleNamespace(path="/api/session/archive")) is True
+
+    assert captured["status"] == 200
+    assert captured["payload"]["ok"] is True
+    assert captured["payload"]["session"]["archived"] is True
+    assert captured["payload"]["worktree_retained"] is True
+    assert captured["payload"]["worktree_path"] == str(worktree.resolve())
+    assert worktree.exists(), "session archive must not remove the git worktree directory"
+    assert Session.load("wtarchive1").archived is True
@@ -0,0 +1,52 @@
+from pathlib import Path
+
+
+ROOT = Path(__file__).resolve().parents[1]
+
+
+def read(path):
+    return (ROOT / path).read_text(encoding="utf-8")
+
+
+def test_delete_confirmation_mentions_retained_worktree():
+    src = read("static/sessions.js")
+    i18n = read("static/i18n.js")
+    assert "function _sessionSnapshotById(sid)" in src
+    assert "session.worktree_path?t('session_delete_worktree_confirm',session.worktree_path)" in src
+    assert "session_delete_worktree_confirm" in i18n
+    assert "will remain on disk" in i18n
+    assert "session_delete_worktree_confirm: (path) => `Delete this conversation? The worktree at ${path} will remain on disk.`" in i18n
+    assert "session_delete_worktree_desc: 'Delete only the WebUI conversation; keep the worktree on disk'" in i18n
+    assert "session_deleted_worktree: 'Conversation deleted. Worktree remains on disk.'" in i18n
+
+
+def test_batch_archive_delete_confirmations_count_worktree_sessions():
+    src = read("static/sessions.js")
+    i18n = read("static/i18n.js")
+    assert "function _worktreeSessionCount(ids)" in src
+    assert "session_batch_delete_worktree_confirm" in src
+    assert "session_batch_archive_worktree_confirm" in src
+    assert "session_batch_delete_worktree_confirm" in i18n
+    assert "session_batch_archive_worktree_confirm" in i18n
+
+
+def test_archive_and_delete_action_descriptions_are_worktree_specific():
+    src = read("static/sessions.js")
+    i18n = read("static/i18n.js")
+    assert "function _sessionArchiveDescription(session)" in src
+    assert "function _sessionDeleteDescription(session)" in src
+    assert "session&&session.worktree_path?t('session_archive_worktree_desc')" in src
+    assert "session&&session.worktree_path?t('session_delete_worktree_desc')" in src
+    assert "session_archive_worktree_desc" in i18n
+    assert "session_delete_worktree_desc" in i18n
+    assert "session_archive_worktree_desc: 'Hide this conversation; keep its worktree on disk'" in i18n
+    assert "session_archived_worktree: 'Session archived. Worktree remains on disk.'" in i18n
+
+
+def test_worktree_archive_delete_api_responses_are_explicit():
+    src = read("api/routes.py")
+    assert "def _worktree_retained_payload(session)" in src
+    assert "def _worktree_retained_payload_for_session_id(sid: str)" in src
+    assert '"worktree_retained": True' in src
+    assert '{"ok": True, **worktree_retained}' in src
+    assert '{"ok": True, "session": s.compact(), **_worktree_retained_payload(s)}' in src
@@ -133,19 +133,48 @@ class TestReasoningModelTitleGeneration(unittest.TestCase):
        self.assertEqual(_title_completion_budget(), 512)
        self.assertEqual(_title_retry_completion_budget(), 1024)

-    def test_aux_retries_empty_reasoning_length_response_with_larger_budget(self):
-        """If a reasoning model returns empty content at finish_reason=length, retry once."""
+    def test_aux_short_circuits_on_empty_reasoning_without_retrying(self):
+        """Regression for #2083: reasoning models that emit only hidden
+        reasoning tokens (no visible content) must NOT trigger a budget-doubling
+        retry — the second call invariably produces the same empty-reasoning
+        shape and just doubles the GPU/credit burn.  Short-circuit to the local
+        fallback path instead."""
        from api.streaming import generate_title_raw_via_aux

-        responses = [
-            {
+        call_count = [0]
+
+        def fake_call_llm(**kwargs):
+            call_count[0] += 1
+            return {
                'choices': [
                    {
                        'message': {'content': '', 'reasoning': 'long hidden reasoning'},
                        'finish_reason': 'length',
                    }
                ]
-            },
+            }
+
+        with _patch_tg_config({'provider': 'ollama', 'model': 'kimi-k2.6', 'base_url': 'https://ollama.com/v1'}):
+            with patch('agent.auxiliary_client.call_llm', side_effect=fake_call_llm, create=True):
+                result, status = generate_title_raw_via_aux(
+                    user_text='Hey nur ein kurzer Test',
+                    assistant_text='Alles klar, ich helfe dir dabei.',
+                )
+
+        self.assertIsNone(result)
+        self.assertEqual(status, 'llm_empty_reasoning_aux')
+        # One call per prompt at the base budget — no retry on prompt 0, no
+        # second-prompt attempt either (short-circuited).
+        self.assertEqual(call_count[0], 1)
+
+    def test_aux_still_retries_finish_length_without_reasoning(self):
+        """Length-truncated responses WITHOUT reasoning tokens still get the
+        budget-doubling retry — those are legitimately recoverable by giving
+        the model more headroom."""
+        from api.streaming import generate_title_raw_via_aux
+
+        responses = [
+            {'choices': [{'message': {'content': ''}, 'finish_reason': 'length'}]},
            {'choices': [{'message': {'content': 'Useful Session Title'}, 'finish_reason': 'stop'}]},
        ]
        captured_budgets = []
@@ -187,21 +216,58 @@ class TestReasoningModelTitleGeneration(unittest.TestCase):
                )

        self.assertIsNone(result)
-        self.assertEqual(status, 'llm_length_aux')
+        self.assertEqual(status, 'llm_empty_reasoning_aux')

-    def test_agent_route_retries_empty_reasoning_length_response(self):
-        """The active-agent route should get the same reasoning-model retry path as aux."""
+    def test_agent_route_short_circuits_on_empty_reasoning_without_retrying(self):
+        """Regression for #2083 on the active-agent route: empty-reasoning
+        responses must NOT trigger a budget-doubling retry."""
        from api.streaming import generate_title_raw_via_agent

-        responses = [
-            {
+        call_count = [0]
+
+        def fake_create(**kwargs):
+            call_count[0] += 1
+            return {
                'choices': [
                    {
                        'message': {'content': '', 'reasoning': 'long hidden reasoning'},
                        'finish_reason': 'length',
                    }
                ]
-            },
+            }
+
+        client = types.SimpleNamespace(
+            chat=types.SimpleNamespace(
+                completions=types.SimpleNamespace(create=fake_create)
+            )
+        )
+        agent = MagicMock()
+        agent.api_mode = 'openai'
+        agent.provider = 'ollama'
+        agent.model = 'kimi-k2.6'
+        agent.base_url = 'https://ollama.com/v1'
+        agent.reasoning_config = None
+        agent._build_api_kwargs.return_value = {}
+        agent._ensure_primary_openai_client.return_value = client
+
+        result, status = generate_title_raw_via_agent(
+            agent,
+            user_text='Hey nur ein kurzer Test',
+            assistant_text='Alles klar, ich helfe dir dabei.',
+        )
+
+        self.assertIsNone(result)
+        self.assertEqual(status, 'llm_empty_reasoning')
+        # One call per prompt at base budget — no retry, no second-prompt attempt.
+        self.assertEqual(call_count[0], 1)
+        self.assertIsNone(agent.reasoning_config)
+
+    def test_agent_route_still_retries_finish_length_without_reasoning(self):
+        """The active-agent route should preserve retry-on-length-no-reasoning."""
+        from api.streaming import generate_title_raw_via_agent
+
+        responses = [
+            {'choices': [{'message': {'content': ''}, 'finish_reason': 'length'}]},
            {'choices': [{'message': {'content': 'Agent Session Title'}, 'finish_reason': 'stop'}]},
        ]
        captured_budgets = []