Release v0.51.28 — Release E2 (MCP server Option A rewrite + WebUI /goal command)

Merge stage-323 to master.
2026-05-25 19:20:16 +00:00 · 2026-05-08 13:28:13 -07:00
parent 891c09c2bc 0590d597a3
commit dec2d25fcc
19 changed files with 2747 additions and 110 deletions
@@ -24,7 +24,14 @@ jobs:
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
-          pip install pyyaml>=6.0 pytest pytest-timeout
+          pip install pyyaml>=6.0 pytest pytest-timeout pytest-asyncio
+          # Install the `mcp` package so tests/test_mcp_server.py runs in CI.
+          # The package is an optional runtime dep of mcp_server.py — users
+          # who run the MCP integration install it themselves; CI installs
+          # it so test coverage exists. If mcp install fails (Python 3.13
+          # wheel not yet available, etc.), tests/test_mcp_server.py uses
+          # importorskip and the matrix stays green.
+          pip install mcp || echo "mcp install failed — test_mcp_server.py will importorskip"

      - name: Run tests
        run: pytest tests/ -v --timeout=60
@@ -1,6 +1,38 @@
 # Hermes Web UI -- Changelog

-## [v0.51.27] — 2026-05-08 — 4-PR contributor batch (Release E1: workspace-prefix sentinel hardening, custom named provider API key resolution, streaming chat scroll-pin retention, Kanban detail scrollable)
+## [v0.51.28] — 2026-05-08 — 2-PR contributor batch (Release E2: MCP server Option A rewrite + WebUI /goal command)
+
+### Added (2 PRs)
+
+- **PR #1895** by @samuelgudi — MCP server Option A rewrite (#1616). Replaces the fragile MCP integration with a clean `mcp_server.py` (567 LOC) implementing project CRUD, session listing, and session mutations (rename/move) over Hermes's HTTP API. Imports `api.models` / `api.profiles` canonically rather than carrying duplicate slug-matching helpers. Relocates `_profiles_match` from `api/routes.py` into `api/profiles.py` as the single source of truth (mcp_server.py and api/routes.py both now import the canonical helper — re-introducing a local copy in either module trips a parity test immediately). Adds env-aware WEBUI_URL (`HERMES_WEBUI_HOST` / `HERMES_WEBUI_PORT`). New behaviour: `delete_project` REFUSES to touch session JSONs when `HERMES_WEBUI_PASSWORD` is unset, returning `{ok:true, unassigned_sessions:0, warning:"…"}` instead — preventing data-loss when an MCP client tries to delete a project on an unauthenticated server. 53-test coverage in `tests/test_mcp_server.py` (914 LOC) including HTTP wire-format integration tests, profile-scoped isolation, legacy untagged row visibility, and `--profile foo` CLI ordering regression. Closes #1616.
+
+- **PR #1866** by @Michaelyklam — WebUI `/goal` command for goal-tracking with budget enforcement and continuation prompts. New `api/goals.py` (489 LOC) implements goal lifecycle (set / pause / resume / clear / status), per-profile SQLite `SessionDB` cache, and `evaluate_goal_after_turn()` SSE hook that emits `goal` and `goal_continue` events from `api/streaming.py` after assistant turns. Wire-up: `api/routes.py` adds `/api/goal` endpoint (POST set/pause/resume/clear, GET status) and `_start_chat_stream_for_session()` extraction so kickoff prompts can run through the canonical streaming path; `static/commands.js` adds `/goal` autocomplete (cmdGoal handler) with i18n description; `static/messages.js` handles new SSE event types with continuation-toast UI; `static/i18n.js` adds 9 new strings across all locales. 4 documentation screenshots added under `docs/pr-media/{1866,1808}/`. Closes #1808.
+
+### Mid-stage absorbed fixes (test isolation, per blocker investigation)
+
+- **#1857 polluter root-cause** — `tests/test_issue1857_usage_overwrite.py` was using `mock.patch.dict(sys.modules, {...})`, which DELETES any keys added during the patched scope on `__exit__`. That silently evicted lazily-imported pydantic submodules (e.g. `pydantic.root_model`), producing `KeyError: 'pydantic.root_model'` in `test_mcp_server.py` downstream when the full pytest suite ran. Fixed by replacing with manual save/restore using a `_MISSING` sentinel.
+- **#1895 module-attribute restoration** — `tests/test_mcp_server.py` mutates module-level constants on `api.config`/`api.models`/`mcp_server` (`STATE_DIR`, `SESSION_DIR`, `PROJECTS_FILE`, …) so the MCP server reads from a tmpdir. Without restoration, downstream tests (`test_pytest_state_isolation`, `test_provider_quota_status`, `test_provider_management`) read deleted tmpdirs from `api.config.STATE_DIR`. Fixed by snapshotting originals on first `_reimport_mcp()` call and restoring in `_cleanup_state_dir()`.
+- **#1895 `_profiles_match` parity test parent-attribute leak** — `test_profiles_match_single_source_of_truth` pops `api.routes`/`api.profiles` from `sys.modules` and re-imports for the canonical-helper identity check. When restoring `sys.modules` only, fresh modules still leaked through because `import api.routes as r` resolves via `sys.modules['api'].routes` (parent-package attribute), NOT via `sys.modules['api.routes']` directly. Fixed by ALSO restoring parent-package attributes — without this, sibling tests (`test_plugins_panel`, `test_pr1350_sse_notify_correctness`, `test_version_badge`) that patch `api.routes.j` and call handlers via `import api.routes as routes` would fail because the patch hits one module object and the handler reads from another.
+
+### Tests
+
+4898 → **4947 collected, 4947 passing, 0 regressions** (+49 net new). Full suite ~140s on Python 3.11 (HERMES_HOME isolated). JS syntax check (`node -c`) clean on `static/commands.js`, `static/i18n.js`, `static/messages.js`. Browser API sanity harness (port 8789): all 11 endpoints + 20 QA tests PASS. Opus advisor pass: SHIP-READY, no blockers (2 follow-up items filed: goal hook firing on unrelated turns; English-only runtime strings in goal UI).
+
+### Pre-release verification
+
+- Full pytest under `HERMES_HOME` isolation: **4947 passed, 8 skipped, 1 xfailed, 2 xpassed, 8 subtests passed** in 140.41s.
+- Browser API harness against stage-323 on port 8789: all 11 endpoints + 20 QA tests PASS (110.66s for QA phase).
+- `node -c` on all 3 modified `static/*.js` files: clean.
+- Stage diff: 16 files, +2692/-105.
+- Opus advisor pass on stage-323 brief: VERDICT=SHIP-READY. No coexistence bugs between #1895 and #1866 (disjoint hunks in routes.py, SSE event names align, `_profiles_match` resolution unambiguous either way, no path collisions).
+- v0.51.27 fixes verified preserved: `_strip_workspace_prefix` (callers at routes.py:1446/1485), `on_interim_assistant` (streaming.py:2120), `_max_iterations_cfg` (streaming.py:2331-2410), `if input_tokens > 0:` guard (streaming.py:2933).
+- Pre-stamp re-fetch of #1866 (sha f2aacf4) + #1895 (sha 766c91e): both MERGEABLE, no force-push during Opus window.
+
+### Follow-up items (filed for next sweep)
+
+- **Goal hook fires on unrelated turns** — while a goal is `active`, every completed assistant turn runs `evaluate_goal_after_turn` and ticks `state.turns_used += 1`, even on user messages unrelated to the goal. UX surprise but not bug-broken; consider gating on `user_initiated` or a goal-context flag.
+- **English-only runtime strings in goal UI** — `messages.js:889` ("Evaluating goal progress…"), `commands.js:651` ("Working toward goal…"), `messages.js:914` ("Continuing toward goal…" toast); also backend strings in `goals.py` (`status_line`, "⊙ Goal set …", "⏸ Goal paused …", "↻ Continuing …"). The `cmd_goal` autocomplete description IS localized across all 9 locales — only the runtime status strings are missed.
+

 ### Fixed (4 PRs)

@@ -2,7 +2,7 @@

 > Web companion to the Hermes Agent CLI. Same workflows, browser-native.
 >
-> Last updated: v0.51.27 (May 8, 2026) — 4898 tests collected — 4-PR Release E1 batch (workspace-prefix sentinel hardening, custom named provider API key resolution, streaming chat scroll-pin, Kanban detail scrollable)
+> Last updated: v0.51.28 (May 8, 2026) — 4947 tests collected — 2-PR Release E2 batch (MCP server Option A rewrite + WebUI /goal command)
 > Test source: `pytest tests/ --collect-only -q`
 > Per-version detail: see [CHANGELOG.md](./CHANGELOG.md)

@@ -168,7 +168,7 @@ Remaining gaps and forward work live in [Forward Work](#forward-work) below.

 ### Slash commands
 - [x] Command registry + autocomplete dropdown
- [x] Built-ins: `/help`, `/clear`, `/model`, `/workspace`, `/new`, `/usage`, `/theme`, `/compact`, `/queue`, `/interrupt`, `/steer`, `/btw`, `/reasoning`, `/skills`, `/toolsets`
+- [x] Built-ins: `/help`, `/clear`, `/model`, `/workspace`, `/new`, `/usage`, `/theme`, `/compact`, `/queue`, `/interrupt`, `/steer`, `/goal`, `/btw`, `/reasoning`, `/skills`, `/toolsets`
 - [x] Transparent pass-through for unrecognized commands

 ### Security
@@ -1835,8 +1835,8 @@ Bridged CLI sessions:

 ---

-*Last updated: v0.51.27, May 8, 2026*
-*Total automated tests collected: 4898*
+*Last updated: v0.51.28, May 8, 2026*
+*Total automated tests collected: 4947*
 *Regression gate: tests/test_regressions.py*
 *Run: pytest tests/ -v --timeout=60*
 *Source: <repo>/*
@@ -0,0 +1,489 @@
+"""WebUI bridge for Hermes persistent session goals."""
+
+from __future__ import annotations
+
+import copy
+import logging
+import time
+from pathlib import Path
+from typing import Any, Dict, Optional
+
+logger = logging.getLogger(__name__)
+
+try:  # Exposed as a module attribute so tests can monkeypatch it directly.
+    from hermes_cli.goals import (  # type: ignore
+        CONTINUATION_PROMPT_TEMPLATE,
+        DEFAULT_MAX_TURNS,
+        GoalManager as _NativeGoalManager,
+        GoalState,
+        judge_goal,
+    )
+except Exception:  # pragma: no cover - depends on installed hermes-agent
+    CONTINUATION_PROMPT_TEMPLATE = ""  # type: ignore
+    DEFAULT_MAX_TURNS = 20  # type: ignore
+    _NativeGoalManager = None  # type: ignore
+    GoalState = None  # type: ignore
+    judge_goal = None  # type: ignore
+
+GoalManager = _NativeGoalManager  # type: ignore
+
+_DB_CACHE: dict[str, Any] = {}
+
+
+def _default_max_turns() -> int:
+    """Return the configured /goal turn budget, defaulting to Hermes' 20 turns."""
+    try:
+        from api import config as _config
+
+        cfg = getattr(_config, "cfg", {}) or {}
+        goals_cfg = cfg.get("goals", {}) if isinstance(cfg, dict) else {}
+        if not isinstance(goals_cfg, dict):
+            return int(DEFAULT_MAX_TURNS or 20)
+        return max(1, int(goals_cfg.get("max_turns", DEFAULT_MAX_TURNS or 20) or 20))
+    except Exception:
+        return int(DEFAULT_MAX_TURNS or 20)
+
+
+def _meta_key(session_id: str) -> str:
+    return f"goal:{session_id}"
+
+
+def _profile_db(profile_home: str | Path):
+    """Return a SessionDB pinned to *profile_home*, without reading HERMES_HOME.
+
+    The upstream Hermes GoalManager persists through hermes_cli.goals.load_goal(),
+    which resolves SessionDB from process-global HERMES_HOME. WebUI sessions are
+    profile-scoped and can run concurrently, so the WebUI bridge uses an explicit
+    state.db path whenever the caller provides the session's profile home.
+    """
+    home = Path(profile_home).expanduser().resolve()
+    key = str(home)
+    cached = _DB_CACHE.get(key)
+    if cached is not None:
+        return cached
+    try:
+        from hermes_state import SessionDB  # type: ignore
+
+        db = SessionDB(db_path=home / "state.db")
+    except Exception as exc:  # pragma: no cover - import/env dependent
+        logger.debug("GoalManager profile DB unavailable for %s: %s", home, exc)
+        return None
+    _DB_CACHE[key] = db
+    return db
+
+
+class _ProfileGoalManager:
+    """Small WebUI-local GoalManager adapter with explicit profile persistence."""
+
+    def __init__(self, session_id: str, *, profile_home: str | Path, default_max_turns: int = 20):
+        if GoalState is None:
+            raise RuntimeError("Hermes goal state unavailable")
+        self.session_id = session_id
+        self.profile_home = Path(profile_home).expanduser().resolve()
+        self.default_max_turns = int(default_max_turns or DEFAULT_MAX_TURNS or 20)
+        self._state = self._load()
+
+    @property
+    def state(self):
+        return self._state
+
+    def _load(self):
+        db = _profile_db(self.profile_home)
+        if db is None or not self.session_id:
+            return None
+        try:
+            raw = db.get_meta(_meta_key(self.session_id))
+        except Exception as exc:
+            logger.debug("GoalManager profile get_meta failed: %s", exc)
+            return None
+        if not raw:
+            return None
+        try:
+            return GoalState.from_json(raw)  # type: ignore[union-attr]
+        except Exception as exc:
+            logger.warning("GoalManager profile state parse failed for %s: %s", self.session_id, exc)
+            return None
+
+    def _save(self, state) -> None:
+        db = _profile_db(self.profile_home)
+        if db is None or not self.session_id or state is None:
+            return
+        try:
+            db.set_meta(_meta_key(self.session_id), state.to_json())
+        except Exception as exc:
+            logger.debug("GoalManager profile set_meta failed: %s", exc)
+
+    def is_active(self) -> bool:
+        return self._state is not None and self._state.status == "active"
+
+    def has_goal(self) -> bool:
+        return self._state is not None and self._state.status in ("active", "paused")
+
+    def status_line(self) -> str:
+        s = self._state
+        if s is None or s.status in ("cleared",):
+            return "No active goal. Set one with /goal <text>."
+        turns = f"{s.turns_used}/{s.max_turns} turns"
+        if s.status == "active":
+            return f"⊙ Goal (active, {turns}): {s.goal}"
+        if s.status == "paused":
+            extra = f" — {s.paused_reason}" if s.paused_reason else ""
+            return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
+        if s.status == "done":
+            return f"✓ Goal done ({turns}): {s.goal}"
+        return f"Goal ({s.status}, {turns}): {s.goal}"
+
+    def set(self, goal: str, *, max_turns: Optional[int] = None):
+        goal = (goal or "").strip()
+        if not goal:
+            raise ValueError("goal text is empty")
+        state = GoalState(  # type: ignore[operator]
+            goal=goal,
+            status="active",
+            turns_used=0,
+            max_turns=int(max_turns) if max_turns else self.default_max_turns,
+            created_at=time.time(),
+            last_turn_at=0.0,
+        )
+        self._state = state
+        self._save(state)
+        return state
+
+    def pause(self, reason: str = "user-paused"):
+        if not self._state:
+            return None
+        self._state.status = "paused"
+        self._state.paused_reason = reason
+        self._save(self._state)
+        return self._state
+
+    def resume(self, *, reset_budget: bool = True):
+        if not self._state:
+            return None
+        self._state.status = "active"
+        self._state.paused_reason = None
+        if reset_budget:
+            self._state.turns_used = 0
+        self._save(self._state)
+        return self._state
+
+    def clear(self) -> None:
+        if self._state is None:
+            return
+        self._state.status = "cleared"
+        self._save(self._state)
+        self._state = None
+
+    def evaluate_after_turn(self, last_response: str, *, user_initiated: bool = True) -> Dict[str, Any]:
+        state = self._state
+        if state is None or state.status != "active":
+            return {
+                "status": state.status if state else None,
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "inactive",
+                "reason": "no active goal",
+                "message": "",
+            }
+
+        state.turns_used += 1
+        state.last_turn_at = time.time()
+
+        if judge_goal is None:
+            verdict, reason = "continue", "goal judge unavailable"
+        else:
+            verdict, reason = judge_goal(state.goal, str(last_response or ""))
+        state.last_verdict = verdict
+        state.last_reason = reason
+
+        if verdict == "done":
+            state.status = "done"
+            self._save(state)
+            return {
+                "status": "done",
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "done",
+                "reason": reason,
+                "message": f"✓ Goal achieved: {reason}",
+            }
+
+        if state.turns_used >= state.max_turns:
+            state.status = "paused"
+            state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
+            self._save(state)
+            return {
+                "status": "paused",
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "continue",
+                "reason": reason,
+                "message": (
+                    f"⏸ Goal paused — {state.turns_used}/{state.max_turns} turns used. "
+                    "Use /goal resume to keep going, or /goal clear to stop."
+                ),
+            }
+
+        self._save(state)
+        return {
+            "status": "active",
+            "should_continue": True,
+            "continuation_prompt": self.next_continuation_prompt(),
+            "verdict": "continue",
+            "reason": reason,
+            "message": f"↻ Continuing toward goal ({state.turns_used}/{state.max_turns}): {reason}",
+        }
+
+    def next_continuation_prompt(self) -> Optional[str]:
+        if not self._state or self._state.status != "active":
+            return None
+        return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)
+
+
+def _manager(session_id: str, *, profile_home: str | Path | None = None):
+    if GoalManager is None:
+        return None
+    if profile_home and GoalManager is _NativeGoalManager and GoalState is not None:
+        try:
+            return _ProfileGoalManager(
+                session_id=session_id,
+                profile_home=profile_home,
+                default_max_turns=_default_max_turns(),
+            )
+        except Exception as exc:
+            logger.debug("Profile-scoped GoalManager unavailable: %s", exc)
+            return None
+    return GoalManager(session_id=session_id, default_max_turns=_default_max_turns())
+
+
+def _state_payload(state: Any) -> Optional[Dict[str, Any]]:
+    if state is None:
+        return None
+    return {
+        "goal": getattr(state, "goal", "") or "",
+        "status": getattr(state, "status", "") or "",
+        "turns_used": int(getattr(state, "turns_used", 0) or 0),
+        "max_turns": int(getattr(state, "max_turns", 0) or 0),
+        "last_verdict": getattr(state, "last_verdict", None),
+        "last_reason": getattr(state, "last_reason", None),
+        "paused_reason": getattr(state, "paused_reason", None),
+    }
+
+
+def _payload(
+    *,
+    ok: bool = True,
+    action: str,
+    message: str,
+    state: Any = None,
+    error: str | None = None,
+    kickoff_prompt: str | None = None,
+    decision: Dict[str, Any] | None = None,
+) -> Dict[str, Any]:
+    body: Dict[str, Any] = {
+        "ok": bool(ok),
+        "action": action,
+        "message": message,
+        "goal": _state_payload(state),
+    }
+    if error:
+        body["error"] = error
+    if kickoff_prompt:
+        body["kickoff_prompt"] = kickoff_prompt
+    if decision is not None:
+        body["decision"] = decision
+    return body
+
+
+def goal_state_snapshot(session_id: str, *, profile_home: str | Path | None = None) -> Any:
+    """Return a deep copy of current goal state for rollback before kickoff."""
+    mgr = _manager(str(session_id or ""), profile_home=profile_home)
+    if mgr is None:
+        return None
+    return copy.deepcopy(getattr(mgr, "state", None))
+
+
+def restore_goal_state(session_id: str, snapshot: Any, *, profile_home: str | Path | None = None) -> None:
+    """Restore a prior goal state after kickoff stream creation fails."""
+    mgr = _manager(str(session_id or ""), profile_home=profile_home)
+    if mgr is None:
+        return
+    if snapshot is None:
+        try:
+            mgr.clear()
+        except Exception:
+            pass
+        return
+    if isinstance(mgr, _ProfileGoalManager):
+        mgr._state = snapshot
+        mgr._save(snapshot)
+        return
+    try:
+        from hermes_cli.goals import save_goal  # type: ignore
+
+        save_goal(str(session_id or ""), snapshot)
+    except Exception as exc:  # pragma: no cover - native fallback only
+        logger.debug("Goal state restore failed for %s: %s", session_id, exc)
+
+
+def goal_command_payload(
+    session_id: str,
+    args: str = "",
+    *,
+    stream_running: bool = False,
+    profile_home: str | Path | None = None,
+) -> Dict[str, Any]:
+    """Return the WebUI response payload for a /goal command.
+
+    Mirrors the gateway command semantics:
+    - /goal or /goal status shows status
+    - /goal pause pauses
+    - /goal resume resumes without auto-starting a turn
+    - /goal clear|stop|done clears
+    - /goal <text> sets a new active goal and returns kickoff_prompt so the
+      caller can start the first normal user-role turn immediately.
+    """
+    sid = str(session_id or "").strip()
+    if not sid:
+        return _payload(ok=False, action="error", error="missing_session", message="session_id required")
+
+    mgr = _manager(sid, profile_home=profile_home)
+    if mgr is None:
+        return _payload(ok=False, action="error", error="unavailable", message="Goals unavailable on this session.")
+
+    text = str(args or "").strip()
+    lower = text.lower()
+
+    if not text or lower == "status":
+        return _payload(action="status", message=mgr.status_line(), state=getattr(mgr, "state", None))
+
+    if lower == "pause":
+        state = mgr.pause(reason="user-paused")
+        if state is None:
+            return _payload(ok=False, action="pause", error="no_goal", message="No goal set.")
+        return _payload(action="pause", message=f"⏸ Goal paused: {state.goal}", state=state)
+
+    if lower == "resume":
+        state = mgr.resume()
+        if state is None:
+            return _payload(ok=False, action="resume", error="no_goal", message="No goal to resume.")
+        return _payload(
+            action="resume",
+            message=(
+                f"▶ Goal resumed: {state.goal}\n"
+                "Send a new message, or type continue, to kick it off."
+            ),
+            state=state,
+        )
+
+    if lower in ("clear", "stop", "done"):
+        had = bool(mgr.has_goal())
+        mgr.clear()
+        return _payload(
+            action="clear",
+            message="Goal cleared." if had else "No active goal.",
+            state=getattr(mgr, "state", None),
+        )
+
+    if stream_running:
+        return _payload(
+            ok=False,
+            action="set",
+            error="agent_running",
+            message=(
+                "Agent is running — use /goal status / pause / clear mid-run, "
+                "or /stop before setting a new goal."
+            ),
+        )
+
+    try:
+        state = mgr.set(text)
+    except ValueError as exc:
+        return _payload(ok=False, action="set", error="invalid_goal", message=f"Invalid goal: {exc}")
+
+    return _payload(
+        action="set",
+        message=(
+            f"⊙ Goal set ({state.max_turns}-turn budget): {state.goal}\n"
+            "I'll keep working until the goal is done, you pause/clear it, or the budget is exhausted.\n"
+            "Controls: /goal status · /goal pause · /goal resume · /goal clear"
+        ),
+        state=state,
+        kickoff_prompt=state.goal,
+    )
+
+
+def has_active_goal(
+    session_id: str,
+    *,
+    profile_home: str | Path | None = None,
+) -> bool:
+    """Return True when the session has an active standing goal to evaluate."""
+    sid = str(session_id or "").strip()
+    if not sid:
+        return False
+    mgr = _manager(sid, profile_home=profile_home)
+    if mgr is None:
+        return False
+    try:
+        return bool(mgr.is_active())
+    except Exception as exc:
+        logger.debug("goal active-state check failed for session=%s: %s", sid, exc)
+        return False
+
+
+def evaluate_goal_after_turn(
+    session_id: str,
+    last_response: str,
+    *,
+    user_initiated: bool = True,
+    profile_home: str | Path | None = None,
+) -> Dict[str, Any]:
+    """Evaluate a completed turn against the standing goal, if any."""
+    sid = str(session_id or "").strip()
+    if not sid:
+        return {
+            "status": None,
+            "should_continue": False,
+            "continuation_prompt": None,
+            "verdict": "inactive",
+            "reason": "missing session_id",
+            "message": "",
+        }
+    mgr = _manager(sid, profile_home=profile_home)
+    if mgr is None:
+        return {
+            "status": None,
+            "should_continue": False,
+            "continuation_prompt": None,
+            "verdict": "inactive",
+            "reason": "goals unavailable",
+            "message": "",
+        }
+    try:
+        if not mgr.is_active():
+            return {
+                "status": getattr(getattr(mgr, "state", None), "status", None),
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "inactive",
+                "reason": "no active goal",
+                "message": "",
+            }
+        decision = mgr.evaluate_after_turn(str(last_response or ""), user_initiated=user_initiated)
+    except Exception as exc:
+        logger.debug("goal evaluation failed for session=%s: %s", sid, exc)
+        return {
+            "status": None,
+            "should_continue": False,
+            "continuation_prompt": None,
+            "verdict": "error",
+            "reason": f"goal evaluation failed: {type(exc).__name__}",
+            "message": "",
+        }
+    if not isinstance(decision, dict):
+        decision = {}
+    decision.setdefault("should_continue", False)
+    decision.setdefault("continuation_prompt", None)
+    decision.setdefault("message", "")
+    return decision
@@ -170,6 +170,33 @@ def _is_root_profile(name: str) -> bool:
        return name in _root_profile_name_cache


+def _profiles_match(row_profile, active_profile) -> bool:
+    """Return True if a session/project row's profile matches the active profile.
+
+    Treats both the literal alias 'default' and any renamed-root display name
+    (per _is_root_profile) as equivalent, so legacy rows tagged 'default'
+    still surface when the user has renamed the root profile to e.g. 'kinni',
+    and vice versa.
+
+    A row with no profile (`None` or empty string) is treated as belonging to
+    the root profile — that's the convention used by the legacy backfill at
+    api/models.py::all_sessions, and matches the default seen in
+    `static/sessions.js` (`S.activeProfile||'default'`).
+
+    Originally lived in api/routes.py; relocated here so both routes.py and
+    out-of-process consumers (mcp_server.py) can import the canonical helper
+    instead of duplicating the body. See #1614 for the visibility model.
+    """
+    row = row_profile or 'default'
+    active = active_profile or 'default'
+    if row == active:
+        return True
+    # Cross-alias the renamed root.
+    if _is_root_profile(row) and _is_root_profile(active):
+        return True
+    return False
+
+
 def get_active_profile_name() -> str:
    """Return the currently active profile name.

@@ -72,29 +72,11 @@ _STALE_MESSAGING_END_REASONS = {"session_reset", "session_switch"}
 # when the active profile is `'default'`. _is_root_profile() is the
 # canonical check.

-def _profiles_match(row_profile, active_profile) -> bool:
-    """Return True if a session/project row's profile matches the active profile.
-
-    Treats both the literal alias 'default' and any renamed-root display name
-    (per _is_root_profile) as equivalent, so legacy rows tagged 'default'
-    still surface when the user has renamed the root profile to e.g. 'kinni',
-    and vice versa.
-
-    A row with no profile (`None` or empty string) is treated as belonging to
-    the root profile — that's the convention used by the legacy backfill at
-    api/models.py::all_sessions, and matches the default seen in
-    `static/sessions.js` (`S.activeProfile||'default'`).
-    """
-    from api.profiles import _is_root_profile
-
-    row = row_profile or 'default'
-    active = active_profile or 'default'
-    if row == active:
-        return True
-    # Cross-alias the renamed root.
-    if _is_root_profile(row) and _is_root_profile(active):
-        return True
-    return False
+# Canonical helper now lives in api.profiles so out-of-process consumers
+# (mcp_server.py) can import it without duplicating the visibility model.
+# Re-exported here so existing `_profiles_match(...)` call sites in this
+# module keep resolving without per-call-site refactors.
+from api.profiles import _profiles_match  # noqa: F401, E402  (re-export)


 def _all_profiles_query_flag(parsed_url) -> bool:
@@ -4287,6 +4269,9 @@ def handle_post(handler, parsed) -> bool:
    if parsed.path == "/api/background":
        return _handle_background(handler, body)

+    if parsed.path == "/api/goal":
+        return _handle_goal_command(handler, body)
+
    if parsed.path == "/api/chat/start":
        return _handle_chat_start(handler, body, diag=diag)

@@ -6456,6 +6441,197 @@ def _prepare_chat_start_session_for_stream(
    s.save()


+def _start_chat_stream_for_session(
+    s,
+    *,
+    msg: str,
+    attachments=None,
+    workspace: str,
+    model: str,
+    model_provider=None,
+    normalized_model: bool = False,
+    diag=None,
+):
+    """Persist pending state, register an SSE channel, and start an agent turn."""
+    attachments = attachments or []
+    # Prevent duplicate runs in the same session while a stream is still active.
+    # This commonly happens after page refresh/reconnect races and can produce
+    # duplicated clarify cards for what appears to be a single user request.
+    diag.stage("active_stream_check") if diag else None
+    current_stream_id = getattr(s, "active_stream_id", None)
+    if current_stream_id:
+        diag.stage("active_stream_lock_wait") if diag else None
+        with STREAMS_LOCK:
+            current_active = current_stream_id in STREAMS
+        if current_active:
+            diag.stage("response_write") if diag else None
+            return {
+                "error": "session already has an active stream",
+                "active_stream_id": current_stream_id,
+                "_status": 409,
+            }
+        # Stale stream id from a previous run; clear and continue.
+        diag.stage("stale_stream_cleanup") if diag else None
+        _clear_stale_stream_state(s)
+    stream_id = uuid.uuid4().hex
+    session_lock = _get_session_agent_lock(s.session_id)
+    diag.stage("session_lock_wait") if diag else None
+    with session_lock:
+        diag.stage("save_pending_state") if diag else None
+        _prepare_chat_start_session_for_stream(
+            s,
+            msg=msg,
+            attachments=attachments,
+            workspace=workspace,
+            model=model,
+            model_provider=model_provider,
+            stream_id=stream_id,
+        )
+    diag.stage("set_last_workspace") if diag else None
+    set_last_workspace(workspace)
+    diag.stage("stream_registration") if diag else None
+    stream = create_stream_channel()
+    with STREAMS_LOCK:
+        STREAMS[stream_id] = stream
+    diag.stage("worker_thread_start") if diag else None
+    thr = threading.Thread(
+        target=_run_agent_streaming,
+        args=(s.session_id, msg, model, workspace, stream_id, attachments),
+        kwargs={"model_provider": model_provider},
+        daemon=True,
+    )
+    thr.start()
+    response = {
+        "stream_id": stream_id,
+        "session_id": s.session_id,
+        "pending_started_at": s.pending_started_at,
+    }
+    if normalized_model:
+        response["effective_model"] = model
+    if model_provider:
+        response["effective_model_provider"] = model_provider
+    return response
+
+
+def _handle_goal_command(handler, body):
+    """Handle WebUI /goal command controls and optional kickoff stream."""
+    try:
+        require(body, "session_id")
+    except ValueError as e:
+        return bad(handler, str(e))
+    try:
+        s = get_session(body["session_id"])
+    except KeyError:
+        return bad(handler, "Session not found", 404)
+
+    requested_profile = str(body.get("profile") or "").strip()
+    if requested_profile:
+        try:
+            from api.profiles import _PROFILE_ID_RE
+
+            if requested_profile != "default" and not _PROFILE_ID_RE.fullmatch(requested_profile):
+                return bad(handler, "invalid profile", 400)
+        except ImportError:
+            requested_profile = ""
+    if requested_profile and not _profiles_match(getattr(s, "profile", None), requested_profile):
+        has_persisted_turns = bool(
+            getattr(s, "messages", None)
+            or getattr(s, "context_messages", None)
+            or getattr(s, "pending_user_message", None)
+        )
+        if not has_persisted_turns:
+            s.profile = requested_profile
+
+    current_stream_id = getattr(s, "active_stream_id", None)
+    stream_running = False
+    if current_stream_id:
+        with STREAMS_LOCK:
+            stream_running = current_stream_id in STREAMS
+        if not stream_running:
+            _clear_stale_stream_state(s)
+
+    try:
+        from api.profiles import get_hermes_home_for_profile
+
+        profile_home = get_hermes_home_for_profile(getattr(s, "profile", None))
+    except Exception:
+        profile_home = None
+
+    from api.goals import goal_command_payload, goal_state_snapshot, restore_goal_state
+
+    goal_args = str(body.get("args", "") or body.get("text", "") or "")
+    goal_action = goal_args.strip().lower()
+    will_kickoff = bool(
+        goal_args.strip()
+        and goal_action not in ("status", "pause", "resume", "clear", "stop", "done")
+        and not stream_running
+    )
+    workspace = model = model_provider = normalized_model = None
+    previous_goal_state = None
+    if will_kickoff:
+        try:
+            workspace = str(resolve_trusted_workspace(body.get("workspace") or s.workspace))
+        except ValueError as e:
+            return bad(handler, str(e))
+        requested_model = body.get("model") or s.model
+        requested_provider = (
+            body.get("model_provider")
+            if "model_provider" in body
+            else getattr(s, "model_provider", None)
+        )
+        model, model_provider, normalized_model = _resolve_compatible_session_model_state(
+            requested_model,
+            requested_provider,
+        )
+        previous_goal_state = goal_state_snapshot(s.session_id, profile_home=profile_home)
+
+    payload = goal_command_payload(
+        s.session_id,
+        goal_args,
+        stream_running=stream_running,
+        profile_home=profile_home,
+    )
+    if not payload.get("ok", True):
+        status = 409 if payload.get("error") == "agent_running" else 400
+        return j(handler, payload, status=status)
+
+    kickoff_prompt = str(payload.get("kickoff_prompt") or "").strip()
+    if kickoff_prompt:
+        if workspace is None:
+            try:
+                workspace = str(resolve_trusted_workspace(body.get("workspace") or s.workspace))
+            except ValueError as e:
+                return bad(handler, str(e))
+        if model is None:
+            requested_model = body.get("model") or s.model
+            requested_provider = (
+                body.get("model_provider")
+                if "model_provider" in body
+                else getattr(s, "model_provider", None)
+            )
+            model, model_provider, normalized_model = _resolve_compatible_session_model_state(
+                requested_model,
+                requested_provider,
+            )
+        stream_response = _start_chat_stream_for_session(
+            s,
+            msg=kickoff_prompt,
+            attachments=[],
+            workspace=workspace,
+            model=model,
+            model_provider=model_provider,
+            normalized_model=normalized_model,
+        )
+        status = int(stream_response.pop("_status", 200) or 200)
+        payload.update(stream_response)
+        if status >= 400:
+            restore_goal_state(s.session_id, previous_goal_state, profile_home=profile_home)
+            payload["ok"] = False
+            return j(handler, payload, status=status)
+
+    return j(handler, payload)
+
+
 def _handle_chat_start(handler, body, diag=None):
    try:
        diag.stage("validate_session_id") if diag else None
@@ -6512,72 +6688,25 @@ def _handle_chat_start(handler, body, diag=None):
            requested_model,
            requested_provider,
        )
-        # Prevent duplicate runs in the same session while a stream is still active.
-        # This commonly happens after page refresh/reconnect races and can produce
-        # duplicated clarify cards for what appears to be a single user request.
-        diag.stage("active_stream_check") if diag else None
-        current_stream_id = getattr(s, "active_stream_id", None)
-        if current_stream_id:
-            diag.stage("active_stream_lock_wait") if diag else None
-            with STREAMS_LOCK:
-                current_active = current_stream_id in STREAMS
-            if current_active:
-                diag.stage("response_write") if diag else None
-                return j(
-                    handler,
-                    {
-                        "error": "session already has an active stream",
-                        "active_stream_id": current_stream_id,
-                    },
-                    status=409,
-                )
-            # Stale stream id from a previous run; clear and continue.
-            diag.stage("stale_stream_cleanup") if diag else None
-            _clear_stale_stream_state(s)
-        stream_id = uuid.uuid4().hex
-        session_lock = _get_session_agent_lock(s.session_id)
-        diag.stage("session_lock_wait") if diag else None
-        with session_lock:
-            diag.stage("save_pending_state") if diag else None
-            _prepare_chat_start_session_for_stream(
-                s,
-                msg=msg,
-                attachments=attachments,
-                workspace=workspace,
-                model=model,
-                model_provider=model_provider,
-                stream_id=stream_id,
-            )
-        diag.stage("set_last_workspace") if diag else None
-        set_last_workspace(workspace)
-        diag.stage("stream_registration") if diag else None
-        stream = create_stream_channel()
-        with STREAMS_LOCK:
-            STREAMS[stream_id] = stream
-        diag.stage("worker_thread_start") if diag else None
-        thr = threading.Thread(
-            target=_run_agent_streaming,
-            args=(s.session_id, msg, model, workspace, stream_id, attachments),
-            kwargs={"model_provider": model_provider},
-            daemon=True,
+        response = _start_chat_stream_for_session(
+            s,
+            msg=msg,
+            attachments=attachments,
+            workspace=workspace,
+            model=model,
+            model_provider=model_provider,
+            normalized_model=normalized_model,
+            diag=diag,
        )
-        thr.start()
-        response = {
-            "stream_id": stream_id,
-            "session_id": s.session_id,
-            "pending_started_at": s.pending_started_at,
-        }
-        if normalized_model:
-            response["effective_model"] = model
-        if model_provider:
-            response["effective_model_provider"] = model_provider
+        status = int(response.pop("_status", 200) or 200)
        diag.stage("response_write") if diag else None
-        return j(handler, response)
+        return j(handler, response, status=status)
    finally:
        if diag:
            diag.finish()


+
 def _normalize_chat_attachments(raw_attachments):
    """Normalize attachment payloads from the browser.

@@ -3227,6 +3227,64 @@ def _run_agent_streaming(
                    })
            except Exception:
                logger.debug("Failed to drain pending steer for session %s", session_id)
+            # /goal parity: after a successful assistant turn, run the Hermes
+            # GoalManager judge before terminal done/stream_end events. The
+            # frontend surfaces the status line and queues continuation_prompt as
+            # a normal next user message so /queue and user input keep priority.
+            try:
+                from api.goals import evaluate_goal_after_turn, has_active_goal
+
+                if not has_active_goal(session_id, profile_home=_profile_home):
+                    _goal_decision = {}
+                else:
+                    _last_goal_response = ''
+                    for _goal_msg in reversed(s.messages or []):
+                        if not isinstance(_goal_msg, dict) or _goal_msg.get('role') != 'assistant':
+                            continue
+                        _goal_content = _goal_msg.get('content', '')
+                        if isinstance(_goal_content, list):
+                            _goal_parts = []
+                            for _goal_part in _goal_content:
+                                if isinstance(_goal_part, dict):
+                                    _goal_text = _goal_part.get('text') or _goal_part.get('content')
+                                    if _goal_text:
+                                        _goal_parts.append(str(_goal_text))
+                            _last_goal_response = '\n'.join(_goal_parts)
+                        else:
+                            _last_goal_response = str(_goal_content or '')
+                        break
+                    put('goal', {
+                        'session_id': session_id,
+                        'state': 'evaluating',
+                        'message': 'Evaluating goal progress…',
+                    })
+                    _goal_decision = evaluate_goal_after_turn(
+                        session_id,
+                        _last_goal_response,
+                        user_initiated=True,
+                        profile_home=_profile_home,
+                    )
+                decision = _goal_decision or {}
+                _goal_message = str(decision.get('message') or '').strip()
+                if _goal_message:
+                    put('goal', {
+                        'session_id': session_id,
+                        'state': 'continuing' if decision.get('should_continue') else 'idle',
+                        'message': _goal_message,
+                        'decision': decision,
+                    })
+                if decision.get('should_continue'):
+                    continuation_prompt = str(decision.get('continuation_prompt') or '').strip()
+                    if continuation_prompt:
+                        put('goal_continue', {
+                            'session_id': session_id,
+                            'continuation_prompt': continuation_prompt,
+                            'text': continuation_prompt,
+                            'message': _goal_message,
+                            'decision': decision,
+                        })
+            except Exception as _goal_exc:
+                logger.debug("Goal continuation hook failed for session %s: %s", session_id, _goal_exc)
            raw_session = s.compact() | {'messages': s.messages, 'tool_calls': tool_calls}
            put('done', {'session': redact_session_data(raw_session), 'usage': usage})
            # Emit one last metering packet for the live message-header TPS label.
@@ -0,0 +1,567 @@
+#!/usr/bin/env python3
+"""
+Hermes WebUI MCP Server — exposes project and session management
+as MCP tools for any MCP-compatible agent.
+
+Option A rewrite (2026-05-08): imports api.models and api.profiles
+directly from the webui codebase, using canonical helpers for
+locking, profile scoping, index consistency, and validation.
+
+    pip install mcp       # one-time setup
+    python3 mcp_server.py # start via stdio
+
+MCP config for Hermes Agent (add to config.yaml):
+    mcp_servers:
+      hermes-webui:
+        command: /path/to/venv/bin/python3
+        args: [/path/to/hermes-webui/mcp_server.py]
+        env:
+          HERMES_WEBUI_PASSWORD: your_password
+
+Profile override (optional):
+        args: [/path/to/hermes-webui/mcp_server.py, --profile, myprofile]
+
+AI-authoring disclosure: this file was rewritten by MILO (Hermes Agent)
+under human direction, per maintainer guidelines for #1616.
+"""
+
+import argparse
+import json
+import os
+import re
+import sys
+import time
+import uuid
+from pathlib import Path
+
+from mcp.server import Server
+from mcp.server.stdio import stdio_server
+from mcp.types import Tool, TextContent
+
+# ── Ensure the repo root is on sys.path so api.* imports work ─────────────
+_REPO_ROOT = Path(__file__).parent.resolve()
+if str(_REPO_ROOT) not in sys.path:
+    sys.path.insert(0, str(_REPO_ROOT))
+
+# ── CLI: optional --profile override ──────────────────────────────────────
+_profile_arg: str | None = None
+_parser = argparse.ArgumentParser(add_help=False)
+_parser.add_argument("--profile", type=str, default=None)
+_args, _unknown = _parser.parse_known_args()
+_profile_arg = _args.profile
+
+# ── Import webui canonical modules (after path setup) ─────────────────────
+import api.config as _cfg
+from api.config import (
+    STATE_DIR, SESSION_DIR, SESSION_INDEX_FILE, PROJECTS_FILE, HOME,
+)
+from api.models import load_projects, save_projects
+from api.profiles import get_active_profile_name, _is_root_profile, _profiles_match
+
+# ── Apply --profile override before any module uses get_active_profile_name
+if _profile_arg is not None:
+    import api.profiles as _profiles
+    _profiles._active_profile = _profile_arg
+
+# ── API auth state ─────────────────────────────────────────────────────────
+# Mirror the env-var contract used by api/config.py:32-33 so a non-default
+# WebUI port/host (e.g. when 8787 is held by another service on the host)
+# Just Works without configuration drift between the WebUI process and MCP.
+WEBUI_HOST = os.environ.get("HERMES_WEBUI_HOST", "127.0.0.1")
+WEBUI_PORT = os.environ.get("HERMES_WEBUI_PORT", "8787")
+WEBUI_URL = f"http://{WEBUI_HOST}:{WEBUI_PORT}"
+_auth_cookie: str | None = None
+_auth_expires: float = 0  # unix timestamp after which we re-auth
+
+server = Server("hermes-webui")
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Helpers — filesystem (project CRUD via canonical api.models)
+# ═══════════════════════════════════════════════════════════════════════════
+
+def _active_profile() -> str:
+    """Shorthand for the current profile name (--profile or auto-detected)."""
+    return get_active_profile_name() or 'default'
+
+
+def _validate_color(color: str | None) -> str | None:
+    """Return an error string if color is invalid, else None."""
+    if color is not None and not re.match(r"^#[0-9a-fA-F]{3,8}$", color):
+        return "Invalid color format (use #RGB, #RRGGBB, or #RRGGBBAA)"
+    return None
+
+
+def _load_index() -> list:
+    """Read the session index. Falls back to empty list on failure."""
+    if not SESSION_INDEX_FILE.exists():
+        return []
+    try:
+        return json.loads(SESSION_INDEX_FILE.read_text(encoding="utf-8"))
+    except Exception:
+        return []
+
+
+def _session_compact(row: dict) -> dict:
+    """Lightweight compact representation of a session index entry."""
+    return {
+        "session_id": row.get("session_id"),
+        "title": row.get("title"),
+        "project_id": row.get("project_id"),
+        "workspace": row.get("workspace"),
+        "model": row.get("model"),
+        "message_count": row.get("message_count", 0),
+        "source_tag": row.get("source_tag"),
+        "is_cli_session": row.get("is_cli_session", False),
+        "profile": row.get("profile"),
+    }
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Helpers — HTTP API (for mutations that need cache sync)
+# ═══════════════════════════════════════════════════════════════════════════
+
+def _api_password() -> str | None:
+    """Return the plaintext webui password from HERMES_WEBUI_PASSWORD, or None.
+
+    settings.json stores only the bcrypt hash, which the login endpoint cannot
+    accept — it calls verify_password(plaintext) against the stored hash. So
+    there's no usable fallback when the env var is unset; the MCP simply runs
+    in unauthenticated mode and any auth-protected mutation will fail clearly
+    with the server's 401 instead of silently sending an unusable hash.
+    """
+    pw = os.environ.get("HERMES_WEBUI_PASSWORD", "").strip()
+    return pw or None
+
+
+def _api_auth() -> str | None:
+    """Authenticate and return cookie value, or None if auth disabled/fails."""
+    global _auth_cookie, _auth_expires
+
+    pw = _api_password()
+    if not pw:
+        return None  # auth not enabled — API calls will fail anyway
+
+    # Reuse cookie if still valid (25 days — server issues 30-day cookies)
+    if _auth_cookie and time.time() < _auth_expires:
+        return _auth_cookie
+
+    import urllib.request
+
+    try:
+        req = urllib.request.Request(
+            f"{WEBUI_URL}/api/auth/login",
+            data=json.dumps({"password": pw}).encode(),
+            headers={"Content-Type": "application/json"},
+            method="POST",
+        )
+        resp = urllib.request.urlopen(req, timeout=5)
+        cookie = resp.headers.get("Set-Cookie", "")
+        if cookie:
+            _auth_cookie = cookie.split(";")[0]  # "hermes_session=VALUE; ..."
+            _auth_expires = time.time() + 25 * 86400  # 25 days
+            return _auth_cookie
+    except Exception:
+        _auth_cookie = None
+    return None
+
+
+def _api_post(endpoint: str, body: dict) -> dict:
+    """POST to webui API with auth cookie. Returns parsed JSON response."""
+    import urllib.request
+    import urllib.error
+
+    cookie = _api_auth()
+    headers = {"Content-Type": "application/json"}
+    if cookie:
+        headers["Cookie"] = cookie
+
+    try:
+        req = urllib.request.Request(
+            f"{WEBUI_URL}{endpoint}",
+            data=json.dumps(body).encode(),
+            headers=headers,
+            method="POST",
+        )
+        resp = urllib.request.urlopen(req, timeout=5)
+        return json.loads(resp.read())
+    except urllib.error.HTTPError as e:
+        err_body = json.loads(e.read())
+        return {"error": f"API {e.code}: {err_body.get('error', 'unknown')}"}
+    except Exception as e:
+        return {"error": f"API unreachable: {e}"}
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Tool handlers — read-only (filesystem, profile-aware)
+# ═══════════════════════════════════════════════════════════════════════════
+
+async def handle_list_projects(_arguments: dict) -> list[TextContent]:
+    """List all projects with session counts, scoped to active profile."""
+    projects = load_projects()
+    active = _active_profile()
+    index = _load_index()
+
+    # Session counts per project (from index)
+    counts: dict[str, int] = {}
+    for s in index:
+        pid = s.get("project_id")
+        if pid:
+            counts[pid] = counts.get(pid, 0) + 1
+
+    result = []
+    for p in projects:
+        # Profile filter: legacy untagged rows are treated as 'default' by
+        # _profiles_match, so non-root profiles correctly hide them.
+        if not _profiles_match(p.get("profile"), active):
+            continue
+        entry = dict(p)
+        entry["session_count"] = counts.get(p["project_id"], 0)
+        result.append(entry)
+
+    return [TextContent(type="text", text=json.dumps(result, ensure_ascii=False, indent=2))]
+
+
+async def handle_list_sessions(arguments: dict) -> list[TextContent]:
+    """List sessions, optionally filtered by project or unassigned status."""
+    project_id = arguments.get("project_id")
+    unassigned = arguments.get("unassigned", False)
+    limit = max(1, min(500, arguments.get("limit", 50)))
+    active = _active_profile()
+
+    index = _load_index()
+    sessions = [_session_compact(s) for s in index if s.get("session_id")]
+
+    # Filter by profile: legacy untagged rows are treated as 'default' by
+    # _profiles_match (canonical convention), so non-root profiles hide them.
+    sessions = [s for s in sessions if _profiles_match(s.get("profile"), active)]
+
+    if unassigned:
+        sessions = [s for s in sessions if not s["project_id"]]
+    elif project_id:
+        sessions = [s for s in sessions if s["project_id"] == project_id]
+
+    sessions = sessions[:limit]
+    return [TextContent(type="text", text=json.dumps(sessions, ensure_ascii=False, indent=2))]
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Tool handlers — project CRUD (canonical helpers, profile-scoped)
+# ═══════════════════════════════════════════════════════════════════════════
+
+async def handle_create_project(arguments: dict) -> list[TextContent]:
+    """Create a new project (profile-scoped, exact-match title collision)."""
+    name = arguments.get("name", "").strip()[:128]
+    if not name:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "name is required"}, ensure_ascii=False))]
+
+    color = arguments.get("color")
+    color_err = _validate_color(color)
+    if color_err:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": color_err}, ensure_ascii=False))]
+
+    active = _active_profile()
+    projects = load_projects()
+
+    # Title collision: exact match (consistent with ensure_cron_project)
+    if any(p.get("name") == name and _profiles_match(p.get("profile"), active)
+           for p in projects):
+        return [TextContent(type="text", text=json.dumps(
+            {"error": f"Project '{name}' already exists"}, ensure_ascii=False))]
+
+    proj = {
+        "project_id": uuid.uuid4().hex[:12],
+        "name": name,
+        "color": color,
+        "profile": active,
+        "created_at": time.time(),
+    }
+    projects.append(proj)
+    save_projects(projects)
+
+    proj["session_count"] = 0
+    return [TextContent(type="text", text=json.dumps(proj, ensure_ascii=False, indent=2))]
+
+
+async def handle_rename_project(arguments: dict) -> list[TextContent]:
+    """Rename a project and optionally change its color (profile-checked)."""
+    project_id = arguments.get("project_id")
+    name = arguments.get("name", "").strip()[:128]
+    if not project_id or not name:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "project_id and name are required"}, ensure_ascii=False))]
+
+    color = arguments.get("color")
+    color_err = _validate_color(color)
+    if color_err:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": color_err}, ensure_ascii=False))]
+
+    active = _active_profile()
+    projects = load_projects()
+    proj = next((p for p in projects if p["project_id"] == project_id), None)
+    if not proj:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "Project not found"}, ensure_ascii=False))]
+
+    # #1614: profile ownership check
+    if not _profiles_match(proj.get("profile"), active):
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "Project not found"}, ensure_ascii=False))]
+
+    proj["name"] = name
+    if color is not None:
+        proj["color"] = color
+    save_projects(projects)
+    return [TextContent(type="text", text=json.dumps(proj, ensure_ascii=False, indent=2))]
+
+
+async def handle_delete_project(arguments: dict) -> list[TextContent]:
+    """Delete a project and unassign all its sessions (profile-checked)."""
+    project_id = arguments.get("project_id")
+    if not project_id:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "project_id is required"}, ensure_ascii=False))]
+
+    active = _active_profile()
+    projects = load_projects()
+    proj = next((p for p in projects if p["project_id"] == project_id), None)
+    if not proj:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "Project not found"}, ensure_ascii=False))]
+
+    # #1614: profile ownership check
+    if not _profiles_match(proj.get("profile"), active):
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "Project not found"}, ensure_ascii=False))]
+
+    projects = [p for p in projects if p["project_id"] != project_id]
+    save_projects(projects)
+
+    # Unassign sessions only when we can do it cache-safely via the HTTP API.
+    # The previous filesystem fallback wrote session_data directly with
+    # os.replace(), which bypassed _write_session_index() in api/models.py
+    # and left _index.json holding the stale project_id — a running WebUI
+    # would still group those sessions under the deleted project until a
+    # subsequent re-compact. Even calling Session.save() in-process would
+    # not help because the WebUI's SESSIONS dict cache (a separate process)
+    # still has the old project_id and overwrites our update on its next
+    # save. The HTTP API is the only cache-safe path; without auth we
+    # refuse and surface the limitation so the operator can act.
+    has_auth = bool(_api_password())
+    if not has_auth:
+        return [TextContent(type="text", text=json.dumps({
+            "ok": True,
+            "deleted": proj["name"],
+            "unassigned_sessions": 0,
+            "warning": "Set HERMES_WEBUI_PASSWORD to unassign sessions; "
+                       "without auth the session index cannot be safely "
+                       "updated and direct filesystem writes would cause "
+                       "index drift in a running WebUI.",
+        }, ensure_ascii=False))]
+
+    unassigned = 0
+    if SESSION_DIR.exists():
+        for p in SESSION_DIR.glob("*.json"):
+            if p.name.startswith("_"):
+                continue
+            try:
+                session_data = json.loads(p.read_text(encoding="utf-8"))
+                if session_data.get("project_id") == project_id:
+                    sid = p.stem
+                    result = _api_post("/api/session/move",
+                                       {"session_id": sid, "project_id": None})
+                    if "ok" in result or "session" in result:
+                        unassigned += 1
+            except Exception:
+                pass
+
+    return [TextContent(type="text", text=json.dumps({
+        "ok": True,
+        "deleted": proj["name"],
+        "unassigned_sessions": unassigned,
+    }, ensure_ascii=False))]
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Tool handlers — mutations (HTTP API with auth, cache-safe)
+# ═══════════════════════════════════════════════════════════════════════════
+
+async def handle_rename_session(arguments: dict) -> list[TextContent]:
+    """Rename a session via the authenticated webui API (cache-safe)."""
+    session_id = arguments.get("session_id")
+    title = arguments.get("title", "").strip()[:80]
+    if not session_id or not title:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "session_id and title are required"}, ensure_ascii=False))]
+
+    result = _api_post("/api/session/rename",
+                       {"session_id": session_id, "title": title})
+    if "error" in result:
+        return [TextContent(type="text", text=json.dumps(result, ensure_ascii=False))]
+
+    session = result.get("session", {})
+    return [TextContent(type="text", text=json.dumps({
+        "ok": True,
+        "session_id": session_id,
+        "title": session.get("title", title),
+        "method": "api",
+    }, ensure_ascii=False, indent=2))]
+
+
+async def handle_move_session(arguments: dict) -> list[TextContent]:
+    """Assign a session to a project via the authenticated webui API (cache-safe)."""
+    session_id = arguments.get("session_id")
+    project_id = arguments.get("project_id")  # None/null = unassign
+    if not session_id:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": "session_id is required"}, ensure_ascii=False))]
+
+    # If project_id is provided, verify it exists and is profile-accessible
+    if project_id is not None:
+        projects = load_projects()
+        active = _active_profile()
+        target = next((p for p in projects if p["project_id"] == project_id), None)
+        if not target:
+            return [TextContent(type="text", text=json.dumps(
+                {"error": "Project not found"}, ensure_ascii=False))]
+        # #1614: refuse moves into projects owned by another profile
+        if not _profiles_match(target.get("profile"), active):
+            return [TextContent(type="text", text=json.dumps(
+                {"error": "Project not found"}, ensure_ascii=False))]
+
+    result = _api_post("/api/session/move",
+                       {"session_id": session_id, "project_id": project_id})
+    if "error" in result:
+        return [TextContent(type="text", text=json.dumps(result, ensure_ascii=False))]
+
+    session = result.get("session", {})
+    return [TextContent(type="text", text=json.dumps({
+        "ok": True,
+        "session_id": session_id,
+        "project_id": project_id,
+        "title": session.get("title"),
+        "method": "api",
+    }, ensure_ascii=False, indent=2))]
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  MCP Server wiring
+# ═══════════════════════════════════════════════════════════════════════════
+
+TOOLS = [
+    Tool(
+        name="list_projects",
+        description="List all session projects with their IDs, names, colors, and session counts (scoped to active profile).",
+        inputSchema={"type": "object", "properties": {}, "required": []},
+    ),
+    Tool(
+        name="create_project",
+        description="Create a new project for organizing sessions (profile-scoped).",
+        inputSchema={
+            "type": "object",
+            "properties": {
+                "name": {"type": "string", "description": "Project name (max 128 chars)"},
+                "color": {"type": "string", "description": "Optional hex color (#RGB, #RRGGBB, or #RRGGBBAA)"},
+            },
+            "required": ["name"],
+        },
+    ),
+    Tool(
+        name="rename_project",
+        description="Rename a project and optionally change its color (profile-checked).",
+        inputSchema={
+            "type": "object",
+            "properties": {
+                "project_id": {"type": "string", "description": "12-char project ID"},
+                "name": {"type": "string", "description": "New name (max 128 chars)"},
+                "color": {"type": "string", "description": "Optional new hex color"},
+            },
+            "required": ["project_id", "name"],
+        },
+    ),
+    Tool(
+        name="delete_project",
+        description="Delete a project and unassign all its sessions (profile-checked).",
+        inputSchema={
+            "type": "object",
+            "properties": {
+                "project_id": {"type": "string", "description": "12-char project ID to delete"},
+            },
+            "required": ["project_id"],
+        },
+    ),
+    Tool(
+        name="rename_session",
+        description="Rename a session (updates sidebar via authenticated API, cache-safe).",
+        inputSchema={
+            "type": "object",
+            "properties": {
+                "session_id": {"type": "string", "description": "Session ID"},
+                "title": {"type": "string", "description": "New title (max 80 chars)"},
+            },
+            "required": ["session_id", "title"],
+        },
+    ),
+    Tool(
+        name="move_session",
+        description="Assign a session to a project. Pass project_id=null to unassign. Uses authenticated API for cache safety (profile-checked).",
+        inputSchema={
+            "type": "object",
+            "properties": {
+                "session_id": {"type": "string", "description": "Session ID"},
+                "project_id": {"type": ["string", "null"], "description": "Project ID (or null to unassign)"},
+            },
+            "required": ["session_id", "project_id"],
+        },
+    ),
+    Tool(
+        name="list_sessions",
+        description="List sessions, optionally filtered by project or unassigned status (profile-scoped).",
+        inputSchema={
+            "type": "object",
+            "properties": {
+                "project_id": {"type": "string", "description": "Filter sessions by project ID"},
+                "unassigned": {"type": "boolean", "description": "Show only sessions with no project"},
+                "limit": {"type": "integer", "description": "Max results (default: 50, max: 500)"},
+            },
+            "required": [],
+        },
+    ),
+]
+
+HANDLERS = {
+    "list_projects": handle_list_projects,
+    "create_project": handle_create_project,
+    "rename_project": handle_rename_project,
+    "delete_project": handle_delete_project,
+    "rename_session": handle_rename_session,
+    "move_session": handle_move_session,
+    "list_sessions": handle_list_sessions,
+}
+
+
+@server.list_tools()
+async def list_tools() -> list[Tool]:
+    return TOOLS
+
+
+@server.call_tool()
+async def call_tool(name: str, arguments: dict) -> list[TextContent]:
+    handler = HANDLERS.get(name)
+    if not handler:
+        return [TextContent(type="text", text=json.dumps(
+            {"error": f"Unknown tool: {name}"}, ensure_ascii=False))]
+    return await handler(arguments)
+
+
+async def main():
+    async with stdio_server() as (read, write):
+        await server.run(read, write, server.create_initialization_options())
+
+
+if __name__ == "__main__":
+    import asyncio
+    asyncio.run(main())
@@ -18,6 +18,7 @@ const COMMANDS=[
  {name:'personality', desc:t('cmd_personality'), fn:cmdPersonality, arg:'name', subArgs:'personalities'},
  {name:'skills',    desc:t('cmd_skills'),   fn:cmdSkills,   arg:'query'},
  {name:'stop',      desc:t('cmd_stop'),     fn:cmdStop,      noEcho:true},
+  {name:'goal',      desc:t('cmd_goal'),     fn:cmdGoal,      arg:'[status|pause|resume|clear|text]', subArgs:['status','pause','resume','clear']},
  {name:'queue',     desc:t('cmd_queue'),    fn:cmdQueue,     arg:'message', noEcho:true},
  {name:'interrupt', desc:t('cmd_interrupt'), fn:cmdInterrupt, arg:'message', noEcho:true},
  {name:'steer',     desc:t('cmd_steer'),    fn:cmdSteer,     arg:'message', noEcho:true},
@@ -625,6 +626,53 @@ async function cmdStop(){
  else showToast(t('cancel_unavailable'));
 }

+async function cmdGoal(args){
+  if(!S.session){await newSession();await renderSessionList();}
+  if(!S.session||!S.session.session_id){showToast(t('no_active_session'));return;}
+  const activeSid=S.session.session_id;
+  try{
+    const r=await api('/api/goal',{method:'POST',body:JSON.stringify({
+      session_id:activeSid,
+      args:args||'',
+      workspace:S.session.workspace,
+      model:S.session.model||($('modelSelect')&&$('modelSelect').value)||'',
+      model_provider:S.session.model_provider||null,
+      profile:S.activeProfile||S.session.profile||'default',
+    })});
+    const msg=String((r&&r.message)||'').trim();
+    if(msg){
+      S.messages.push({role:'assistant',content:msg,_ts:Date.now()/1000,_goalStatus:true,_transient:true});
+      renderMessages({preserveScroll:true});
+      showToast(msg.split('\n')[0],2600);
+    }
+    if(!r||!r.stream_id)return;
+    S.toolCalls=[];
+    if(typeof clearLiveToolCards==='function')clearLiveToolCards();
+    appendThinking();setBusy(true);
+    setComposerStatus('Working toward goal…');
+    S.activeStreamId=r.stream_id;
+    if(S.session&&S.session.session_id===activeSid){
+      S.session.active_stream_id=r.stream_id;
+      if(typeof r.pending_started_at==='number')S.session.pending_started_at=r.pending_started_at;
+      if(r.effective_model)S.session.model=r.effective_model;
+      if(r.effective_model_provider)S.session.model_provider=r.effective_model_provider;
+    }
+    INFLIGHT[activeSid]={messages:[...S.messages],uploaded:[],toolCalls:[]};
+    if(typeof markInflight==='function')markInflight(activeSid,r.stream_id);
+    if(typeof saveInflightState==='function')saveInflightState(activeSid,{streamId:r.stream_id,messages:INFLIGHT[activeSid].messages,uploaded:[],toolCalls:[]});
+    startApprovalPolling(activeSid);
+    startClarifyPolling(activeSid);
+    if(typeof _fetchYoloState==='function')_fetchYoloState(activeSid);
+    attachLiveStream(activeSid,r.stream_id,[]);
+    if(typeof renderSessionList==='function')void renderSessionList();
+  }catch(e){
+    const err=String((e&&e.message)||e||'Goal command failed');
+    S.messages.push({role:'assistant',content:`**Goal command failed:** ${err}`,_ts:Date.now()/1000,_error:true});
+    renderMessages({preserveScroll:true});
+    showToast(err,3000);
+  }
+}
+
 // ── Busy-input mode commands ──────────────────────────────────────────────
 // These commands let users override the default busy_input_mode setting for a
 // specific message.  They are only meaningful while the agent is running.
@@ -183,6 +183,7 @@ const LOCALES = {
    theme_set: 'Theme: ',
    no_active_session: 'No active session',
    cmd_queue: 'Queue a message for the next turn',
+    cmd_goal: 'Set or inspect a persistent goal',
    cmd_interrupt: 'Cancel current turn and send a new message',
    cmd_steer: 'Inject a mid-turn correction without interrupting the agent',
    cmd_queue_no_msg: 'Usage: /queue <message>',
@@ -1204,6 +1205,7 @@ const LOCALES = {
    theme_set: 'テーマ: ',
    no_active_session: 'アクティブなセッションがありません',
    cmd_queue: '次のターン用にメッセージをキュー',
+    cmd_goal: '永続ゴールを設定または確認',
    cmd_interrupt: '現在のターンをキャンセルして新規メッセージを送信',
    cmd_steer: 'エージェントを中断せずにターン中に修正を注入',
    cmd_queue_no_msg: '使い方: /queue <メッセージ>',
@@ -2186,6 +2188,7 @@ const LOCALES = {
    theme_set: 'Тема: ',
    no_active_session: 'Нет активной сессии',
    cmd_queue: 'Поставить сообщение в очередь на следующий оборот',
+    cmd_goal: 'Задать или проверить постоянную цель',
    cmd_interrupt: 'Прервать текущий оборот и отправить новое сообщение',
    cmd_steer: 'Направить агента исправлением (переходит к прерыванию)',
    cmd_queue_no_msg: 'Использование: /queue <сообщение>',
@@ -3173,6 +3176,7 @@ const LOCALES = {
    theme_set: 'Tema: ',
    no_active_session: 'No hay ninguna sesión activa',
    cmd_queue: 'Poner mensaje en cola para el siguiente turno',
+    cmd_goal: 'Definir o consultar un objetivo persistente',
    cmd_interrupt: 'Cancelar turno actual y enviar nuevo mensaje',
    cmd_steer: 'Inyectar una corrección a mitad del turno sin interrumpir al agente',
    cmd_queue_no_msg: 'Uso: /queue <mensaje>',
@@ -4109,6 +4113,7 @@ const LOCALES = {
    model_scope_advisory: 'Gilt für diesen Chat ab Ihrer nächsten Nachricht.',
    model_scope_toast: 'Gilt für diesen Chat ab Ihrer nächsten Nachricht.',
    cmd_queue: 'Nachricht f\u00fcr den n\u00e4chsten Durchgang einreihen',
+    cmd_goal: 'Ein dauerhaftes Ziel setzen oder prüfen',
    cmd_interrupt: 'Aktuellen Durchgang abbrechen und neue Nachricht senden',
    cmd_steer: 'Korrektursignal einf\u00fcgen ohne Unterbrechung',
    cmd_queue_no_msg: 'Verwendung: /queue <Nachricht>',
@@ -5082,6 +5087,7 @@ const LOCALES = {
    theme_set: '\u4e3b\u9898\uff1a',
    no_active_session: '\u5f53\u524d\u6ca1\u6709\u6d3b\u52a8\u4f1a\u8bdd',
    cmd_queue: '\u5c06\u6d88\u606f\u52a0\u5165\u4e0b\u4e00\u8f6e\u7684\u961f\u5217',
+    cmd_goal: '设置或查看持久目标',
    cmd_interrupt: '\u53d6\u6d88\u5f53\u524d\u56de\u5408\u5e76\u53d1\u9001\u65b0\u6d88\u606f',
    cmd_steer: '\u7528\u7ea0\u6b63\u4fe1\u606f\u5f15\u5bfc\u4ee3\u7406\uff08\u56de\u9000\u4e3a\u4e2d\u65ad\uff09',
    cmd_queue_no_msg: '\u7528\u6cd5\uff1a/queue <\u6d88\u606f>',
@@ -6459,6 +6465,7 @@ const LOCALES = {
    never: '\u5f9e\u4e0d',
    no_active_session: '\u7121\u6d3b\u8e8d\u6703\u8a71',
    cmd_queue: '\u5c07\u8a0a\u606f\u52a0\u5165\u4e0b\u4e00\u8f2a\u7684\u4f47\u5217',
+    cmd_goal: '設定或查看持久目標',
    cmd_interrupt: '\u53d6\u6d88\u7576\u524d\u56de\u5408\u4e26\u767c\u9001\u65b0\u8a0a\u606f',
    cmd_steer: '\u5728\u56de\u5408\u9032\u884c\u4e2d\u6ce8\u5165\u7d3a\u6b63\uff0c\u4e0d\u4e2d\u65b7\u4ee3\u7406',
    cmd_queue_no_msg: '\u7528\u6cd5\uff1a/queue <\u8a0a\u606f>',
@@ -6972,6 +6979,7 @@ const LOCALES = {
    theme_set: 'Tema: ',
    no_active_session: 'Nenhuma sessão ativa',
    cmd_queue: 'Enfileirar mensagem para o próximo turno',
+    cmd_goal: 'Definir ou consultar uma meta persistente',
    cmd_interrupt: 'Cancelar turno atual e enviar nova mensagem',
    cmd_steer: 'Injetar correção no meio do turno sem interromper',
    cmd_queue_no_msg: 'Uso: /queue <mensagem>',
@@ -7869,6 +7877,7 @@ const LOCALES = {
    theme_set: 'Theme: ',
    no_active_session: '활성 세션 없음',
    cmd_queue: 'Queue a message for the next turn',
+    cmd_goal: '지속 목표를 설정하거나 확인',
    cmd_interrupt: 'Cancel current turn and send a new message',
    cmd_steer: 'Inject a mid-turn correction without interrupting the agent',
    cmd_queue_no_msg: 'Usage: /queue <message>',
@@ -63,7 +63,7 @@ async function send(){
      // cmdSteer / cmdInterrupt say "No active task to stop."
      if(text.startsWith('/')){
        const _pc=typeof parseCommand==='function'&&parseCommand(text);
-        if(_pc&&['steer','interrupt','queue','terminal'].includes(_pc.name)){
+        if(_pc&&['steer','interrupt','queue','terminal','goal'].includes(_pc.name)){
          const _bc=COMMANDS.find(c=>c.name===_pc.name);
          if(_bc){
            $('msg').value='';autoResize();
@@ -336,6 +336,8 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
  let assistantText='';
  let reasoningText='';
  let liveReasoningText='';
+  let _latestGoalStatus=null;
+  let _pendingGoalContinuation=null;
  let assistantRow=null;
  let assistantBody=null;
  let segmentStart=0;      // char offset in assistantText where current segment begins
@@ -879,6 +881,41 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
      }catch(_){}
    });

+    source.addEventListener('goal',e=>{
+      try{
+        const d=JSON.parse(e.data||'{}');
+        if((d.session_id||activeSid)!==activeSid) return;
+        const goalState=String(d.state||'').trim();
+        const goalEvaluatingMessage='Evaluating goal progress…';
+        if(goalState==='evaluating'){
+          setComposerStatus(goalEvaluatingMessage);
+          return;
+        }
+        const msg=String(d.message||'').trim();
+        if(!msg)return;
+        _latestGoalStatus={message:msg,decision:d.decision||null,state:goalState||null};
+        setComposerStatus(msg);
+        showToast(msg.split('\n')[0],2600);
+      }catch(_){}
+    });
+
+    source.addEventListener('goal_continue',e=>{
+      try{
+        const d=JSON.parse(e.data||'{}');
+        const sid=d.session_id||activeSid;
+        const continuation_prompt=String(d.continuation_prompt||d.text||'').trim();
+        if(!continuation_prompt||sid!==activeSid)return;
+        _pendingGoalContinuation={
+          sid,
+          text:continuation_prompt,
+          model:S.session&&S.session.model||'',
+          model_provider:S.session&&S.session.model_provider||null,
+          profile:S.activeProfile||'default',
+        };
+        showToast('Continuing toward goal…',2200);
+      }catch(_){}
+    });
+
    source.addEventListener('done',e=>{
      _terminalStateReached=true;
      if(_persistTimer){clearTimeout(_persistTimer);_persistTimer=null;}
@@ -992,6 +1029,15 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
          const lastUser=[...S.messages].reverse().find(m=>m.role==='user');
          if(lastUser)lastUser.attachments=uploaded;
        }
+        if(_latestGoalStatus&&_latestGoalStatus.message){
+          S.messages.push({
+            role:'assistant',
+            content:String(_latestGoalStatus.message),
+            _ts:Date.now()/1000,
+            _goalStatus:true,
+            _transient:true,
+          });
+        }
        clearLiveToolCards();
        S.busy=false;
        // No-reply guard (#373): if agent returned nothing, show inline error
@@ -1003,6 +1049,18 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
        // TTS auto-read: speak the last assistant response if enabled (#499)
        if(typeof autoReadLastAssistant==='function') setTimeout(()=>autoReadLastAssistant(), 300);
      }
+      if(isActiveSession&&_pendingGoalContinuation&&typeof queueSessionMessage==='function'){
+        const _goalNext=_pendingGoalContinuation;
+        _pendingGoalContinuation=null;
+        queueSessionMessage(_goalNext.sid,{
+          text:_goalNext.text,
+          files:[],
+          model:_goalNext.model,
+          model_provider:_goalNext.model_provider,
+          profile:_goalNext.profile,
+        });
+        if(typeof updateQueueBadge==='function')updateQueueBadge(_goalNext.sid);
+      }
      if(isActiveSession) _queueDrainSid=activeSid;
      renderSessionList();
      _setActivePaneIdleIfOwner();
@@ -0,0 +1,272 @@
+"""Regression tests for first-class WebUI /goal command parity."""
+
+import io
+import json
+from pathlib import Path
+from types import SimpleNamespace
+
+import pytest
+
+REPO_ROOT = Path(__file__).resolve().parents[1]
+COMMANDS_JS = (REPO_ROOT / "static" / "commands.js").read_text(encoding="utf-8")
+MESSAGES_JS = (REPO_ROOT / "static" / "messages.js").read_text(encoding="utf-8")
+ROUTES_PY = (REPO_ROOT / "api" / "routes.py").read_text(encoding="utf-8")
+STREAMING_PY = (REPO_ROOT / "api" / "streaming.py").read_text(encoding="utf-8")
+
+
+def test_goal_command_payload_matches_gateway_controls(monkeypatch):
+    """The backend command helper mirrors gateway /goal status/pause/resume/clear/set."""
+    from api import goals as webui_goals
+
+    calls = []
+
+    class FakeState:
+        goal = "ship the feature"
+        status = "active"
+        turns_used = 0
+        max_turns = 20
+        last_verdict = None
+        last_reason = None
+        paused_reason = None
+
+    class FakeGoalManager:
+        def __init__(self, session_id, default_max_turns=20):
+            calls.append(("init", session_id, default_max_turns))
+            self.state = None
+
+        def status_line(self):
+            return "No active goal. Set one with /goal <text>."
+
+        def pause(self, reason="user-paused"):
+            calls.append(("pause", reason))
+            return FakeState()
+
+        def resume(self, reset_budget=True):
+            calls.append(("resume", reset_budget))
+            return FakeState()
+
+        def has_goal(self):
+            return True
+
+        def clear(self):
+            calls.append(("clear",))
+
+        def set(self, goal):
+            calls.append(("set", goal))
+            state = FakeState()
+            state.goal = goal
+            self.state = state
+            return state
+
+    monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
+    monkeypatch.setattr(webui_goals, "_default_max_turns", lambda: 20)
+
+    status = webui_goals.goal_command_payload("sid-123", "status")
+    pause = webui_goals.goal_command_payload("sid-123", "pause")
+    resume = webui_goals.goal_command_payload("sid-123", "resume")
+    clear = webui_goals.goal_command_payload("sid-123", "clear")
+    set_goal = webui_goals.goal_command_payload("sid-123", "ship the feature")
+
+    assert status["message"] == "No active goal. Set one with /goal <text>."
+    assert pause["message"] == "⏸ Goal paused: ship the feature"
+    assert resume["message"].startswith("▶ Goal resumed: ship the feature")
+    assert clear["message"] == "Goal cleared."
+    assert set_goal["action"] == "set"
+    assert set_goal["kickoff_prompt"] == "ship the feature"
+    assert "⊙ Goal set (20-turn budget): ship the feature" in set_goal["message"]
+    assert ("set", "ship the feature") in calls
+
+
+def test_goal_command_payload_rejects_new_goal_while_stream_running(monkeypatch):
+    """Status/control subcommands are safe mid-run; replacing the goal is not."""
+    from api import goals as webui_goals
+
+    class FakeGoalManager:
+        def __init__(self, session_id, default_max_turns=20):
+            pass
+
+        def status_line(self):
+            return "⊙ Goal (active, 1/20 turns): existing"
+
+    monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
+    monkeypatch.setattr(webui_goals, "_default_max_turns", lambda: 20)
+
+    status = webui_goals.goal_command_payload("sid-123", "status", stream_running=True)
+    rejected = webui_goals.goal_command_payload("sid-123", "replace it", stream_running=True)
+
+    assert status["ok"] is True
+    assert rejected["ok"] is False
+    assert rejected["error"] == "agent_running"
+    assert "use /goal status / pause / clear mid-run" in rejected["message"]
+
+
+def test_has_active_goal_reports_only_active_state(monkeypatch):
+    """Streaming can avoid showing an evaluating spinner when no standing goal is active."""
+    from api import goals as webui_goals
+
+    class FakeGoalManager:
+        def __init__(self, session_id, default_max_turns=20):
+            self.session_id = session_id
+
+        def is_active(self):
+            return self.session_id == "sid-active-goal"
+
+    monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
+    monkeypatch.setattr(webui_goals, "_default_max_turns", lambda: 20)
+
+    assert webui_goals.has_active_goal("sid-active-goal") is True
+    assert webui_goals.has_active_goal("sid-idle-goal") is False
+    assert webui_goals.has_active_goal("") is False
+
+
+def test_goal_continuation_decision_emits_status_and_normal_user_prompt(monkeypatch):
+    """Post-turn hook returns the visible status event plus a normal continuation prompt."""
+    from api import goals as webui_goals
+
+    class FakeGoalManager:
+        def __init__(self, session_id, default_max_turns=20):
+            self.session_id = session_id
+
+        def is_active(self):
+            return True
+
+        def evaluate_after_turn(self, last_response, user_initiated=True):
+            return {
+                "status": "active",
+                "should_continue": True,
+                "continuation_prompt": "[Continuing toward your standing goal]\nGoal: ship it",
+                "verdict": "continue",
+                "reason": "one step remains",
+                "message": "↻ Continuing toward goal (1/20): one step remains",
+            }
+
+    monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
+    monkeypatch.setattr(webui_goals, "_default_max_turns", lambda: 20)
+
+    decision = webui_goals.evaluate_goal_after_turn("sid-123", "not done yet", user_initiated=False)
+
+    assert decision["message"].startswith("↻ Continuing toward goal")
+    assert decision["should_continue"] is True
+    assert decision["continuation_prompt"].startswith("[Continuing toward your standing goal]")
+
+
+def test_goal_endpoint_sets_goal_and_starts_kickoff_stream(monkeypatch, tmp_path):
+    """POST /api/goal uses GoalManager state and launches the first goal turn."""
+    from api import goals as webui_goals
+    from api import routes
+
+    class FakeState:
+        goal = "ship the feature"
+        status = "active"
+        turns_used = 0
+        max_turns = 20
+        last_verdict = None
+        last_reason = None
+        paused_reason = None
+
+    class FakeGoalManager:
+        def __init__(self, session_id, default_max_turns=20):
+            self.session_id = session_id
+            self.default_max_turns = default_max_turns
+
+        def set(self, goal):
+            state = FakeState()
+            state.goal = goal
+            return state
+
+    class FakeSession:
+        session_id = "sid-goal-route"
+        profile = "default"
+        workspace = str(tmp_path)
+        model = "gpt-5.5"
+        model_provider = "openai-codex"
+        messages = []
+        context_messages = []
+        pending_user_message = None
+        active_stream_id = None
+
+    monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
+    monkeypatch.setattr(routes, "get_session", lambda sid: FakeSession())
+    monkeypatch.setattr(routes, "resolve_trusted_workspace", lambda workspace: tmp_path)
+    monkeypatch.setattr(
+        routes,
+        "_resolve_compatible_session_model_state",
+        lambda model, provider: (model, provider, False),
+    )
+    started = []
+
+    def fake_start(session, **kwargs):
+        started.append(kwargs)
+        return {"stream_id": "goal-stream", "session_id": session.session_id, "pending_started_at": 123.0}
+
+    monkeypatch.setattr(routes, "_start_chat_stream_for_session", fake_start)
+    monkeypatch.setattr(routes, "j", lambda handler, payload, status=200, **kwargs: {"status": status, "payload": payload})
+
+    result = routes._handle_goal_command(
+        object(),
+        {
+            "session_id": "sid-goal-route",
+            "args": "ship the feature",
+            "workspace": str(tmp_path),
+            "model": "gpt-5.5",
+            "model_provider": "openai-codex",
+        },
+    )
+
+    assert result["status"] == 200
+    assert result["payload"]["action"] == "set"
+    assert result["payload"]["stream_id"] == "goal-stream"
+    assert started and started[0]["msg"] == "ship the feature"
+    assert started[0]["model_provider"] == "openai-codex"
+
+
+def test_routes_register_goal_endpoint_and_kickoff_stream():
+    assert 'if parsed.path == "/api/goal"' in ROUTES_PY
+    assert "return _handle_goal_command(handler, body)" in ROUTES_PY
+    assert "goal_command_payload" in ROUTES_PY
+    assert "kickoff_prompt" in ROUTES_PY
+    assert "_start_chat_stream_for_session" in ROUTES_PY
+
+
+def test_streaming_post_turn_goal_hook_surfaces_and_continues():
+    assert "evaluate_goal_after_turn" in STREAMING_PY
+    assert "put('goal'" in STREAMING_PY
+    assert "decision.get('should_continue')" in STREAMING_PY
+    assert "continuation_prompt" in STREAMING_PY
+    assert "put('goal_continue'" in STREAMING_PY
+    goal_idx = STREAMING_PY.find("evaluate_goal_after_turn")
+    done_idx = STREAMING_PY.find("put('done'", goal_idx)
+    assert goal_idx != -1 and done_idx != -1
+    assert goal_idx < done_idx, "goal status should be emitted before the terminal done payload"
+
+
+def test_streaming_goal_hook_emits_evaluating_state_before_judge():
+    evaluating_idx = STREAMING_PY.find("'state': 'evaluating'")
+    judge_idx = STREAMING_PY.find("_goal_decision = evaluate_goal_after_turn")
+    done_idx = STREAMING_PY.find("put('done'", judge_idx)
+    assert evaluating_idx != -1, "goal hook should emit an evaluating state before judge round-trip"
+    assert judge_idx != -1 and done_idx != -1
+    assert evaluating_idx < judge_idx < done_idx
+    assert "Evaluating goal progress…" in STREAMING_PY
+    assert "'state': 'continuing' if decision.get('should_continue') else 'idle'" in STREAMING_PY
+
+
+def test_frontend_has_goal_slash_command_and_status_event_handler():
+    assert "{name:'goal'" in COMMANDS_JS
+    assert "subArgs:['status','pause','resume','clear']" in COMMANDS_JS
+    assert "function cmdGoal" in COMMANDS_JS
+    assert "api('/api/goal'" in COMMANDS_JS
+    assert "stream_id" in COMMANDS_JS
+    assert "goal'" in MESSAGES_JS
+    assert "source.addEventListener('goal'" in MESSAGES_JS
+    assert "source.addEventListener('goal_continue'" in MESSAGES_JS
+    assert "['steer','interrupt','queue','terminal','goal'].includes(_pc.name)" in MESSAGES_JS
+    assert "queueSessionMessage" in MESSAGES_JS
+
+
+def test_frontend_goal_evaluating_state_uses_calm_composer_indicator():
+    assert "const goalState=String(d.state||'').trim();" in MESSAGES_JS
+    assert "const goalEvaluatingMessage='Evaluating goal progress…';" in MESSAGES_JS
+    assert "if(goalState==='evaluating')" in MESSAGES_JS
+    assert "setComposerStatus(goalEvaluatingMessage);" in MESSAGES_JS
+    assert "return;" in MESSAGES_JS
@@ -3,6 +3,9 @@ import sys
 import types
 from unittest import mock

+# Sentinel for sys.modules save/restore — distinguishes "key wasn't there" from None.
+_MISSING = object()
+

 def test_stream_completion_overwrites_session_usage_with_latest_turn(cleanup_test_sessions):
    """#1857: completed turns must not add prompt tokens to stale session totals."""
@@ -125,27 +128,40 @@ def test_stream_completion_overwrites_session_usage_with_latest_turn(cleanup_tes
    fake_hermes_state = types.ModuleType("hermes_state")
    fake_hermes_state.SessionDB = mock.Mock(return_value=None)

-    with mock.patch.object(streaming, "get_session", return_value=fake_session), \
-         mock.patch.object(streaming, "_get_ai_agent", return_value=UsageAgent), \
-         mock.patch.object(streaming, "resolve_model_provider", return_value=("gpt-5.4", "openai", None)), \
-         mock.patch("api.config.get_config", return_value={}), \
-         mock.patch("api.config._resolve_cli_toolsets", return_value=[]), \
-         mock.patch.dict(
-             sys.modules,
-             {
-                 "hermes_cli": fake_hermes_cli,
-                 "hermes_cli.runtime_provider": fake_runtime_module,
-                 "hermes_state": fake_hermes_state,
-             },
-         ):
-        streaming.STREAMS[fake_stream_id] = fake_queue
-        streaming._run_agent_streaming(
-            session_id=fake_session.session_id,
-            msg_text="new turn",
-            model="gpt-5.4",
-            workspace="/tmp",
-            stream_id=fake_stream_id,
-        )
+    # NOTE: We deliberately avoid mock.patch.dict(sys.modules, ...) here.
+    # patch.dict tracks original keys at __enter__ and on __exit__ DELETES any
+    # keys added during the patch that weren't in the original snapshot. That
+    # silently evicts lazily-imported submodules (e.g. pydantic.root_model)
+    # that other tests rely on, producing KeyError: 'pydantic.root_model' in
+    # downstream tests (notably tests/test_mcp_server.py via fastmcp imports).
+    # Manual save/restore only touches the three keys we explicitly inject.
+    _injected = {
+        "hermes_cli": fake_hermes_cli,
+        "hermes_cli.runtime_provider": fake_runtime_module,
+        "hermes_state": fake_hermes_state,
+    }
+    _saved = {k: sys.modules.get(k, _MISSING) for k in _injected}
+    sys.modules.update(_injected)
+    try:
+        with mock.patch.object(streaming, "get_session", return_value=fake_session), \
+             mock.patch.object(streaming, "_get_ai_agent", return_value=UsageAgent), \
+             mock.patch.object(streaming, "resolve_model_provider", return_value=("gpt-5.4", "openai", None)), \
+             mock.patch("api.config.get_config", return_value={}), \
+             mock.patch("api.config._resolve_cli_toolsets", return_value=[]):
+            streaming.STREAMS[fake_stream_id] = fake_queue
+            streaming._run_agent_streaming(
+                session_id=fake_session.session_id,
+                msg_text="new turn",
+                model="gpt-5.4",
+                workspace="/tmp",
+                stream_id=fake_stream_id,
+            )
+    finally:
+        for k, prev in _saved.items():
+            if prev is _MISSING:
+                sys.modules.pop(k, None)
+            else:
+                sys.modules[k] = prev

    assert fake_session.input_tokens == 123
    assert fake_session.output_tokens == 45
@@ -0,0 +1,925 @@
+"""Tests for mcp_server.py — Option A rewrite (Issue #1616).
+
+Covers: project CRUD, profile scoping, title collision, color validation,
+session listing, cross-profile isolation.
+
+Uses HERMES_WEBUI_STATE_DIR env var to point to a temp directory,
+so tests don't touch the real webui state. Module is re-imported
+per test class to ensure clean state.
+"""
+
+import json
+import os
+import sys
+import tempfile
+import uuid
+from pathlib import Path
+
+import pytest
+
+# Skip the entire module when the optional `mcp` package isn't installed.
+# CI runs with stdlib-only deps (pyyaml + pytest + pytest-timeout), and the
+# `mcp` package is only required for users who actually run the MCP server.
+# Locally with `pip install mcp pytest-asyncio` these tests run; on CI they
+# skip cleanly without breaking the matrix.
+pytest.importorskip("mcp", reason="mcp package not installed (optional MCP server dep)")
+
+# pytest-asyncio is also optional but always installed alongside mcp tests
+# in our local runs. If absent, importorskip the asyncio plugin gracefully.
+pytest.importorskip("pytest_asyncio", reason="pytest-asyncio required for MCP server tests")
+
+pytestmark = pytest.mark.asyncio
+
+# ── Ensure repo root on path ──────────────────────────────────────────────
+_REPO = Path(__file__).parent.parent.resolve()
+if str(_REPO) not in sys.path:
+    sys.path.insert(0, str(_REPO))
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  State-restore bookkeeping
+# ═══════════════════════════════════════════════════════════════════════════
+#
+# These tests mutate module-level constants on api.config / mcp_server /
+# api.models (STATE_DIR, SESSION_DIR, PROJECTS_FILE, …) so the MCP server
+# reads from a tmpdir. Without restoration, downstream tests in the full
+# suite (test_pytest_state_isolation, test_provider_quota_status,
+# test_provider_management, etc.) read the now-deleted tmpdir from
+# api.config.STATE_DIR and fail.
+#
+# We snapshot the original values on first _reimport_mcp() call and restore
+# them in _cleanup_state_dir() so the post-test module state matches pre-test.
+
+_MISSING_ENV = object()
+_SAVED_CONSTANTS = {"captured": False}
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Helpers
+# ═══════════════════════════════════════════════════════════════════════════
+
+def _fresh_state_dir():
+    """Create a clean temp state dir and set HERMES_WEBUI_STATE_DIR."""
+    td = tempfile.mkdtemp()
+    state_dir = Path(td)
+    sessions_dir = state_dir / "sessions"
+    sessions_dir.mkdir(parents=True)
+    (state_dir / "projects.json").write_text("[]", encoding="utf-8")
+    (sessions_dir / "_index.json").write_text("[]", encoding="utf-8")
+    os.environ["HERMES_WEBUI_STATE_DIR"] = str(state_dir)
+    return state_dir
+
+
+
+def _cleanup_state_dir(state_dir: Path):
+    """Remove temp state dir, clear env var, and restore api.config/mcp_server
+    module constants to whatever they were before the fixture started.
+
+    Without restoring, subsequent tests (test_pytest_state_isolation,
+    test_provider_quota_status, test_provider_management, etc.) read the
+    fixture's tmpdir from `api.config.STATE_DIR` and fail because the path
+    no longer exists or doesn't match their pytest-managed state dir."""
+    import shutil
+    shutil.rmtree(state_dir, ignore_errors=True)
+    os.environ.pop("HERMES_WEBUI_STATE_DIR", None)
+
+    # Restore api.config / mcp_server / api.models module constants.
+    saved = _SAVED_CONSTANTS
+    if saved.get("captured"):
+        import api.config as _cfg
+        for attr, val in saved["api.config"].items():
+            setattr(_cfg, attr, val)
+        if "mcp_server" in sys.modules:
+            mcp_mod = sys.modules["mcp_server"]
+            for attr, val in saved["mcp_server"].items():
+                setattr(mcp_mod, attr, val)
+        if "api.models" in sys.modules:
+            models_mod = sys.modules["api.models"]
+            for attr, val in saved["api.models"].items():
+                setattr(models_mod, attr, val)
+        # Restore HERMES_BASE_HOME / HERMES_HOME if we changed them
+        for env_key, env_val in saved["env"].items():
+            if env_val is _MISSING_ENV:
+                os.environ.pop(env_key, None)
+            else:
+                os.environ[env_key] = env_val
+
+def _reimport_mcp():
+    """Re-point mcp_server's module-level STATE_DIR / SESSION_DIR /
+    SESSION_INDEX_FILE / PROJECTS_FILE constants at the current
+    HERMES_WEBUI_STATE_DIR.
+
+    Returns (mcp_module, profiles_module) — profiles_module is the
+    live api.profiles reference.
+
+    NOTE: Does NOT use `del sys.modules[...]` or `importlib.reload(...)`.
+    Both patterns trigger a chain re-import inside the FastMCP / pydantic
+    stack that corrupts pydantic's `_generics._GENERIC_TYPES_CACHE`
+    (manifests as `KeyError: 'pydantic.root_model'` in unrelated
+    downstream tests in the full suite). Instead, we mutate the
+    constants in-place after the first one-time import, which is
+    behaviorally equivalent for these tests since the constants are
+    module-level Path objects used only to compute STATE_DIR-rooted
+    paths at call time.
+
+    Also normalizes HERMES_BASE_HOME / HERMES_HOME to point at a
+    directory whose `profiles/` subdirectory we control. This isolates
+    us from sibling test files (e.g. test_profile_path_security.py)
+    that mutate those env vars during their own setup and don't restore
+    them in the strict sense the active-profile path resolution needs.
+    """
+    state_dir = Path(os.environ['HERMES_WEBUI_STATE_DIR'])
+
+    # Sibling test files (e.g. test_profile_path_security.py) mutate
+    # HERMES_BASE_HOME / HERMES_HOME but only restore sys.modules — the
+    # env vars stay pointing at their tmpdir, which then breaks our
+    # active-profile path resolution. Re-anchor at a local home dir
+    # under our state_dir so other-profile scoping works.
+    isolated_home = state_dir.parent / "hermes-home"
+    (isolated_home / "profiles").mkdir(parents=True, exist_ok=True)
+
+    # Snapshot env vars BEFORE we overwrite them, so _cleanup_state_dir
+    # can restore them at fixture exit.
+    if not _SAVED_CONSTANTS.get("captured"):
+        _SAVED_CONSTANTS["env"] = {
+            "HERMES_BASE_HOME": os.environ.get("HERMES_BASE_HOME", _MISSING_ENV),
+            "HERMES_HOME": os.environ.get("HERMES_HOME", _MISSING_ENV),
+        }
+
+    os.environ["HERMES_BASE_HOME"] = str(isolated_home)
+    os.environ["HERMES_HOME"] = str(isolated_home)
+
+    import api.config as cfg
+    import mcp_server as mod
+
+    # First-time snapshot of module constants — captured AFTER the imports
+    # land their original values but BEFORE we mutate them below.
+    if not _SAVED_CONSTANTS.get("captured"):
+        _SAVED_CONSTANTS["api.config"] = {
+            attr: getattr(cfg, attr)
+            for attr in ("STATE_DIR", "SESSION_DIR", "WORKSPACES_FILE",
+                         "SETTINGS_FILE", "LAST_WORKSPACE_FILE", "PROJECTS_FILE",
+                         "SESSION_INDEX_FILE")
+            if hasattr(cfg, attr)
+        }
+        _SAVED_CONSTANTS["mcp_server"] = {
+            attr: getattr(mod, attr)
+            for attr in ("STATE_DIR", "SESSION_DIR", "PROJECTS_FILE",
+                         "SESSION_INDEX_FILE", "WEBUI_HOST", "WEBUI_PORT",
+                         "WEBUI_URL")
+            if hasattr(mod, attr)
+        }
+        if "api.models" in sys.modules:
+            models_mod = sys.modules["api.models"]
+            _SAVED_CONSTANTS["api.models"] = {
+                attr: getattr(models_mod, attr)
+                for attr in ("STATE_DIR", "PROJECTS_FILE", "SESSION_DIR")
+                if hasattr(models_mod, attr)
+            }
+        else:
+            _SAVED_CONSTANTS["api.models"] = {}
+        _SAVED_CONSTANTS["captured"] = True
+
+    # Acquire the api.profiles module THAT mcp_server's bound functions read.
+    # Sibling tests (test_profile_path_security.py) deletes api.profiles from
+    # sys.modules during their setup, then restores the originally-saved
+    # module reference. The result is that `import api.profiles` returns
+    # whatever module is currently in sys.modules, which may NOT be the same
+    # object as `mcp_server.get_active_profile_name`'s closure reference.
+    # We need to mutate the closure-bound module so mcp_server sees our
+    # _active_profile assignment.
+    import api.profiles as fresh_profiles_via_import
+    # mcp_server.get_active_profile_name is bound at first-import time and
+    # reads `_active_profile` from its own module's globals via closure.
+    # That module is the function's __globals__["__name__"] entry in
+    # sys.modules at first-import time. The most reliable way to find it
+    # is to follow the bound function back to its module.
+    bound_get_active = mod.get_active_profile_name
+    bound_module_name = bound_get_active.__module__
+    # Grab whatever Python currently has registered for that name; it may
+    # or may not be the same object as fresh_profiles_via_import.
+    # Use the function's __globals__ directly — that's the actual closure
+    # the function uses for its module-level reads.
+    bound_globals = bound_get_active.__globals__
+    # bound_globals IS the dict from sys.modules[<api.profiles>].__dict__ at
+    # first-import time. Mutating it propagates to all bound functions.
+    fresh_profiles = sys.modules.get(bound_module_name)
+    if fresh_profiles is None or fresh_profiles.__dict__ is not bound_globals:
+        # Sibling tests left a different module in sys.modules. The bound
+        # functions still use the original globals dict, so we expose a
+        # ModuleType-like proxy that writes to the original dict.
+        class _ProxyModule:
+            def __init__(self, globs):
+                self.__dict__ = globs
+        fresh_profiles = _ProxyModule(bound_globals)
+
+    # Re-point api.config module-level constants
+    cfg.STATE_DIR = state_dir
+    cfg.SESSION_DIR = state_dir / "sessions"
+    cfg.WORKSPACES_FILE = state_dir / "workspaces.json"
+    cfg.SETTINGS_FILE = state_dir / "settings.json"
+    cfg.LAST_WORKSPACE_FILE = state_dir / "last_workspace.txt"
+    cfg.PROJECTS_FILE = state_dir / "projects.json"
+    if hasattr(cfg, 'SESSION_INDEX_FILE'):
+        cfg.SESSION_INDEX_FILE = state_dir / "sessions" / "_index.json"
+
+    # Re-point mcp_server's imported aliases (they were copied at first
+    # import and don't pick up cfg mutations automatically).
+    mod.STATE_DIR = cfg.STATE_DIR
+    mod.SESSION_DIR = cfg.SESSION_DIR
+    mod.PROJECTS_FILE = cfg.PROJECTS_FILE
+    if hasattr(mod, 'SESSION_INDEX_FILE'):
+        mod.SESSION_INDEX_FILE = cfg.SESSION_INDEX_FILE
+
+    # api.models also imports STATE_DIR / PROJECTS_FILE etc. as module
+    # constants — re-point those too so load_projects() / save_projects()
+    # see the fresh STATE_DIR.
+    if 'api.models' in sys.modules:
+        models_mod = sys.modules['api.models']
+        if hasattr(models_mod, 'STATE_DIR'):
+            models_mod.STATE_DIR = cfg.STATE_DIR
+        if hasattr(models_mod, 'PROJECTS_FILE'):
+            models_mod.PROJECTS_FILE = cfg.PROJECTS_FILE
+        if hasattr(models_mod, 'SESSION_DIR'):
+            models_mod.SESSION_DIR = cfg.SESSION_DIR
+
+    # Re-evaluate WEBUI_URL from current env (PR #1895 made it env-aware
+    # but the value is computed once at module load; tests need to see
+    # current env state).
+    mod.WEBUI_HOST = os.environ.get("HERMES_WEBUI_HOST", "127.0.0.1")
+    mod.WEBUI_PORT = os.environ.get("HERMES_WEBUI_PORT", "8787")
+    mod.WEBUI_URL = f"http://{mod.WEBUI_HOST}:{mod.WEBUI_PORT}"
+
+    fresh_profiles._active_profile = 'default'
+
+    # Invalidate the root-profile cache (set at module load to detect
+    # renamed-root profiles, but stale after sibling tests that called
+    # switch_profile / list_profiles_api in their own setup).
+    if hasattr(fresh_profiles, '_invalidate_root_profile_cache'):
+        fresh_profiles._invalidate_root_profile_cache()
+    elif hasattr(fresh_profiles, '_root_profile_name_cache'):
+        fresh_profiles._root_profile_name_cache.clear()
+        fresh_profiles._root_profile_name_cache.add('default')
+        if hasattr(fresh_profiles, '_root_profile_name_cache_loaded'):
+            fresh_profiles._root_profile_name_cache_loaded = False
+    return mod, fresh_profiles
+
+
+async def _call(mod, tool_name, **kwargs):
+    """Call a tool handler and return parsed JSON."""
+    handler = mod.HANDLERS[tool_name]
+    result = await handler(kwargs)
+    return json.loads(result[0].text)
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Project CRUD
+# ═══════════════════════════════════════════════════════════════════════════
+
+class TestCreateProject:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        self.mod, self.profiles = _reimport_mcp()
+        yield
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_create_basic(self):
+        result = await _call(self.mod, "create_project", name="Test Project")
+        assert "project_id" in result
+        assert result["name"] == "Test Project"
+        assert result["profile"] == "default"
+        assert result["session_count"] == 0
+
+    async def test_create_with_color(self):
+        result = await _call(self.mod, "create_project",
+                             name="Colored", color="#ff6600")
+        assert result["color"] == "#ff6600"
+
+    async def test_create_duplicate_exact_match(self):
+        await _call(self.mod, "create_project", name="My Project")
+        result = await _call(self.mod, "create_project", name="My Project")
+        assert "error" in result
+        assert "already exists" in result["error"]
+
+    async def test_create_case_sensitive_no_collision(self):
+        """Exact match: 'MY project' and 'My Project' are different."""
+        await _call(self.mod, "create_project", name="My Project")
+        result = await _call(self.mod, "create_project", name="MY project")
+        assert "project_id" in result
+
+    async def test_create_empty_name(self):
+        result = await _call(self.mod, "create_project", name="")
+        assert "error" in result
+
+    async def test_create_invalid_color(self):
+        result = await _call(self.mod, "create_project",
+                             name="Bad", color="not-a-color")
+        assert "error" in result
+        assert "Invalid color" in result["error"]
+
+    async def test_create_valid_color_formats(self):
+        for color in ["#fff", "#ff6600", "#ff6600aa"]:
+            result = await _call(self.mod, "create_project",
+                                 name=f"Color-{color}", color=color)
+            assert result["color"] == color
+
+
+class TestRenameProject:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        self.mod, self.profiles = _reimport_mcp()
+        yield
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_rename_basic(self):
+        created = await _call(self.mod, "create_project", name="Old")
+        pid = created["project_id"]
+        result = await _call(self.mod, "rename_project",
+                             project_id=pid, name="New")
+        assert result["name"] == "New"
+        assert result["project_id"] == pid
+
+    async def test_rename_with_color(self):
+        created = await _call(self.mod, "create_project", name="X")
+        result = await _call(self.mod, "rename_project",
+                             project_id=created["project_id"],
+                             name="X", color="#000")
+        assert result["color"] == "#000"
+
+    async def test_rename_not_found(self):
+        result = await _call(self.mod, "rename_project",
+                             project_id="nonexistent", name="Nope")
+        assert "error" in result
+
+    async def test_rename_wrong_profile(self):
+        created = await _call(self.mod, "create_project", name="DefaultOwned")
+        pid = created["project_id"]
+        self.profiles._active_profile = 'other'
+        result = await _call(self.mod, "rename_project",
+                             project_id=pid, name="Stolen")
+        assert "error" in result
+        assert "not found" in result["error"].lower()
+
+
+class TestDeleteProject:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        self.mod, self.profiles = _reimport_mcp()
+        yield
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_delete_basic(self):
+        created = await _call(self.mod, "create_project", name="ToDelete")
+        pid = created["project_id"]
+        result = await _call(self.mod, "delete_project", project_id=pid)
+        assert result["ok"] is True
+        assert result["deleted"] == "ToDelete"
+
+    async def test_delete_not_found(self):
+        result = await _call(self.mod, "delete_project",
+                             project_id="nonexistent")
+        assert "error" in result
+
+    async def test_delete_wrong_profile(self):
+        created = await _call(self.mod, "create_project", name="Owned")
+        pid = created["project_id"]
+        self.profiles._active_profile = 'other'
+        result = await _call(self.mod, "delete_project", project_id=pid)
+        assert "error" in result
+
+    async def test_delete_no_auth_refuses_unassign(self):
+        """Without HERMES_WEBUI_PASSWORD, delete_project must NOT touch
+        session JSONs. Direct FS writes would bypass _write_session_index()
+        and leave _index.json holding the stale project_id, causing a
+        running WebUI to keep grouping sessions under the deleted project.
+
+        The handler should: delete the project from projects.json, leave
+        every session JSON untouched, leave the index untouched, and
+        surface a `warning` field telling the operator to set the env var.
+        """
+        from api.config import SESSION_DIR, SESSION_INDEX_FILE
+        os.environ.pop("HERMES_WEBUI_PASSWORD", None)
+
+        # Create project + a session JSON that points at it
+        created = await _call(self.mod, "create_project", name="ToDelete")
+        pid = created["project_id"]
+        sid = "test_sess_001"
+        session_path = SESSION_DIR / f"{sid}.json"
+        session_payload = {
+            "session_id": sid,
+            "title": "T",
+            "project_id": pid,
+            "messages": [],
+        }
+        session_path.write_text(json.dumps(session_payload), encoding="utf-8")
+        # Index references the session under the project
+        SESSION_INDEX_FILE.write_text(
+            json.dumps([{"session_id": sid, "project_id": pid, "title": "T"}]),
+            encoding="utf-8")
+        index_before = SESSION_INDEX_FILE.read_text(encoding="utf-8")
+        session_before = session_path.read_text(encoding="utf-8")
+
+        result = await _call(self.mod, "delete_project", project_id=pid)
+
+        assert result["ok"] is True
+        assert result["unassigned_sessions"] == 0
+        assert "warning" in result
+        assert "HERMES_WEBUI_PASSWORD" in result["warning"]
+        # Session JSON untouched
+        assert session_path.read_text(encoding="utf-8") == session_before
+        # Index untouched
+        assert SESSION_INDEX_FILE.read_text(encoding="utf-8") == index_before
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Profile Scoping
+# ═══════════════════════════════════════════════════════════════════════════
+
+class TestProfileScoping:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        self.mod, self.profiles = _reimport_mcp()
+        yield
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_projects_tagged_with_profile(self):
+        result = await _call(self.mod, "create_project", name="Tagged")
+        assert result["profile"] == "default"
+
+    async def test_list_projects_respects_profile(self):
+        # Create under default
+        await _call(self.mod, "create_project", name="DefaultProject")
+
+        # Switch to other
+        self.profiles._active_profile = 'other'
+        await _call(self.mod, "create_project", name="OtherProject")
+
+        # List should only show current profile's projects
+        projects = await _call(self.mod, "list_projects")
+        names = [p["name"] for p in projects]
+        assert "OtherProject" in names
+        assert "DefaultProject" not in names
+
+        # Switch back
+        self.profiles._active_profile = 'default'
+        projects = await _call(self.mod, "list_projects")
+        names = [p["name"] for p in projects]
+        assert "DefaultProject" in names
+        assert "OtherProject" not in names
+
+    async def test_cross_profile_isolation_create(self):
+        """Same name in different profiles should be allowed."""
+        await _call(self.mod, "create_project", name="Shared")
+        self.profiles._active_profile = 'other'
+        result = await _call(self.mod, "create_project", name="Shared")
+        assert "project_id" in result
+
+    async def test_legacy_untagged_hidden_from_non_root_profile(self):
+        """Untagged projects (no `profile` field) belong to the root profile.
+
+        Mirrors api/routes.py:_profiles_match where a missing profile coerces
+        to 'default'. A non-root profile must NOT see legacy untagged rows.
+        """
+        # Manually write a legacy untagged project (pre-#1614 schema)
+        import api.config as _cfg_mod
+        PROJECTS_FILE = _cfg_mod.PROJECTS_FILE
+        legacy = [{
+            "project_id": "legacy000001",
+            "name": "LegacyUntagged",
+            "color": None,
+            "created_at": 1700000000.0,
+            # No "profile" field on purpose
+        }]
+        PROJECTS_FILE.write_text(json.dumps(legacy), encoding="utf-8")
+
+        # Non-root profile must NOT see it
+        self.profiles._active_profile = 'other'
+        projects = await _call(self.mod, "list_projects")
+        names = [p["name"] for p in projects]
+        assert "LegacyUntagged" not in names
+
+        # Root profile still sees it (load_projects backfills `profile`
+        # to 'default', so visibility is preserved for the root).
+        self.profiles._active_profile = 'default'
+        projects = await _call(self.mod, "list_projects")
+        names = [p["name"] for p in projects]
+        assert "LegacyUntagged" in names
+
+    async def test_legacy_untagged_rename_blocked_from_non_root(self):
+        """Non-root profile cannot rename a legacy untagged project."""
+        import api.config as _cfg_mod
+        PROJECTS_FILE = _cfg_mod.PROJECTS_FILE
+        legacy = [{
+            "project_id": "legacy000002",
+            "name": "Legacy",
+            "color": None,
+            "created_at": 1700000000.0,
+        }]
+        PROJECTS_FILE.write_text(json.dumps(legacy), encoding="utf-8")
+        self.profiles._active_profile = 'other'
+        result = await _call(self.mod, "rename_project",
+                             project_id="legacy000002", name="Stolen")
+        assert "error" in result
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Session listing
+# ═══════════════════════════════════════════════════════════════════════════
+
+class TestListSessions:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        self.mod, self.profiles = _reimport_mcp()
+        yield
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_list_empty(self):
+        result = await _call(self.mod, "list_sessions")
+        assert result == []
+
+    async def test_list_with_limit(self):
+        result = await _call(self.mod, "list_sessions", limit=10)
+        assert isinstance(result, list)
+
+    async def test_list_unassigned(self):
+        result = await _call(self.mod, "list_sessions", unassigned=True)
+        assert isinstance(result, list)
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Session mutations (HTTP API — basic validation only)
+# ═══════════════════════════════════════════════════════════════════════════
+
+class TestSessionMutations:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        self.mod, self.profiles = _reimport_mcp()
+        yield
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_rename_missing_args(self):
+        result = await _call(self.mod, "rename_session",
+                             session_id="", title="")
+        assert "error" in result
+
+    async def test_move_missing_args(self):
+        result = await _call(self.mod, "move_session",
+                             session_id="", project_id="x")
+        assert "error" in result
+
+    async def test_move_project_not_found(self):
+        result = await _call(self.mod, "move_session",
+                             session_id="s1", project_id="nonexistent")
+        assert "error" in result
+
+    async def test_move_target_owned_by_other_profile_rejected(self):
+        """A project owned by profile A is invisible to profile B (#1614)."""
+        created = await _call(self.mod, "create_project", name="ATarget")
+        pid = created["project_id"]
+        self.profiles._active_profile = 'other'
+        result = await _call(self.mod, "move_session",
+                             session_id="any", project_id=pid)
+        assert "error" in result
+        assert "not found" in result["error"].lower()
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  Auth helper
+# ═══════════════════════════════════════════════════════════════════════════
+
+class TestApiPassword:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        # Ensure env var is unset for the test
+        os.environ.pop("HERMES_WEBUI_PASSWORD", None)
+        self.mod, self.profiles = _reimport_mcp()
+        yield
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_no_env_no_settings_returns_none(self):
+        assert self.mod._api_password() is None
+
+    async def test_password_hash_in_settings_is_ignored(self):
+        """settings.json holds a hash, not a plaintext password — must NOT
+        be returned as if it were a usable password."""
+        from api.config import STATE_DIR as _SD
+        (_SD / "settings.json").write_text(
+            json.dumps({"password_hash": "$2b$12$abcdefghijk"}),
+            encoding="utf-8")
+        assert self.mod._api_password() is None
+
+    async def test_env_var_returned(self):
+        os.environ["HERMES_WEBUI_PASSWORD"] = "secret123"
+        try:
+            assert self.mod._api_password() == "secret123"
+        finally:
+            os.environ.pop("HERMES_WEBUI_PASSWORD", None)
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  _profiles_match parity (mcp_server vs api.routes vs api.profiles)
+# ═══════════════════════════════════════════════════════════════════════════
+#
+# Locks the canonical-helper relocation: mcp_server.py and api/routes.py both
+# now import _profiles_match from api/profiles.py. If anyone re-introduces a
+# local copy in either module, both the identity check and the input-matrix
+# parametrize trip immediately.
+
+async def test_profiles_match_single_source_of_truth():
+    """All three module names resolve to the same canonical object.
+
+    This locks the relocation: mcp_server.py and api/routes.py both import
+    _profiles_match from api/profiles.py rather than carrying a local copy.
+    Re-introducing a local definition in either module trips this test
+    immediately.
+
+    Imported here in a clean module-import context (not via _reimport_mcp,
+    which would re-execute api/profiles.py and produce a distinct function
+    object that's behaviorally identical but fails the `is` check).
+
+    NOTE: We swap-in fresh modules but RESTORE the originals at exit so
+    sibling test files (test_provider_quota_status etc.) that imported
+    api.profiles at module-load time continue to see the same object
+    they already have monkeypatch handles into. Otherwise their
+    `monkeypatch.setattr(profiles, ...)` patches the wrong module object.
+    """
+    # Snapshot the originals; we'll put them back at the end.
+    saved_modules = {
+        k: sys.modules[k]
+        for k in ('mcp_server', 'api.routes', 'api.profiles')
+        if k in sys.modules
+    }
+    # Also snapshot the attributes on the parent `api` package, because
+    # `import api.routes as r` resolves via `sys.modules['api'].routes`,
+    # NOT directly via sys.modules['api.routes']. If we don't restore
+    # the parent attribute, subsequent `import api.routes as r` calls
+    # bind to the fresh re-imported module even though sys.modules
+    # holds the original.
+    import api as _api_parent
+    saved_api_attrs = {}
+    for sub in ('routes', 'profiles'):
+        if hasattr(_api_parent, sub):
+            saved_api_attrs[sub] = getattr(_api_parent, sub)
+
+    for k in ('mcp_server', 'api.routes', 'api.profiles'):
+        sys.modules.pop(k, None)
+    try:
+        import api.profiles as _profiles_mod
+        import api.routes as _routes_mod
+        import mcp_server as _mcp_mod
+        canonical = _profiles_mod._profiles_match
+        assert _routes_mod._profiles_match is canonical
+        assert _mcp_mod._profiles_match is canonical
+    finally:
+        # Restore so monkeypatch handles in sibling tests target the right module.
+        for k in ('mcp_server', 'api.routes', 'api.profiles'):
+            sys.modules.pop(k, None)
+        sys.modules.update(saved_modules)
+        # Restore parent-package attributes too (see above for why).
+        for sub, mod_obj in saved_api_attrs.items():
+            setattr(_api_parent, sub, mod_obj)
+
+
+@pytest.mark.parametrize("a, b", [
+    (None, None),
+    (None, ''),
+    ('', None),
+    ('', ''),
+    (None, 'default'),
+    ('default', None),
+    ('default', 'default'),
+    ('foo', 'foo'),
+    ('foo', 'bar'),
+    ('foo', None),
+    (None, 'foo'),
+    ('default', 'foo'),
+    ('foo', 'default'),
+])
+async def test_profiles_match_input_matrix(a, b):
+    """mcp_server._profiles_match agrees with api.routes._profiles_match
+    on every (row, active) pair across the visibility matrix.
+
+    Note: function-object identity is checked separately in
+    test_profiles_match_single_source_of_truth — here we only assert
+    behavioral parity, which is robust to test-fixture re-imports that
+    clear and re-execute api.profiles."""
+    from mcp_server import _profiles_match as mcp_match
+    from api.routes import _profiles_match as routes_match
+    assert mcp_match(a, b) == routes_match(a, b)
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  --profile CLI ordering regression
+# ═══════════════════════════════════════════════════════════════════════════
+#
+# Maintainer ask: verify that --profile is applied to _active_profile *before*
+# any api.models / api.profiles consumer reads the active profile. The risk
+# is that if the canonical helpers cached the profile on first read at import
+# time, a --profile foo flag passed at startup would bind too late.
+#
+# Today the helpers read _active_profile lazily (api/profiles.py:173 reads
+# the module global at every call) so the override is safe. This test locks
+# the behaviour: setting _active_profile = 'foo' before the first list call
+# produces results filtered to 'foo', not the default.
+
+class TestProfileCliOrdering:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        self.mod, self.profiles = _reimport_mcp()
+        yield
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_active_profile_override_takes_effect_before_first_read(self):
+        """--profile foo must filter list_projects to foo's rows immediately.
+
+        Simulates the CLI override path (mcp_server.py:62-64 sets
+        _profiles._active_profile = _profile_arg right after import). If a
+        helper had latched the profile at import time, the override here
+        would be too late and the test would see 'default'-tagged rows."""
+        import api.config as _cfg_mod
+        PROJECTS_FILE = _cfg_mod.PROJECTS_FILE
+        # Pre-seed two projects: one for default, one for foo.
+        seeded = [
+            {"project_id": "p_default_0001", "name": "DefaultRow",
+             "color": None, "profile": "default", "created_at": 1.0},
+            {"project_id": "p_foo_0001", "name": "FooRow",
+             "color": None, "profile": "foo", "created_at": 2.0},
+        ]
+        PROJECTS_FILE.write_text(json.dumps(seeded), encoding="utf-8")
+
+        # Apply the override BEFORE the first list call. This is what
+        # mcp_server.py:62-64 does after argparse.
+        self.profiles._active_profile = 'foo'
+
+        projects = await _call(self.mod, "list_projects")
+        names = [p["name"] for p in projects]
+        assert "FooRow" in names
+        assert "DefaultRow" not in names
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+#  HTTP wire-format coverage for rename_session / move_session
+# ═══════════════════════════════════════════════════════════════════════════
+#
+# Maintainer ask: exercise the actual HTTP path so a typo in WEBUI_URL or in
+# the request body shape can't slip through validation-only tests. We stand
+# up a tiny http.server stub on a free localhost port, point WEBUI_URL at it,
+# and capture (path, body) from the requests our handlers issue. This is
+# the thing that would have caught the original 8788 vs 8787 mismatch.
+
+import http.server
+import socket
+import threading
+
+
+class _RecordingHandler(http.server.BaseHTTPRequestHandler):
+    """Captures POST path + body, returns canned JSON. Class-level state is
+    set by the fixture before each test so handlers can cross-reference."""
+    captured = None  # populated per-test as a list of (path, body, headers)
+    canned_response = None  # populated per-test: dict to be JSON-encoded
+
+    def log_message(self, *args, **kwargs):  # noqa: D401 — silence stderr
+        pass
+
+    def do_POST(self):
+        length = int(self.headers.get("Content-Length", "0"))
+        raw = self.rfile.read(length) if length else b""
+        try:
+            body = json.loads(raw.decode("utf-8")) if raw else {}
+        except Exception:
+            body = {"_raw": raw.decode("utf-8", errors="replace")}
+        type(self).captured.append({
+            "path": self.path,
+            "body": body,
+            "cookie": self.headers.get("Cookie"),
+            "content_type": self.headers.get("Content-Type"),
+        })
+        payload = json.dumps(type(self).canned_response or {}).encode("utf-8")
+        self.send_response(200)
+        self.send_header("Content-Type", "application/json")
+        self.send_header("Content-Length", str(len(payload)))
+        self.end_headers()
+        self.wfile.write(payload)
+
+
+def _free_port() -> int:
+    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+    s.bind(("127.0.0.1", 0))
+    port = s.getsockname()[1]
+    s.close()
+    return port
+
+
+class TestApiWireFormat:
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        self.state_dir = _fresh_state_dir()
+        # Stand up a recording HTTP server on a free port. We override
+        # WEBUI_URL on the imported mcp_server module to point at it.
+        self.port = _free_port()
+        _RecordingHandler.captured = []
+        _RecordingHandler.canned_response = {}
+        self.httpd = http.server.HTTPServer(("127.0.0.1", self.port),
+                                            _RecordingHandler)
+        self.thread = threading.Thread(target=self.httpd.serve_forever,
+                                       daemon=True)
+        self.thread.start()
+
+        # Disable auth so _api_post() does not attempt a real /api/auth/login.
+        os.environ.pop("HERMES_WEBUI_PASSWORD", None)
+
+        self.mod, self.profiles = _reimport_mcp()
+        # Override AFTER import so the value sticks in the loaded module.
+        self.mod.WEBUI_URL = f"http://127.0.0.1:{self.port}"
+        yield
+        self.httpd.shutdown()
+        self.httpd.server_close()
+        self.thread.join(timeout=2)
+        _cleanup_state_dir(self.state_dir)
+
+    async def test_rename_session_posts_to_canonical_path(self):
+        """rename_session must POST {session_id, title} to /api/session/rename."""
+        _RecordingHandler.canned_response = {
+            "session": {"session_id": "abc123", "title": "Renamed"}
+        }
+        result = await _call(self.mod, "rename_session",
+                             session_id="abc123", title="Renamed")
+        assert len(_RecordingHandler.captured) == 1
+        req = _RecordingHandler.captured[0]
+        assert req["path"] == "/api/session/rename"
+        assert req["body"] == {"session_id": "abc123", "title": "Renamed"}
+        assert req["content_type"] == "application/json"
+        # Handler returns success-shaped result on 200.
+        assert result["ok"] is True
+        assert result["session_id"] == "abc123"
+        assert result["title"] == "Renamed"
+        assert result["method"] == "api"
+
+    async def test_move_session_posts_to_canonical_path(self):
+        """move_session (with a project_id) POSTs to /api/session/move
+        after confirming the project exists locally."""
+        # Need a real project so the pre-flight profile check passes.
+        created = await _call(self.mod, "create_project", name="MoveTarget")
+        pid = created["project_id"]
+        _RecordingHandler.canned_response = {
+            "ok": True,
+            "session": {"session_id": "s1", "title": "T", "project_id": pid}
+        }
+        result = await _call(self.mod, "move_session",
+                             session_id="s1", project_id=pid)
+        assert len(_RecordingHandler.captured) == 1
+        req = _RecordingHandler.captured[0]
+        assert req["path"] == "/api/session/move"
+        assert req["body"] == {"session_id": "s1", "project_id": pid}
+        assert result["ok"] is True
+        assert result["session_id"] == "s1"
+        assert result["project_id"] == pid
+        assert result["method"] == "api"
+
+    async def test_move_session_unassign_sends_null_project_id(self):
+        """Passing project_id=None must serialize as JSON null (not omitted)."""
+        _RecordingHandler.canned_response = {
+            "ok": True, "session": {"session_id": "s1", "project_id": None}
+        }
+        result = await _call(self.mod, "move_session",
+                             session_id="s1", project_id=None)
+        assert len(_RecordingHandler.captured) == 1
+        req = _RecordingHandler.captured[0]
+        assert req["path"] == "/api/session/move"
+        assert req["body"] == {"session_id": "s1", "project_id": None}
+        assert result["ok"] is True
+
+    async def test_url_built_from_env_vars(self):
+        """HERMES_WEBUI_HOST / HERMES_WEBUI_PORT govern WEBUI_URL.
+
+        Locks the maintainer-suggested env-var contract from #1895 review:
+        the MCP must track the same env vars api/config.py:32-33 reads, so
+        a non-default WebUI port (e.g. 8788 when 8787 is held by another
+        service on the host) does not require a code edit."""
+        os.environ["HERMES_WEBUI_HOST"] = "10.0.0.42"
+        os.environ["HERMES_WEBUI_PORT"] = "9999"
+        try:
+            mod, _ = _reimport_mcp()
+            assert mod.WEBUI_HOST == "10.0.0.42"
+            assert mod.WEBUI_PORT == "9999"
+            assert mod.WEBUI_URL == "http://10.0.0.42:9999"
+        finally:
+            os.environ.pop("HERMES_WEBUI_HOST", None)
+            os.environ.pop("HERMES_WEBUI_PORT", None)
+
+    async def test_url_default_when_env_unset(self):
+        """Default upstream port is 8787, matching api/config.py:33."""
+        os.environ.pop("HERMES_WEBUI_HOST", None)
+        os.environ.pop("HERMES_WEBUI_PORT", None)
+        mod, _ = _reimport_mcp()
+        assert mod.WEBUI_HOST == "127.0.0.1"
+        assert mod.WEBUI_PORT == "8787"
+        assert mod.WEBUI_URL == "http://127.0.0.1:8787"