Release v0.51.28 — Release E2 (MCP server Option A rewrite + WebUI /goal command)

Merge stage-323 to master.
This commit is contained in:
nesquena-hermes
2026-05-08 13:28:13 -07:00
committed by GitHub
19 changed files with 2747 additions and 110 deletions
+8 -1
View File
@@ -24,7 +24,14 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pyyaml>=6.0 pytest pytest-timeout
pip install pyyaml>=6.0 pytest pytest-timeout pytest-asyncio
# Install the `mcp` package so tests/test_mcp_server.py runs in CI.
# The package is an optional runtime dep of mcp_server.py — users
# who run the MCP integration install it themselves; CI installs
# it so test coverage exists. If mcp install fails (Python 3.13
# wheel not yet available, etc.), tests/test_mcp_server.py uses
# importorskip and the matrix stays green.
pip install mcp || echo "mcp install failed — test_mcp_server.py will importorskip"
- name: Run tests
run: pytest tests/ -v --timeout=60
+33 -1
View File
@@ -1,6 +1,38 @@
# Hermes Web UI -- Changelog
## [v0.51.27] — 2026-05-08 — 4-PR contributor batch (Release E1: workspace-prefix sentinel hardening, custom named provider API key resolution, streaming chat scroll-pin retention, Kanban detail scrollable)
## [v0.51.28] — 2026-05-08 — 2-PR contributor batch (Release E2: MCP server Option A rewrite + WebUI /goal command)
### Added (2 PRs)
- **PR #1895** by @samuelgudi — MCP server Option A rewrite (#1616). Replaces the fragile MCP integration with a clean `mcp_server.py` (567 LOC) implementing project CRUD, session listing, and session mutations (rename/move) over Hermes's HTTP API. Imports `api.models` / `api.profiles` canonically rather than carrying duplicate slug-matching helpers. Relocates `_profiles_match` from `api/routes.py` into `api/profiles.py` as the single source of truth (mcp_server.py and api/routes.py both now import the canonical helper — re-introducing a local copy in either module trips a parity test immediately). Adds env-aware WEBUI_URL (`HERMES_WEBUI_HOST` / `HERMES_WEBUI_PORT`). New behaviour: `delete_project` REFUSES to touch session JSONs when `HERMES_WEBUI_PASSWORD` is unset, returning `{ok:true, unassigned_sessions:0, warning:"…"}` instead — preventing data-loss when an MCP client tries to delete a project on an unauthenticated server. 53-test coverage in `tests/test_mcp_server.py` (914 LOC) including HTTP wire-format integration tests, profile-scoped isolation, legacy untagged row visibility, and `--profile foo` CLI ordering regression. Closes #1616.
- **PR #1866** by @Michaelyklam — WebUI `/goal` command for goal-tracking with budget enforcement and continuation prompts. New `api/goals.py` (489 LOC) implements goal lifecycle (set / pause / resume / clear / status), per-profile SQLite `SessionDB` cache, and `evaluate_goal_after_turn()` SSE hook that emits `goal` and `goal_continue` events from `api/streaming.py` after assistant turns. Wire-up: `api/routes.py` adds `/api/goal` endpoint (POST set/pause/resume/clear, GET status) and `_start_chat_stream_for_session()` extraction so kickoff prompts can run through the canonical streaming path; `static/commands.js` adds `/goal` autocomplete (cmdGoal handler) with i18n description; `static/messages.js` handles new SSE event types with continuation-toast UI; `static/i18n.js` adds 9 new strings across all locales. 4 documentation screenshots added under `docs/pr-media/{1866,1808}/`. Closes #1808.
### Mid-stage absorbed fixes (test isolation, per blocker investigation)
- **#1857 polluter root-cause** — `tests/test_issue1857_usage_overwrite.py` was using `mock.patch.dict(sys.modules, {...})`, which DELETES any keys added during the patched scope on `__exit__`. That silently evicted lazily-imported pydantic submodules (e.g. `pydantic.root_model`), producing `KeyError: 'pydantic.root_model'` in `test_mcp_server.py` downstream when the full pytest suite ran. Fixed by replacing with manual save/restore using a `_MISSING` sentinel.
- **#1895 module-attribute restoration** — `tests/test_mcp_server.py` mutates module-level constants on `api.config`/`api.models`/`mcp_server` (`STATE_DIR`, `SESSION_DIR`, `PROJECTS_FILE`, …) so the MCP server reads from a tmpdir. Without restoration, downstream tests (`test_pytest_state_isolation`, `test_provider_quota_status`, `test_provider_management`) read deleted tmpdirs from `api.config.STATE_DIR`. Fixed by snapshotting originals on first `_reimport_mcp()` call and restoring in `_cleanup_state_dir()`.
- **#1895 `_profiles_match` parity test parent-attribute leak** — `test_profiles_match_single_source_of_truth` pops `api.routes`/`api.profiles` from `sys.modules` and re-imports for the canonical-helper identity check. When restoring `sys.modules` only, fresh modules still leaked through because `import api.routes as r` resolves via `sys.modules['api'].routes` (parent-package attribute), NOT via `sys.modules['api.routes']` directly. Fixed by ALSO restoring parent-package attributes — without this, sibling tests (`test_plugins_panel`, `test_pr1350_sse_notify_correctness`, `test_version_badge`) that patch `api.routes.j` and call handlers via `import api.routes as routes` would fail because the patch hits one module object and the handler reads from another.
### Tests
4898 → **4947 collected, 4947 passing, 0 regressions** (+49 net new). Full suite ~140s on Python 3.11 (HERMES_HOME isolated). JS syntax check (`node -c`) clean on `static/commands.js`, `static/i18n.js`, `static/messages.js`. Browser API sanity harness (port 8789): all 11 endpoints + 20 QA tests PASS. Opus advisor pass: SHIP-READY, no blockers (2 follow-up items filed: goal hook firing on unrelated turns; English-only runtime strings in goal UI).
### Pre-release verification
- Full pytest under `HERMES_HOME` isolation: **4947 passed, 8 skipped, 1 xfailed, 2 xpassed, 8 subtests passed** in 140.41s.
- Browser API harness against stage-323 on port 8789: all 11 endpoints + 20 QA tests PASS (110.66s for QA phase).
- `node -c` on all 3 modified `static/*.js` files: clean.
- Stage diff: 16 files, +2692/-105.
- Opus advisor pass on stage-323 brief: VERDICT=SHIP-READY. No coexistence bugs between #1895 and #1866 (disjoint hunks in routes.py, SSE event names align, `_profiles_match` resolution unambiguous either way, no path collisions).
- v0.51.27 fixes verified preserved: `_strip_workspace_prefix` (callers at routes.py:1446/1485), `on_interim_assistant` (streaming.py:2120), `_max_iterations_cfg` (streaming.py:2331-2410), `if input_tokens > 0:` guard (streaming.py:2933).
- Pre-stamp re-fetch of #1866 (sha f2aacf4) + #1895 (sha 766c91e): both MERGEABLE, no force-push during Opus window.
### Follow-up items (filed for next sweep)
- **Goal hook fires on unrelated turns** — while a goal is `active`, every completed assistant turn runs `evaluate_goal_after_turn` and ticks `state.turns_used += 1`, even on user messages unrelated to the goal. UX surprise but not bug-broken; consider gating on `user_initiated` or a goal-context flag.
- **English-only runtime strings in goal UI**`messages.js:889` ("Evaluating goal progress…"), `commands.js:651` ("Working toward goal…"), `messages.js:914` ("Continuing toward goal…" toast); also backend strings in `goals.py` (`status_line`, "⊙ Goal set …", "⏸ Goal paused …", "↻ Continuing …"). The `cmd_goal` autocomplete description IS localized across all 9 locales — only the runtime status strings are missed.
### Fixed (4 PRs)
+2 -2
View File
@@ -2,7 +2,7 @@
> Web companion to the Hermes Agent CLI. Same workflows, browser-native.
>
> Last updated: v0.51.27 (May 8, 2026) — 4898 tests collected — 4-PR Release E1 batch (workspace-prefix sentinel hardening, custom named provider API key resolution, streaming chat scroll-pin, Kanban detail scrollable)
> Last updated: v0.51.28 (May 8, 2026) — 4947 tests collected — 2-PR Release E2 batch (MCP server Option A rewrite + WebUI /goal command)
> Test source: `pytest tests/ --collect-only -q`
> Per-version detail: see [CHANGELOG.md](./CHANGELOG.md)
@@ -168,7 +168,7 @@ Remaining gaps and forward work live in [Forward Work](#forward-work) below.
### Slash commands
- [x] Command registry + autocomplete dropdown
- [x] Built-ins: `/help`, `/clear`, `/model`, `/workspace`, `/new`, `/usage`, `/theme`, `/compact`, `/queue`, `/interrupt`, `/steer`, `/btw`, `/reasoning`, `/skills`, `/toolsets`
- [x] Built-ins: `/help`, `/clear`, `/model`, `/workspace`, `/new`, `/usage`, `/theme`, `/compact`, `/queue`, `/interrupt`, `/steer`, `/goal`, `/btw`, `/reasoning`, `/skills`, `/toolsets`
- [x] Transparent pass-through for unrecognized commands
### Security
+2 -2
View File
@@ -1835,8 +1835,8 @@ Bridged CLI sessions:
---
*Last updated: v0.51.27, May 8, 2026*
*Total automated tests collected: 4898*
*Last updated: v0.51.28, May 8, 2026*
*Total automated tests collected: 4947*
*Regression gate: tests/test_regressions.py*
*Run: pytest tests/ -v --timeout=60*
*Source: <repo>/*
+489
View File
@@ -0,0 +1,489 @@
"""WebUI bridge for Hermes persistent session goals."""
from __future__ import annotations
import copy
import logging
import time
from pathlib import Path
from typing import Any, Dict, Optional
logger = logging.getLogger(__name__)
try: # Exposed as a module attribute so tests can monkeypatch it directly.
from hermes_cli.goals import ( # type: ignore
CONTINUATION_PROMPT_TEMPLATE,
DEFAULT_MAX_TURNS,
GoalManager as _NativeGoalManager,
GoalState,
judge_goal,
)
except Exception: # pragma: no cover - depends on installed hermes-agent
CONTINUATION_PROMPT_TEMPLATE = "" # type: ignore
DEFAULT_MAX_TURNS = 20 # type: ignore
_NativeGoalManager = None # type: ignore
GoalState = None # type: ignore
judge_goal = None # type: ignore
GoalManager = _NativeGoalManager # type: ignore
_DB_CACHE: dict[str, Any] = {}
def _default_max_turns() -> int:
"""Return the configured /goal turn budget, defaulting to Hermes' 20 turns."""
try:
from api import config as _config
cfg = getattr(_config, "cfg", {}) or {}
goals_cfg = cfg.get("goals", {}) if isinstance(cfg, dict) else {}
if not isinstance(goals_cfg, dict):
return int(DEFAULT_MAX_TURNS or 20)
return max(1, int(goals_cfg.get("max_turns", DEFAULT_MAX_TURNS or 20) or 20))
except Exception:
return int(DEFAULT_MAX_TURNS or 20)
def _meta_key(session_id: str) -> str:
return f"goal:{session_id}"
def _profile_db(profile_home: str | Path):
"""Return a SessionDB pinned to *profile_home*, without reading HERMES_HOME.
The upstream Hermes GoalManager persists through hermes_cli.goals.load_goal(),
which resolves SessionDB from process-global HERMES_HOME. WebUI sessions are
profile-scoped and can run concurrently, so the WebUI bridge uses an explicit
state.db path whenever the caller provides the session's profile home.
"""
home = Path(profile_home).expanduser().resolve()
key = str(home)
cached = _DB_CACHE.get(key)
if cached is not None:
return cached
try:
from hermes_state import SessionDB # type: ignore
db = SessionDB(db_path=home / "state.db")
except Exception as exc: # pragma: no cover - import/env dependent
logger.debug("GoalManager profile DB unavailable for %s: %s", home, exc)
return None
_DB_CACHE[key] = db
return db
class _ProfileGoalManager:
"""Small WebUI-local GoalManager adapter with explicit profile persistence."""
def __init__(self, session_id: str, *, profile_home: str | Path, default_max_turns: int = 20):
if GoalState is None:
raise RuntimeError("Hermes goal state unavailable")
self.session_id = session_id
self.profile_home = Path(profile_home).expanduser().resolve()
self.default_max_turns = int(default_max_turns or DEFAULT_MAX_TURNS or 20)
self._state = self._load()
@property
def state(self):
return self._state
def _load(self):
db = _profile_db(self.profile_home)
if db is None or not self.session_id:
return None
try:
raw = db.get_meta(_meta_key(self.session_id))
except Exception as exc:
logger.debug("GoalManager profile get_meta failed: %s", exc)
return None
if not raw:
return None
try:
return GoalState.from_json(raw) # type: ignore[union-attr]
except Exception as exc:
logger.warning("GoalManager profile state parse failed for %s: %s", self.session_id, exc)
return None
def _save(self, state) -> None:
db = _profile_db(self.profile_home)
if db is None or not self.session_id or state is None:
return
try:
db.set_meta(_meta_key(self.session_id), state.to_json())
except Exception as exc:
logger.debug("GoalManager profile set_meta failed: %s", exc)
def is_active(self) -> bool:
return self._state is not None and self._state.status == "active"
def has_goal(self) -> bool:
return self._state is not None and self._state.status in ("active", "paused")
def status_line(self) -> str:
s = self._state
if s is None or s.status in ("cleared",):
return "No active goal. Set one with /goal <text>."
turns = f"{s.turns_used}/{s.max_turns} turns"
if s.status == "active":
return f"⊙ Goal (active, {turns}): {s.goal}"
if s.status == "paused":
extra = f"{s.paused_reason}" if s.paused_reason else ""
return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
if s.status == "done":
return f"✓ Goal done ({turns}): {s.goal}"
return f"Goal ({s.status}, {turns}): {s.goal}"
def set(self, goal: str, *, max_turns: Optional[int] = None):
goal = (goal or "").strip()
if not goal:
raise ValueError("goal text is empty")
state = GoalState( # type: ignore[operator]
goal=goal,
status="active",
turns_used=0,
max_turns=int(max_turns) if max_turns else self.default_max_turns,
created_at=time.time(),
last_turn_at=0.0,
)
self._state = state
self._save(state)
return state
def pause(self, reason: str = "user-paused"):
if not self._state:
return None
self._state.status = "paused"
self._state.paused_reason = reason
self._save(self._state)
return self._state
def resume(self, *, reset_budget: bool = True):
if not self._state:
return None
self._state.status = "active"
self._state.paused_reason = None
if reset_budget:
self._state.turns_used = 0
self._save(self._state)
return self._state
def clear(self) -> None:
if self._state is None:
return
self._state.status = "cleared"
self._save(self._state)
self._state = None
def evaluate_after_turn(self, last_response: str, *, user_initiated: bool = True) -> Dict[str, Any]:
state = self._state
if state is None or state.status != "active":
return {
"status": state.status if state else None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "no active goal",
"message": "",
}
state.turns_used += 1
state.last_turn_at = time.time()
if judge_goal is None:
verdict, reason = "continue", "goal judge unavailable"
else:
verdict, reason = judge_goal(state.goal, str(last_response or ""))
state.last_verdict = verdict
state.last_reason = reason
if verdict == "done":
state.status = "done"
self._save(state)
return {
"status": "done",
"should_continue": False,
"continuation_prompt": None,
"verdict": "done",
"reason": reason,
"message": f"✓ Goal achieved: {reason}",
}
if state.turns_used >= state.max_turns:
state.status = "paused"
state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
self._save(state)
return {
"status": "paused",
"should_continue": False,
"continuation_prompt": None,
"verdict": "continue",
"reason": reason,
"message": (
f"⏸ Goal paused — {state.turns_used}/{state.max_turns} turns used. "
"Use /goal resume to keep going, or /goal clear to stop."
),
}
self._save(state)
return {
"status": "active",
"should_continue": True,
"continuation_prompt": self.next_continuation_prompt(),
"verdict": "continue",
"reason": reason,
"message": f"↻ Continuing toward goal ({state.turns_used}/{state.max_turns}): {reason}",
}
def next_continuation_prompt(self) -> Optional[str]:
if not self._state or self._state.status != "active":
return None
return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)
def _manager(session_id: str, *, profile_home: str | Path | None = None):
if GoalManager is None:
return None
if profile_home and GoalManager is _NativeGoalManager and GoalState is not None:
try:
return _ProfileGoalManager(
session_id=session_id,
profile_home=profile_home,
default_max_turns=_default_max_turns(),
)
except Exception as exc:
logger.debug("Profile-scoped GoalManager unavailable: %s", exc)
return None
return GoalManager(session_id=session_id, default_max_turns=_default_max_turns())
def _state_payload(state: Any) -> Optional[Dict[str, Any]]:
if state is None:
return None
return {
"goal": getattr(state, "goal", "") or "",
"status": getattr(state, "status", "") or "",
"turns_used": int(getattr(state, "turns_used", 0) or 0),
"max_turns": int(getattr(state, "max_turns", 0) or 0),
"last_verdict": getattr(state, "last_verdict", None),
"last_reason": getattr(state, "last_reason", None),
"paused_reason": getattr(state, "paused_reason", None),
}
def _payload(
*,
ok: bool = True,
action: str,
message: str,
state: Any = None,
error: str | None = None,
kickoff_prompt: str | None = None,
decision: Dict[str, Any] | None = None,
) -> Dict[str, Any]:
body: Dict[str, Any] = {
"ok": bool(ok),
"action": action,
"message": message,
"goal": _state_payload(state),
}
if error:
body["error"] = error
if kickoff_prompt:
body["kickoff_prompt"] = kickoff_prompt
if decision is not None:
body["decision"] = decision
return body
def goal_state_snapshot(session_id: str, *, profile_home: str | Path | None = None) -> Any:
"""Return a deep copy of current goal state for rollback before kickoff."""
mgr = _manager(str(session_id or ""), profile_home=profile_home)
if mgr is None:
return None
return copy.deepcopy(getattr(mgr, "state", None))
def restore_goal_state(session_id: str, snapshot: Any, *, profile_home: str | Path | None = None) -> None:
"""Restore a prior goal state after kickoff stream creation fails."""
mgr = _manager(str(session_id or ""), profile_home=profile_home)
if mgr is None:
return
if snapshot is None:
try:
mgr.clear()
except Exception:
pass
return
if isinstance(mgr, _ProfileGoalManager):
mgr._state = snapshot
mgr._save(snapshot)
return
try:
from hermes_cli.goals import save_goal # type: ignore
save_goal(str(session_id or ""), snapshot)
except Exception as exc: # pragma: no cover - native fallback only
logger.debug("Goal state restore failed for %s: %s", session_id, exc)
def goal_command_payload(
session_id: str,
args: str = "",
*,
stream_running: bool = False,
profile_home: str | Path | None = None,
) -> Dict[str, Any]:
"""Return the WebUI response payload for a /goal command.
Mirrors the gateway command semantics:
- /goal or /goal status shows status
- /goal pause pauses
- /goal resume resumes without auto-starting a turn
- /goal clear|stop|done clears
- /goal <text> sets a new active goal and returns kickoff_prompt so the
caller can start the first normal user-role turn immediately.
"""
sid = str(session_id or "").strip()
if not sid:
return _payload(ok=False, action="error", error="missing_session", message="session_id required")
mgr = _manager(sid, profile_home=profile_home)
if mgr is None:
return _payload(ok=False, action="error", error="unavailable", message="Goals unavailable on this session.")
text = str(args or "").strip()
lower = text.lower()
if not text or lower == "status":
return _payload(action="status", message=mgr.status_line(), state=getattr(mgr, "state", None))
if lower == "pause":
state = mgr.pause(reason="user-paused")
if state is None:
return _payload(ok=False, action="pause", error="no_goal", message="No goal set.")
return _payload(action="pause", message=f"⏸ Goal paused: {state.goal}", state=state)
if lower == "resume":
state = mgr.resume()
if state is None:
return _payload(ok=False, action="resume", error="no_goal", message="No goal to resume.")
return _payload(
action="resume",
message=(
f"▶ Goal resumed: {state.goal}\n"
"Send a new message, or type continue, to kick it off."
),
state=state,
)
if lower in ("clear", "stop", "done"):
had = bool(mgr.has_goal())
mgr.clear()
return _payload(
action="clear",
message="Goal cleared." if had else "No active goal.",
state=getattr(mgr, "state", None),
)
if stream_running:
return _payload(
ok=False,
action="set",
error="agent_running",
message=(
"Agent is running — use /goal status / pause / clear mid-run, "
"or /stop before setting a new goal."
),
)
try:
state = mgr.set(text)
except ValueError as exc:
return _payload(ok=False, action="set", error="invalid_goal", message=f"Invalid goal: {exc}")
return _payload(
action="set",
message=(
f"⊙ Goal set ({state.max_turns}-turn budget): {state.goal}\n"
"I'll keep working until the goal is done, you pause/clear it, or the budget is exhausted.\n"
"Controls: /goal status · /goal pause · /goal resume · /goal clear"
),
state=state,
kickoff_prompt=state.goal,
)
def has_active_goal(
session_id: str,
*,
profile_home: str | Path | None = None,
) -> bool:
"""Return True when the session has an active standing goal to evaluate."""
sid = str(session_id or "").strip()
if not sid:
return False
mgr = _manager(sid, profile_home=profile_home)
if mgr is None:
return False
try:
return bool(mgr.is_active())
except Exception as exc:
logger.debug("goal active-state check failed for session=%s: %s", sid, exc)
return False
def evaluate_goal_after_turn(
session_id: str,
last_response: str,
*,
user_initiated: bool = True,
profile_home: str | Path | None = None,
) -> Dict[str, Any]:
"""Evaluate a completed turn against the standing goal, if any."""
sid = str(session_id or "").strip()
if not sid:
return {
"status": None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "missing session_id",
"message": "",
}
mgr = _manager(sid, profile_home=profile_home)
if mgr is None:
return {
"status": None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "goals unavailable",
"message": "",
}
try:
if not mgr.is_active():
return {
"status": getattr(getattr(mgr, "state", None), "status", None),
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "no active goal",
"message": "",
}
decision = mgr.evaluate_after_turn(str(last_response or ""), user_initiated=user_initiated)
except Exception as exc:
logger.debug("goal evaluation failed for session=%s: %s", sid, exc)
return {
"status": None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "error",
"reason": f"goal evaluation failed: {type(exc).__name__}",
"message": "",
}
if not isinstance(decision, dict):
decision = {}
decision.setdefault("should_continue", False)
decision.setdefault("continuation_prompt", None)
decision.setdefault("message", "")
return decision
+27
View File
@@ -170,6 +170,33 @@ def _is_root_profile(name: str) -> bool:
return name in _root_profile_name_cache
def _profiles_match(row_profile, active_profile) -> bool:
"""Return True if a session/project row's profile matches the active profile.
Treats both the literal alias 'default' and any renamed-root display name
(per _is_root_profile) as equivalent, so legacy rows tagged 'default'
still surface when the user has renamed the root profile to e.g. 'kinni',
and vice versa.
A row with no profile (`None` or empty string) is treated as belonging to
the root profile that's the convention used by the legacy backfill at
api/models.py::all_sessions, and matches the default seen in
`static/sessions.js` (`S.activeProfile||'default'`).
Originally lived in api/routes.py; relocated here so both routes.py and
out-of-process consumers (mcp_server.py) can import the canonical helper
instead of duplicating the body. See #1614 for the visibility model.
"""
row = row_profile or 'default'
active = active_profile or 'default'
if row == active:
return True
# Cross-alias the renamed root.
if _is_root_profile(row) and _is_root_profile(active):
return True
return False
def get_active_profile_name() -> str:
"""Return the currently active profile name.
+211 -82
View File
@@ -72,29 +72,11 @@ _STALE_MESSAGING_END_REASONS = {"session_reset", "session_switch"}
# when the active profile is `'default'`. _is_root_profile() is the
# canonical check.
def _profiles_match(row_profile, active_profile) -> bool:
"""Return True if a session/project row's profile matches the active profile.
Treats both the literal alias 'default' and any renamed-root display name
(per _is_root_profile) as equivalent, so legacy rows tagged 'default'
still surface when the user has renamed the root profile to e.g. 'kinni',
and vice versa.
A row with no profile (`None` or empty string) is treated as belonging to
the root profile that's the convention used by the legacy backfill at
api/models.py::all_sessions, and matches the default seen in
`static/sessions.js` (`S.activeProfile||'default'`).
"""
from api.profiles import _is_root_profile
row = row_profile or 'default'
active = active_profile or 'default'
if row == active:
return True
# Cross-alias the renamed root.
if _is_root_profile(row) and _is_root_profile(active):
return True
return False
# Canonical helper now lives in api.profiles so out-of-process consumers
# (mcp_server.py) can import it without duplicating the visibility model.
# Re-exported here so existing `_profiles_match(...)` call sites in this
# module keep resolving without per-call-site refactors.
from api.profiles import _profiles_match # noqa: F401, E402 (re-export)
def _all_profiles_query_flag(parsed_url) -> bool:
@@ -4287,6 +4269,9 @@ def handle_post(handler, parsed) -> bool:
if parsed.path == "/api/background":
return _handle_background(handler, body)
if parsed.path == "/api/goal":
return _handle_goal_command(handler, body)
if parsed.path == "/api/chat/start":
return _handle_chat_start(handler, body, diag=diag)
@@ -6456,6 +6441,197 @@ def _prepare_chat_start_session_for_stream(
s.save()
def _start_chat_stream_for_session(
s,
*,
msg: str,
attachments=None,
workspace: str,
model: str,
model_provider=None,
normalized_model: bool = False,
diag=None,
):
"""Persist pending state, register an SSE channel, and start an agent turn."""
attachments = attachments or []
# Prevent duplicate runs in the same session while a stream is still active.
# This commonly happens after page refresh/reconnect races and can produce
# duplicated clarify cards for what appears to be a single user request.
diag.stage("active_stream_check") if diag else None
current_stream_id = getattr(s, "active_stream_id", None)
if current_stream_id:
diag.stage("active_stream_lock_wait") if diag else None
with STREAMS_LOCK:
current_active = current_stream_id in STREAMS
if current_active:
diag.stage("response_write") if diag else None
return {
"error": "session already has an active stream",
"active_stream_id": current_stream_id,
"_status": 409,
}
# Stale stream id from a previous run; clear and continue.
diag.stage("stale_stream_cleanup") if diag else None
_clear_stale_stream_state(s)
stream_id = uuid.uuid4().hex
session_lock = _get_session_agent_lock(s.session_id)
diag.stage("session_lock_wait") if diag else None
with session_lock:
diag.stage("save_pending_state") if diag else None
_prepare_chat_start_session_for_stream(
s,
msg=msg,
attachments=attachments,
workspace=workspace,
model=model,
model_provider=model_provider,
stream_id=stream_id,
)
diag.stage("set_last_workspace") if diag else None
set_last_workspace(workspace)
diag.stage("stream_registration") if diag else None
stream = create_stream_channel()
with STREAMS_LOCK:
STREAMS[stream_id] = stream
diag.stage("worker_thread_start") if diag else None
thr = threading.Thread(
target=_run_agent_streaming,
args=(s.session_id, msg, model, workspace, stream_id, attachments),
kwargs={"model_provider": model_provider},
daemon=True,
)
thr.start()
response = {
"stream_id": stream_id,
"session_id": s.session_id,
"pending_started_at": s.pending_started_at,
}
if normalized_model:
response["effective_model"] = model
if model_provider:
response["effective_model_provider"] = model_provider
return response
def _handle_goal_command(handler, body):
"""Handle WebUI /goal command controls and optional kickoff stream."""
try:
require(body, "session_id")
except ValueError as e:
return bad(handler, str(e))
try:
s = get_session(body["session_id"])
except KeyError:
return bad(handler, "Session not found", 404)
requested_profile = str(body.get("profile") or "").strip()
if requested_profile:
try:
from api.profiles import _PROFILE_ID_RE
if requested_profile != "default" and not _PROFILE_ID_RE.fullmatch(requested_profile):
return bad(handler, "invalid profile", 400)
except ImportError:
requested_profile = ""
if requested_profile and not _profiles_match(getattr(s, "profile", None), requested_profile):
has_persisted_turns = bool(
getattr(s, "messages", None)
or getattr(s, "context_messages", None)
or getattr(s, "pending_user_message", None)
)
if not has_persisted_turns:
s.profile = requested_profile
current_stream_id = getattr(s, "active_stream_id", None)
stream_running = False
if current_stream_id:
with STREAMS_LOCK:
stream_running = current_stream_id in STREAMS
if not stream_running:
_clear_stale_stream_state(s)
try:
from api.profiles import get_hermes_home_for_profile
profile_home = get_hermes_home_for_profile(getattr(s, "profile", None))
except Exception:
profile_home = None
from api.goals import goal_command_payload, goal_state_snapshot, restore_goal_state
goal_args = str(body.get("args", "") or body.get("text", "") or "")
goal_action = goal_args.strip().lower()
will_kickoff = bool(
goal_args.strip()
and goal_action not in ("status", "pause", "resume", "clear", "stop", "done")
and not stream_running
)
workspace = model = model_provider = normalized_model = None
previous_goal_state = None
if will_kickoff:
try:
workspace = str(resolve_trusted_workspace(body.get("workspace") or s.workspace))
except ValueError as e:
return bad(handler, str(e))
requested_model = body.get("model") or s.model
requested_provider = (
body.get("model_provider")
if "model_provider" in body
else getattr(s, "model_provider", None)
)
model, model_provider, normalized_model = _resolve_compatible_session_model_state(
requested_model,
requested_provider,
)
previous_goal_state = goal_state_snapshot(s.session_id, profile_home=profile_home)
payload = goal_command_payload(
s.session_id,
goal_args,
stream_running=stream_running,
profile_home=profile_home,
)
if not payload.get("ok", True):
status = 409 if payload.get("error") == "agent_running" else 400
return j(handler, payload, status=status)
kickoff_prompt = str(payload.get("kickoff_prompt") or "").strip()
if kickoff_prompt:
if workspace is None:
try:
workspace = str(resolve_trusted_workspace(body.get("workspace") or s.workspace))
except ValueError as e:
return bad(handler, str(e))
if model is None:
requested_model = body.get("model") or s.model
requested_provider = (
body.get("model_provider")
if "model_provider" in body
else getattr(s, "model_provider", None)
)
model, model_provider, normalized_model = _resolve_compatible_session_model_state(
requested_model,
requested_provider,
)
stream_response = _start_chat_stream_for_session(
s,
msg=kickoff_prompt,
attachments=[],
workspace=workspace,
model=model,
model_provider=model_provider,
normalized_model=normalized_model,
)
status = int(stream_response.pop("_status", 200) or 200)
payload.update(stream_response)
if status >= 400:
restore_goal_state(s.session_id, previous_goal_state, profile_home=profile_home)
payload["ok"] = False
return j(handler, payload, status=status)
return j(handler, payload)
def _handle_chat_start(handler, body, diag=None):
try:
diag.stage("validate_session_id") if diag else None
@@ -6512,72 +6688,25 @@ def _handle_chat_start(handler, body, diag=None):
requested_model,
requested_provider,
)
# Prevent duplicate runs in the same session while a stream is still active.
# This commonly happens after page refresh/reconnect races and can produce
# duplicated clarify cards for what appears to be a single user request.
diag.stage("active_stream_check") if diag else None
current_stream_id = getattr(s, "active_stream_id", None)
if current_stream_id:
diag.stage("active_stream_lock_wait") if diag else None
with STREAMS_LOCK:
current_active = current_stream_id in STREAMS
if current_active:
diag.stage("response_write") if diag else None
return j(
handler,
{
"error": "session already has an active stream",
"active_stream_id": current_stream_id,
},
status=409,
)
# Stale stream id from a previous run; clear and continue.
diag.stage("stale_stream_cleanup") if diag else None
_clear_stale_stream_state(s)
stream_id = uuid.uuid4().hex
session_lock = _get_session_agent_lock(s.session_id)
diag.stage("session_lock_wait") if diag else None
with session_lock:
diag.stage("save_pending_state") if diag else None
_prepare_chat_start_session_for_stream(
s,
msg=msg,
attachments=attachments,
workspace=workspace,
model=model,
model_provider=model_provider,
stream_id=stream_id,
)
diag.stage("set_last_workspace") if diag else None
set_last_workspace(workspace)
diag.stage("stream_registration") if diag else None
stream = create_stream_channel()
with STREAMS_LOCK:
STREAMS[stream_id] = stream
diag.stage("worker_thread_start") if diag else None
thr = threading.Thread(
target=_run_agent_streaming,
args=(s.session_id, msg, model, workspace, stream_id, attachments),
kwargs={"model_provider": model_provider},
daemon=True,
response = _start_chat_stream_for_session(
s,
msg=msg,
attachments=attachments,
workspace=workspace,
model=model,
model_provider=model_provider,
normalized_model=normalized_model,
diag=diag,
)
thr.start()
response = {
"stream_id": stream_id,
"session_id": s.session_id,
"pending_started_at": s.pending_started_at,
}
if normalized_model:
response["effective_model"] = model
if model_provider:
response["effective_model_provider"] = model_provider
status = int(response.pop("_status", 200) or 200)
diag.stage("response_write") if diag else None
return j(handler, response)
return j(handler, response, status=status)
finally:
if diag:
diag.finish()
def _normalize_chat_attachments(raw_attachments):
"""Normalize attachment payloads from the browser.
+58
View File
@@ -3227,6 +3227,64 @@ def _run_agent_streaming(
})
except Exception:
logger.debug("Failed to drain pending steer for session %s", session_id)
# /goal parity: after a successful assistant turn, run the Hermes
# GoalManager judge before terminal done/stream_end events. The
# frontend surfaces the status line and queues continuation_prompt as
# a normal next user message so /queue and user input keep priority.
try:
from api.goals import evaluate_goal_after_turn, has_active_goal
if not has_active_goal(session_id, profile_home=_profile_home):
_goal_decision = {}
else:
_last_goal_response = ''
for _goal_msg in reversed(s.messages or []):
if not isinstance(_goal_msg, dict) or _goal_msg.get('role') != 'assistant':
continue
_goal_content = _goal_msg.get('content', '')
if isinstance(_goal_content, list):
_goal_parts = []
for _goal_part in _goal_content:
if isinstance(_goal_part, dict):
_goal_text = _goal_part.get('text') or _goal_part.get('content')
if _goal_text:
_goal_parts.append(str(_goal_text))
_last_goal_response = '\n'.join(_goal_parts)
else:
_last_goal_response = str(_goal_content or '')
break
put('goal', {
'session_id': session_id,
'state': 'evaluating',
'message': 'Evaluating goal progress…',
})
_goal_decision = evaluate_goal_after_turn(
session_id,
_last_goal_response,
user_initiated=True,
profile_home=_profile_home,
)
decision = _goal_decision or {}
_goal_message = str(decision.get('message') or '').strip()
if _goal_message:
put('goal', {
'session_id': session_id,
'state': 'continuing' if decision.get('should_continue') else 'idle',
'message': _goal_message,
'decision': decision,
})
if decision.get('should_continue'):
continuation_prompt = str(decision.get('continuation_prompt') or '').strip()
if continuation_prompt:
put('goal_continue', {
'session_id': session_id,
'continuation_prompt': continuation_prompt,
'text': continuation_prompt,
'message': _goal_message,
'decision': decision,
})
except Exception as _goal_exc:
logger.debug("Goal continuation hook failed for session %s: %s", session_id, _goal_exc)
raw_session = s.compact() | {'messages': s.messages, 'tool_calls': tool_calls}
put('done', {'session': redact_session_data(raw_session), 'usage': usage})
# Emit one last metering packet for the live message-header TPS label.
Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

+567
View File
@@ -0,0 +1,567 @@
#!/usr/bin/env python3
"""
Hermes WebUI MCP Server exposes project and session management
as MCP tools for any MCP-compatible agent.
Option A rewrite (2026-05-08): imports api.models and api.profiles
directly from the webui codebase, using canonical helpers for
locking, profile scoping, index consistency, and validation.
pip install mcp # one-time setup
python3 mcp_server.py # start via stdio
MCP config for Hermes Agent (add to config.yaml):
mcp_servers:
hermes-webui:
command: /path/to/venv/bin/python3
args: [/path/to/hermes-webui/mcp_server.py]
env:
HERMES_WEBUI_PASSWORD: your_password
Profile override (optional):
args: [/path/to/hermes-webui/mcp_server.py, --profile, myprofile]
AI-authoring disclosure: this file was rewritten by MILO (Hermes Agent)
under human direction, per maintainer guidelines for #1616.
"""
import argparse
import json
import os
import re
import sys
import time
import uuid
from pathlib import Path
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent
# ── Ensure the repo root is on sys.path so api.* imports work ─────────────
_REPO_ROOT = Path(__file__).parent.resolve()
if str(_REPO_ROOT) not in sys.path:
sys.path.insert(0, str(_REPO_ROOT))
# ── CLI: optional --profile override ──────────────────────────────────────
_profile_arg: str | None = None
_parser = argparse.ArgumentParser(add_help=False)
_parser.add_argument("--profile", type=str, default=None)
_args, _unknown = _parser.parse_known_args()
_profile_arg = _args.profile
# ── Import webui canonical modules (after path setup) ─────────────────────
import api.config as _cfg
from api.config import (
STATE_DIR, SESSION_DIR, SESSION_INDEX_FILE, PROJECTS_FILE, HOME,
)
from api.models import load_projects, save_projects
from api.profiles import get_active_profile_name, _is_root_profile, _profiles_match
# ── Apply --profile override before any module uses get_active_profile_name
if _profile_arg is not None:
import api.profiles as _profiles
_profiles._active_profile = _profile_arg
# ── API auth state ─────────────────────────────────────────────────────────
# Mirror the env-var contract used by api/config.py:32-33 so a non-default
# WebUI port/host (e.g. when 8787 is held by another service on the host)
# Just Works without configuration drift between the WebUI process and MCP.
WEBUI_HOST = os.environ.get("HERMES_WEBUI_HOST", "127.0.0.1")
WEBUI_PORT = os.environ.get("HERMES_WEBUI_PORT", "8787")
WEBUI_URL = f"http://{WEBUI_HOST}:{WEBUI_PORT}"
_auth_cookie: str | None = None
_auth_expires: float = 0 # unix timestamp after which we re-auth
server = Server("hermes-webui")
# ═══════════════════════════════════════════════════════════════════════════
# Helpers — filesystem (project CRUD via canonical api.models)
# ═══════════════════════════════════════════════════════════════════════════
def _active_profile() -> str:
"""Shorthand for the current profile name (--profile or auto-detected)."""
return get_active_profile_name() or 'default'
def _validate_color(color: str | None) -> str | None:
"""Return an error string if color is invalid, else None."""
if color is not None and not re.match(r"^#[0-9a-fA-F]{3,8}$", color):
return "Invalid color format (use #RGB, #RRGGBB, or #RRGGBBAA)"
return None
def _load_index() -> list:
"""Read the session index. Falls back to empty list on failure."""
if not SESSION_INDEX_FILE.exists():
return []
try:
return json.loads(SESSION_INDEX_FILE.read_text(encoding="utf-8"))
except Exception:
return []
def _session_compact(row: dict) -> dict:
"""Lightweight compact representation of a session index entry."""
return {
"session_id": row.get("session_id"),
"title": row.get("title"),
"project_id": row.get("project_id"),
"workspace": row.get("workspace"),
"model": row.get("model"),
"message_count": row.get("message_count", 0),
"source_tag": row.get("source_tag"),
"is_cli_session": row.get("is_cli_session", False),
"profile": row.get("profile"),
}
# ═══════════════════════════════════════════════════════════════════════════
# Helpers — HTTP API (for mutations that need cache sync)
# ═══════════════════════════════════════════════════════════════════════════
def _api_password() -> str | None:
"""Return the plaintext webui password from HERMES_WEBUI_PASSWORD, or None.
settings.json stores only the bcrypt hash, which the login endpoint cannot
accept it calls verify_password(plaintext) against the stored hash. So
there's no usable fallback when the env var is unset; the MCP simply runs
in unauthenticated mode and any auth-protected mutation will fail clearly
with the server's 401 instead of silently sending an unusable hash.
"""
pw = os.environ.get("HERMES_WEBUI_PASSWORD", "").strip()
return pw or None
def _api_auth() -> str | None:
"""Authenticate and return cookie value, or None if auth disabled/fails."""
global _auth_cookie, _auth_expires
pw = _api_password()
if not pw:
return None # auth not enabled — API calls will fail anyway
# Reuse cookie if still valid (25 days — server issues 30-day cookies)
if _auth_cookie and time.time() < _auth_expires:
return _auth_cookie
import urllib.request
try:
req = urllib.request.Request(
f"{WEBUI_URL}/api/auth/login",
data=json.dumps({"password": pw}).encode(),
headers={"Content-Type": "application/json"},
method="POST",
)
resp = urllib.request.urlopen(req, timeout=5)
cookie = resp.headers.get("Set-Cookie", "")
if cookie:
_auth_cookie = cookie.split(";")[0] # "hermes_session=VALUE; ..."
_auth_expires = time.time() + 25 * 86400 # 25 days
return _auth_cookie
except Exception:
_auth_cookie = None
return None
def _api_post(endpoint: str, body: dict) -> dict:
"""POST to webui API with auth cookie. Returns parsed JSON response."""
import urllib.request
import urllib.error
cookie = _api_auth()
headers = {"Content-Type": "application/json"}
if cookie:
headers["Cookie"] = cookie
try:
req = urllib.request.Request(
f"{WEBUI_URL}{endpoint}",
data=json.dumps(body).encode(),
headers=headers,
method="POST",
)
resp = urllib.request.urlopen(req, timeout=5)
return json.loads(resp.read())
except urllib.error.HTTPError as e:
err_body = json.loads(e.read())
return {"error": f"API {e.code}: {err_body.get('error', 'unknown')}"}
except Exception as e:
return {"error": f"API unreachable: {e}"}
# ═══════════════════════════════════════════════════════════════════════════
# Tool handlers — read-only (filesystem, profile-aware)
# ═══════════════════════════════════════════════════════════════════════════
async def handle_list_projects(_arguments: dict) -> list[TextContent]:
"""List all projects with session counts, scoped to active profile."""
projects = load_projects()
active = _active_profile()
index = _load_index()
# Session counts per project (from index)
counts: dict[str, int] = {}
for s in index:
pid = s.get("project_id")
if pid:
counts[pid] = counts.get(pid, 0) + 1
result = []
for p in projects:
# Profile filter: legacy untagged rows are treated as 'default' by
# _profiles_match, so non-root profiles correctly hide them.
if not _profiles_match(p.get("profile"), active):
continue
entry = dict(p)
entry["session_count"] = counts.get(p["project_id"], 0)
result.append(entry)
return [TextContent(type="text", text=json.dumps(result, ensure_ascii=False, indent=2))]
async def handle_list_sessions(arguments: dict) -> list[TextContent]:
"""List sessions, optionally filtered by project or unassigned status."""
project_id = arguments.get("project_id")
unassigned = arguments.get("unassigned", False)
limit = max(1, min(500, arguments.get("limit", 50)))
active = _active_profile()
index = _load_index()
sessions = [_session_compact(s) for s in index if s.get("session_id")]
# Filter by profile: legacy untagged rows are treated as 'default' by
# _profiles_match (canonical convention), so non-root profiles hide them.
sessions = [s for s in sessions if _profiles_match(s.get("profile"), active)]
if unassigned:
sessions = [s for s in sessions if not s["project_id"]]
elif project_id:
sessions = [s for s in sessions if s["project_id"] == project_id]
sessions = sessions[:limit]
return [TextContent(type="text", text=json.dumps(sessions, ensure_ascii=False, indent=2))]
# ═══════════════════════════════════════════════════════════════════════════
# Tool handlers — project CRUD (canonical helpers, profile-scoped)
# ═══════════════════════════════════════════════════════════════════════════
async def handle_create_project(arguments: dict) -> list[TextContent]:
"""Create a new project (profile-scoped, exact-match title collision)."""
name = arguments.get("name", "").strip()[:128]
if not name:
return [TextContent(type="text", text=json.dumps(
{"error": "name is required"}, ensure_ascii=False))]
color = arguments.get("color")
color_err = _validate_color(color)
if color_err:
return [TextContent(type="text", text=json.dumps(
{"error": color_err}, ensure_ascii=False))]
active = _active_profile()
projects = load_projects()
# Title collision: exact match (consistent with ensure_cron_project)
if any(p.get("name") == name and _profiles_match(p.get("profile"), active)
for p in projects):
return [TextContent(type="text", text=json.dumps(
{"error": f"Project '{name}' already exists"}, ensure_ascii=False))]
proj = {
"project_id": uuid.uuid4().hex[:12],
"name": name,
"color": color,
"profile": active,
"created_at": time.time(),
}
projects.append(proj)
save_projects(projects)
proj["session_count"] = 0
return [TextContent(type="text", text=json.dumps(proj, ensure_ascii=False, indent=2))]
async def handle_rename_project(arguments: dict) -> list[TextContent]:
"""Rename a project and optionally change its color (profile-checked)."""
project_id = arguments.get("project_id")
name = arguments.get("name", "").strip()[:128]
if not project_id or not name:
return [TextContent(type="text", text=json.dumps(
{"error": "project_id and name are required"}, ensure_ascii=False))]
color = arguments.get("color")
color_err = _validate_color(color)
if color_err:
return [TextContent(type="text", text=json.dumps(
{"error": color_err}, ensure_ascii=False))]
active = _active_profile()
projects = load_projects()
proj = next((p for p in projects if p["project_id"] == project_id), None)
if not proj:
return [TextContent(type="text", text=json.dumps(
{"error": "Project not found"}, ensure_ascii=False))]
# #1614: profile ownership check
if not _profiles_match(proj.get("profile"), active):
return [TextContent(type="text", text=json.dumps(
{"error": "Project not found"}, ensure_ascii=False))]
proj["name"] = name
if color is not None:
proj["color"] = color
save_projects(projects)
return [TextContent(type="text", text=json.dumps(proj, ensure_ascii=False, indent=2))]
async def handle_delete_project(arguments: dict) -> list[TextContent]:
"""Delete a project and unassign all its sessions (profile-checked)."""
project_id = arguments.get("project_id")
if not project_id:
return [TextContent(type="text", text=json.dumps(
{"error": "project_id is required"}, ensure_ascii=False))]
active = _active_profile()
projects = load_projects()
proj = next((p for p in projects if p["project_id"] == project_id), None)
if not proj:
return [TextContent(type="text", text=json.dumps(
{"error": "Project not found"}, ensure_ascii=False))]
# #1614: profile ownership check
if not _profiles_match(proj.get("profile"), active):
return [TextContent(type="text", text=json.dumps(
{"error": "Project not found"}, ensure_ascii=False))]
projects = [p for p in projects if p["project_id"] != project_id]
save_projects(projects)
# Unassign sessions only when we can do it cache-safely via the HTTP API.
# The previous filesystem fallback wrote session_data directly with
# os.replace(), which bypassed _write_session_index() in api/models.py
# and left _index.json holding the stale project_id — a running WebUI
# would still group those sessions under the deleted project until a
# subsequent re-compact. Even calling Session.save() in-process would
# not help because the WebUI's SESSIONS dict cache (a separate process)
# still has the old project_id and overwrites our update on its next
# save. The HTTP API is the only cache-safe path; without auth we
# refuse and surface the limitation so the operator can act.
has_auth = bool(_api_password())
if not has_auth:
return [TextContent(type="text", text=json.dumps({
"ok": True,
"deleted": proj["name"],
"unassigned_sessions": 0,
"warning": "Set HERMES_WEBUI_PASSWORD to unassign sessions; "
"without auth the session index cannot be safely "
"updated and direct filesystem writes would cause "
"index drift in a running WebUI.",
}, ensure_ascii=False))]
unassigned = 0
if SESSION_DIR.exists():
for p in SESSION_DIR.glob("*.json"):
if p.name.startswith("_"):
continue
try:
session_data = json.loads(p.read_text(encoding="utf-8"))
if session_data.get("project_id") == project_id:
sid = p.stem
result = _api_post("/api/session/move",
{"session_id": sid, "project_id": None})
if "ok" in result or "session" in result:
unassigned += 1
except Exception:
pass
return [TextContent(type="text", text=json.dumps({
"ok": True,
"deleted": proj["name"],
"unassigned_sessions": unassigned,
}, ensure_ascii=False))]
# ═══════════════════════════════════════════════════════════════════════════
# Tool handlers — mutations (HTTP API with auth, cache-safe)
# ═══════════════════════════════════════════════════════════════════════════
async def handle_rename_session(arguments: dict) -> list[TextContent]:
"""Rename a session via the authenticated webui API (cache-safe)."""
session_id = arguments.get("session_id")
title = arguments.get("title", "").strip()[:80]
if not session_id or not title:
return [TextContent(type="text", text=json.dumps(
{"error": "session_id and title are required"}, ensure_ascii=False))]
result = _api_post("/api/session/rename",
{"session_id": session_id, "title": title})
if "error" in result:
return [TextContent(type="text", text=json.dumps(result, ensure_ascii=False))]
session = result.get("session", {})
return [TextContent(type="text", text=json.dumps({
"ok": True,
"session_id": session_id,
"title": session.get("title", title),
"method": "api",
}, ensure_ascii=False, indent=2))]
async def handle_move_session(arguments: dict) -> list[TextContent]:
"""Assign a session to a project via the authenticated webui API (cache-safe)."""
session_id = arguments.get("session_id")
project_id = arguments.get("project_id") # None/null = unassign
if not session_id:
return [TextContent(type="text", text=json.dumps(
{"error": "session_id is required"}, ensure_ascii=False))]
# If project_id is provided, verify it exists and is profile-accessible
if project_id is not None:
projects = load_projects()
active = _active_profile()
target = next((p for p in projects if p["project_id"] == project_id), None)
if not target:
return [TextContent(type="text", text=json.dumps(
{"error": "Project not found"}, ensure_ascii=False))]
# #1614: refuse moves into projects owned by another profile
if not _profiles_match(target.get("profile"), active):
return [TextContent(type="text", text=json.dumps(
{"error": "Project not found"}, ensure_ascii=False))]
result = _api_post("/api/session/move",
{"session_id": session_id, "project_id": project_id})
if "error" in result:
return [TextContent(type="text", text=json.dumps(result, ensure_ascii=False))]
session = result.get("session", {})
return [TextContent(type="text", text=json.dumps({
"ok": True,
"session_id": session_id,
"project_id": project_id,
"title": session.get("title"),
"method": "api",
}, ensure_ascii=False, indent=2))]
# ═══════════════════════════════════════════════════════════════════════════
# MCP Server wiring
# ═══════════════════════════════════════════════════════════════════════════
TOOLS = [
Tool(
name="list_projects",
description="List all session projects with their IDs, names, colors, and session counts (scoped to active profile).",
inputSchema={"type": "object", "properties": {}, "required": []},
),
Tool(
name="create_project",
description="Create a new project for organizing sessions (profile-scoped).",
inputSchema={
"type": "object",
"properties": {
"name": {"type": "string", "description": "Project name (max 128 chars)"},
"color": {"type": "string", "description": "Optional hex color (#RGB, #RRGGBB, or #RRGGBBAA)"},
},
"required": ["name"],
},
),
Tool(
name="rename_project",
description="Rename a project and optionally change its color (profile-checked).",
inputSchema={
"type": "object",
"properties": {
"project_id": {"type": "string", "description": "12-char project ID"},
"name": {"type": "string", "description": "New name (max 128 chars)"},
"color": {"type": "string", "description": "Optional new hex color"},
},
"required": ["project_id", "name"],
},
),
Tool(
name="delete_project",
description="Delete a project and unassign all its sessions (profile-checked).",
inputSchema={
"type": "object",
"properties": {
"project_id": {"type": "string", "description": "12-char project ID to delete"},
},
"required": ["project_id"],
},
),
Tool(
name="rename_session",
description="Rename a session (updates sidebar via authenticated API, cache-safe).",
inputSchema={
"type": "object",
"properties": {
"session_id": {"type": "string", "description": "Session ID"},
"title": {"type": "string", "description": "New title (max 80 chars)"},
},
"required": ["session_id", "title"],
},
),
Tool(
name="move_session",
description="Assign a session to a project. Pass project_id=null to unassign. Uses authenticated API for cache safety (profile-checked).",
inputSchema={
"type": "object",
"properties": {
"session_id": {"type": "string", "description": "Session ID"},
"project_id": {"type": ["string", "null"], "description": "Project ID (or null to unassign)"},
},
"required": ["session_id", "project_id"],
},
),
Tool(
name="list_sessions",
description="List sessions, optionally filtered by project or unassigned status (profile-scoped).",
inputSchema={
"type": "object",
"properties": {
"project_id": {"type": "string", "description": "Filter sessions by project ID"},
"unassigned": {"type": "boolean", "description": "Show only sessions with no project"},
"limit": {"type": "integer", "description": "Max results (default: 50, max: 500)"},
},
"required": [],
},
),
]
HANDLERS = {
"list_projects": handle_list_projects,
"create_project": handle_create_project,
"rename_project": handle_rename_project,
"delete_project": handle_delete_project,
"rename_session": handle_rename_session,
"move_session": handle_move_session,
"list_sessions": handle_list_sessions,
}
@server.list_tools()
async def list_tools() -> list[Tool]:
return TOOLS
@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
handler = HANDLERS.get(name)
if not handler:
return [TextContent(type="text", text=json.dumps(
{"error": f"Unknown tool: {name}"}, ensure_ascii=False))]
return await handler(arguments)
async def main():
async with stdio_server() as (read, write):
await server.run(read, write, server.create_initialization_options())
if __name__ == "__main__":
import asyncio
asyncio.run(main())
+48
View File
@@ -18,6 +18,7 @@ const COMMANDS=[
{name:'personality', desc:t('cmd_personality'), fn:cmdPersonality, arg:'name', subArgs:'personalities'},
{name:'skills', desc:t('cmd_skills'), fn:cmdSkills, arg:'query'},
{name:'stop', desc:t('cmd_stop'), fn:cmdStop, noEcho:true},
{name:'goal', desc:t('cmd_goal'), fn:cmdGoal, arg:'[status|pause|resume|clear|text]', subArgs:['status','pause','resume','clear']},
{name:'queue', desc:t('cmd_queue'), fn:cmdQueue, arg:'message', noEcho:true},
{name:'interrupt', desc:t('cmd_interrupt'), fn:cmdInterrupt, arg:'message', noEcho:true},
{name:'steer', desc:t('cmd_steer'), fn:cmdSteer, arg:'message', noEcho:true},
@@ -625,6 +626,53 @@ async function cmdStop(){
else showToast(t('cancel_unavailable'));
}
async function cmdGoal(args){
if(!S.session){await newSession();await renderSessionList();}
if(!S.session||!S.session.session_id){showToast(t('no_active_session'));return;}
const activeSid=S.session.session_id;
try{
const r=await api('/api/goal',{method:'POST',body:JSON.stringify({
session_id:activeSid,
args:args||'',
workspace:S.session.workspace,
model:S.session.model||($('modelSelect')&&$('modelSelect').value)||'',
model_provider:S.session.model_provider||null,
profile:S.activeProfile||S.session.profile||'default',
})});
const msg=String((r&&r.message)||'').trim();
if(msg){
S.messages.push({role:'assistant',content:msg,_ts:Date.now()/1000,_goalStatus:true,_transient:true});
renderMessages({preserveScroll:true});
showToast(msg.split('\n')[0],2600);
}
if(!r||!r.stream_id)return;
S.toolCalls=[];
if(typeof clearLiveToolCards==='function')clearLiveToolCards();
appendThinking();setBusy(true);
setComposerStatus('Working toward goal…');
S.activeStreamId=r.stream_id;
if(S.session&&S.session.session_id===activeSid){
S.session.active_stream_id=r.stream_id;
if(typeof r.pending_started_at==='number')S.session.pending_started_at=r.pending_started_at;
if(r.effective_model)S.session.model=r.effective_model;
if(r.effective_model_provider)S.session.model_provider=r.effective_model_provider;
}
INFLIGHT[activeSid]={messages:[...S.messages],uploaded:[],toolCalls:[]};
if(typeof markInflight==='function')markInflight(activeSid,r.stream_id);
if(typeof saveInflightState==='function')saveInflightState(activeSid,{streamId:r.stream_id,messages:INFLIGHT[activeSid].messages,uploaded:[],toolCalls:[]});
startApprovalPolling(activeSid);
startClarifyPolling(activeSid);
if(typeof _fetchYoloState==='function')_fetchYoloState(activeSid);
attachLiveStream(activeSid,r.stream_id,[]);
if(typeof renderSessionList==='function')void renderSessionList();
}catch(e){
const err=String((e&&e.message)||e||'Goal command failed');
S.messages.push({role:'assistant',content:`**Goal command failed:** ${err}`,_ts:Date.now()/1000,_error:true});
renderMessages({preserveScroll:true});
showToast(err,3000);
}
}
// ── Busy-input mode commands ──────────────────────────────────────────────
// These commands let users override the default busy_input_mode setting for a
// specific message. They are only meaningful while the agent is running.
+9
View File
@@ -183,6 +183,7 @@ const LOCALES = {
theme_set: 'Theme: ',
no_active_session: 'No active session',
cmd_queue: 'Queue a message for the next turn',
cmd_goal: 'Set or inspect a persistent goal',
cmd_interrupt: 'Cancel current turn and send a new message',
cmd_steer: 'Inject a mid-turn correction without interrupting the agent',
cmd_queue_no_msg: 'Usage: /queue <message>',
@@ -1204,6 +1205,7 @@ const LOCALES = {
theme_set: 'テーマ: ',
no_active_session: 'アクティブなセッションがありません',
cmd_queue: '次のターン用にメッセージをキュー',
cmd_goal: '永続ゴールを設定または確認',
cmd_interrupt: '現在のターンをキャンセルして新規メッセージを送信',
cmd_steer: 'エージェントを中断せずにターン中に修正を注入',
cmd_queue_no_msg: '使い方: /queue <メッセージ>',
@@ -2186,6 +2188,7 @@ const LOCALES = {
theme_set: 'Тема: ',
no_active_session: 'Нет активной сессии',
cmd_queue: 'Поставить сообщение в очередь на следующий оборот',
cmd_goal: 'Задать или проверить постоянную цель',
cmd_interrupt: 'Прервать текущий оборот и отправить новое сообщение',
cmd_steer: 'Направить агента исправлением (переходит к прерыванию)',
cmd_queue_no_msg: 'Использование: /queue <сообщение>',
@@ -3173,6 +3176,7 @@ const LOCALES = {
theme_set: 'Tema: ',
no_active_session: 'No hay ninguna sesión activa',
cmd_queue: 'Poner mensaje en cola para el siguiente turno',
cmd_goal: 'Definir o consultar un objetivo persistente',
cmd_interrupt: 'Cancelar turno actual y enviar nuevo mensaje',
cmd_steer: 'Inyectar una corrección a mitad del turno sin interrumpir al agente',
cmd_queue_no_msg: 'Uso: /queue <mensaje>',
@@ -4109,6 +4113,7 @@ const LOCALES = {
model_scope_advisory: 'Gilt für diesen Chat ab Ihrer nächsten Nachricht.',
model_scope_toast: 'Gilt für diesen Chat ab Ihrer nächsten Nachricht.',
cmd_queue: 'Nachricht f\u00fcr den n\u00e4chsten Durchgang einreihen',
cmd_goal: 'Ein dauerhaftes Ziel setzen oder prüfen',
cmd_interrupt: 'Aktuellen Durchgang abbrechen und neue Nachricht senden',
cmd_steer: 'Korrektursignal einf\u00fcgen ohne Unterbrechung',
cmd_queue_no_msg: 'Verwendung: /queue <Nachricht>',
@@ -5082,6 +5087,7 @@ const LOCALES = {
theme_set: '\u4e3b\u9898\uff1a',
no_active_session: '\u5f53\u524d\u6ca1\u6709\u6d3b\u52a8\u4f1a\u8bdd',
cmd_queue: '\u5c06\u6d88\u606f\u52a0\u5165\u4e0b\u4e00\u8f6e\u7684\u961f\u5217',
cmd_goal: '设置或查看持久目标',
cmd_interrupt: '\u53d6\u6d88\u5f53\u524d\u56de\u5408\u5e76\u53d1\u9001\u65b0\u6d88\u606f',
cmd_steer: '\u7528\u7ea0\u6b63\u4fe1\u606f\u5f15\u5bfc\u4ee3\u7406\uff08\u56de\u9000\u4e3a\u4e2d\u65ad\uff09',
cmd_queue_no_msg: '\u7528\u6cd5\uff1a/queue <\u6d88\u606f>',
@@ -6459,6 +6465,7 @@ const LOCALES = {
never: '\u5f9e\u4e0d',
no_active_session: '\u7121\u6d3b\u8e8d\u6703\u8a71',
cmd_queue: '\u5c07\u8a0a\u606f\u52a0\u5165\u4e0b\u4e00\u8f2a\u7684\u4f47\u5217',
cmd_goal: '設定或查看持久目標',
cmd_interrupt: '\u53d6\u6d88\u7576\u524d\u56de\u5408\u4e26\u767c\u9001\u65b0\u8a0a\u606f',
cmd_steer: '\u5728\u56de\u5408\u9032\u884c\u4e2d\u6ce8\u5165\u7d3a\u6b63\uff0c\u4e0d\u4e2d\u65b7\u4ee3\u7406',
cmd_queue_no_msg: '\u7528\u6cd5\uff1a/queue <\u8a0a\u606f>',
@@ -6972,6 +6979,7 @@ const LOCALES = {
theme_set: 'Tema: ',
no_active_session: 'Nenhuma sessão ativa',
cmd_queue: 'Enfileirar mensagem para o próximo turno',
cmd_goal: 'Definir ou consultar uma meta persistente',
cmd_interrupt: 'Cancelar turno atual e enviar nova mensagem',
cmd_steer: 'Injetar correção no meio do turno sem interromper',
cmd_queue_no_msg: 'Uso: /queue <mensagem>',
@@ -7869,6 +7877,7 @@ const LOCALES = {
theme_set: 'Theme: ',
no_active_session: '활성 세션 없음',
cmd_queue: 'Queue a message for the next turn',
cmd_goal: '지속 목표를 설정하거나 확인',
cmd_interrupt: 'Cancel current turn and send a new message',
cmd_steer: 'Inject a mid-turn correction without interrupting the agent',
cmd_queue_no_msg: 'Usage: /queue <message>',
+59 -1
View File
@@ -63,7 +63,7 @@ async function send(){
// cmdSteer / cmdInterrupt say "No active task to stop."
if(text.startsWith('/')){
const _pc=typeof parseCommand==='function'&&parseCommand(text);
if(_pc&&['steer','interrupt','queue','terminal'].includes(_pc.name)){
if(_pc&&['steer','interrupt','queue','terminal','goal'].includes(_pc.name)){
const _bc=COMMANDS.find(c=>c.name===_pc.name);
if(_bc){
$('msg').value='';autoResize();
@@ -336,6 +336,8 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
let assistantText='';
let reasoningText='';
let liveReasoningText='';
let _latestGoalStatus=null;
let _pendingGoalContinuation=null;
let assistantRow=null;
let assistantBody=null;
let segmentStart=0; // char offset in assistantText where current segment begins
@@ -879,6 +881,41 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
}catch(_){}
});
source.addEventListener('goal',e=>{
try{
const d=JSON.parse(e.data||'{}');
if((d.session_id||activeSid)!==activeSid) return;
const goalState=String(d.state||'').trim();
const goalEvaluatingMessage='Evaluating goal progress…';
if(goalState==='evaluating'){
setComposerStatus(goalEvaluatingMessage);
return;
}
const msg=String(d.message||'').trim();
if(!msg)return;
_latestGoalStatus={message:msg,decision:d.decision||null,state:goalState||null};
setComposerStatus(msg);
showToast(msg.split('\n')[0],2600);
}catch(_){}
});
source.addEventListener('goal_continue',e=>{
try{
const d=JSON.parse(e.data||'{}');
const sid=d.session_id||activeSid;
const continuation_prompt=String(d.continuation_prompt||d.text||'').trim();
if(!continuation_prompt||sid!==activeSid)return;
_pendingGoalContinuation={
sid,
text:continuation_prompt,
model:S.session&&S.session.model||'',
model_provider:S.session&&S.session.model_provider||null,
profile:S.activeProfile||'default',
};
showToast('Continuing toward goal…',2200);
}catch(_){}
});
source.addEventListener('done',e=>{
_terminalStateReached=true;
if(_persistTimer){clearTimeout(_persistTimer);_persistTimer=null;}
@@ -992,6 +1029,15 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
const lastUser=[...S.messages].reverse().find(m=>m.role==='user');
if(lastUser)lastUser.attachments=uploaded;
}
if(_latestGoalStatus&&_latestGoalStatus.message){
S.messages.push({
role:'assistant',
content:String(_latestGoalStatus.message),
_ts:Date.now()/1000,
_goalStatus:true,
_transient:true,
});
}
clearLiveToolCards();
S.busy=false;
// No-reply guard (#373): if agent returned nothing, show inline error
@@ -1003,6 +1049,18 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
// TTS auto-read: speak the last assistant response if enabled (#499)
if(typeof autoReadLastAssistant==='function') setTimeout(()=>autoReadLastAssistant(), 300);
}
if(isActiveSession&&_pendingGoalContinuation&&typeof queueSessionMessage==='function'){
const _goalNext=_pendingGoalContinuation;
_pendingGoalContinuation=null;
queueSessionMessage(_goalNext.sid,{
text:_goalNext.text,
files:[],
model:_goalNext.model,
model_provider:_goalNext.model_provider,
profile:_goalNext.profile,
});
if(typeof updateQueueBadge==='function')updateQueueBadge(_goalNext.sid);
}
if(isActiveSession) _queueDrainSid=activeSid;
renderSessionList();
_setActivePaneIdleIfOwner();
+272
View File
@@ -0,0 +1,272 @@
"""Regression tests for first-class WebUI /goal command parity."""
import io
import json
from pathlib import Path
from types import SimpleNamespace
import pytest
REPO_ROOT = Path(__file__).resolve().parents[1]
COMMANDS_JS = (REPO_ROOT / "static" / "commands.js").read_text(encoding="utf-8")
MESSAGES_JS = (REPO_ROOT / "static" / "messages.js").read_text(encoding="utf-8")
ROUTES_PY = (REPO_ROOT / "api" / "routes.py").read_text(encoding="utf-8")
STREAMING_PY = (REPO_ROOT / "api" / "streaming.py").read_text(encoding="utf-8")
def test_goal_command_payload_matches_gateway_controls(monkeypatch):
"""The backend command helper mirrors gateway /goal status/pause/resume/clear/set."""
from api import goals as webui_goals
calls = []
class FakeState:
goal = "ship the feature"
status = "active"
turns_used = 0
max_turns = 20
last_verdict = None
last_reason = None
paused_reason = None
class FakeGoalManager:
def __init__(self, session_id, default_max_turns=20):
calls.append(("init", session_id, default_max_turns))
self.state = None
def status_line(self):
return "No active goal. Set one with /goal <text>."
def pause(self, reason="user-paused"):
calls.append(("pause", reason))
return FakeState()
def resume(self, reset_budget=True):
calls.append(("resume", reset_budget))
return FakeState()
def has_goal(self):
return True
def clear(self):
calls.append(("clear",))
def set(self, goal):
calls.append(("set", goal))
state = FakeState()
state.goal = goal
self.state = state
return state
monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
monkeypatch.setattr(webui_goals, "_default_max_turns", lambda: 20)
status = webui_goals.goal_command_payload("sid-123", "status")
pause = webui_goals.goal_command_payload("sid-123", "pause")
resume = webui_goals.goal_command_payload("sid-123", "resume")
clear = webui_goals.goal_command_payload("sid-123", "clear")
set_goal = webui_goals.goal_command_payload("sid-123", "ship the feature")
assert status["message"] == "No active goal. Set one with /goal <text>."
assert pause["message"] == "⏸ Goal paused: ship the feature"
assert resume["message"].startswith("▶ Goal resumed: ship the feature")
assert clear["message"] == "Goal cleared."
assert set_goal["action"] == "set"
assert set_goal["kickoff_prompt"] == "ship the feature"
assert "⊙ Goal set (20-turn budget): ship the feature" in set_goal["message"]
assert ("set", "ship the feature") in calls
def test_goal_command_payload_rejects_new_goal_while_stream_running(monkeypatch):
"""Status/control subcommands are safe mid-run; replacing the goal is not."""
from api import goals as webui_goals
class FakeGoalManager:
def __init__(self, session_id, default_max_turns=20):
pass
def status_line(self):
return "⊙ Goal (active, 1/20 turns): existing"
monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
monkeypatch.setattr(webui_goals, "_default_max_turns", lambda: 20)
status = webui_goals.goal_command_payload("sid-123", "status", stream_running=True)
rejected = webui_goals.goal_command_payload("sid-123", "replace it", stream_running=True)
assert status["ok"] is True
assert rejected["ok"] is False
assert rejected["error"] == "agent_running"
assert "use /goal status / pause / clear mid-run" in rejected["message"]
def test_has_active_goal_reports_only_active_state(monkeypatch):
"""Streaming can avoid showing an evaluating spinner when no standing goal is active."""
from api import goals as webui_goals
class FakeGoalManager:
def __init__(self, session_id, default_max_turns=20):
self.session_id = session_id
def is_active(self):
return self.session_id == "sid-active-goal"
monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
monkeypatch.setattr(webui_goals, "_default_max_turns", lambda: 20)
assert webui_goals.has_active_goal("sid-active-goal") is True
assert webui_goals.has_active_goal("sid-idle-goal") is False
assert webui_goals.has_active_goal("") is False
def test_goal_continuation_decision_emits_status_and_normal_user_prompt(monkeypatch):
"""Post-turn hook returns the visible status event plus a normal continuation prompt."""
from api import goals as webui_goals
class FakeGoalManager:
def __init__(self, session_id, default_max_turns=20):
self.session_id = session_id
def is_active(self):
return True
def evaluate_after_turn(self, last_response, user_initiated=True):
return {
"status": "active",
"should_continue": True,
"continuation_prompt": "[Continuing toward your standing goal]\nGoal: ship it",
"verdict": "continue",
"reason": "one step remains",
"message": "↻ Continuing toward goal (1/20): one step remains",
}
monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
monkeypatch.setattr(webui_goals, "_default_max_turns", lambda: 20)
decision = webui_goals.evaluate_goal_after_turn("sid-123", "not done yet", user_initiated=False)
assert decision["message"].startswith("↻ Continuing toward goal")
assert decision["should_continue"] is True
assert decision["continuation_prompt"].startswith("[Continuing toward your standing goal]")
def test_goal_endpoint_sets_goal_and_starts_kickoff_stream(monkeypatch, tmp_path):
"""POST /api/goal uses GoalManager state and launches the first goal turn."""
from api import goals as webui_goals
from api import routes
class FakeState:
goal = "ship the feature"
status = "active"
turns_used = 0
max_turns = 20
last_verdict = None
last_reason = None
paused_reason = None
class FakeGoalManager:
def __init__(self, session_id, default_max_turns=20):
self.session_id = session_id
self.default_max_turns = default_max_turns
def set(self, goal):
state = FakeState()
state.goal = goal
return state
class FakeSession:
session_id = "sid-goal-route"
profile = "default"
workspace = str(tmp_path)
model = "gpt-5.5"
model_provider = "openai-codex"
messages = []
context_messages = []
pending_user_message = None
active_stream_id = None
monkeypatch.setattr(webui_goals, "GoalManager", FakeGoalManager)
monkeypatch.setattr(routes, "get_session", lambda sid: FakeSession())
monkeypatch.setattr(routes, "resolve_trusted_workspace", lambda workspace: tmp_path)
monkeypatch.setattr(
routes,
"_resolve_compatible_session_model_state",
lambda model, provider: (model, provider, False),
)
started = []
def fake_start(session, **kwargs):
started.append(kwargs)
return {"stream_id": "goal-stream", "session_id": session.session_id, "pending_started_at": 123.0}
monkeypatch.setattr(routes, "_start_chat_stream_for_session", fake_start)
monkeypatch.setattr(routes, "j", lambda handler, payload, status=200, **kwargs: {"status": status, "payload": payload})
result = routes._handle_goal_command(
object(),
{
"session_id": "sid-goal-route",
"args": "ship the feature",
"workspace": str(tmp_path),
"model": "gpt-5.5",
"model_provider": "openai-codex",
},
)
assert result["status"] == 200
assert result["payload"]["action"] == "set"
assert result["payload"]["stream_id"] == "goal-stream"
assert started and started[0]["msg"] == "ship the feature"
assert started[0]["model_provider"] == "openai-codex"
def test_routes_register_goal_endpoint_and_kickoff_stream():
assert 'if parsed.path == "/api/goal"' in ROUTES_PY
assert "return _handle_goal_command(handler, body)" in ROUTES_PY
assert "goal_command_payload" in ROUTES_PY
assert "kickoff_prompt" in ROUTES_PY
assert "_start_chat_stream_for_session" in ROUTES_PY
def test_streaming_post_turn_goal_hook_surfaces_and_continues():
assert "evaluate_goal_after_turn" in STREAMING_PY
assert "put('goal'" in STREAMING_PY
assert "decision.get('should_continue')" in STREAMING_PY
assert "continuation_prompt" in STREAMING_PY
assert "put('goal_continue'" in STREAMING_PY
goal_idx = STREAMING_PY.find("evaluate_goal_after_turn")
done_idx = STREAMING_PY.find("put('done'", goal_idx)
assert goal_idx != -1 and done_idx != -1
assert goal_idx < done_idx, "goal status should be emitted before the terminal done payload"
def test_streaming_goal_hook_emits_evaluating_state_before_judge():
evaluating_idx = STREAMING_PY.find("'state': 'evaluating'")
judge_idx = STREAMING_PY.find("_goal_decision = evaluate_goal_after_turn")
done_idx = STREAMING_PY.find("put('done'", judge_idx)
assert evaluating_idx != -1, "goal hook should emit an evaluating state before judge round-trip"
assert judge_idx != -1 and done_idx != -1
assert evaluating_idx < judge_idx < done_idx
assert "Evaluating goal progress…" in STREAMING_PY
assert "'state': 'continuing' if decision.get('should_continue') else 'idle'" in STREAMING_PY
def test_frontend_has_goal_slash_command_and_status_event_handler():
assert "{name:'goal'" in COMMANDS_JS
assert "subArgs:['status','pause','resume','clear']" in COMMANDS_JS
assert "function cmdGoal" in COMMANDS_JS
assert "api('/api/goal'" in COMMANDS_JS
assert "stream_id" in COMMANDS_JS
assert "goal'" in MESSAGES_JS
assert "source.addEventListener('goal'" in MESSAGES_JS
assert "source.addEventListener('goal_continue'" in MESSAGES_JS
assert "['steer','interrupt','queue','terminal','goal'].includes(_pc.name)" in MESSAGES_JS
assert "queueSessionMessage" in MESSAGES_JS
def test_frontend_goal_evaluating_state_uses_calm_composer_indicator():
assert "const goalState=String(d.state||'').trim();" in MESSAGES_JS
assert "const goalEvaluatingMessage='Evaluating goal progress…';" in MESSAGES_JS
assert "if(goalState==='evaluating')" in MESSAGES_JS
assert "setComposerStatus(goalEvaluatingMessage);" in MESSAGES_JS
assert "return;" in MESSAGES_JS
+37 -21
View File
@@ -3,6 +3,9 @@ import sys
import types
from unittest import mock
# Sentinel for sys.modules save/restore — distinguishes "key wasn't there" from None.
_MISSING = object()
def test_stream_completion_overwrites_session_usage_with_latest_turn(cleanup_test_sessions):
"""#1857: completed turns must not add prompt tokens to stale session totals."""
@@ -125,27 +128,40 @@ def test_stream_completion_overwrites_session_usage_with_latest_turn(cleanup_tes
fake_hermes_state = types.ModuleType("hermes_state")
fake_hermes_state.SessionDB = mock.Mock(return_value=None)
with mock.patch.object(streaming, "get_session", return_value=fake_session), \
mock.patch.object(streaming, "_get_ai_agent", return_value=UsageAgent), \
mock.patch.object(streaming, "resolve_model_provider", return_value=("gpt-5.4", "openai", None)), \
mock.patch("api.config.get_config", return_value={}), \
mock.patch("api.config._resolve_cli_toolsets", return_value=[]), \
mock.patch.dict(
sys.modules,
{
"hermes_cli": fake_hermes_cli,
"hermes_cli.runtime_provider": fake_runtime_module,
"hermes_state": fake_hermes_state,
},
):
streaming.STREAMS[fake_stream_id] = fake_queue
streaming._run_agent_streaming(
session_id=fake_session.session_id,
msg_text="new turn",
model="gpt-5.4",
workspace="/tmp",
stream_id=fake_stream_id,
)
# NOTE: We deliberately avoid mock.patch.dict(sys.modules, ...) here.
# patch.dict tracks original keys at __enter__ and on __exit__ DELETES any
# keys added during the patch that weren't in the original snapshot. That
# silently evicts lazily-imported submodules (e.g. pydantic.root_model)
# that other tests rely on, producing KeyError: 'pydantic.root_model' in
# downstream tests (notably tests/test_mcp_server.py via fastmcp imports).
# Manual save/restore only touches the three keys we explicitly inject.
_injected = {
"hermes_cli": fake_hermes_cli,
"hermes_cli.runtime_provider": fake_runtime_module,
"hermes_state": fake_hermes_state,
}
_saved = {k: sys.modules.get(k, _MISSING) for k in _injected}
sys.modules.update(_injected)
try:
with mock.patch.object(streaming, "get_session", return_value=fake_session), \
mock.patch.object(streaming, "_get_ai_agent", return_value=UsageAgent), \
mock.patch.object(streaming, "resolve_model_provider", return_value=("gpt-5.4", "openai", None)), \
mock.patch("api.config.get_config", return_value={}), \
mock.patch("api.config._resolve_cli_toolsets", return_value=[]):
streaming.STREAMS[fake_stream_id] = fake_queue
streaming._run_agent_streaming(
session_id=fake_session.session_id,
msg_text="new turn",
model="gpt-5.4",
workspace="/tmp",
stream_id=fake_stream_id,
)
finally:
for k, prev in _saved.items():
if prev is _MISSING:
sys.modules.pop(k, None)
else:
sys.modules[k] = prev
assert fake_session.input_tokens == 123
assert fake_session.output_tokens == 45
+925
View File
@@ -0,0 +1,925 @@
"""Tests for mcp_server.py — Option A rewrite (Issue #1616).
Covers: project CRUD, profile scoping, title collision, color validation,
session listing, cross-profile isolation.
Uses HERMES_WEBUI_STATE_DIR env var to point to a temp directory,
so tests don't touch the real webui state. Module is re-imported
per test class to ensure clean state.
"""
import json
import os
import sys
import tempfile
import uuid
from pathlib import Path
import pytest
# Skip the entire module when the optional `mcp` package isn't installed.
# CI runs with stdlib-only deps (pyyaml + pytest + pytest-timeout), and the
# `mcp` package is only required for users who actually run the MCP server.
# Locally with `pip install mcp pytest-asyncio` these tests run; on CI they
# skip cleanly without breaking the matrix.
pytest.importorskip("mcp", reason="mcp package not installed (optional MCP server dep)")
# pytest-asyncio is also optional but always installed alongside mcp tests
# in our local runs. If absent, importorskip the asyncio plugin gracefully.
pytest.importorskip("pytest_asyncio", reason="pytest-asyncio required for MCP server tests")
pytestmark = pytest.mark.asyncio
# ── Ensure repo root on path ──────────────────────────────────────────────
_REPO = Path(__file__).parent.parent.resolve()
if str(_REPO) not in sys.path:
sys.path.insert(0, str(_REPO))
# ═══════════════════════════════════════════════════════════════════════════
# State-restore bookkeeping
# ═══════════════════════════════════════════════════════════════════════════
#
# These tests mutate module-level constants on api.config / mcp_server /
# api.models (STATE_DIR, SESSION_DIR, PROJECTS_FILE, …) so the MCP server
# reads from a tmpdir. Without restoration, downstream tests in the full
# suite (test_pytest_state_isolation, test_provider_quota_status,
# test_provider_management, etc.) read the now-deleted tmpdir from
# api.config.STATE_DIR and fail.
#
# We snapshot the original values on first _reimport_mcp() call and restore
# them in _cleanup_state_dir() so the post-test module state matches pre-test.
_MISSING_ENV = object()
_SAVED_CONSTANTS = {"captured": False}
# ═══════════════════════════════════════════════════════════════════════════
# Helpers
# ═══════════════════════════════════════════════════════════════════════════
def _fresh_state_dir():
"""Create a clean temp state dir and set HERMES_WEBUI_STATE_DIR."""
td = tempfile.mkdtemp()
state_dir = Path(td)
sessions_dir = state_dir / "sessions"
sessions_dir.mkdir(parents=True)
(state_dir / "projects.json").write_text("[]", encoding="utf-8")
(sessions_dir / "_index.json").write_text("[]", encoding="utf-8")
os.environ["HERMES_WEBUI_STATE_DIR"] = str(state_dir)
return state_dir
def _cleanup_state_dir(state_dir: Path):
"""Remove temp state dir, clear env var, and restore api.config/mcp_server
module constants to whatever they were before the fixture started.
Without restoring, subsequent tests (test_pytest_state_isolation,
test_provider_quota_status, test_provider_management, etc.) read the
fixture's tmpdir from `api.config.STATE_DIR` and fail because the path
no longer exists or doesn't match their pytest-managed state dir."""
import shutil
shutil.rmtree(state_dir, ignore_errors=True)
os.environ.pop("HERMES_WEBUI_STATE_DIR", None)
# Restore api.config / mcp_server / api.models module constants.
saved = _SAVED_CONSTANTS
if saved.get("captured"):
import api.config as _cfg
for attr, val in saved["api.config"].items():
setattr(_cfg, attr, val)
if "mcp_server" in sys.modules:
mcp_mod = sys.modules["mcp_server"]
for attr, val in saved["mcp_server"].items():
setattr(mcp_mod, attr, val)
if "api.models" in sys.modules:
models_mod = sys.modules["api.models"]
for attr, val in saved["api.models"].items():
setattr(models_mod, attr, val)
# Restore HERMES_BASE_HOME / HERMES_HOME if we changed them
for env_key, env_val in saved["env"].items():
if env_val is _MISSING_ENV:
os.environ.pop(env_key, None)
else:
os.environ[env_key] = env_val
def _reimport_mcp():
"""Re-point mcp_server's module-level STATE_DIR / SESSION_DIR /
SESSION_INDEX_FILE / PROJECTS_FILE constants at the current
HERMES_WEBUI_STATE_DIR.
Returns (mcp_module, profiles_module) profiles_module is the
live api.profiles reference.
NOTE: Does NOT use `del sys.modules[...]` or `importlib.reload(...)`.
Both patterns trigger a chain re-import inside the FastMCP / pydantic
stack that corrupts pydantic's `_generics._GENERIC_TYPES_CACHE`
(manifests as `KeyError: 'pydantic.root_model'` in unrelated
downstream tests in the full suite). Instead, we mutate the
constants in-place after the first one-time import, which is
behaviorally equivalent for these tests since the constants are
module-level Path objects used only to compute STATE_DIR-rooted
paths at call time.
Also normalizes HERMES_BASE_HOME / HERMES_HOME to point at a
directory whose `profiles/` subdirectory we control. This isolates
us from sibling test files (e.g. test_profile_path_security.py)
that mutate those env vars during their own setup and don't restore
them in the strict sense the active-profile path resolution needs.
"""
state_dir = Path(os.environ['HERMES_WEBUI_STATE_DIR'])
# Sibling test files (e.g. test_profile_path_security.py) mutate
# HERMES_BASE_HOME / HERMES_HOME but only restore sys.modules — the
# env vars stay pointing at their tmpdir, which then breaks our
# active-profile path resolution. Re-anchor at a local home dir
# under our state_dir so other-profile scoping works.
isolated_home = state_dir.parent / "hermes-home"
(isolated_home / "profiles").mkdir(parents=True, exist_ok=True)
# Snapshot env vars BEFORE we overwrite them, so _cleanup_state_dir
# can restore them at fixture exit.
if not _SAVED_CONSTANTS.get("captured"):
_SAVED_CONSTANTS["env"] = {
"HERMES_BASE_HOME": os.environ.get("HERMES_BASE_HOME", _MISSING_ENV),
"HERMES_HOME": os.environ.get("HERMES_HOME", _MISSING_ENV),
}
os.environ["HERMES_BASE_HOME"] = str(isolated_home)
os.environ["HERMES_HOME"] = str(isolated_home)
import api.config as cfg
import mcp_server as mod
# First-time snapshot of module constants — captured AFTER the imports
# land their original values but BEFORE we mutate them below.
if not _SAVED_CONSTANTS.get("captured"):
_SAVED_CONSTANTS["api.config"] = {
attr: getattr(cfg, attr)
for attr in ("STATE_DIR", "SESSION_DIR", "WORKSPACES_FILE",
"SETTINGS_FILE", "LAST_WORKSPACE_FILE", "PROJECTS_FILE",
"SESSION_INDEX_FILE")
if hasattr(cfg, attr)
}
_SAVED_CONSTANTS["mcp_server"] = {
attr: getattr(mod, attr)
for attr in ("STATE_DIR", "SESSION_DIR", "PROJECTS_FILE",
"SESSION_INDEX_FILE", "WEBUI_HOST", "WEBUI_PORT",
"WEBUI_URL")
if hasattr(mod, attr)
}
if "api.models" in sys.modules:
models_mod = sys.modules["api.models"]
_SAVED_CONSTANTS["api.models"] = {
attr: getattr(models_mod, attr)
for attr in ("STATE_DIR", "PROJECTS_FILE", "SESSION_DIR")
if hasattr(models_mod, attr)
}
else:
_SAVED_CONSTANTS["api.models"] = {}
_SAVED_CONSTANTS["captured"] = True
# Acquire the api.profiles module THAT mcp_server's bound functions read.
# Sibling tests (test_profile_path_security.py) deletes api.profiles from
# sys.modules during their setup, then restores the originally-saved
# module reference. The result is that `import api.profiles` returns
# whatever module is currently in sys.modules, which may NOT be the same
# object as `mcp_server.get_active_profile_name`'s closure reference.
# We need to mutate the closure-bound module so mcp_server sees our
# _active_profile assignment.
import api.profiles as fresh_profiles_via_import
# mcp_server.get_active_profile_name is bound at first-import time and
# reads `_active_profile` from its own module's globals via closure.
# That module is the function's __globals__["__name__"] entry in
# sys.modules at first-import time. The most reliable way to find it
# is to follow the bound function back to its module.
bound_get_active = mod.get_active_profile_name
bound_module_name = bound_get_active.__module__
# Grab whatever Python currently has registered for that name; it may
# or may not be the same object as fresh_profiles_via_import.
# Use the function's __globals__ directly — that's the actual closure
# the function uses for its module-level reads.
bound_globals = bound_get_active.__globals__
# bound_globals IS the dict from sys.modules[<api.profiles>].__dict__ at
# first-import time. Mutating it propagates to all bound functions.
fresh_profiles = sys.modules.get(bound_module_name)
if fresh_profiles is None or fresh_profiles.__dict__ is not bound_globals:
# Sibling tests left a different module in sys.modules. The bound
# functions still use the original globals dict, so we expose a
# ModuleType-like proxy that writes to the original dict.
class _ProxyModule:
def __init__(self, globs):
self.__dict__ = globs
fresh_profiles = _ProxyModule(bound_globals)
# Re-point api.config module-level constants
cfg.STATE_DIR = state_dir
cfg.SESSION_DIR = state_dir / "sessions"
cfg.WORKSPACES_FILE = state_dir / "workspaces.json"
cfg.SETTINGS_FILE = state_dir / "settings.json"
cfg.LAST_WORKSPACE_FILE = state_dir / "last_workspace.txt"
cfg.PROJECTS_FILE = state_dir / "projects.json"
if hasattr(cfg, 'SESSION_INDEX_FILE'):
cfg.SESSION_INDEX_FILE = state_dir / "sessions" / "_index.json"
# Re-point mcp_server's imported aliases (they were copied at first
# import and don't pick up cfg mutations automatically).
mod.STATE_DIR = cfg.STATE_DIR
mod.SESSION_DIR = cfg.SESSION_DIR
mod.PROJECTS_FILE = cfg.PROJECTS_FILE
if hasattr(mod, 'SESSION_INDEX_FILE'):
mod.SESSION_INDEX_FILE = cfg.SESSION_INDEX_FILE
# api.models also imports STATE_DIR / PROJECTS_FILE etc. as module
# constants — re-point those too so load_projects() / save_projects()
# see the fresh STATE_DIR.
if 'api.models' in sys.modules:
models_mod = sys.modules['api.models']
if hasattr(models_mod, 'STATE_DIR'):
models_mod.STATE_DIR = cfg.STATE_DIR
if hasattr(models_mod, 'PROJECTS_FILE'):
models_mod.PROJECTS_FILE = cfg.PROJECTS_FILE
if hasattr(models_mod, 'SESSION_DIR'):
models_mod.SESSION_DIR = cfg.SESSION_DIR
# Re-evaluate WEBUI_URL from current env (PR #1895 made it env-aware
# but the value is computed once at module load; tests need to see
# current env state).
mod.WEBUI_HOST = os.environ.get("HERMES_WEBUI_HOST", "127.0.0.1")
mod.WEBUI_PORT = os.environ.get("HERMES_WEBUI_PORT", "8787")
mod.WEBUI_URL = f"http://{mod.WEBUI_HOST}:{mod.WEBUI_PORT}"
fresh_profiles._active_profile = 'default'
# Invalidate the root-profile cache (set at module load to detect
# renamed-root profiles, but stale after sibling tests that called
# switch_profile / list_profiles_api in their own setup).
if hasattr(fresh_profiles, '_invalidate_root_profile_cache'):
fresh_profiles._invalidate_root_profile_cache()
elif hasattr(fresh_profiles, '_root_profile_name_cache'):
fresh_profiles._root_profile_name_cache.clear()
fresh_profiles._root_profile_name_cache.add('default')
if hasattr(fresh_profiles, '_root_profile_name_cache_loaded'):
fresh_profiles._root_profile_name_cache_loaded = False
return mod, fresh_profiles
async def _call(mod, tool_name, **kwargs):
"""Call a tool handler and return parsed JSON."""
handler = mod.HANDLERS[tool_name]
result = await handler(kwargs)
return json.loads(result[0].text)
# ═══════════════════════════════════════════════════════════════════════════
# Project CRUD
# ═══════════════════════════════════════════════════════════════════════════
class TestCreateProject:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
self.mod, self.profiles = _reimport_mcp()
yield
_cleanup_state_dir(self.state_dir)
async def test_create_basic(self):
result = await _call(self.mod, "create_project", name="Test Project")
assert "project_id" in result
assert result["name"] == "Test Project"
assert result["profile"] == "default"
assert result["session_count"] == 0
async def test_create_with_color(self):
result = await _call(self.mod, "create_project",
name="Colored", color="#ff6600")
assert result["color"] == "#ff6600"
async def test_create_duplicate_exact_match(self):
await _call(self.mod, "create_project", name="My Project")
result = await _call(self.mod, "create_project", name="My Project")
assert "error" in result
assert "already exists" in result["error"]
async def test_create_case_sensitive_no_collision(self):
"""Exact match: 'MY project' and 'My Project' are different."""
await _call(self.mod, "create_project", name="My Project")
result = await _call(self.mod, "create_project", name="MY project")
assert "project_id" in result
async def test_create_empty_name(self):
result = await _call(self.mod, "create_project", name="")
assert "error" in result
async def test_create_invalid_color(self):
result = await _call(self.mod, "create_project",
name="Bad", color="not-a-color")
assert "error" in result
assert "Invalid color" in result["error"]
async def test_create_valid_color_formats(self):
for color in ["#fff", "#ff6600", "#ff6600aa"]:
result = await _call(self.mod, "create_project",
name=f"Color-{color}", color=color)
assert result["color"] == color
class TestRenameProject:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
self.mod, self.profiles = _reimport_mcp()
yield
_cleanup_state_dir(self.state_dir)
async def test_rename_basic(self):
created = await _call(self.mod, "create_project", name="Old")
pid = created["project_id"]
result = await _call(self.mod, "rename_project",
project_id=pid, name="New")
assert result["name"] == "New"
assert result["project_id"] == pid
async def test_rename_with_color(self):
created = await _call(self.mod, "create_project", name="X")
result = await _call(self.mod, "rename_project",
project_id=created["project_id"],
name="X", color="#000")
assert result["color"] == "#000"
async def test_rename_not_found(self):
result = await _call(self.mod, "rename_project",
project_id="nonexistent", name="Nope")
assert "error" in result
async def test_rename_wrong_profile(self):
created = await _call(self.mod, "create_project", name="DefaultOwned")
pid = created["project_id"]
self.profiles._active_profile = 'other'
result = await _call(self.mod, "rename_project",
project_id=pid, name="Stolen")
assert "error" in result
assert "not found" in result["error"].lower()
class TestDeleteProject:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
self.mod, self.profiles = _reimport_mcp()
yield
_cleanup_state_dir(self.state_dir)
async def test_delete_basic(self):
created = await _call(self.mod, "create_project", name="ToDelete")
pid = created["project_id"]
result = await _call(self.mod, "delete_project", project_id=pid)
assert result["ok"] is True
assert result["deleted"] == "ToDelete"
async def test_delete_not_found(self):
result = await _call(self.mod, "delete_project",
project_id="nonexistent")
assert "error" in result
async def test_delete_wrong_profile(self):
created = await _call(self.mod, "create_project", name="Owned")
pid = created["project_id"]
self.profiles._active_profile = 'other'
result = await _call(self.mod, "delete_project", project_id=pid)
assert "error" in result
async def test_delete_no_auth_refuses_unassign(self):
"""Without HERMES_WEBUI_PASSWORD, delete_project must NOT touch
session JSONs. Direct FS writes would bypass _write_session_index()
and leave _index.json holding the stale project_id, causing a
running WebUI to keep grouping sessions under the deleted project.
The handler should: delete the project from projects.json, leave
every session JSON untouched, leave the index untouched, and
surface a `warning` field telling the operator to set the env var.
"""
from api.config import SESSION_DIR, SESSION_INDEX_FILE
os.environ.pop("HERMES_WEBUI_PASSWORD", None)
# Create project + a session JSON that points at it
created = await _call(self.mod, "create_project", name="ToDelete")
pid = created["project_id"]
sid = "test_sess_001"
session_path = SESSION_DIR / f"{sid}.json"
session_payload = {
"session_id": sid,
"title": "T",
"project_id": pid,
"messages": [],
}
session_path.write_text(json.dumps(session_payload), encoding="utf-8")
# Index references the session under the project
SESSION_INDEX_FILE.write_text(
json.dumps([{"session_id": sid, "project_id": pid, "title": "T"}]),
encoding="utf-8")
index_before = SESSION_INDEX_FILE.read_text(encoding="utf-8")
session_before = session_path.read_text(encoding="utf-8")
result = await _call(self.mod, "delete_project", project_id=pid)
assert result["ok"] is True
assert result["unassigned_sessions"] == 0
assert "warning" in result
assert "HERMES_WEBUI_PASSWORD" in result["warning"]
# Session JSON untouched
assert session_path.read_text(encoding="utf-8") == session_before
# Index untouched
assert SESSION_INDEX_FILE.read_text(encoding="utf-8") == index_before
# ═══════════════════════════════════════════════════════════════════════════
# Profile Scoping
# ═══════════════════════════════════════════════════════════════════════════
class TestProfileScoping:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
self.mod, self.profiles = _reimport_mcp()
yield
_cleanup_state_dir(self.state_dir)
async def test_projects_tagged_with_profile(self):
result = await _call(self.mod, "create_project", name="Tagged")
assert result["profile"] == "default"
async def test_list_projects_respects_profile(self):
# Create under default
await _call(self.mod, "create_project", name="DefaultProject")
# Switch to other
self.profiles._active_profile = 'other'
await _call(self.mod, "create_project", name="OtherProject")
# List should only show current profile's projects
projects = await _call(self.mod, "list_projects")
names = [p["name"] for p in projects]
assert "OtherProject" in names
assert "DefaultProject" not in names
# Switch back
self.profiles._active_profile = 'default'
projects = await _call(self.mod, "list_projects")
names = [p["name"] for p in projects]
assert "DefaultProject" in names
assert "OtherProject" not in names
async def test_cross_profile_isolation_create(self):
"""Same name in different profiles should be allowed."""
await _call(self.mod, "create_project", name="Shared")
self.profiles._active_profile = 'other'
result = await _call(self.mod, "create_project", name="Shared")
assert "project_id" in result
async def test_legacy_untagged_hidden_from_non_root_profile(self):
"""Untagged projects (no `profile` field) belong to the root profile.
Mirrors api/routes.py:_profiles_match where a missing profile coerces
to 'default'. A non-root profile must NOT see legacy untagged rows.
"""
# Manually write a legacy untagged project (pre-#1614 schema)
import api.config as _cfg_mod
PROJECTS_FILE = _cfg_mod.PROJECTS_FILE
legacy = [{
"project_id": "legacy000001",
"name": "LegacyUntagged",
"color": None,
"created_at": 1700000000.0,
# No "profile" field on purpose
}]
PROJECTS_FILE.write_text(json.dumps(legacy), encoding="utf-8")
# Non-root profile must NOT see it
self.profiles._active_profile = 'other'
projects = await _call(self.mod, "list_projects")
names = [p["name"] for p in projects]
assert "LegacyUntagged" not in names
# Root profile still sees it (load_projects backfills `profile`
# to 'default', so visibility is preserved for the root).
self.profiles._active_profile = 'default'
projects = await _call(self.mod, "list_projects")
names = [p["name"] for p in projects]
assert "LegacyUntagged" in names
async def test_legacy_untagged_rename_blocked_from_non_root(self):
"""Non-root profile cannot rename a legacy untagged project."""
import api.config as _cfg_mod
PROJECTS_FILE = _cfg_mod.PROJECTS_FILE
legacy = [{
"project_id": "legacy000002",
"name": "Legacy",
"color": None,
"created_at": 1700000000.0,
}]
PROJECTS_FILE.write_text(json.dumps(legacy), encoding="utf-8")
self.profiles._active_profile = 'other'
result = await _call(self.mod, "rename_project",
project_id="legacy000002", name="Stolen")
assert "error" in result
# ═══════════════════════════════════════════════════════════════════════════
# Session listing
# ═══════════════════════════════════════════════════════════════════════════
class TestListSessions:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
self.mod, self.profiles = _reimport_mcp()
yield
_cleanup_state_dir(self.state_dir)
async def test_list_empty(self):
result = await _call(self.mod, "list_sessions")
assert result == []
async def test_list_with_limit(self):
result = await _call(self.mod, "list_sessions", limit=10)
assert isinstance(result, list)
async def test_list_unassigned(self):
result = await _call(self.mod, "list_sessions", unassigned=True)
assert isinstance(result, list)
# ═══════════════════════════════════════════════════════════════════════════
# Session mutations (HTTP API — basic validation only)
# ═══════════════════════════════════════════════════════════════════════════
class TestSessionMutations:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
self.mod, self.profiles = _reimport_mcp()
yield
_cleanup_state_dir(self.state_dir)
async def test_rename_missing_args(self):
result = await _call(self.mod, "rename_session",
session_id="", title="")
assert "error" in result
async def test_move_missing_args(self):
result = await _call(self.mod, "move_session",
session_id="", project_id="x")
assert "error" in result
async def test_move_project_not_found(self):
result = await _call(self.mod, "move_session",
session_id="s1", project_id="nonexistent")
assert "error" in result
async def test_move_target_owned_by_other_profile_rejected(self):
"""A project owned by profile A is invisible to profile B (#1614)."""
created = await _call(self.mod, "create_project", name="ATarget")
pid = created["project_id"]
self.profiles._active_profile = 'other'
result = await _call(self.mod, "move_session",
session_id="any", project_id=pid)
assert "error" in result
assert "not found" in result["error"].lower()
# ═══════════════════════════════════════════════════════════════════════════
# Auth helper
# ═══════════════════════════════════════════════════════════════════════════
class TestApiPassword:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
# Ensure env var is unset for the test
os.environ.pop("HERMES_WEBUI_PASSWORD", None)
self.mod, self.profiles = _reimport_mcp()
yield
_cleanup_state_dir(self.state_dir)
async def test_no_env_no_settings_returns_none(self):
assert self.mod._api_password() is None
async def test_password_hash_in_settings_is_ignored(self):
"""settings.json holds a hash, not a plaintext password — must NOT
be returned as if it were a usable password."""
from api.config import STATE_DIR as _SD
(_SD / "settings.json").write_text(
json.dumps({"password_hash": "$2b$12$abcdefghijk"}),
encoding="utf-8")
assert self.mod._api_password() is None
async def test_env_var_returned(self):
os.environ["HERMES_WEBUI_PASSWORD"] = "secret123"
try:
assert self.mod._api_password() == "secret123"
finally:
os.environ.pop("HERMES_WEBUI_PASSWORD", None)
# ═══════════════════════════════════════════════════════════════════════════
# _profiles_match parity (mcp_server vs api.routes vs api.profiles)
# ═══════════════════════════════════════════════════════════════════════════
#
# Locks the canonical-helper relocation: mcp_server.py and api/routes.py both
# now import _profiles_match from api/profiles.py. If anyone re-introduces a
# local copy in either module, both the identity check and the input-matrix
# parametrize trip immediately.
async def test_profiles_match_single_source_of_truth():
"""All three module names resolve to the same canonical object.
This locks the relocation: mcp_server.py and api/routes.py both import
_profiles_match from api/profiles.py rather than carrying a local copy.
Re-introducing a local definition in either module trips this test
immediately.
Imported here in a clean module-import context (not via _reimport_mcp,
which would re-execute api/profiles.py and produce a distinct function
object that's behaviorally identical but fails the `is` check).
NOTE: We swap-in fresh modules but RESTORE the originals at exit so
sibling test files (test_provider_quota_status etc.) that imported
api.profiles at module-load time continue to see the same object
they already have monkeypatch handles into. Otherwise their
`monkeypatch.setattr(profiles, ...)` patches the wrong module object.
"""
# Snapshot the originals; we'll put them back at the end.
saved_modules = {
k: sys.modules[k]
for k in ('mcp_server', 'api.routes', 'api.profiles')
if k in sys.modules
}
# Also snapshot the attributes on the parent `api` package, because
# `import api.routes as r` resolves via `sys.modules['api'].routes`,
# NOT directly via sys.modules['api.routes']. If we don't restore
# the parent attribute, subsequent `import api.routes as r` calls
# bind to the fresh re-imported module even though sys.modules
# holds the original.
import api as _api_parent
saved_api_attrs = {}
for sub in ('routes', 'profiles'):
if hasattr(_api_parent, sub):
saved_api_attrs[sub] = getattr(_api_parent, sub)
for k in ('mcp_server', 'api.routes', 'api.profiles'):
sys.modules.pop(k, None)
try:
import api.profiles as _profiles_mod
import api.routes as _routes_mod
import mcp_server as _mcp_mod
canonical = _profiles_mod._profiles_match
assert _routes_mod._profiles_match is canonical
assert _mcp_mod._profiles_match is canonical
finally:
# Restore so monkeypatch handles in sibling tests target the right module.
for k in ('mcp_server', 'api.routes', 'api.profiles'):
sys.modules.pop(k, None)
sys.modules.update(saved_modules)
# Restore parent-package attributes too (see above for why).
for sub, mod_obj in saved_api_attrs.items():
setattr(_api_parent, sub, mod_obj)
@pytest.mark.parametrize("a, b", [
(None, None),
(None, ''),
('', None),
('', ''),
(None, 'default'),
('default', None),
('default', 'default'),
('foo', 'foo'),
('foo', 'bar'),
('foo', None),
(None, 'foo'),
('default', 'foo'),
('foo', 'default'),
])
async def test_profiles_match_input_matrix(a, b):
"""mcp_server._profiles_match agrees with api.routes._profiles_match
on every (row, active) pair across the visibility matrix.
Note: function-object identity is checked separately in
test_profiles_match_single_source_of_truth here we only assert
behavioral parity, which is robust to test-fixture re-imports that
clear and re-execute api.profiles."""
from mcp_server import _profiles_match as mcp_match
from api.routes import _profiles_match as routes_match
assert mcp_match(a, b) == routes_match(a, b)
# ═══════════════════════════════════════════════════════════════════════════
# --profile CLI ordering regression
# ═══════════════════════════════════════════════════════════════════════════
#
# Maintainer ask: verify that --profile is applied to _active_profile *before*
# any api.models / api.profiles consumer reads the active profile. The risk
# is that if the canonical helpers cached the profile on first read at import
# time, a --profile foo flag passed at startup would bind too late.
#
# Today the helpers read _active_profile lazily (api/profiles.py:173 reads
# the module global at every call) so the override is safe. This test locks
# the behaviour: setting _active_profile = 'foo' before the first list call
# produces results filtered to 'foo', not the default.
class TestProfileCliOrdering:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
self.mod, self.profiles = _reimport_mcp()
yield
_cleanup_state_dir(self.state_dir)
async def test_active_profile_override_takes_effect_before_first_read(self):
"""--profile foo must filter list_projects to foo's rows immediately.
Simulates the CLI override path (mcp_server.py:62-64 sets
_profiles._active_profile = _profile_arg right after import). If a
helper had latched the profile at import time, the override here
would be too late and the test would see 'default'-tagged rows."""
import api.config as _cfg_mod
PROJECTS_FILE = _cfg_mod.PROJECTS_FILE
# Pre-seed two projects: one for default, one for foo.
seeded = [
{"project_id": "p_default_0001", "name": "DefaultRow",
"color": None, "profile": "default", "created_at": 1.0},
{"project_id": "p_foo_0001", "name": "FooRow",
"color": None, "profile": "foo", "created_at": 2.0},
]
PROJECTS_FILE.write_text(json.dumps(seeded), encoding="utf-8")
# Apply the override BEFORE the first list call. This is what
# mcp_server.py:62-64 does after argparse.
self.profiles._active_profile = 'foo'
projects = await _call(self.mod, "list_projects")
names = [p["name"] for p in projects]
assert "FooRow" in names
assert "DefaultRow" not in names
# ═══════════════════════════════════════════════════════════════════════════
# HTTP wire-format coverage for rename_session / move_session
# ═══════════════════════════════════════════════════════════════════════════
#
# Maintainer ask: exercise the actual HTTP path so a typo in WEBUI_URL or in
# the request body shape can't slip through validation-only tests. We stand
# up a tiny http.server stub on a free localhost port, point WEBUI_URL at it,
# and capture (path, body) from the requests our handlers issue. This is
# the thing that would have caught the original 8788 vs 8787 mismatch.
import http.server
import socket
import threading
class _RecordingHandler(http.server.BaseHTTPRequestHandler):
"""Captures POST path + body, returns canned JSON. Class-level state is
set by the fixture before each test so handlers can cross-reference."""
captured = None # populated per-test as a list of (path, body, headers)
canned_response = None # populated per-test: dict to be JSON-encoded
def log_message(self, *args, **kwargs): # noqa: D401 — silence stderr
pass
def do_POST(self):
length = int(self.headers.get("Content-Length", "0"))
raw = self.rfile.read(length) if length else b""
try:
body = json.loads(raw.decode("utf-8")) if raw else {}
except Exception:
body = {"_raw": raw.decode("utf-8", errors="replace")}
type(self).captured.append({
"path": self.path,
"body": body,
"cookie": self.headers.get("Cookie"),
"content_type": self.headers.get("Content-Type"),
})
payload = json.dumps(type(self).canned_response or {}).encode("utf-8")
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(payload)))
self.end_headers()
self.wfile.write(payload)
def _free_port() -> int:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(("127.0.0.1", 0))
port = s.getsockname()[1]
s.close()
return port
class TestApiWireFormat:
@pytest.fixture(autouse=True)
def setup(self):
self.state_dir = _fresh_state_dir()
# Stand up a recording HTTP server on a free port. We override
# WEBUI_URL on the imported mcp_server module to point at it.
self.port = _free_port()
_RecordingHandler.captured = []
_RecordingHandler.canned_response = {}
self.httpd = http.server.HTTPServer(("127.0.0.1", self.port),
_RecordingHandler)
self.thread = threading.Thread(target=self.httpd.serve_forever,
daemon=True)
self.thread.start()
# Disable auth so _api_post() does not attempt a real /api/auth/login.
os.environ.pop("HERMES_WEBUI_PASSWORD", None)
self.mod, self.profiles = _reimport_mcp()
# Override AFTER import so the value sticks in the loaded module.
self.mod.WEBUI_URL = f"http://127.0.0.1:{self.port}"
yield
self.httpd.shutdown()
self.httpd.server_close()
self.thread.join(timeout=2)
_cleanup_state_dir(self.state_dir)
async def test_rename_session_posts_to_canonical_path(self):
"""rename_session must POST {session_id, title} to /api/session/rename."""
_RecordingHandler.canned_response = {
"session": {"session_id": "abc123", "title": "Renamed"}
}
result = await _call(self.mod, "rename_session",
session_id="abc123", title="Renamed")
assert len(_RecordingHandler.captured) == 1
req = _RecordingHandler.captured[0]
assert req["path"] == "/api/session/rename"
assert req["body"] == {"session_id": "abc123", "title": "Renamed"}
assert req["content_type"] == "application/json"
# Handler returns success-shaped result on 200.
assert result["ok"] is True
assert result["session_id"] == "abc123"
assert result["title"] == "Renamed"
assert result["method"] == "api"
async def test_move_session_posts_to_canonical_path(self):
"""move_session (with a project_id) POSTs to /api/session/move
after confirming the project exists locally."""
# Need a real project so the pre-flight profile check passes.
created = await _call(self.mod, "create_project", name="MoveTarget")
pid = created["project_id"]
_RecordingHandler.canned_response = {
"ok": True,
"session": {"session_id": "s1", "title": "T", "project_id": pid}
}
result = await _call(self.mod, "move_session",
session_id="s1", project_id=pid)
assert len(_RecordingHandler.captured) == 1
req = _RecordingHandler.captured[0]
assert req["path"] == "/api/session/move"
assert req["body"] == {"session_id": "s1", "project_id": pid}
assert result["ok"] is True
assert result["session_id"] == "s1"
assert result["project_id"] == pid
assert result["method"] == "api"
async def test_move_session_unassign_sends_null_project_id(self):
"""Passing project_id=None must serialize as JSON null (not omitted)."""
_RecordingHandler.canned_response = {
"ok": True, "session": {"session_id": "s1", "project_id": None}
}
result = await _call(self.mod, "move_session",
session_id="s1", project_id=None)
assert len(_RecordingHandler.captured) == 1
req = _RecordingHandler.captured[0]
assert req["path"] == "/api/session/move"
assert req["body"] == {"session_id": "s1", "project_id": None}
assert result["ok"] is True
async def test_url_built_from_env_vars(self):
"""HERMES_WEBUI_HOST / HERMES_WEBUI_PORT govern WEBUI_URL.
Locks the maintainer-suggested env-var contract from #1895 review:
the MCP must track the same env vars api/config.py:32-33 reads, so
a non-default WebUI port (e.g. 8788 when 8787 is held by another
service on the host) does not require a code edit."""
os.environ["HERMES_WEBUI_HOST"] = "10.0.0.42"
os.environ["HERMES_WEBUI_PORT"] = "9999"
try:
mod, _ = _reimport_mcp()
assert mod.WEBUI_HOST == "10.0.0.42"
assert mod.WEBUI_PORT == "9999"
assert mod.WEBUI_URL == "http://10.0.0.42:9999"
finally:
os.environ.pop("HERMES_WEBUI_HOST", None)
os.environ.pop("HERMES_WEBUI_PORT", None)
async def test_url_default_when_env_unset(self):
"""Default upstream port is 8787, matching api/config.py:33."""
os.environ.pop("HERMES_WEBUI_HOST", None)
os.environ.pop("HERMES_WEBUI_PORT", None)
mod, _ = _reimport_mcp()
assert mod.WEBUI_HOST == "127.0.0.1"
assert mod.WEBUI_PORT == "8787"
assert mod.WEBUI_URL == "http://127.0.0.1:8787"