From be32b90cea340ecc54dcd2f6e5989e1df351e5b0 Mon Sep 17 00:00:00 2001
From: Frank Song <franksong2702@gmail.com>
Date: Wed, 13 May 2026 10:57:17 +0800
Subject: [PATCH 1/3] docs: refresh current project snapshot

---
 .env.example      |   3 +
 ARCHITECTURE.md   | 177 +++++++++++++++++++++++-----------------------
 CHANGELOG.md      |   4 ++
 README.md         |  61 ++++++++--------
 TESTING.md        |  11 +--
 docker_init.bash  |   2 +-
 tests/conftest.py |   6 +-
 7 files changed, 137 insertions(+), 127 deletions(-)
diff --git a/.env.example b/.env.example
index 768eca50..27cd9b3d 100644
--- a/.env.example
+++ b/.env.example
@@ -21,6 +21,9 @@
 # Default workspace directory shown on first launch
 # HERMES_WEBUI_DEFAULT_WORKSPACE=~/workspace
 
+# Optional model override. Leave unset to use the active Hermes provider default.
+# HERMES_WEBUI_DEFAULT_MODEL=
+
 # Base directory for all Hermes state (affects all paths above if set)
 # HERMES_HOME=~/.hermes
 
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
index 1d8f5a71..c83c19e1 100644
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -7,10 +7,10 @@
 >
 > Keep this document updated as architecture changes are made.
 
-> Current shipped build: `v0.50.245` (April 30, 2026).
-> Automated coverage: 3309 tests via `pytest tests/ --collect-only -q`. CI runs on Python 3.11, 3.12, and 3.13 against every PR.
+> Current shipped build: `v0.51.52` (May 12, 2026).
+> Automated coverage: 5271 tests via `pytest tests/ --collect-only -q`. CI runs on Python 3.11, 3.12, and 3.13 against every PR.
 >
-> Notable architecture state as of v0.50.245: workspace panel closed/open state is preloaded via a `documentElement` dataset marker before `style.css` paints to avoid first-load flash; transcript disclosure cards animate via transitionable `max-height`/`opacity` states; thinking cards share rounded bordered card chrome with tool cards (gold palette); incremental streaming-markdown via vendored `streaming-markdown@0.2.15` (no CDN); HTTP byte-range streaming for large media; SSE-driven session sidebar with `pending_user_message` + `active_stream_id` lifecycle tracking; configurable model badges (`primary` / `fallback N`) computed in `_build_configured_model_badges()` and provider-aware in the dropdown picker.
+> Notable architecture state as of v0.51.52: the bootstrap and first-run onboarding flow own setup discovery; the default WebUI state directory is `~/.hermes/webui`; `ctl.sh` provides a daemon wrapper for homelab installs; chat streaming is still WebUI-owned SSE with stream-ownership guards, cancellation, async manual compression, and turn-journal audit plumbing; provider/model discovery is profile-aware with live-model cache invalidation and custom-provider scoping.
 
 ---
 
@@ -43,42 +43,42 @@ actions. The topbar remains focused on conversation context and the workspace/fi
 ## 2. File Inventory
 
     <repo>/
-    server.py              Thin routing shell + HTTP Handler + auth middleware. ~81 lines.
+    server.py              Thin routing shell + HTTP Handler + auth middleware. ~435 lines.
                            Delegates all route handling to api/routes.py.
     bootstrap.py           One-shot launcher: optional agent install, deps, health wait, browser open.
     start.sh               Thin wrapper around bootstrap.py for shell-based startup.
-    Dockerfile             python:3.12-slim container image (~23 lines)
-    docker-compose.yml     Compose config with named volume and optional auth (~22 lines)
+    Dockerfile             python:3.12-slim container image (~89 lines)
+    docker-compose.yml     Compose config with named volume and optional auth (~57 lines)
     .dockerignore          Excludes .git, tests/, .env* from Docker builds
     api/
       __init__.py          Package marker
-      auth.py              Optional password authentication, signed cookies (~149 lines)
-      config.py            Discovery, globals, model detection, reloadable config (~701 lines)
-      helpers.py           HTTP helpers: j(), bad(), require(), safe_resolve(), security headers (~71 lines)
-      models.py            Session model + CRUD, per-session profile tracking (~137 lines)
-      profiles.py          Profile state management, hermes_cli wrapper (~246 lines)
-      onboarding.py        First-run onboarding status, real provider config writes, and readiness detection.
-      routes.py            All GET + POST route handlers (~1180 lines)
+      auth.py              Optional password authentication, signed cookies (~366 lines)
+      config.py            Discovery, globals, model detection, reloadable config (~4136 lines)
+      helpers.py           HTTP helpers: j(), bad(), require(), safe_resolve(), security headers (~302 lines)
+      models.py            Session model + CRUD, per-session profile tracking (~1927 lines)
+      profiles.py          Profile state management, hermes_cli wrapper (~1056 lines)
+      onboarding.py        First-run onboarding status, real provider config writes, OAuth linking, and readiness detection (~1002 lines)
+      routes.py            All GET + POST route handlers (~9630 lines)
       startup.py           Startup helpers: auto_install_agent_deps() (~50 lines)
-      streaming.py         SSE engine, run_agent, cancel, HERMES_HOME save/restore (~236 lines)
-      upload.py            Multipart parser, file upload handler (~78 lines)
-      workspace.py         File ops: list_dir, read_file_content, workspace helpers (~77 lines)
+      streaming.py         SSE engine, run_agent, cancel, HERMES_HOME save/restore (~4404 lines)
+      upload.py            Multipart parser, file upload handler (~284 lines)
+      workspace.py         File ops: list_dir, read_file_content, workspace helpers (~810 lines)
     static/
-      index.html           HTML template (~364 lines)
-      style.css            All CSS incl. mobile responsive (~670 lines)
-      ui.js                DOM helpers, renderMd, tool cards, model dropdown, file tree (~977 lines)
-      workspace.js         File preview, file ops, loadDir, clearPreview (~185 lines)
-      sessions.js          Session CRUD, list rendering, search, SVG icons, dropdown actions (~533 lines)
-      messages.js          send(), SSE event handlers, approval, transcript (~297 lines)
-      panels.js            Cron, skills, memory, workspace, profiles, todo, settings (~974 lines)
-      commands.js          Slash command registry, parser, autocomplete dropdown (~156 lines)
+      index.html           HTML template (~1323 lines)
+      style.css            All CSS incl. mobile responsive (~3767 lines)
+      ui.js                DOM helpers, renderMd, tool cards, model dropdown, file tree (~7197 lines)
+      workspace.js         File preview, file ops, loadDir, clearPreview (~369 lines)
+      sessions.js          Session CRUD, list rendering, search, SVG icons, dropdown actions (~3433 lines)
+      messages.js          send(), SSE event handlers, approval, transcript (~2301 lines)
+      panels.js            Cron, skills, memory, workspace, profiles, todo, settings (~6480 lines)
+      commands.js          Slash command registry, parser, autocomplete dropdown (~1302 lines)
       onboarding.js        First-run wizard overlay, provider setup flow, and settings/workspace orchestration.
-      boot.js              Event wiring, mobile sidebar/workspace nav, voice input, boot IIFE (~338 lines)
+      boot.js              Event wiring, mobile sidebar/workspace nav, voice input, boot IIFE (~1607 lines)
     tests/
-      conftest.py          Isolated test server (port 8788, separate HERMES_HOME) (~240 lines)
-      test_sprint{1-20b}.py Feature tests per sprint (21 files, 415 test functions)
-      test_regressions.py  Permanent regression gate (23 tests)
-    AGENTS.md              Instruction file for agents working in this directory.
+      conftest.py          Isolated test server/state fixtures (~630 lines)
+      483 test files       5271 tests collected via pytest
+      test_regressions.py  Permanent regression gate (~976 lines)
+    CONTRIBUTING.md        Contributor workflow and PR expectations.
     ROADMAP.md             Feature and product roadmap document.
     SPRINTS.md             Forward sprint plan with CLI + Claude parity targets.
     ARCHITECTURE.md        THIS FILE.
@@ -90,7 +90,7 @@ actions. The topbar remains focused on conversation context and the workspace/fi
 
 State directory (runtime data, separate from source):
 
-    ~/.hermes/webui-mvp/
+    ~/.hermes/webui/
     sessions/          One JSON file per session: {session_id}.json
     workspaces.json    Registered workspaces list
     last_workspace.txt Last-used workspace path
@@ -99,7 +99,8 @@ State directory (runtime data, separate from source):
 
 Log file:
 
-    /tmp/webui-mvp.log   stdout/stderr from the background server process
+    ~/.hermes/webui/bootstrap-8787.log   start.sh/bootstrap background server log
+    ~/.hermes/webui.log                  ctl.sh daemon log
 
 ---
 
@@ -118,15 +119,16 @@ Environment variables controlling behavior:
     HERMES_WEBUI_DEFAULT_WORKSPACE Default workspace path for new sessions
     HERMES_WEBUI_STATE_DIR         Where sessions/ folder lives
     HERMES_CONFIG_PATH             Path to ~/.hermes/config.yaml
-    HERMES_WEBUI_DEFAULT_MODEL     Default LLM model string
+    HERMES_WEBUI_DEFAULT_MODEL     Optional model override; unset means provider default
     HERMES_WEBUI_PASSWORD          Optional: enable password auth (off by default)
+    HERMES_WEBUI_SKIP_ONBOARDING   Optional: bypass the first-run onboarding wizard
     HERMES_HOME                    Base directory for Hermes state (~/.hermes by default)
 
 Test isolation environment variables (set by conftest.py):
 
-    HERMES_WEBUI_PORT=8788                           Isolated test port
-    HERMES_WEBUI_STATE_DIR=~/.hermes/webui-mvp-test  Isolated test state
-    HERMES_WEBUI_DEFAULT_WORKSPACE=.../test-workspace Isolated test workspace
+    HERMES_WEBUI_TEST_PORT=...                         Optional pinned test port
+    HERMES_WEBUI_TEST_STATE_DIR=~/.hermes/webui-test-* Optional pinned test state
+    HERMES_WEBUI_DEFAULT_WORKSPACE=.../test-workspace  Isolated test workspace
 
 Tests NEVER talk to the production server (port 8787).
 The test state dir is wiped before each test session and deleted after.
@@ -363,16 +365,18 @@ read_file_content(workspace, rel):
 ### 5.1 Structure
 
 The frontend is served from static/ as separate files: one HTML template, one CSS file,
-and six JavaScript modules (~2,786 lines total). External dependencies: Prism.js (syntax
-highlighting) and Mermaid.js (diagrams) from CDN, both loaded async/deferred with SRI hashes.
+and multiple JavaScript modules. External dependencies include Prism.js (syntax
+highlighting), Mermaid.js (diagrams), xterm.js, and KaTeX assets loaded with the
+current static template's integrity/CSP assumptions.
 
-Six JS modules loaded in order at end of <body>:
-  1. ui.js       (~846 lines) DOM helpers, renderMd, tool card rendering, global state
-  2. workspace.js (~169 lines) File tree, preview, file operations
-  3. sessions.js  (~532 lines) Session CRUD, list rendering, search, SVG icons, dropdown actions, project picker
-  4. messages.js  (~293 lines) send(), SSE event handlers, approval, transcript
-  5. panels.js    (~771 lines) Cron, skills, memory, workspace, todo, switchPanel
-  6. boot.js      (~175 lines) Event wiring + boot IIFE
+Core JS modules loaded by the app include:
+  1. ui.js         (~7197 lines) DOM helpers, renderMd, tool card rendering, global state
+  2. workspace.js  (~369 lines) File tree, preview, file operations
+  3. sessions.js  (~3433 lines) Session CRUD, list rendering, search, SVG icons, dropdown actions, project picker
+  4. messages.js  (~2301 lines) send(), SSE event handlers, approval, transcript
+  5. panels.js    (~6480 lines) Cron, skills, memory, workspace, profiles, todo, settings
+  6. commands.js  (~1302 lines) Slash command registry, parser, autocomplete dropdown
+  7. boot.js      (~1607 lines) Event wiring + boot IIFE
 
 sessions.js defines an `ICONS` constant at module level with hardcoded SVG strings for all
 session action buttons (pin, unpin, folder, archive, unarchive, duplicate, trash). All icons
@@ -680,27 +684,28 @@ Split server.py into a proper package. Completed across Sprints 4-10.
 Current structure:
 
     <repo>/
-      server.py               Entry point + HTTP Handler dispatch (~76 lines)
+      server.py               Entry point + HTTP Handler dispatch (~435 lines)
       api/
         __init__.py
-        routes.py             All GET + POST route handlers (~1016 lines)
-        config.py             Configuration, constants, global state, model discovery (~640 lines)
-        helpers.py            HTTP helpers: j(), bad(), require(), safe_resolve() (~57 lines)
-        models.py             Session model + CRUD (~132 lines)
-        workspace.py          File ops, workspace management (~77 lines)
-        upload.py             Multipart parser, file upload handler (~77 lines)
-        streaming.py          SSE engine, run_agent, cancel support (~222 lines)
+        routes.py             All GET + POST route handlers (~9630 lines)
+        config.py             Configuration, constants, global state, model discovery (~4136 lines)
+        helpers.py            HTTP helpers: j(), bad(), require(), safe_resolve() (~302 lines)
+        models.py             Session model + CRUD (~1927 lines)
+        workspace.py          File ops, workspace management (~810 lines)
+        upload.py             Multipart parser, file upload handler (~284 lines)
+        streaming.py          SSE engine, run_agent, cancel support (~4404 lines)
       static/
         index.html            HTML document (served from disk)
-        style.css             All CSS (~560 lines)
-        ui.js, workspace.js, sessions.js, messages.js, panels.js, boot.js
+        style.css             All CSS (~3767 lines)
+        ui.js, workspace.js, sessions.js, messages.js, panels.js, commands.js, boot.js
       tests/
-        conftest.py           Isolated test server on port 8788
-        test_sprint1-16.py    Feature tests per sprint (14 files)
+        conftest.py           Isolated test server/state fixtures
+        483 test files        5271 tests collected
         test_regressions.py   Permanent regression gate
 
-Route extraction to api/routes.py completed in Sprint 11. server.py is now a ~76-line
-thin shell: Handler class with structured logging, dispatch to routes, and main().
+Route extraction to api/routes.py completed in Sprint 11. server.py remains a
+thin shell relative to the rest of the app: Handler class with headers,
+structured logging, dispatch to routes, TLS wrapping, and main().
 
 ### Phase B: Thread-Safe Request Context (Priority: Critical, Effort: Medium)
 
@@ -779,7 +784,7 @@ Replacing with marked.js + DOMPurify is a future improvement (not blocking).
 
 ### Phase G: Observability -- MOSTLY COMPLETE
 
-1. Structured JSON logging: COMPLETE (Sprint 1). Per-request JSON to /tmp/webui-mvp.log.
+1. Structured JSON logging: COMPLETE (Sprint 1). Per-request JSON is printed to the active launcher log (`~/.hermes/webui/bootstrap-8787.log` for `start.sh`, `~/.hermes/webui.log` for `ctl.sh`).
 2. Enhanced /health: COMPLETE (Sprint 7). Returns `active_streams`, `uptime_seconds`.
 3. GET /api/debug/stats: NOT YET IMPLEMENTED. Low priority.
 
@@ -795,13 +800,13 @@ Optional password gate for non-SSH-tunnel deployments.
 
 ### Phase I: Test Infrastructure -- COMPLETE
 
-289 tests across 14 test files + regression gate. Isolated test server on port 8788
-with separate HERMES_HOME, wiped per run. Production data never touched.
+5271 tests across 483 test files + regression gates. The pytest fixture derives
+an isolated port and state directory from the repo path unless
+`HERMES_WEBUI_TEST_PORT` / `HERMES_WEBUI_TEST_STATE_DIR` pin them explicitly.
+Production data never touched.
 
-Test files: `test_sprint1.py` through `test_sprint11.py`, `test_sprint16.py`, `test_regressions.py`.
-Fixtures in `conftest.py`: auto-cleanup, cron isolation, workspace reset.
-
-Remaining: no CI (GitHub Actions), no frontend tests (browser-based).
+Fixtures in `conftest.py`: auto-cleanup, profile/config isolation, cron
+isolation, workspace reset, and test-server lifecycle.
 
 ### Phase J: Performance (Priority: Low, Effort: High)
 
@@ -889,7 +894,8 @@ The api() helper:
     curl -s http://127.0.0.1:8787/health | python3 -m json.tool
 
     # Tail the server log live
-    tail -f /tmp/webui-mvp.log
+    tail -f ~/.hermes/webui/bootstrap-8787.log
+    tail -f ~/.hermes/webui.log  # when launched through ctl.sh
 
     # List all sessions (metadata only)
     curl -s http://127.0.0.1:8787/api/sessions | python3 -m json.tool
@@ -899,15 +905,15 @@ The api() helper:
     curl -s "http://127.0.0.1:8787/api/session?session_id=$SID" | python3 -m json.tool
 
     # Kill and restart server cleanly
-    pkill -f "python.*webui-mvp/server.py"
-    <agent-dir>/webui-mvp/start.sh
+    pkill -f "python.*server.py"
+    <repo>/start.sh
 
     # Check if server process is running
-    ps aux | grep "webui-mvp/server.py"
+    ps aux | grep "server.py"
 
     # Inspect session files on disk
-    ls -lt ~/.hermes/webui-mvp/sessions/
-    cat ~/.hermes/webui-mvp/sessions/SESSION_ID.json | python3 -m json.tool
+    ls -lt ~/.hermes/webui/sessions/
+    cat ~/.hermes/webui/sessions/SESSION_ID.json | python3 -m json.tool
 
     # Count messages in a session
     python3 -c "import json; d=json.load(open('sessions/SID.json')); print(len(d['messages']))"
@@ -920,9 +926,9 @@ The api() helper:
     curl -s http://127.0.0.1:8787/health  # streams not exposed yet, add in Phase G
 
     # Find all sessions with messages (not Untitled empty)
-    ls ~/.hermes/webui-mvp/sessions/ | xargs -I{} python3 -c "
+    ls ~/.hermes/webui/sessions/ | xargs -I{} python3 -c "
     import json, sys
-    d = json.load(open('~/.hermes/webui-mvp/sessions/{}'))
+    d = json.load(open('~/.hermes/webui/sessions/{}'))
     if d['messages']: print('{}', d['title'][:50])
     " 2>/dev/null
 
@@ -1195,31 +1201,22 @@ will be working on this codebase. Read this before touching any file.
 ### Before Making Any Change
 
 1. Read this document (ARCHITECTURE.md) fully. Especially sections 4, 5, and the ADRs.
-2. Read the relevant section of server.py by searching for the SECTION header.
+2. Inspect the relevant module under `api/` or `static/`; `server.py` is only the routing shell.
 3. Check the Sprint Log (Section 15) to understand what was recently changed.
-4. Run the test suite first to confirm baseline: cd <agent-dir> &&
-   venv/bin/python -m pytest webui-mvp/tests/test_sprint1.py -v
+4. Run the relevant test slice first to confirm baseline, for example:
+   venv/bin/python -m pytest tests/test_regressions.py -q
 5. Check server health: curl -s http://127.0.0.1:8787/health
 
 ### Making Changes
 
-Always back up server.py before a non-trivial change:
-    cp server.py server.py.$(date +%Y%m%d_%H%M).bak
-
-Use exact string matching when patching. The pitfalls are documented in the
-hermes-webui-mvp skill. Key ones:
-- Never use sed on this file from the shell. Use execute_code with Python string replace.
-- Always assert the old string is found before replacing (prevents silent no-op patches).
-- Unicode escape sequences in JS (\u2026) exist as literal backslash-u in the file.
-  Match the file's raw content, not interpreted Python strings.
-- The HTML block is a Python raw string (r"""..."""). Standard triple-quote escaping
-  rules do not apply inside it, but Python escape sequences \n etc. work in JS strings
-  inside it as literal two-character sequences.
+Keep edits scoped to the module that owns the behavior. Use exact string
+matching when making mechanical patches and verify that the intended old string
+was found before replacing it.
 
 After any change:
-    venv/bin/python -m py_compile webui-mvp/server.py   # syntax check
+    venv/bin/python -m py_compile server.py             # syntax check
     curl -s http://127.0.0.1:8787/health                # server still alive
-    venv/bin/python -m pytest webui-mvp/tests/ -v       # tests still pass
+    venv/bin/python -m pytest tests/ -v                 # tests still pass
 
 ### Critical Rules (do NOT regress these)
 
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 4a07d3e6..43e43a92 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -48,6 +48,10 @@
 
 - **PR #2150** by @Jordan-SkyLF — "Refresh usage" button on the Provider quota card in Settings → Providers. Calls `/api/provider/quota?refresh=1&ts=<now>` with `cache: 'no-store'` to bypass browser, service worker, and reverse-proxy caches that may have stamped a previous quota response, then re-renders just the quota card from the fresh response and shows a `Last checked ...` timestamp. Disabled `Refreshing…` state during the in-flight request; success toast on completion or failure toast if the refresh fails. Note: the `refresh=1` query param is a no-op at the server today (`get_provider_quota()` has no in-process cache layer), so the win is strictly browser-side cache-bust + the `no-store` fetch option. A future maintainer follow-up may add server-side TTL caching of OAuth account-limit fetches, at which point the `refresh=1` param becomes load-bearing on both sides.
 
+### Documentation
+
+- Refreshed the README / TESTING / ARCHITECTURE current-state snapshots for `v0.51.52`: default model override semantics, test collection counts, file inventory line counts, default state/log paths, and the top-level docs index now match the current code. Also corrected the Docker init banner for `HERMES_WEBUI_STATE_DIR`.
+
 ## [v0.51.51] — 2026-05-12 — Release AA (stage-344 — 16-PR contributor batch — i18n + insights bucketing/mobile + manual-compress async + workspace recovery + iOS PWA scroll + Cloudflare login health + fr locale)
 
 ### Added
diff --git a/README.md b/README.md
index f7563c6c..d6b0df80 100644
--- a/README.md
+++ b/README.md
@@ -267,7 +267,7 @@ Full list of environment variables:
 | `HERMES_WEBUI_PORT` | `8787` | Port |
 | `HERMES_WEBUI_STATE_DIR` | `~/.hermes/webui` | Where sessions and state are stored |
 | `HERMES_WEBUI_DEFAULT_WORKSPACE` | `~/workspace` | Default workspace |
-| `HERMES_WEBUI_DEFAULT_MODEL` | `openai/gpt-5.4-mini` | Default model |
+| `HERMES_WEBUI_DEFAULT_MODEL` | *(provider default)* | Optional model override; leave unset to use the active Hermes provider default |
 | `HERMES_WEBUI_PASSWORD` | *(unset)* | Set to enable password authentication |
 | `HERMES_WEBUI_EXTENSION_DIR` | *(unset)* | Optional local directory served at `/extensions/`; must point to an existing directory before extension injection is enabled |
 | `HERMES_WEBUI_EXTENSION_SCRIPT_URLS` | *(unset)* | Optional comma-separated same-origin script URLs to inject; see [WebUI Extensions](docs/EXTENSIONS.md) |
@@ -367,9 +367,9 @@ Or using the agent venv explicitly:
 /path/to/hermes-agent/venv/bin/python -m pytest tests/ -v
 ```
 
-Tests run against an isolated server on port 8788 with a separate state directory.
-Production data and real cron jobs are never touched. Current count: **3309 tests**
-across 100+ test files.
+Tests run against an isolated server with a separate state directory.
+Production data and real cron jobs are never touched. Current snapshot:
+**5271 tests collected** across **483 test files**.
 
 ---
 
@@ -491,33 +491,33 @@ across 100+ test files.
 ## Architecture
 
 ```
-server.py               HTTP routing shell + auth middleware (~154 lines)
+server.py               HTTP routing shell + auth middleware (~435 lines)
 api/
-  auth.py               Optional password authentication, signed cookies (~201 lines)
-  config.py             Discovery, globals, model detection, reloadable config (~1110 lines)
-  helpers.py            HTTP helpers, security headers (~175 lines)
-  models.py             Session model + CRUD + CLI bridge (~377 lines)
-  onboarding.py         First-run onboarding wizard, OAuth provider support (~507 lines)
-  profiles.py           Profile state management, hermes_cli wrapper (~411 lines)
-  routes.py             All GET + POST route handlers (~2250 lines)
-  state_sync.py         /insights sync — message_count to state.db (~113 lines)
-  streaming.py          SSE engine, run_agent, cancel support (~660 lines)
-  updates.py            Self-update check and release notes (~257 lines)
-  upload.py             Multipart parser, file upload handler (~82 lines)
-  workspace.py          File ops, workspace helpers, git detection (~288 lines)
+  auth.py               Optional password authentication, signed cookies (~366 lines)
+  config.py             Discovery, globals, model detection, reloadable config (~4136 lines)
+  helpers.py            HTTP helpers, security headers (~302 lines)
+  models.py             Session model + CRUD + CLI bridge (~1927 lines)
+  onboarding.py         First-run onboarding wizard, OAuth provider support (~1002 lines)
+  profiles.py           Profile state management, hermes_cli wrapper (~1056 lines)
+  routes.py             All GET + POST route handlers (~9630 lines)
+  state_sync.py         /insights sync — message_count to state.db (~118 lines)
+  streaming.py          SSE engine, run_agent, cancel support (~4404 lines)
+  updates.py            Self-update check and release notes (~545 lines)
+  upload.py             Multipart parser, file upload handler (~284 lines)
+  workspace.py          File ops, workspace helpers, git detection (~810 lines)
 static/
-  index.html            HTML template (~600 lines)
-  style.css             All CSS incl. mobile responsive, themes (~1050 lines)
-  ui.js                 DOM helpers, renderMd, tool cards, context indicator (~1740 lines)
-  workspace.js          File preview, file ops, git badge (~286 lines)
-  sessions.js           Session CRUD, collapsible groups, search, reload recovery (~800 lines)
-  messages.js           send(), SSE handlers, live streaming, session recovery (~655 lines)
-  panels.js             Cron, skills, memory, profiles, settings (~1438 lines)
-  commands.js           Slash command autocomplete (~267 lines)
-  boot.js               Mobile nav, voice input, boot IIFE (~524 lines)
+  index.html            HTML template (~1323 lines)
+  style.css             All CSS incl. mobile responsive, themes (~3767 lines)
+  ui.js                 DOM helpers, renderMd, tool cards, context indicator (~7197 lines)
+  workspace.js          File preview, file ops, git badge (~369 lines)
+  sessions.js           Session CRUD, collapsible groups, search, reload recovery (~3433 lines)
+  messages.js           send(), SSE handlers, live streaming, session recovery (~2301 lines)
+  panels.js             Cron, skills, memory, profiles, settings (~6480 lines)
+  commands.js           Slash command autocomplete (~1302 lines)
+  boot.js               Mobile nav, voice input, boot IIFE (~1607 lines)
 tests/
-  conftest.py           Isolated test server (port 8788)
-  61 test files          961 test functions
+  conftest.py           Isolated test server/state fixtures
+  483 test files         5271 tests collected
 Dockerfile              python:3.12-slim container image
 docker-compose.yml      Compose with named volume and optional auth
 .github/workflows/      CI: multi-arch Docker build + GitHub Release on tag
@@ -537,8 +537,13 @@ State lives outside the repo at `~/.hermes/webui/` by default
 - `CHANGELOG.md` -- release notes per sprint
 - `SPRINTS.md` -- forward sprint plan with CLI + Claude parity targets
 - `THEMES.md` -- theme system documentation, custom theme guide
+- `docs/docker.md` -- Docker compose setup, common failures, and bind-mount migration
+- `docs/supervisor.md` -- launchd, systemd, supervisord, runit, and s6 process-supervisor setup
 - `docs/onboarding.md` -- first-run wizard, provider setup, local model server Base URLs, and safe re-runs
 - `docs/troubleshooting.md` -- diagnostic flows for common failures (e.g. "AIAgent not available")
+- `docs/wsl-autostart.md` -- WSL2 auto-start at Windows login
+- `docs/EXTENSIONS.md` -- administrator-controlled WebUI extension injection
+- `docs/rfcs/README.md` -- RFC index for larger architecture and durability proposals
 
 ## Contributors
 
diff --git a/TESTING.md b/TESTING.md
index ee35af45..a6ea7f20 100644
--- a/TESTING.md
+++ b/TESTING.md
@@ -8,7 +8,7 @@
 > Prerequisites: SSH tunnel is active on port 8787. Open http://localhost:8787 in browser.
 > Server health check: curl http://127.0.0.1:8787/health should return {"status":"ok"}.
 >
-> Automated coverage: 3648 tests collected via `pytest tests/ --collect-only -q`. Tests run on every PR via GitHub Actions on Python 3.11, 3.12, and 3.13. The suite covers the bootstrap/static wizard, real provider config persistence (`config.yaml` + `.env`), the `/api/onboarding/*` backend, the onboarding skip/existing-config guard, CSS regression coverage for thinking/tool card animation, streaming session persistence, mobile layout breakpoints, locale parity across 9 languages, and ~700 issue/PR-pinned regression tests.
+> Automated coverage: 5271 tests collected via `pytest tests/ --collect-only -q`. Tests run on every PR via GitHub Actions on Python 3.11, 3.12, and 3.13. The suite covers the bootstrap/static wizard, real provider config persistence (`config.yaml` + `.env`), the `/api/onboarding/*` backend, the onboarding skip/existing-config guard, CSS regression coverage for thinking/tool card animation, streaming session persistence, mobile layout breakpoints, locale parity across 11 languages, and hundreds of issue/PR-pinned regression tests.
 > Run: `pytest tests/ -v --timeout=60`
 >
 > Local regression focus: verify that a previously closed workspace panel stays visually closed from first paint through boot completion on desktop refresh; there should be no brief open-then-close flash.
@@ -533,7 +533,8 @@ FAIL: Sidebar causes layout overflow or blocks chat.
 ### T11.3: Structured Log Output
 SETUP: SSH access to the server.
 STEPS:
-  1. In a terminal: tail -f /tmp/webui-mvp.log
+  1. In a terminal: tail -f ~/.hermes/webui/bootstrap-8787.log
+     (or tail -f ~/.hermes/webui.log when launched through `ctl.sh`)
   2. In browser: perform any action (load page, send message, click file)
 EXPECT:
   - Log entries appear in terminal as JSON: {"ts":"...","method":"GET","path":"/health","status":200,"ms":0.1}
@@ -577,7 +578,7 @@ FAIL: Browser freezes, crash, or security issue.
 
 ## Automated Test Coverage Reference
 
-These behaviors are verified by pytest (run: venv/bin/python -m pytest webui-mvp/tests/ -v):
+These behaviors are verified by pytest (run: venv/bin/python -m pytest tests/ -v):
 
 Sprint 1 tests (test_sprint1.py):
   - Server health, session CRUD (create/load/update/delete/sort)
@@ -1835,8 +1836,8 @@ Bridged CLI sessions:
 
 ---
 
-*Last updated: v0.51.31, May 9, 2026*
-*Total automated tests collected: 4977*
+*Last updated: v0.51.52, May 13, 2026*
+*Total automated tests collected: 5271*
 *Regression gate: tests/test_regressions.py*
 *Run: pytest tests/ -v --timeout=60*
 *Source: <repo>/*
diff --git a/docker_init.bash b/docker_init.bash
index fbe71780..b45eb0b6 100644
--- a/docker_init.bash
+++ b/docker_init.bash
@@ -277,7 +277,7 @@ rm -f $it || error_exit "Failed to delete test file in /app"
 
 echo ""; echo "== Checking required environment variables for hermes-webui"
 
-echo ""; echo "-- HERMES_WEBUI_VERSION: Where to store sessions, workspaces, and other state (default: ~/.hermes/webui-mvp)"
+echo ""; echo "-- HERMES_WEBUI_STATE_DIR: Where to store sessions, workspaces, and other state (default: ~/.hermes/webui)"
 if [ -z "${HERMES_WEBUI_STATE_DIR+x}" ]; then error_exit "HERMES_WEBUI_STATE_DIR not set"; fi; 
 echo "-- HERMES_WEBUI_STATE_DIR: $HERMES_WEBUI_STATE_DIR"
 if [ ! -d "$HERMES_WEBUI_STATE_DIR" ]; then mkdir -p $HERMES_WEBUI_STATE_DIR || error_exit "Failed to create state directory at $HERMES_WEBUI_STATE_DIR"; fi
diff --git a/tests/conftest.py b/tests/conftest.py
index 8b993538..6d4e7ecc 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -2,8 +2,8 @@
 Shared pytest fixtures for webui-mvp tests.
 
 TEST ISOLATION:
-  Tests run against a SEPARATE server instance on port 8788 with a
-  completely separate state directory. Production data is never touched.
+  Tests run against a SEPARATE server instance on an auto-derived test port
+  with a completely separate state directory. Production data is never touched.
   The test state dir is wiped before each full test run and again on teardown.
 
 PATH DISCOVERY:
@@ -32,7 +32,7 @@ HERMES_HOME = pathlib.Path(os.getenv('HERMES_HOME', str(HOME / '.hermes')))
 
 # ── Test server config ────────────────────────────────────────────────────
 # Port and state dir auto-derive from the repo path when no env var is set,
-# giving every worktree its own isolated port (8800-8899) and state directory.
+# giving every worktree its own isolated port (20000-29999) and state directory.
 # Override with HERMES_WEBUI_TEST_PORT / HERMES_WEBUI_TEST_STATE_DIR to pin.
 
 def _auto_test_port(repo_root) -> int:

From 65fa18c7d96026b9bc032dcd8d9cb63ecf582987 Mon Sep 17 00:00:00 2001
From: Frank Song <franksong2702@gmail.com>
Date: Wed, 13 May 2026 13:38:47 +0800
Subject: [PATCH 2/3] docs: add agent onboarding entrypoint

---
 .gitignore                          |   2 +-
 AGENTS.md                           |  68 +++++++++
 ARCHITECTURE.md                     |   8 +-
 CHANGELOG.md                        |   1 +
 README.md                           |   6 +-
 TESTING.md                          |   4 +-
 docs/onboarding-agent-checklist.md  | 207 ++++++++++++++++++++++++++++
 docs/onboarding.md                  |   9 ++
 tests/test_docs_gitignore_policy.py |   6 +
 9 files changed, 302 insertions(+), 9 deletions(-)
 create mode 100644 AGENTS.md
 create mode 100644 docs/onboarding-agent-checklist.md

diff --git a/.gitignore b/.gitignore
index 28316280..e0f68fbc 100644
--- a/.gitignore
+++ b/.gitignore
@@ -19,7 +19,7 @@ archive/
 !.env.docker.example
 .claude/
 CLAUDE.md
-AGENTS.md
+AGENTS.local.md
 .cursorrules
 .windsurfrules
 .aider*
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 00000000..a916be04
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,68 @@
+# Agent instructions for Hermes WebUI
+
+This file is the shared entry point for AI assistants working in this
+repository. Keep it project-specific and safe to publish. Do not put personal
+machine setup, private network details, credentials, tokens, or local-only
+workflow notes here.
+
+## Read first
+
+Before making changes, read:
+
+1. `README.md`
+2. `CONTRIBUTING.md`
+3. `CHANGELOG.md`
+
+For architecture, testing, or setup work, also read the matching reference:
+
+- `ARCHITECTURE.md` for design constraints and current module layout
+- `TESTING.md` for local verification commands and manual test guidance
+- `docs/onboarding.md` for first-run onboarding behavior
+- `docs/troubleshooting.md` for diagnostic flows
+
+## Onboarding and reinstall support
+
+If the task involves install, reinstall, bootstrap, first-run onboarding,
+provider setup, local model server setup, Docker onboarding, WSL onboarding, or
+support for a failed first run, read `docs/onboarding-agent-checklist.md`
+before running commands or inspecting logs.
+
+Follow that checklist's safety rules:
+
+- use isolated `HERMES_HOME` and `HERMES_WEBUI_STATE_DIR` for trials unless the
+  human explicitly asks to use real state
+- do not delete or overwrite a real `~/.hermes` directory without explicit
+  approval
+- do not print API keys, OAuth tokens, cookies, full `.env` files, full
+  `auth.json` files, or password hashes
+- collect non-secret status and log evidence before recommending a fix
+
+## Contribution style
+
+- Keep changes focused on one logical problem.
+- Prefer the existing Python + vanilla JavaScript structure over new
+  dependencies or build steps.
+- Update docs when changing setup, onboarding, runtime behavior, architecture,
+  or testing guidance.
+- Update `CHANGELOG.md` for user-visible behavior, setup, workflow, or
+  documentation changes that should be release-note ready.
+- For UI or UX changes, follow `CONTRIBUTING.md`: include before/after evidence
+  and test relevant responsive states.
+
+## Local state and secrets
+
+Hermes WebUI can read and write real agent state, sessions, workspaces,
+credentials, and cron data. Treat local validation as potentially destructive
+unless you have confirmed the active state directories.
+
+Prefer isolated trial state for experiments:
+
+```bash
+HERMES_HOME=/tmp/hermes-webui-agent-home \
+HERMES_WEBUI_STATE_DIR=/tmp/hermes-webui-agent-state \
+HERMES_WEBUI_PORT=8789 \
+python3 bootstrap.py
+```
+
+Do not include private machine instructions in this tracked file. Use a
+git-ignored local note for personal workflow details.
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
index c83c19e1..a6bbfc1d 100644
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -8,7 +8,7 @@
 > Keep this document updated as architecture changes are made.
 
 > Current shipped build: `v0.51.52` (May 12, 2026).
-> Automated coverage: 5271 tests via `pytest tests/ --collect-only -q`. CI runs on Python 3.11, 3.12, and 3.13 against every PR.
+> Automated coverage: 5272 tests via `pytest tests/ --collect-only -q`. CI runs on Python 3.11, 3.12, and 3.13 against every PR.
 >
 > Notable architecture state as of v0.51.52: the bootstrap and first-run onboarding flow own setup discovery; the default WebUI state directory is `~/.hermes/webui`; `ctl.sh` provides a daemon wrapper for homelab installs; chat streaming is still WebUI-owned SSE with stream-ownership guards, cancellation, async manual compression, and turn-journal audit plumbing; provider/model discovery is profile-aware with live-model cache invalidation and custom-provider scoping.
 
@@ -76,7 +76,7 @@ actions. The topbar remains focused on conversation context and the workspace/fi
       boot.js              Event wiring, mobile sidebar/workspace nav, voice input, boot IIFE (~1607 lines)
     tests/
       conftest.py          Isolated test server/state fixtures (~630 lines)
-      483 test files       5271 tests collected via pytest
+      483 test files       5272 tests collected via pytest
       test_regressions.py  Permanent regression gate (~976 lines)
     CONTRIBUTING.md        Contributor workflow and PR expectations.
     ROADMAP.md             Feature and product roadmap document.
@@ -700,7 +700,7 @@ Current structure:
         ui.js, workspace.js, sessions.js, messages.js, panels.js, commands.js, boot.js
       tests/
         conftest.py           Isolated test server/state fixtures
-        483 test files        5271 tests collected
+        483 test files        5272 tests collected
         test_regressions.py   Permanent regression gate
 
 Route extraction to api/routes.py completed in Sprint 11. server.py remains a
@@ -800,7 +800,7 @@ Optional password gate for non-SSH-tunnel deployments.
 
 ### Phase I: Test Infrastructure -- COMPLETE
 
-5271 tests across 483 test files + regression gates. The pytest fixture derives
+5272 tests across 483 test files + regression gates. The pytest fixture derives
 an isolated port and state directory from the repo path unless
 `HERMES_WEBUI_TEST_PORT` / `HERMES_WEBUI_TEST_STATE_DIR` pin them explicitly.
 Production data never touched.
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 43e43a92..b0758704 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -51,6 +51,7 @@
 ### Documentation
 
 - Refreshed the README / TESTING / ARCHITECTURE current-state snapshots for `v0.51.52`: default model override semantics, test collection counts, file inventory line counts, default state/log paths, and the top-level docs index now match the current code. Also corrected the Docker init banner for `HERMES_WEBUI_STATE_DIR`.
+- Added a tracked root `AGENTS.md` entry point plus `docs/onboarding-agent-checklist.md` for assistant-led install/reinstall support, with safety rules for real Hermes state, isolated trial commands, non-secret evidence collection, onboarding pass/fail criteria, and a redacted support-report format. Linked the checklist from the README and first-run onboarding guide so assistants helping with setup see it before running commands.
 
 ## [v0.51.51] — 2026-05-12 — Release AA (stage-344 — 16-PR contributor batch — i18n + insights bucketing/mobile + manual-compress async + workspace recovery + iOS PWA scroll + Cloudflare login health + fr locale)
 
diff --git a/README.md b/README.md
index d6b0df80..63c33e8a 100644
--- a/README.md
+++ b/README.md
@@ -135,6 +135,7 @@ The bootstrap will:
 
 If provider setup is still incomplete after install, the onboarding wizard will point you to finish it with `hermes model` instead of trying to replicate the full CLI setup in-browser.
 For a step-by-step walkthrough of the wizard, provider choices, local model server Base URLs, and safe re-runs, see [`docs/onboarding.md`](docs/onboarding.md).
+If an AI assistant is helping with install, reinstall, bootstrap, provider setup, or first-run support, have it read [`docs/onboarding-agent-checklist.md`](docs/onboarding-agent-checklist.md) before running commands or inspecting logs.
 
 ---
 
@@ -369,7 +370,7 @@ Or using the agent venv explicitly:
 
 Tests run against an isolated server with a separate state directory.
 Production data and real cron jobs are never touched. Current snapshot:
-**5271 tests collected** across **483 test files**.
+**5272 tests collected** across **483 test files**.
 
 ---
 
@@ -517,7 +518,7 @@ static/
   boot.js               Mobile nav, voice input, boot IIFE (~1607 lines)
 tests/
   conftest.py           Isolated test server/state fixtures
-  483 test files         5271 tests collected
+  483 test files         5272 tests collected
 Dockerfile              python:3.12-slim container image
 docker-compose.yml      Compose with named volume and optional auth
 .github/workflows/      CI: multi-arch Docker build + GitHub Release on tag
@@ -540,6 +541,7 @@ State lives outside the repo at `~/.hermes/webui/` by default
 - `docs/docker.md` -- Docker compose setup, common failures, and bind-mount migration
 - `docs/supervisor.md` -- launchd, systemd, supervisord, runit, and s6 process-supervisor setup
 - `docs/onboarding.md` -- first-run wizard, provider setup, local model server Base URLs, and safe re-runs
+- `docs/onboarding-agent-checklist.md` -- safety rules, evidence commands, and pass/fail checks for assistant-led install or reinstall support
 - `docs/troubleshooting.md` -- diagnostic flows for common failures (e.g. "AIAgent not available")
 - `docs/wsl-autostart.md` -- WSL2 auto-start at Windows login
 - `docs/EXTENSIONS.md` -- administrator-controlled WebUI extension injection
diff --git a/TESTING.md b/TESTING.md
index a6ea7f20..324f2420 100644
--- a/TESTING.md
+++ b/TESTING.md
@@ -8,7 +8,7 @@
 > Prerequisites: SSH tunnel is active on port 8787. Open http://localhost:8787 in browser.
 > Server health check: curl http://127.0.0.1:8787/health should return {"status":"ok"}.
 >
-> Automated coverage: 5271 tests collected via `pytest tests/ --collect-only -q`. Tests run on every PR via GitHub Actions on Python 3.11, 3.12, and 3.13. The suite covers the bootstrap/static wizard, real provider config persistence (`config.yaml` + `.env`), the `/api/onboarding/*` backend, the onboarding skip/existing-config guard, CSS regression coverage for thinking/tool card animation, streaming session persistence, mobile layout breakpoints, locale parity across 11 languages, and hundreds of issue/PR-pinned regression tests.
+> Automated coverage: 5272 tests collected via `pytest tests/ --collect-only -q`. Tests run on every PR via GitHub Actions on Python 3.11, 3.12, and 3.13. The suite covers the bootstrap/static wizard, real provider config persistence (`config.yaml` + `.env`), the `/api/onboarding/*` backend, the onboarding skip/existing-config guard, CSS regression coverage for thinking/tool card animation, streaming session persistence, mobile layout breakpoints, locale parity across 11 languages, and hundreds of issue/PR-pinned regression tests.
 > Run: `pytest tests/ -v --timeout=60`
 >
 > Local regression focus: verify that a previously closed workspace panel stays visually closed from first paint through boot completion on desktop refresh; there should be no brief open-then-close flash.
@@ -1837,7 +1837,7 @@ Bridged CLI sessions:
 ---
 
 *Last updated: v0.51.52, May 13, 2026*
-*Total automated tests collected: 5271*
+*Total automated tests collected: 5272*
 *Regression gate: tests/test_regressions.py*
 *Run: pytest tests/ -v --timeout=60*
 *Source: <repo>/*
diff --git a/docs/onboarding-agent-checklist.md b/docs/onboarding-agent-checklist.md
new file mode 100644
index 00000000..df62f90e
--- /dev/null
+++ b/docs/onboarding-agent-checklist.md
@@ -0,0 +1,207 @@
+# Agent-assisted onboarding checklist
+
+This checklist is for an AI assistant helping a human install, reinstall, or
+debug Hermes WebUI onboarding. It does not replace the human first-run wizard.
+Use it before running bootstrap commands, inspecting logs, or recommending a
+cleanup path.
+
+If you are an AI assistant, read this file before assisting with onboarding,
+bootstrap, provider setup, reinstall, or first-run support.
+
+## Role split
+
+The human operator owns:
+
+- choosing the install path
+- choosing the provider and model
+- entering API keys, OAuth codes, and passwords
+- approving any cleanup of a real Hermes home
+- approving any external exposure outside localhost
+
+The assistant owns:
+
+- using isolated trial directories unless the human explicitly says otherwise
+- checking non-secret status endpoints and logs
+- explaining which step passed or failed
+- collecting redacted evidence for Discord or GitHub support
+- stopping before destructive cleanup, credential handling, or public exposure
+
+## Hard safety rules
+
+- Do not delete, move, or overwrite the real `~/.hermes` directory unless the
+  human explicitly asks for that exact action.
+- Do not print API keys, OAuth tokens, cookies, full `.env` files, full
+  `auth.json` files, or password hashes.
+- Do not modify real cron jobs, real sessions, real profiles, or real memory
+  files during an onboarding trial.
+- Do not expose WebUI on a public interface without password protection and
+  explicit human approval.
+- Do not proxy or tunnel local service checks such as `localhost`,
+  `127.0.0.1`, private LAN addresses, or Docker container loopback paths.
+
+## Pre-flight
+
+Confirm the basic context:
+
+```bash
+pwd
+git branch --show-current
+git rev-parse --short HEAD
+python3 --version
+```
+
+Check whether repo-local environment overrides will affect bootstrap:
+
+```bash
+test -f .env && grep -n 'HERMES_HOME\|HERMES_WEBUI_STATE_DIR\|HERMES_WEBUI_PORT\|HERMES_WEBUI_HOST' .env
+```
+
+If `.env` exists, do not print the full file. Inspect only the specific
+non-secret keys needed to understand the active Hermes home, WebUI state
+directory, port, or host.
+
+## Isolated local trial
+
+Use an isolated Hermes home and WebUI state directory for a reinstall or support
+trial. This keeps the test away from the operator's real memory, sessions,
+profiles, credentials, and cron state.
+
+```bash
+mkdir -p ~/hermes-onboarding-test
+HERMES_HOME=~/hermes-onboarding-test/.hermes \
+HERMES_WEBUI_STATE_DIR=~/hermes-onboarding-test/webui \
+HERMES_WEBUI_PORT=8789 \
+python3 bootstrap.py
+```
+
+Open:
+
+```text
+http://127.0.0.1:8789
+```
+
+The bootstrap writes a port-specific log under the selected WebUI state
+directory:
+
+```text
+~/hermes-onboarding-test/webui/bootstrap-8789.log
+```
+
+For daemon-style installs, `ctl.sh` writes the daemon log to the active
+`HERMES_HOME` by default:
+
+```text
+~/.hermes/webui.log
+```
+
+When using the isolated trial environment, prefer the bootstrap command above
+unless the human specifically wants to validate `ctl.sh`.
+
+## Non-secret evidence commands
+
+After the server starts, collect status without secrets:
+
+```bash
+curl -sS http://127.0.0.1:8789/health
+curl -sS http://127.0.0.1:8789/api/onboarding/status
+find ~/hermes-onboarding-test -maxdepth 3 -type f | sort
+tail -n 120 ~/hermes-onboarding-test/webui/bootstrap-8789.log
+```
+
+When summarizing `/api/onboarding/status`, focus on:
+
+- `completed`
+- `system.hermes_found`
+- `system.imports_ok`
+- `system.config_path`
+- `system.config_exists`
+- `system.setup_state`
+- `system.provider_configured`
+- `system.provider_ready`
+- `system.chat_ready`
+- `system.current_provider`
+- `system.current_model`
+- `system.current_base_url`
+- `system.env_path`
+
+Do not paste the full payload if it contains unexpected sensitive local paths
+or values. Redact paths and provider details when the human asks for a public
+GitHub or Discord support report.
+
+## Pass criteria
+
+A local onboarding trial passes when:
+
+- `/health` returns successfully.
+- `/api/onboarding/status` returns JSON.
+- The wizard appears when `completed` is false.
+- The wizard stays out of the way when `completed` is true or
+  `HERMES_WEBUI_SKIP_ONBOARDING=1` is intentionally set.
+- `system.hermes_found` and `system.imports_ok` match the expected bootstrap
+  state.
+- `system.provider_ready` and `system.chat_ready` become true after the human
+  completes a provider path that should support chat.
+- `system.config_path` and `system.env_path` point inside the intended isolated
+  `HERMES_HOME` during a trial.
+- WebUI files are written under the intended `HERMES_WEBUI_STATE_DIR`.
+
+If the human chooses a provider that must be completed in the CLI, passing can
+mean the wizard correctly points them to `hermes model` or `hermes auth` rather
+than trying to collect unsupported credentials in the browser.
+
+## Failure triage
+
+If the server does not start:
+
+- check the bootstrap log
+- check for a port conflict on `8789`
+- confirm Python can run `bootstrap.py`
+- confirm `.env` is not overriding the isolated directories or port
+
+If onboarding reports `agent_unavailable`:
+
+- confirm the bootstrap found or installed Hermes Agent
+- check whether the running Python can import `run_agent.AIAgent`
+- use `docs/troubleshooting.md`, especially the `AIAgent not available` flow
+
+If onboarding reports `provider_incomplete`:
+
+- confirm whether the provider is API-key based, OAuth based, or local
+- let the human enter credentials or run the CLI auth flow
+- do not ask the human to paste secrets into chat
+
+If a local model server does not probe successfully:
+
+- from native macOS/Linux, use `http://127.0.0.1:<port>/v1` when the server is
+  on the same host
+- from Docker Desktop, use `http://host.docker.internal:<port>/v1`
+- from another LAN machine, use the server's LAN IP and `/v1`
+- remember that `localhost` inside a container is the container itself
+
+If password or reverse-proxy behavior is confusing:
+
+- keep the first pass on `127.0.0.1`
+- require password protection before exposing WebUI beyond localhost
+- include the reverse proxy shape in the support report without pasting tokens
+  or cookies
+
+## Final support report
+
+Use this shape when reporting results to the human, Discord, or GitHub:
+
+```text
+Install path:
+OS / Python:
+Repo commit:
+Command used:
+WebUI URL:
+State isolation:
+Health result:
+Onboarding status summary:
+Files created or changed:
+Log excerpt:
+Pass/fail:
+Next recommended action:
+```
+
+Redact secrets and private paths before posting publicly.
diff --git a/docs/onboarding.md b/docs/onboarding.md
index f6409f96..b53b68fc 100644
--- a/docs/onboarding.md
+++ b/docs/onboarding.md
@@ -3,6 +3,11 @@
 This guide explains what happens the first time Hermes WebUI starts, which
 setup path to choose, and how to recover when the wizard cannot finish.
 
+If an AI assistant is helping with install, reinstall, bootstrap, provider
+setup, or first-run support, read
+[`docs/onboarding-agent-checklist.md`](onboarding-agent-checklist.md) before
+running commands or inspecting logs.
+
 The short version: run the bootstrap, open the WebUI, choose a provider, choose
 a workspace, optionally set a password, then start a chat. If you are using a
 local model server from Docker, pay special attention to the Base URL section
@@ -55,6 +60,10 @@ python3 bootstrap.py
 
 Then open `http://127.0.0.1:8789`.
 
+For an assistant-led trial run, follow the safety rules, evidence commands, and
+pass/fail criteria in
+[`docs/onboarding-agent-checklist.md`](onboarding-agent-checklist.md).
+
 If your repo has a `.env` file, remember that the bootstrap loads it. Remove or
 adjust any `HERMES_HOME`, `HERMES_WEBUI_STATE_DIR`, or `HERMES_WEBUI_PORT`
 entries there before using the isolated command above.
diff --git a/tests/test_docs_gitignore_policy.py b/tests/test_docs_gitignore_policy.py
index a2729fae..de0271db 100644
--- a/tests/test_docs_gitignore_policy.py
+++ b/tests/test_docs_gitignore_policy.py
@@ -20,6 +20,12 @@ def test_new_top_level_markdown_docs_are_trackable():
     assert _git_check_ignore("docs/example-new-guide.md").returncode == 1
 
 
+def test_root_agents_entrypoint_is_trackable():
+    """AGENTS.md is the shared repo entrypoint; local overrides stay ignored."""
+    assert _git_check_ignore("AGENTS.md").returncode == 1
+    assert _git_check_ignore("AGENTS.local.md").returncode == 0
+
+
 def test_docs_scratch_files_remain_ignored():
     """The broad docs/* ignore rule should still keep arbitrary scratch files out."""
     assert _git_check_ignore("docs/local-scratch.tmp").returncode == 0

From 155a727ec1882d6cc66cbb3fb07eae085f0ef48c Mon Sep 17 00:00:00 2001
From: Frank Song <franksong2702@gmail.com>
Date: Wed, 13 May 2026 16:56:21 +0800
Subject: [PATCH 3/3] docs: refresh current snapshot for v0.51.54

---
 ARCHITECTURE.md | 40 ++++++++++++++++++++--------------------
 CHANGELOG.md    |  2 +-
 README.md       | 16 ++++++++--------
 TESTING.md      |  6 +++---
 4 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
index a6bbfc1d..a0f92b9f 100644
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -7,10 +7,10 @@
 >
 > Keep this document updated as architecture changes are made.
 
-> Current shipped build: `v0.51.52` (May 12, 2026).
-> Automated coverage: 5272 tests via `pytest tests/ --collect-only -q`. CI runs on Python 3.11, 3.12, and 3.13 against every PR.
+> Current shipped build: `v0.51.54` (May 13, 2026).
+> Automated coverage: 5303 tests via `pytest tests/ --collect-only -q`. CI runs on Python 3.11, 3.12, and 3.13 against every PR.
 >
-> Notable architecture state as of v0.51.52: the bootstrap and first-run onboarding flow own setup discovery; the default WebUI state directory is `~/.hermes/webui`; `ctl.sh` provides a daemon wrapper for homelab installs; chat streaming is still WebUI-owned SSE with stream-ownership guards, cancellation, async manual compression, and turn-journal audit plumbing; provider/model discovery is profile-aware with live-model cache invalidation and custom-provider scoping.
+> Notable architecture state as of v0.51.54: the bootstrap and first-run onboarding flow own setup discovery; the default WebUI state directory is `~/.hermes/webui`; `ctl.sh` provides a daemon wrapper for homelab installs; chat streaming is still WebUI-owned SSE with stream-ownership guards, cancellation, async manual compression, and turn-journal audit plumbing; provider/model discovery is profile-aware with live-model cache invalidation and custom-provider scoping.
 
 ---
 
@@ -43,7 +43,7 @@ actions. The topbar remains focused on conversation context and the workspace/fi
 ## 2. File Inventory
 
     <repo>/
-    server.py              Thin routing shell + HTTP Handler + auth middleware. ~435 lines.
+    server.py              Thin routing shell + HTTP Handler + auth middleware. ~446 lines.
                            Delegates all route handling to api/routes.py.
     bootstrap.py           One-shot launcher: optional agent install, deps, health wait, browser open.
     start.sh               Thin wrapper around bootstrap.py for shell-based startup.
@@ -53,30 +53,30 @@ actions. The topbar remains focused on conversation context and the workspace/fi
     api/
       __init__.py          Package marker
       auth.py              Optional password authentication, signed cookies (~366 lines)
-      config.py            Discovery, globals, model detection, reloadable config (~4136 lines)
+      config.py            Discovery, globals, model detection, reloadable config (~4139 lines)
       helpers.py           HTTP helpers: j(), bad(), require(), safe_resolve(), security headers (~302 lines)
       models.py            Session model + CRUD, per-session profile tracking (~1927 lines)
       profiles.py          Profile state management, hermes_cli wrapper (~1056 lines)
       onboarding.py        First-run onboarding status, real provider config writes, OAuth linking, and readiness detection (~1002 lines)
-      routes.py            All GET + POST route handlers (~9630 lines)
-      startup.py           Startup helpers: auto_install_agent_deps() (~50 lines)
-      streaming.py         SSE engine, run_agent, cancel, HERMES_HOME save/restore (~4404 lines)
+      routes.py            All GET + POST route handlers (~9772 lines)
+      startup.py           Startup helpers: auto_install_agent_deps() (~128 lines)
+      streaming.py         SSE engine, run_agent, cancel, HERMES_HOME save/restore (~4420 lines)
       upload.py            Multipart parser, file upload handler (~284 lines)
       workspace.py         File ops: list_dir, read_file_content, workspace helpers (~810 lines)
     static/
       index.html           HTML template (~1323 lines)
       style.css            All CSS incl. mobile responsive (~3767 lines)
-      ui.js                DOM helpers, renderMd, tool cards, model dropdown, file tree (~7197 lines)
+      ui.js                DOM helpers, renderMd, tool cards, model dropdown, file tree (~7216 lines)
       workspace.js         File preview, file ops, loadDir, clearPreview (~369 lines)
-      sessions.js          Session CRUD, list rendering, search, SVG icons, dropdown actions (~3433 lines)
+      sessions.js          Session CRUD, list rendering, search, SVG icons, dropdown actions (~3517 lines)
       messages.js          send(), SSE event handlers, approval, transcript (~2301 lines)
       panels.js            Cron, skills, memory, workspace, profiles, todo, settings (~6480 lines)
       commands.js          Slash command registry, parser, autocomplete dropdown (~1302 lines)
       onboarding.js        First-run wizard overlay, provider setup flow, and settings/workspace orchestration.
       boot.js              Event wiring, mobile sidebar/workspace nav, voice input, boot IIFE (~1607 lines)
     tests/
-      conftest.py          Isolated test server/state fixtures (~630 lines)
-      483 test files       5272 tests collected via pytest
+      conftest.py          Isolated test server/state fixtures (~644 lines)
+      488 test files       5303 tests collected via pytest
       test_regressions.py  Permanent regression gate (~976 lines)
     CONTRIBUTING.md        Contributor workflow and PR expectations.
     ROADMAP.md             Feature and product roadmap document.
@@ -370,9 +370,9 @@ highlighting), Mermaid.js (diagrams), xterm.js, and KaTeX assets loaded with the
 current static template's integrity/CSP assumptions.
 
 Core JS modules loaded by the app include:
-  1. ui.js         (~7197 lines) DOM helpers, renderMd, tool card rendering, global state
+  1. ui.js         (~7216 lines) DOM helpers, renderMd, tool card rendering, global state
   2. workspace.js  (~369 lines) File tree, preview, file operations
-  3. sessions.js  (~3433 lines) Session CRUD, list rendering, search, SVG icons, dropdown actions, project picker
+  3. sessions.js  (~3517 lines) Session CRUD, list rendering, search, SVG icons, dropdown actions, project picker
   4. messages.js  (~2301 lines) send(), SSE event handlers, approval, transcript
   5. panels.js    (~6480 lines) Cron, skills, memory, workspace, profiles, todo, settings
   6. commands.js  (~1302 lines) Slash command registry, parser, autocomplete dropdown
@@ -684,23 +684,23 @@ Split server.py into a proper package. Completed across Sprints 4-10.
 Current structure:
 
     <repo>/
-      server.py               Entry point + HTTP Handler dispatch (~435 lines)
+      server.py               Entry point + HTTP Handler dispatch (~446 lines)
       api/
         __init__.py
-        routes.py             All GET + POST route handlers (~9630 lines)
-        config.py             Configuration, constants, global state, model discovery (~4136 lines)
+        routes.py             All GET + POST route handlers (~9772 lines)
+        config.py             Configuration, constants, global state, model discovery (~4139 lines)
         helpers.py            HTTP helpers: j(), bad(), require(), safe_resolve() (~302 lines)
         models.py             Session model + CRUD (~1927 lines)
         workspace.py          File ops, workspace management (~810 lines)
         upload.py             Multipart parser, file upload handler (~284 lines)
-        streaming.py          SSE engine, run_agent, cancel support (~4404 lines)
+        streaming.py          SSE engine, run_agent, cancel support (~4420 lines)
       static/
         index.html            HTML document (served from disk)
         style.css             All CSS (~3767 lines)
         ui.js, workspace.js, sessions.js, messages.js, panels.js, commands.js, boot.js
       tests/
         conftest.py           Isolated test server/state fixtures
-        483 test files        5272 tests collected
+        488 test files        5303 tests collected
         test_regressions.py   Permanent regression gate
 
 Route extraction to api/routes.py completed in Sprint 11. server.py remains a
@@ -800,7 +800,7 @@ Optional password gate for non-SSH-tunnel deployments.
 
 ### Phase I: Test Infrastructure -- COMPLETE
 
-5272 tests across 483 test files + regression gates. The pytest fixture derives
+5303 tests across 488 test files + regression gates. The pytest fixture derives
 an isolated port and state directory from the repo path unless
 `HERMES_WEBUI_TEST_PORT` / `HERMES_WEBUI_TEST_STATE_DIR` pin them explicitly.
 Production data never touched.
diff --git a/CHANGELOG.md b/CHANGELOG.md
index b0758704..9edf4833 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -50,7 +50,7 @@
 
 ### Documentation
 
-- Refreshed the README / TESTING / ARCHITECTURE current-state snapshots for `v0.51.52`: default model override semantics, test collection counts, file inventory line counts, default state/log paths, and the top-level docs index now match the current code. Also corrected the Docker init banner for `HERMES_WEBUI_STATE_DIR`.
+- Refreshed the README / TESTING / ARCHITECTURE current-state snapshots for `v0.51.54`: default model override semantics, test collection counts, file inventory line counts, default state/log paths, and the top-level docs index now match the current code. Also corrected the Docker init banner for `HERMES_WEBUI_STATE_DIR`.
 - Added a tracked root `AGENTS.md` entry point plus `docs/onboarding-agent-checklist.md` for assistant-led install/reinstall support, with safety rules for real Hermes state, isolated trial commands, non-secret evidence collection, onboarding pass/fail criteria, and a redacted support-report format. Linked the checklist from the README and first-run onboarding guide so assistants helping with setup see it before running commands.
 
 ## [v0.51.51] — 2026-05-12 — Release AA (stage-344 — 16-PR contributor batch — i18n + insights bucketing/mobile + manual-compress async + workspace recovery + iOS PWA scroll + Cloudflare login health + fr locale)
diff --git a/README.md b/README.md
index 63c33e8a..2539daff 100644
--- a/README.md
+++ b/README.md
@@ -370,7 +370,7 @@ Or using the agent venv explicitly:
 
 Tests run against an isolated server with a separate state directory.
 Production data and real cron jobs are never touched. Current snapshot:
-**5272 tests collected** across **483 test files**.
+**5303 tests collected** across **488 test files**.
 
 ---
 
@@ -492,33 +492,33 @@ Production data and real cron jobs are never touched. Current snapshot:
 ## Architecture
 
 ```
-server.py               HTTP routing shell + auth middleware (~435 lines)
+server.py               HTTP routing shell + auth middleware (~446 lines)
 api/
   auth.py               Optional password authentication, signed cookies (~366 lines)
-  config.py             Discovery, globals, model detection, reloadable config (~4136 lines)
+  config.py             Discovery, globals, model detection, reloadable config (~4139 lines)
   helpers.py            HTTP helpers, security headers (~302 lines)
   models.py             Session model + CRUD + CLI bridge (~1927 lines)
   onboarding.py         First-run onboarding wizard, OAuth provider support (~1002 lines)
   profiles.py           Profile state management, hermes_cli wrapper (~1056 lines)
-  routes.py             All GET + POST route handlers (~9630 lines)
+  routes.py             All GET + POST route handlers (~9772 lines)
   state_sync.py         /insights sync — message_count to state.db (~118 lines)
-  streaming.py          SSE engine, run_agent, cancel support (~4404 lines)
+  streaming.py          SSE engine, run_agent, cancel support (~4420 lines)
   updates.py            Self-update check and release notes (~545 lines)
   upload.py             Multipart parser, file upload handler (~284 lines)
   workspace.py          File ops, workspace helpers, git detection (~810 lines)
 static/
   index.html            HTML template (~1323 lines)
   style.css             All CSS incl. mobile responsive, themes (~3767 lines)
-  ui.js                 DOM helpers, renderMd, tool cards, context indicator (~7197 lines)
+  ui.js                 DOM helpers, renderMd, tool cards, context indicator (~7216 lines)
   workspace.js          File preview, file ops, git badge (~369 lines)
-  sessions.js           Session CRUD, collapsible groups, search, reload recovery (~3433 lines)
+  sessions.js           Session CRUD, collapsible groups, search, reload recovery (~3517 lines)
   messages.js           send(), SSE handlers, live streaming, session recovery (~2301 lines)
   panels.js             Cron, skills, memory, profiles, settings (~6480 lines)
   commands.js           Slash command autocomplete (~1302 lines)
   boot.js               Mobile nav, voice input, boot IIFE (~1607 lines)
 tests/
   conftest.py           Isolated test server/state fixtures
-  483 test files         5272 tests collected
+  488 test files         5303 tests collected
 Dockerfile              python:3.12-slim container image
 docker-compose.yml      Compose with named volume and optional auth
 .github/workflows/      CI: multi-arch Docker build + GitHub Release on tag
diff --git a/TESTING.md b/TESTING.md
index 324f2420..b472570a 100644
--- a/TESTING.md
+++ b/TESTING.md
@@ -8,7 +8,7 @@
 > Prerequisites: SSH tunnel is active on port 8787. Open http://localhost:8787 in browser.
 > Server health check: curl http://127.0.0.1:8787/health should return {"status":"ok"}.
 >
-> Automated coverage: 5272 tests collected via `pytest tests/ --collect-only -q`. Tests run on every PR via GitHub Actions on Python 3.11, 3.12, and 3.13. The suite covers the bootstrap/static wizard, real provider config persistence (`config.yaml` + `.env`), the `/api/onboarding/*` backend, the onboarding skip/existing-config guard, CSS regression coverage for thinking/tool card animation, streaming session persistence, mobile layout breakpoints, locale parity across 11 languages, and hundreds of issue/PR-pinned regression tests.
+> Automated coverage: 5303 tests collected via `pytest tests/ --collect-only -q`. Tests run on every PR via GitHub Actions on Python 3.11, 3.12, and 3.13. The suite covers the bootstrap/static wizard, real provider config persistence (`config.yaml` + `.env`), the `/api/onboarding/*` backend, the onboarding skip/existing-config guard, CSS regression coverage for thinking/tool card animation, streaming session persistence, mobile layout breakpoints, locale parity across 11 languages, and hundreds of issue/PR-pinned regression tests.
 > Run: `pytest tests/ -v --timeout=60`
 >
 > Local regression focus: verify that a previously closed workspace panel stays visually closed from first paint through boot completion on desktop refresh; there should be no brief open-then-close flash.
@@ -1836,8 +1836,8 @@ Bridged CLI sessions:
 
 ---
 
-*Last updated: v0.51.52, May 13, 2026*
-*Total automated tests collected: 5272*
+*Last updated: v0.51.54, May 13, 2026*
+*Total automated tests collected: 5303*
 *Regression gate: tests/test_regressions.py*
 *Run: pytest tests/ -v --timeout=60*
 *Source: <repo>/*