Merge branch 'master' into fix/csp-report-only

This commit is contained in:
ai-ag2026
2026-05-11 17:26:45 +02:00
committed by GitHub
371 changed files with 60340 additions and 4016 deletions
+1 -1
View File
@@ -15,7 +15,7 @@
# Port to listen on (default: 8787)
# HERMES_WEBUI_PORT=8787
# Where to store sessions, workspaces, and other state (default: ~/.hermes/webui-mvp)
# Where to store sessions, workspaces, and other state (default: ~/.hermes/webui)
# HERMES_WEBUI_STATE_DIR=~/.hermes/webui
# Default workspace directory shown on first launch
+8 -1
View File
@@ -24,7 +24,14 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pyyaml>=6.0 pytest pytest-timeout
pip install pyyaml>=6.0 pytest pytest-timeout pytest-asyncio
# Install the `mcp` package so tests/test_mcp_server.py runs in CI.
# The package is an optional runtime dep of mcp_server.py — users
# who run the MCP integration install it themselves; CI installs
# it so test coverage exists. If mcp install fails (Python 3.13
# wheel not yet available, etc.), tests/test_mcp_server.py uses
# importorskip and the matrix stays green.
pip install mcp || echo "mcp install failed — test_mcp_server.py will importorskip"
- name: Run tests
run: pytest tests/ -v --timeout=60
+5
View File
@@ -40,8 +40,11 @@ Thumbs.db
docs/*
!docs/ui-ux/
!docs/ui-ux/**
!docs/rfcs/
!docs/rfcs/**
!docs/docker.md
!docs/supervisor.md
!docs/troubleshooting.md
# Local-only PR review harness: rendering drivers, sample bank, fixtures.
# Used by Claude during deep reviews; never shared in the repo.
@@ -49,3 +52,5 @@ docs/*
graphify-out/
.graphify_cached.json
.graphify_uncached.txt
.venv/
+2075
View File
File diff suppressed because it is too large Load Diff
+65 -33
View File
@@ -1,61 +1,93 @@
# Contributors
Hermes WebUI is a community project. **66 people** have shipped code that landed in a release tag, including the long tail of folks whose work was salvaged into batch releases. This file is the canonical credit roll. Numbers are merged-PR count plus release-batch credit (a contributor whose patch was extracted into a clean PR or merged via squash gets the same credit as a standalone PR).
Hermes WebUI is a community project. **130 people** have shipped code that landed in a release tag including the long tail of folks whose work was salvaged into batch releases or absorbed via Co-authored-by trailers. This file is the canonical credit roll.
**Total contributors tracked:** 66
**Total PRs landed:** 142
**Last refreshed:** v0.50.245, 2026-04-30
A contributor's PR count is the number of distinct PRs they get credit for: PRs they authored that merged directly, PRs they authored that were closed-but-absorbed into a release commit (batch merges, salvage rewrites), and PRs where they were explicitly attributed in `CHANGELOG.md`. All three count the same.
Generated from `git log` + `gh api repos/.../pulls?state=closed` + the `CHANGELOG.md` attribution lines. If your name is missing or wrong, open a PR against `CONTRIBUTORS.md` — we cross-check against the changelog on each release.
**Total contributors tracked:** 130
**Total PR credits:** 568
**Last refreshed:** v0.51.44, 2026-05-11
Generated from `git log` + the GitHub PR list (merged and closed) + the `CHANGELOG.md` attribution lines (`PR #N by @user`, `(credit: @user)`, `@user — PR #N`). If your name is missing or wrong, open a PR against `CONTRIBUTORS.md` — we cross-check against the changelog on each release.
---
## Top contributors (5+ merged PRs)
## Top contributors (5+ PRs landed)
| # | Contributor | PRs | First release | Latest release |
|---|---|---:|---|---|
| 1 | [@franksong2702](https://github.com/franksong2702) | 22 | `v0.50.49` 2026-04-15 | `v0.50.245` 2026-04-30 |
| 2 | [@bergeouss](https://github.com/bergeouss) | 18 | `v0.50.49` 2026-04-15 | `v0.50.240` 2026-04-30 |
| 3 | [@aronprins](https://github.com/aronprins) | 8 | `v0.47.0` 2026-04-11 | `v0.50.77` 2026-04-17 |
| 4 | [@iRonin](https://github.com/iRonin) | 6 | `v0.41.0` 2026-04-10 | `v0.41.0` 2026-04-10 |
| 5 | [@24601](https://github.com/24601) | 6 | `v0.50.201` 2026-04-28 | `v0.50.201` 2026-04-28 |
| 1 | [@franksong2702](https://github.com/franksong2702) | 92 | `v0.49.3` | `v0.51.44` |
| 2 | [@Michaelyklam](https://github.com/Michaelyklam) | 81 | `v0.50.240` | `v0.51.40` |
| 3 | [@bergeouss](https://github.com/bergeouss) | 61 | `v0.48.0` | `v0.51.18` |
| 4 | [@ai-ag2026](https://github.com/ai-ag2026) | 49 | `v0.50.279` | `v0.51.44` |
| 5 | [@dso2ng](https://github.com/dso2ng) | 21 | `v0.50.227` | `v0.51.37` |
| 6 | [@jasonjcwu](https://github.com/jasonjcwu) | 13 | `v0.50.227` | `v0.51.43` |
| 7 | [@aronprins](https://github.com/aronprins) | 10 | `v0.44.0` | `v0.50.233` |
| 8 | [@JKJameson](https://github.com/JKJameson) | 10 | `v0.50.233` | `v0.51.31` |
| 9 | [@ccqqlo](https://github.com/ccqqlo) | 9 | `v0.44.0` | `v0.50.270` |
| 10 | [@24601](https://github.com/24601) | 8 | `v0.50.233` | `v0.51.5` |
| 11 | [@starship-s](https://github.com/starship-s) | 8 | `v0.50.128` | `v0.51.8` |
| 12 | [@armorbreak001](https://github.com/armorbreak001) | 7 | `v0.50.47` | `v0.50.50` |
| 13 | [@NocGeek](https://github.com/NocGeek) | 7 | `v0.50.251` | `v0.50.252` |
| 14 | [@Hinotoi-agent](https://github.com/Hinotoi-agent) | 6 | `v0.50.12` | `v0.51.44` |
| 15 | [@iRonin](https://github.com/iRonin) | 6 | `v0.41.0` | `v0.41.0` |
| 16 | [@Jordan-SkyLF](https://github.com/Jordan-SkyLF) | 6 | `v0.50.18` | `v0.50.27` |
| 17 | [@Sanjays2402](https://github.com/Sanjays2402) | 6 | `v0.50.292` | `v0.51.31` |
| 18 | [@cloudyun888](https://github.com/cloudyun888) | 5 | `v0.50.47` | `v0.50.140` |
| 19 | [@fxd-jason](https://github.com/fxd-jason) | 5 | `v0.50.245` | `v0.50.249` |
| 20 | [@happy5318](https://github.com/happy5318) | 5 | `v0.50.238` | `v0.51.31` |
## Sustained contributors (34 merged PRs)
## Sustained contributors (34 PRs landed)
| Contributor | PRs | Highlights |
|---|---:|---|
| [@renheqiang](https://github.com/renheqiang) | 4 | feat: add full Russian (ru-RU) localization — v0.50.93 |
| [@KingBoyAndGirl](https://github.com/KingBoyAndGirl) | 4 | fix: trust custom provider base_url in SSRF validation; fix: fetch live models for custom provider from model.base_u |
| [@ccqqlo](https://github.com/ccqqlo) | 3 | `v0.50.83` batch credit |
| [@deboste](https://github.com/deboste) | 3 | fix(frontend): use URL origin for fetch/EventSource to suppo; fix(api): resolve model provider from config to prevent misr |
| [@frap129](https://github.com/frap129) | 3 | fix(docker): Install Open SSH client; fix(docker): Install all dependencies for agent |
| Contributor | PRs | First release | Latest release |
|---|---:|---|---|
| [@bsgdigital](https://github.com/bsgdigital) | 4 | `v0.50.228` | `v0.50.258` |
| [@fecolinhares](https://github.com/fecolinhares) | 4 | `v0.50.238` | `v0.50.250` |
| [@frap129](https://github.com/frap129) | 4 | `v0.50.140` | `v0.50.233` |
| [@KingBoyAndGirl](https://github.com/KingBoyAndGirl) | 4 | `v0.50.238` | `v0.50.240` |
| [@qxxaa](https://github.com/qxxaa) | 4 | `v0.50.233` | `v0.51.37` |
| [@renheqiang](https://github.com/renheqiang) | 4 | `v0.50.61` | `v0.50.95` |
| [@Thanatos-Z](https://github.com/Thanatos-Z) | 4 | `v0.50.257` | `v0.50.278` |
| [@AlexeyDsov](https://github.com/AlexeyDsov) | 3 | `v0.50.267` | `v0.50.278` |
| [@deboste](https://github.com/deboste) | 3 | `v0.50.269` | `v0.50.297` |
| [@dutchaiagency](https://github.com/dutchaiagency) | 3 | `v0.50.281` | `v0.50.286` |
| [@pavolbiely](https://github.com/pavolbiely) | 3 | `v0.50.159` | `v0.50.233` |
## Two-PR contributors
## Two-PR contributors (14)
[@dso2ng](https://github.com/dso2ng), [@Michaelyklam](https://github.com/Michaelyklam), [@mmartial](https://github.com/mmartial), [@renatomott](https://github.com/renatomott), [@zichen0116](https://github.com/zichen0116), [@pavolbiely](https://github.com/pavolbiely), [@bsgdigital](https://github.com/bsgdigital), [@vansour](https://github.com/vansour), [@fecolinhares](https://github.com/fecolinhares).
[@ChaseFlorell](https://github.com/ChaseFlorell), [@dobby-d-elf](https://github.com/dobby-d-elf), [@gabogabucho](https://github.com/gabogabucho), [@hacker1e7](https://github.com/hacker1e7), [@lost9999](https://github.com/lost9999), [@mmartial](https://github.com/mmartial), [@nickgiulioni1](https://github.com/nickgiulioni1), [@renatomott](https://github.com/renatomott), [@ruxme](https://github.com/ruxme), [@Saik0s](https://github.com/Saik0s), [@shruggr](https://github.com/shruggr), [@TaraTheStar](https://github.com/TaraTheStar), [@vansour](https://github.com/vansour), [@zichen0116](https://github.com/zichen0116).
## Single-PR contributors
## Single-PR contributors (85)
Each of these folks landed exactly one merged change — bug fixes, locale work, doc improvements, infrastructure tweaks. Every one of them moved the project forward.
Each of these folks landed exactly one PR that shipped a bug fix, a locale, a security hardening, a doc improvement, an infrastructure tweak. Every one moved the project forward.
[@Argonaut790](https://github.com/Argonaut790), [@betamod](https://github.com/betamod), [@bschmidy10](https://github.com/bschmidy10), [@carlytwozero](https://github.com/carlytwozero), [@cloudyun888](https://github.com/cloudyun888), [@davidsben](https://github.com/davidsben), [@DavidSchuchert](https://github.com/DavidSchuchert), [@DrMaks22](https://github.com/DrMaks22), [@eba8](https://github.com/eba8), [@fxd-jason](https://github.com/fxd-jason), [@gabogabucho](https://github.com/gabogabucho), [@GiggleSamurai](https://github.com/GiggleSamurai), [@hacker2005](https://github.com/hacker2005), [@halmisen](https://github.com/halmisen), [@happy5318](https://github.com/happy5318), [@hi-friday](https://github.com/hi-friday), [@Hinotoi-agent](https://github.com/Hinotoi-agent), [@huangzt](https://github.com/huangzt), [@jeffscottward](https://github.com/jeffscottward), [@JKJameson](https://github.com/JKJameson), [@KayZz69](https://github.com/KayZz69), [@kcclaw001](https://github.com/kcclaw001), [@kevin-ho](https://github.com/kevin-ho), [@mangodxd](https://github.com/mangodxd), [@mariosam95](https://github.com/mariosam95), [@MatzAgent](https://github.com/MatzAgent), [@mbac](https://github.com/mbac), [@migueltavares](https://github.com/migueltavares), [@nickgiulioni1](https://github.com/nickgiulioni1), [@octo-patch](https://github.com/octo-patch), [@qxxaa](https://github.com/qxxaa), [@ruxme](https://github.com/ruxme), [@SaulgoodMan-C](https://github.com/SaulgoodMan-C), [@smurmann](https://github.com/smurmann), [@Stampede](https://github.com/Stampede), [@starship-s](https://github.com/starship-s), [@suinia](https://github.com/suinia), [@TaraTheStar](https://github.com/TaraTheStar), [@tgaalman](https://github.com/tgaalman), [@thadreber-web](https://github.com/thadreber-web), [@the-own-lab](https://github.com/the-own-lab), [@vcavichini](https://github.com/vcavichini), [@vCillusion](https://github.com/vCillusion), [@woaijiadanoo](https://github.com/woaijiadanoo), [@xingyue52077](https://github.com/xingyue52077), [@yunyunyunyun-yun](https://github.com/yunyunyunyun-yun), [@yzp12138](https://github.com/yzp12138).
[@29n](https://github.com/29n), [@amlyczz](https://github.com/amlyczz), [@andrewy-wizard](https://github.com/andrewy-wizard), [@Argonaut790](https://github.com/Argonaut790), [@Asunfly](https://github.com/Asunfly), [@betamod](https://github.com/betamod), [@Bobby9228](https://github.com/Bobby9228), [@bschmidy10](https://github.com/bschmidy10), [@carlytwozero](https://github.com/carlytwozero), [@davidsben](https://github.com/davidsben), [@DavidSchuchert](https://github.com/DavidSchuchert), [@DelightRun](https://github.com/DelightRun), [@DrMaks22](https://github.com/DrMaks22), [@eba8](https://github.com/eba8), [@eov128](https://github.com/eov128), [@galvani](https://github.com/galvani), [@GeoffBao](https://github.com/GeoffBao), [@georgebdavis](https://github.com/georgebdavis), [@GiggleSamurai](https://github.com/GiggleSamurai), [@hacker2005](https://github.com/hacker2005), [@halmisen](https://github.com/halmisen), [@hermes-gimmethebeans](https://github.com/hermes-gimmethebeans), [@hi-friday](https://github.com/hi-friday), [@hualong1009](https://github.com/hualong1009), [@huangzt](https://github.com/huangzt), [@indigokarasu](https://github.com/indigokarasu), [@insecurejezza](https://github.com/insecurejezza), [@jeffscottward](https://github.com/jeffscottward), [@Jellypowered](https://github.com/Jellypowered), [@jimdawdy-hub](https://github.com/jimdawdy-hub), [@JinYue-GitHub](https://github.com/JinYue-GitHub), [@joaompfp](https://github.com/joaompfp), [@jundev0001](https://github.com/jundev0001), [@KayZz69](https://github.com/KayZz69), [@kcclaw001](https://github.com/kcclaw001), [@kevin-ho](https://github.com/kevin-ho), [@koshikai](https://github.com/koshikai), [@kowenhaoai](https://github.com/kowenhaoai), [@lawrencel1ng](https://github.com/lawrencel1ng), [@likawa3b](https://github.com/likawa3b), [@lucky-yonug](https://github.com/lucky-yonug), [@lx3133584](https://github.com/lx3133584), [@MacLeodMike](https://github.com/MacLeodMike), [@mangodxd](https://github.com/mangodxd), [@mariosam95](https://github.com/mariosam95), [@MatzAgent](https://github.com/MatzAgent), [@mbac](https://github.com/mbac), [@michael-dg](https://github.com/michael-dg), [@migueltavares](https://github.com/migueltavares), [@mittyok](https://github.com/mittyok), [@ng-technology-llc](https://github.com/ng-technology-llc), [@octo-patch](https://github.com/octo-patch), [@rhelmer](https://github.com/rhelmer), [@rickchew](https://github.com/rickchew), [@ryan-remeo](https://github.com/ryan-remeo), [@ryansombraio](https://github.com/ryansombraio), [@s905060](https://github.com/s905060), [@samuelgudi](https://github.com/samuelgudi), [@SaulgoodMan-C](https://github.com/SaulgoodMan-C), [@sbe27](https://github.com/sbe27), [@shaoxianbilly](https://github.com/shaoxianbilly), [@sheng-di](https://github.com/sheng-di), [@sixianli](https://github.com/sixianli), [@skspade](https://github.com/skspade), [@smurmann](https://github.com/smurmann), [@snuffxxx](https://github.com/snuffxxx), [@spektro33](https://github.com/spektro33), [@Stampede](https://github.com/Stampede), [@suinia](https://github.com/suinia), [@sunnysktsang](https://github.com/sunnysktsang), [@tgaalman](https://github.com/tgaalman), [@thadreber-web](https://github.com/thadreber-web), [@the-own-lab](https://github.com/the-own-lab), [@tomaioo](https://github.com/tomaioo), [@trucuit](https://github.com/trucuit), [@vcavichini](https://github.com/vcavichini), [@vCillusion](https://github.com/vCillusion), [@vikarag](https://github.com/vikarag), [@wali-reheman](https://github.com/wali-reheman), [@watzon](https://github.com/watzon), [@woaijiadanoo](https://github.com/woaijiadanoo), [@xingyue52077](https://github.com/xingyue52077), [@yunyunyunyun-yun](https://github.com/yunyunyunyun-yun), [@yzp12138](https://github.com/yzp12138), [@zenc-cp](https://github.com/zenc-cp).
---
## How credit is tracked
Most PRs in this repo land via one of three paths:
Most PRs in this repo land via one of four paths:
1. **Direct merge** — your PR is reviewed and merged on its own. Author shows up directly in `git log`.
2. **Squash into a batch release** — your PR is merged together with several other contributor PRs into a single release commit (e.g. `release: v0.50.24510-PR batch`). The squashed commit carries a `Co-authored-by: <you>` trailer plus an entry in `CHANGELOG.md` crediting you by username and PR number.
3. **Salvaged from a larger PR** — when a PR mixes one good change with several unrelated or risky ones, we sometimes split it: the good parts ship in a clean follow-up PR, you get credit in the CHANGELOG entry, and the original PR is closed with a salvage map showing what went where.
1. **Direct merge** — your PR is reviewed and merged on its own. Author shows up directly in `git log` and on the PR's `merged_at` timestamp.
2. **Squash into a batch release** — your PR is merged together with several other contributor PRs into a single release commit (e.g. `release: v0.51.445-PR contributor batch`). The original PR closes (not merges) on GitHub but the squashed release commit carries a `Co-authored-by: <you>` trailer plus an entry in `CHANGELOG.md` crediting you by username and PR number.
3. **Salvaged from a larger PR** — when a PR mixes one good change with several unrelated or risky ones, we split it: the good parts ship in a clean follow-up PR, you get credit in the CHANGELOG entry, and the original PR is closed with a salvage map showing what went where.
4. **Auto-rebase + auto-fix** — for merge-ready contributor PRs with mechanical blockers (CHANGELOG conflicts, lint, drifted tests), a maintainer rebases the contributor's branch, fixes the blockers, and force-pushes back. The `Co-authored-by` trailer preserves your authorship.
All three paths count as a contribution. The number next to your name above is the total of merged PRs (path 1) plus PRs where you got attribution credit in CHANGELOG.md (paths 2 and 3).
All four paths count as a contribution. GitHub's `merged_at` field only catches path 1; paths 2-4 show as "closed" on the contributor's PR even though the work is live in master. That's why this file consults the CHANGELOG attribution lines, not just GitHub's merged-PR list.
## Special thanks
- **[@aronprins](https://github.com/aronprins)** — `v0.50.0` UI overhaul (PR #242). The CSS-only redesign that defined the design tokens, theme architecture, and three-panel layout that the rest of the app builds on. The PR didn't merge as-is — it was reshaped through `v0.50.0` but it is the design language of the app.
- **[@franksong2702](https://github.com/franksong2702)** — most prolific external contributor. Mobile/responsive layout, session sidebar polish, cron output preservation, streaming-session sidebar exemption, and a long tail of profile/workspace fixes.
- **[@bergeouss](https://github.com/bergeouss)** — provider-management UI, OAuth status, two-container Docker docs, profile isolation hardening. Most of what users see when they touch Settings → Providers is bergeouss's work.
- **[@aronprins](https://github.com/aronprins)** — `v0.50.0` UI overhaul (PR #242). The CSS-only redesign that defined the design tokens, theme architecture, and three-panel layout that the rest of the app builds on. PR #242 didn't merge as-is, but it is the design language of the app.
- **[@franksong2702](https://github.com/franksong2702)** — most prolific external contributor across the project's history. 92 PRs spanning the session sidebar, mobile/responsive layout, workspace state machine, profile context, slash autocomplete, breadcrumb navigation, streaming-session exemption, cron output preservation, embedded terminal, and a long tail of polish.
- **[@Michaelyklam](https://github.com/Michaelyklam)** — most prolific contributor of late-2025/early-2026. 81 PRs covering Docker hardening, profile-scoped skills, KaTeX delimiter parsing, Codex quota surfacing, Goal command, Kanban polish, auto-compression toast lifetime, and the localization parity backfills.
- **[@bergeouss](https://github.com/bergeouss)** — provider-management UI, OAuth status, two-container Docker docs, profile isolation hardening, Reveal-in-Finder, the OpenRouter free-tier live fetch, and most of Settings → Providers. 61 PRs.
- **[@ai-ag2026](https://github.com/ai-ag2026)** — autonomous-AI contributor (Hermes Agent-driven). 49 PRs focused on session recovery (state.db sidecar reconciliation, orphan `.bak` recovery, audit + safe-repair endpoints), workspace/run lifecycle health, and the crash-safe turn-journal RFC.
- **[@iRonin](https://github.com/iRonin)** — security hardening sprint (PRs #196#204): session memory leak fix, CSP + Permissions-Policy headers, slow-client connection timeout, optional HTTPS/TLS, upstream branch tracking, CLI session file-browser support. Six consecutive, focused, high-quality security PRs.
- **[@indigokarasu](https://github.com/indigokarasu)** — visual redesign proposal (PR #213). Icon rail sidebar, design token system, 7 themes. Didn't merge as-is but shaped the design language that landed in v0.50.0.
- **[@zenc-cp](https://github.com/zenc-cp)** — anti-hallucination guard for the ReAct loop (PR #133). Three-layer approach (ephemeral prompt, live token filtering, session-history cleanup) that the streaming pipeline still uses.
- **[@Jordan-SkyLF](https://github.com/Jordan-SkyLF)** — live streaming, session recovery, workspace fallback (PRs #366, #367, #394#397). Six interlocking improvements that landed across v0.50.18v0.50.27.
- **[@deboste](https://github.com/deboste)** — reverse-proxy auth, mobile responsive layout, model routing (PRs #3, #4, #5). Three of the very first community PRs. Early foundation work.
- **[@Hinotoi-agent](https://github.com/Hinotoi-agent)** — security fixes spanning profile `.env` isolation (PR #351), session-import workspace validation (PR #2048), and bandit B105 hardening. Subtle, high-leverage credential and path-traversal fixes.
If you've contributed and aren't here, **open a PR**. We cross-check the CHANGELOG, but if a credit fell through (a Co-authored-by trailer that didn't make it into the changelog entry, an attribution in a comment that should be on the PR), this list is the right place to fix it.
If you've contributed and aren't here, **open a PR**. We cross-check the CHANGELOG on every release, but if a credit fell through (a Co-authored-by trailer that didn't make it into the changelog entry, an attribution in a PR comment that should be in the release notes), this list is the right place to fix it.
+1 -1
View File
@@ -140,7 +140,7 @@ Use almost no shadows in the transcript. Shadows are reserved for popovers, drop
### Tool/thinking activity group
Collapsed by default in settled history and during live runs. Summary line uses one disclosure for internals, e.g. `Activity: thinking + 4 tools · read_file, patch, terminal`. Expanding reveals thinking and individual tool cards together. Thinking and tools should not create separate transcript rows unless there is an error or approval state that needs attention.
Collapsed by default in settled history and during live runs unless the user has explicitly opened that Activity row before. Persist open/closed disclosure state per chat and per turn, so switching away from a chat and coming back preserves the mode the user left it in. Summary line uses one disclosure for internals and stays intentionally terse, e.g. `Activity: 4 tools`. It should not repeat the always-present thinking area, list individual tool names, or add a second trailing count badge. Expanding reveals thinking and individual tool cards together. Thinking and tools should not create separate transcript rows unless there is an error or approval state that needs attention.
### Tool card
+12 -22
View File
@@ -21,10 +21,11 @@ RUN apt-get update -y --fix-missing --no-install-recommends \
apt-utils \
locales \
ca-certificates \
sudo \
curl \
rsync \
openssh-client \
git \
xz-utils \
&& apt-get upgrade -y \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
@@ -41,24 +42,12 @@ ENV PYTHONDONTWRITEBYTECODE=1 \
WORKDIR /apptoo
# Every sudo group user does not need a password
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
# Create a new group for the hermeswebui and hermeswebuitoo users
RUN groupadd -g 1024 hermeswebui \
&& groupadd -g 1025 hermeswebuitoo
# The hermeswebui (resp. hermeswebuitoo) user will have UID 1024 (resp. 1025),
# be part of the hermeswebui (resp. hermeswebuitoo) and users groups and be sudo capable (passwordless)
RUN useradd -u 1024 -d /home/hermeswebui -g hermeswebui -s /bin/bash -m hermeswebui \
&& usermod -G users hermeswebui \
&& adduser hermeswebui sudo
RUN useradd -u 1025 -d /home/hermeswebuitoo -g hermeswebuitoo -s /bin/bash -m hermeswebuitoo \
&& usermod -G users hermeswebuitoo \
&& adduser hermeswebuitoo sudo
RUN chown -R hermeswebuitoo:hermeswebuitoo /apptoo
USER root
# Create the unprivileged runtime user. The entrypoint starts as root only for
# UID/GID alignment and filesystem preparation, then execs the server as this user.
RUN groupadd -g 1024 hermeswebui \
&& useradd -u 1024 -d /home/hermeswebui -g hermeswebui -G users -s /bin/bash -m hermeswebui \
&& mkdir -p /app /uv_cache \
&& chown -R hermeswebui:hermeswebui /home/hermeswebui /app /uv_cache
COPY --chmod=555 docker_init.bash /hermeswebui_init.bash
@@ -75,9 +64,7 @@ USER root
# The init script will skip the download when uv is already on PATH.
RUN curl -LsSf https://astral.sh/uv/install.sh | env UV_INSTALL_DIR=/usr/local/bin sh
USER hermeswebuitoo
COPY --chown=hermeswebuitoo:hermeswebuitoo . /apptoo
COPY --chown=root:root . /apptoo
# Bake the git version tag into the image so the settings badge works even
# when .git is not present (it is excluded by .dockerignore).
@@ -95,5 +82,8 @@ EXPOSE 8787
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD curl -f http://localhost:8787/health || exit 1
# docker_init.bash performs root-only bind-mount setup, then drops to hermeswebui
# before starting the WebUI server. The production image does not ship sudo.
USER root
CMD ["/hermeswebui_init.bash"]
+107 -110
View File
@@ -109,6 +109,18 @@ Or keep using the shell launcher:
./start.sh
```
For self-hosted VM or homelab installs, `ctl.sh` wraps the common daemon lifecycle commands without requiring `fuser` or `pkill`:
```bash
./ctl.sh start # background daemon, PID at ~/.hermes/webui.pid
./ctl.sh status # PID, uptime, bound host/port, log path, /health
./ctl.sh logs --lines 100 # tail ~/.hermes/webui.log
./ctl.sh restart
./ctl.sh stop
```
`ctl.sh start` runs the bootstrap in foreground/no-browser mode behind the daemon wrapper, writes logs to `~/.hermes/webui.log`, and respects `.env` plus inline overrides such as `HERMES_WEBUI_HOST=0.0.0.0 ./ctl.sh start`.
The bootstrap will:
1. Detect Hermes Agent and, if missing, attempt the official installer (`curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash`).
@@ -118,8 +130,11 @@ The bootstrap will:
5. Drop you into a first-run onboarding wizard inside the WebUI.
> Native Windows is not supported for this bootstrap yet. Use Linux, macOS, or WSL2.
> For Windows / WSL auto-start at login, see [`docs/wsl-autostart.md`](docs/wsl-autostart.md).
> A community-maintained native Windows guide is tracked in [#1952](https://github.com/nesquena/hermes-webui/issues/1952).
If provider setup is still incomplete after install, the onboarding wizard will point you to finish it with `hermes model` instead of trying to replicate the full CLI setup in-browser.
For a step-by-step walkthrough of the wizard, provider choices, local model server Base URLs, and safe re-runs, see [`docs/onboarding.md`](docs/onboarding.md).
---
@@ -218,7 +233,7 @@ For the deep dive on each of these, see [`docs/docker.md`](docs/docker.md).
|---|---|
| Hermes agent dir | `HERMES_WEBUI_AGENT_DIR` env, then `~/.hermes/hermes-agent`, then sibling `../hermes-agent` |
| Python executable | Agent venv first, then `.venv` in this repo, then system `python3` |
| State directory | `HERMES_WEBUI_STATE_DIR` env, then `~/.hermes/webui-mvp` |
| State directory | `HERMES_WEBUI_STATE_DIR` env, then `~/.hermes/webui` |
| Default workspace | `HERMES_WEBUI_DEFAULT_WORKSPACE` env, then `~/workspace`, then state dir |
| Port | `HERMES_WEBUI_PORT` env or first argument, default `8787` |
@@ -248,9 +263,9 @@ Full list of environment variables:
|---|---|---|
| `HERMES_WEBUI_AGENT_DIR` | auto-discovered | Path to the hermes-agent checkout |
| `HERMES_WEBUI_PYTHON` | auto-discovered | Python executable |
| `HERMES_WEBUI_HOST` | `127.0.0.1` | Bind address |
| `HERMES_WEBUI_HOST` | `127.0.0.1` | Bind address (`0.0.0.0` for all IPv4, `::` for all IPv6, `::1` for IPv6 loopback) |
| `HERMES_WEBUI_PORT` | `8787` | Port |
| `HERMES_WEBUI_STATE_DIR` | `~/.hermes/webui-mvp` | Where sessions and state are stored |
| `HERMES_WEBUI_STATE_DIR` | `~/.hermes/webui` | Where sessions and state are stored |
| `HERMES_WEBUI_DEFAULT_WORKSPACE` | `~/workspace` | Default workspace |
| `HERMES_WEBUI_DEFAULT_MODEL` | `openai/gpt-5.4-mini` | Default model |
| `HERMES_WEBUI_PASSWORD` | *(unset)* | Set to enable password authentication |
@@ -362,7 +377,7 @@ across 100+ test files.
### Chat and agent
- Streaming responses via SSE (tokens appear as they are generated)
- Multi-provider model support -- any Hermes API provider (OpenAI, Anthropic, Google, DeepSeek, Nous Portal, OpenRouter, MiniMax, Z.AI); dynamic model dropdown populated from configured keys
- Multi-provider model support -- any Hermes API provider (OpenAI, Anthropic, Google, DeepSeek, Nous Portal, OpenRouter, MiniMax, Xiaomi MiMo, Z.AI); dynamic model dropdown populated from configured keys
- Send a message while one is processing -- it queues automatically
- Edit any past user message inline and regenerate from that point
- Retry the last assistant response with one click
@@ -508,7 +523,7 @@ docker-compose.yml Compose with named volume and optional auth
.github/workflows/ CI: multi-arch Docker build + GitHub Release on tag
```
State lives outside the repo at `~/.hermes/webui-mvp/` by default
State lives outside the repo at `~/.hermes/webui/` by default
(sessions, workspaces, settings, projects, last_workspace). Override with `HERMES_WEBUI_STATE_DIR`.
---
@@ -522,138 +537,120 @@ State lives outside the repo at `~/.hermes/webui-mvp/` by default
- `CHANGELOG.md` -- release notes per sprint
- `SPRINTS.md` -- forward sprint plan with CLI + Claude parity targets
- `THEMES.md` -- theme system documentation, custom theme guide
- `docs/onboarding.md` -- first-run wizard, provider setup, local model server Base URLs, and safe re-runs
- `docs/troubleshooting.md` -- diagnostic flows for common failures (e.g. "AIAgent not available")
## Contributors
Hermes WebUI is built with help from the open-source community. Every PR — whether merged directly or incorporated via batch release — shapes the project, and we're grateful to everyone who has taken the time to contribute.
Hermes WebUI is built with help from the open-source community. Every PR — whether merged directly, absorbed into a batch release, or salvaged from a larger proposal — shapes the project, and we're grateful to everyone who has taken the time to contribute.
**66 contributors have shipped code that landed in a release tag** as of v0.50.245. The full credit roll lives in [`CONTRIBUTORS.md`](CONTRIBUTORS.md). The highlights:
**130 contributors have shipped code that landed in a release tag** as of v0.51.44. The full credit roll lives in [`CONTRIBUTORS.md`](CONTRIBUTORS.md). The highlights:
### Top contributors (by merged-PR count)
### Top contributors (by PR count, including absorbed/batch-released work)
| # | Contributor | PRs | First → latest release |
|---|---|---:|---|
| 1 | [@franksong2702](https://github.com/franksong2702) | 22 | `v0.50.49``v0.50.245` |
| 2 | [@bergeouss](https://github.com/bergeouss) | 18 | `v0.50.49``v0.50.240` |
| 3 | [@aronprins](https://github.com/aronprins) | 8 | `v0.47.0``v0.50.77` |
| 4 | [@iRonin](https://github.com/iRonin) | 6 | `v0.41.0` |
| 5 | [@24601](https://github.com/24601) | 6 | `v0.50.201` |
| 6 | [@KingBoyAndGirl](https://github.com/KingBoyAndGirl) | 4 | `v0.50.232``v0.50.237` |
| 7 | [@renheqiang](https://github.com/renheqiang) | 4 | `v0.50.93` |
| 8 | [@ccqqlo](https://github.com/ccqqlo) | 3 | `v0.50.83``v0.50.207` |
| 9 | [@deboste](https://github.com/deboste) | 3 | `v0.16.1` |
| 10 | [@frap129](https://github.com/frap129) | 3 | `v0.50.157``v0.50.166` |
| 1 | [@franksong2702](https://github.com/franksong2702) | 92 | `v0.49.3``v0.51.44` |
| 2 | [@Michaelyklam](https://github.com/Michaelyklam) | 81 | `v0.50.240``v0.51.40` |
| 3 | [@bergeouss](https://github.com/bergeouss) | 61 | `v0.48.0``v0.51.18` |
| 4 | [@ai-ag2026](https://github.com/ai-ag2026) | 49 | `v0.50.279``v0.51.44` |
| 5 | [@dso2ng](https://github.com/dso2ng) | 21 | `v0.50.227``v0.51.37` |
| 6 | [@jasonjcwu](https://github.com/jasonjcwu) | 13 | `v0.50.227``v0.51.43` |
| 7 | [@aronprins](https://github.com/aronprins) | 10 | `v0.44.0` `v0.50.233` |
| 8 | [@JKJameson](https://github.com/JKJameson) | 10 | `v0.50.233``v0.51.31` |
| 9 | [@ccqqlo](https://github.com/ccqqlo) | 9 | `v0.44.0``v0.50.270` |
| 10 | [@24601](https://github.com/24601) | 8 | `v0.50.233``v0.51.5` |
See [`CONTRIBUTORS.md`](CONTRIBUTORS.md) for the full ranked list of all 66 contributors, including everyone with one or two merged PRs and the special-thanks roll for design and architectural contributions.
See [`CONTRIBUTORS.md`](CONTRIBUTORS.md) for the full ranked list of all 130 contributors, including everyone with one or two PRs and the special-thanks roll for design and architectural contributions.
### Notable contributions
**[@aronprins](https://github.com/aronprins)** — v0.50.0 UI overhaul (PR #242)
The biggest single contribution to the project: a complete UI redesign that moved model/profile/workspace controls into the composer footer, replaced the gear-icon settings panel with the Hermes Control Center (tabbed modal), removed the activity bar in favor of inline composer status, redesigned the session list with a `⋯` action dropdown, and added the workspace panel state machine. 26 commits, thoroughly designed and iterated through multiple review rounds.
**[@franksong2702](https://github.com/franksong2702)** — Most prolific external contributor (92 PRs, `v0.49.3``v0.51.44`)
Across the longest tenure of any external contributor: the session title guard (#301), breadcrumb workspace navigation (#302), embedded workspace terminal (#1099), worktree-backed session creation (#2053), onboarding documentation (#2052), composer footer container queries, streaming-session sidebar exemption (#1327), session sidecar repair, cron output preservation (#1295), profile default workspace persistence, and a long tail of polish across mobile/responsive, the session sidebar, and the workspace state machine.
**[@Michaelyklam](https://github.com/Michaelyklam)** — Most prolific contributor of recent releases (81 PRs, `v0.50.240``v0.51.40`)
Production Docker hardening (#1921, drops sudo-capable staging user), profile-scoped skills endpoints (#1903), gateway PID resolution under profile-scoped HERMES_HOME (#1901), profile-aware AIAgent cache (#1898/#1904), backslash LaTeX delimiters (#1848), Codex quota error surfacing (#1770), shell-route HTML 503 (#1836), stale Kanban client recovery (#1828), context auto-compression toast lifetime (#1988), `/goal` command (#1866), Kanban detail-view scrolling (#1916), CLI session tool metadata preservation (#1778), Traditional Chinese kanban locale backfill (#1979).
**[@bergeouss](https://github.com/bergeouss)** — Provider management UI + Docker hardening (61 PRs, `v0.48.0``v0.51.18`)
Provider management UI for adding/editing custom providers from Settings, OAuth provider status detection (#1552), two-container Docker setup, profile isolation hardening (per-profile `.env` secrets), the bulk of what users see when they touch Settings → Providers, Reveal-in-Finder context menu (#1551), gateway status card (#1552), auto-assign session to active project filter (#1550), "What's new?" link in update banner (#1549), OpenRouter free-tier live fetch (#1548), credential pool 401 self-heal (#1553), inline provider chip + group model count in model picker (#1644).
**[@ai-ag2026](https://github.com/ai-ag2026)** — Session recovery + audit infrastructure (49 PRs, `v0.50.279``v0.51.44`)
Autonomous-AI contributor (Hermes Agent-driven) focused on durability: `state.db`-backed sidecar reconciliation (#2041), orphan `.json.bak` recovery on startup (#2035), read-only session recovery audit endpoints (#2036, #2040), active run lifecycle in `/health` (#2039), crash-safe turn-journal RFC at `docs/rfcs/turn-journal.md` (#2042), fork-session compression lineage isolation (#2014).
**[@dso2ng](https://github.com/dso2ng)** — Session lineage + diagnostics (21 PRs, `v0.50.227``v0.51.37`)
`/api/session/lineage-report/<sid>` endpoint for bounded session graph diagnostics (#2012), stale Mermaid render error cleanup (#1337), and a long tail of frontend reliability fixes around session loading.
**[@jasonjcwu](https://github.com/jasonjcwu)** — Composer + transcript polish (13 PRs, `v0.50.227``v0.51.43`)
Sidebar collapse via active-rail click (#2054, fuses #1884 + #1924), composer chip lightbox (#1758), title fixes for tool-heavy first turns, and a string of frontend polish fixes.
**[@aronprins](https://github.com/aronprins)** — `v0.50.0` UI overhaul (PR #242, plus 9 follow-ups)
The biggest single contribution to the project: a complete UI redesign that moved model/profile/workspace controls into the composer footer, replaced the gear-icon settings panel with the Hermes Control Center (tabbed modal), removed the activity bar in favor of inline composer status, redesigned the session list with a `⋯` action dropdown, and added the workspace panel state machine. Plus chat transcript redesign (#587), sidebar declutter (#584), three-column layout refactor (#899), light/dark theme + accent skins (#627), and shared `confirm()`/`prompt()` dialog replacement (PR #251 extracted from #242).
**[@iRonin](https://github.com/iRonin)** — Security hardening sprint (PRs #196#204)
Six consecutive security and reliability PRs: session memory leak fix (expired token pruning), Content-Security-Policy + Permissions-Policy headers, 30-second slow-client connection timeout, optional HTTPS/TLS support via environment variables, upstream branch tracking fix for self-update, and CLI session support in the file browser API. This is the kind of focused, high-quality security work that makes a self-hosted tool trustworthy.
Six consecutive, focused security PRs: session memory leak fix (expired token pruning), CSP + Permissions-Policy headers, 30-second slow-client connection timeout, optional HTTPS/TLS support via environment variables, upstream branch tracking fix for self-update, and CLI session support in the file-browser API. The kind of focused, high-quality security work that makes a self-hosted tool trustworthy.
**[@Jordan-SkyLF](https://github.com/Jordan-SkyLF)** — Live streaming + session recovery (PRs #366, #367, #394#397)
Six interlocking improvements: workspace fallback resolution, live reasoning cards that upgrade the generic thinking spinner to a real-time reasoning display, durable session state recovery via `localStorage` so in-flight tool cards survive a page reload, plus relative time labels and imported-session timestamp preservation.
**[@JKJameson](https://github.com/JKJameson)** — Composer + session polish (10 PRs)
Persistent composer draft per session (#1956), and a long tail of polish across the composer and session sidebar.
**[@gabogabucho](https://github.com/gabogabucho)** — Spanish locale + onboarding wizard
Full Spanish (`es`) locale covering all UI strings, plus the one-shot bootstrap onboarding wizard that guides new users through provider setup on first launch.
**[@deboste](https://github.com/deboste)** — Reverse-proxy auth + mobile responsive layout (PRs #3, #4, #5)
Three of the very first community PRs: fixed EventSource/fetch to use URL origin for reverse-proxy setups, corrected model provider routing from config, and added mobile responsive layout with dvh viewport fix. Early foundation work.
**[@indigokarasu](https://github.com/indigokarasu)** — Visual redesign proposal (PR #213)
A CSS-only redesign of the full UI — proper design tokens, an icon rail sidebar replacing the emoji tab strip, consistent form cards, breadcrumb nav, and 7 built-in themes as custom properties. The PR didn't merge as-is but shaped the design language and theme architecture that shipped in v0.50.0.
**[@zenc-cp](https://github.com/zenc-cp)** — Anti-hallucination guard for the ReAct loop (PR #133)
A three-layer approach (ephemeral anti-hallucination prompt, live token filtering, session-history cleanup) that the streaming pipeline still uses.
**[@Hinotoi-agent](https://github.com/Hinotoi-agent)** — Profile + session security (PRs #351, #2048)
Profile `.env` secret isolation fix (PR #351) preventing API key leakage between profiles, and session-import workspace validation (PR #2048) blocking a crafted-JSON file-read against `/`.
**[@Sanjays2402](https://github.com/Sanjays2402)** — Endless-scroll + Start-jump race fix (PR #1949)
A generation-token + mutex pair fixing the v0.51.30 race between endless-scroll prefetch and Start-jump's `_ensureAllMessagesLoaded`. The naive same-flag-check approach (proposed in #1942 and #1962) was a no-op for the post-await race — Sanjays2402's fix was the correct shape.
**[@fxd-jason](https://github.com/fxd-jason)** — Real-time approval + clarify via SSE (PRs #1350, #1355)
Replaced 1.5s HTTP polling with SSE long-connections for both approval and clarify, cutting latency from up to 1.5s to near-instant. Got all the correctness details right (atomic subscribe + snapshot, notify-inside-lock, head-of-queue payload, trailing event re-emission).
**[@happy5318](https://github.com/happy5318)** — Custom provider model dedup (PR #1947)
Fixed the same model from different named custom providers being silently deduplicated in the picker, with Opus catching a race in the original tests that needed augmentation.
**[@NocGeek](https://github.com/NocGeek)** — Streaming scroll + manual cron output persistence (7 PRs)
Streaming scroll viewport stability when tool/queue cards insert (#1360), manual cron-run output and metadata persistence (#1372, split from held #1352).
**[@DavidSchuchert](https://github.com/DavidSchuchert)** — German translation (PR #190)
Complete German locale (`de`) covering all UI strings, settings labels, commands, and system messages — and in doing so, stress-tested the i18n system and exposed several elements that weren't yet translatable, which got fixed as part of the same PR.
Complete German locale (`de`) covering all UI strings, settings labels, commands, and system messages — and stress-tested the i18n system, exposing several elements that weren't yet translatable and getting them fixed as part of the same PR.
**[@Jordan-SkyLF](https://github.com/Jordan-SkyLF)** — Live streaming, session recovery, workspace fallback (PRs #366, #367)
Three interlocking improvements: workspace fallback resolution so the server recovers gracefully when the configured workspace is deleted or unavailable; live reasoning cards that upgrade the generic thinking spinner to a real-time reasoning display as the model thinks; and durable session state recovery via `localStorage` so in-flight tool cards, partial assistant output, and the live SSE stream all survive a full page reload or session switch.
### Feature contributions
**[@gabogabucho](https://github.com/gabogabucho)** — Spanish locale + onboarding wizard (PRs #275, #285)
Full Spanish (`es`) locale covering all 175 UI strings, plus the one-shot bootstrap onboarding wizard that guides new users through provider setup on first launch — the feature most responsible for new users actually getting started.
**[@bergeouss](https://github.com/bergeouss)** — Provider management UI + gateway sync + Docker hardening (18 PRs, `v0.50.49``v0.50.240`)
Real-time gateway session sync (Telegram/Discord/Slack into the WebUI sidebar via SSE), the provider management UI for adding/editing custom providers from Settings, the two-container Docker setup docs, OAuth provider status detection, profile isolation hardening (per-profile `.env` secrets), and the bulk of what users see when they touch Settings → Providers.
**[@ccqqlo](https://github.com/ccqqlo)** — Terminal approval UX + custom model discovery + mobile close button (PRs #224, #225, #238, #333)
A run of focused quality-of-life improvements: terminal tool approval prompts that stay visible long enough to actually be read, restored custom model API key discovery, and the redundant mobile close button fix that had been confusing users on narrow screens.
**[@Bobby9228](https://github.com/Bobby9228)** — Mobile Profiles button (PR #265)
Added the Profiles entry to the mobile navigation flow, making profile switching reachable on phones.
**[@kevin-ho](https://github.com/kevin-ho)** — OLED theme (PR #168)
Added the 7th built-in theme: pure black backgrounds with warm accents tuned to reduce burn-in risk. Small diff, big impact for anyone on an OLED display.
**[@Bobby9228](https://github.com/Bobby9228)** — Mobile Profiles button + Android Chrome fixes (PRs #253, #263, #265)
Added the Profiles entry to the mobile navigation flow, making profile switching reachable on phones, plus a set of Android Chrome-specific fixes for the profile dropdown.
**[@franksong2702](https://github.com/franksong2702)** — Most prolific external contributor (22 PRs, `v0.50.49``v0.50.245`)
The session title guard, breadcrumb workspace navigation, mobile workspace panel sliver fix (#1300), composer footer container queries, streaming session sidebar exemption (#1327), session sidecar repair, cron output preservation (#1295), profile default workspace persistence, and a long tail of polish across the session sidebar, mobile responsive layout, and workspace state machine.
**[@betamod](https://github.com/betamod)** — Security hardening (PR #171)
A comprehensive security audit PR covering CSRF protection, SSRF guards, XSS escaping improvements, and the env race condition between concurrent agent sessions — foundational security work that shipped in v0.39.0.
**[@TaraTheStar](https://github.com/TaraTheStar)** — Bot name + thinking blocks + login refactor (PRs #132, #176, #181)
Made the assistant display name configurable throughout the UI, added thinking/reasoning block display in chat, and refactored the login page to use template variables instead of inline string replacement.
**[@thadreber-web](https://github.com/thadreber-web)** — CLI session bridge (PR #56)
The original CLI session bridge: reads CLI sessions from the agent's SQLite state store and surfaces them in the WebUI sidebar. This was the first bridge between the CLI and WebUI session worlds.
**[@deboste](https://github.com/deboste)** — Reverse proxy auth + mobile responsive layout + model routing (PRs #3, #4, #5)
Three of the very first community PRs: fixed EventSource/fetch to use the URL origin for reverse proxy setups, corrected model provider routing from config, and added mobile responsive layout with dvh viewport fix. Early foundation work.
### Bug fix and security contributions
**[@Hinotoi-agent](https://github.com/Hinotoi-agent)** — Profile .env secret isolation (PR #351)
Fixed API key leakage between profiles on switch — switching from a profile with `OPENAI_API_KEY` to one without it left the key in the process environment for the duration of the session, effectively leaking credentials. A subtle and important security fix.
**[@lawrencel1ng](https://github.com/lawrencel1ng)** — Bandit security fixes B310/B324/B110 + QuietHTTPServer (PR #354)
Systematic bandit security scan fixes: URL scheme validation before `urlopen`, MD5 `usedforsecurity=False`, and 40+ bare `except: pass` blocks replaced with proper logging — plus `QuietHTTPServer` to stop client-disconnect log spam from SSE streams.
**[@lx3133584](https://github.com/lx3133584)** — CSRF fix for reverse proxy on non-standard ports (PR #360)
Fixed CSRF rejection for deployments behind Nginx Proxy Manager or similar on non-standard ports — a real-world blocker for anyone hosting on a port other than 80/443.
**[@DelightRun](https://github.com/DelightRun)** — session_search fix for WebUI sessions (PR #356)
The `session_search` tool silently returned "Session database not available" in every WebUI session. Tracked down the missing `SessionDB` injection in the streaming path and fixed it.
**[@shaoxianbilly](https://github.com/shaoxianbilly)** — Unicode filename downloads (PR #378)
Fixed `UnicodeEncodeError` crashes when downloading workspace files with Chinese, Japanese, or other non-ASCII names. Implemented proper `Content-Disposition` header with RFC 5987 `filename*=UTF-8''...` encoding.
**[@huangzt](https://github.com/huangzt)** — Cancel interrupts agent (PR #244)
Made the Cancel button actually interrupt the running agent and clean up UI state, rather than just hiding the button while the agent kept running.
**[@tgaalman](https://github.com/tgaalman)** — Thinking card fix (PR #169)
Fixed top-level reasoning fields being missed in the thinking card display — an edge case in how Claude's extended thinking blocks surface in the API response.
**[@smurmann](https://github.com/smurmann)** — Custom provider routing fix (PR #189)
Fixed model routing for slash-prefixed custom provider models, which were being misrouted in the model selector. A precise fix for a real edge case in multi-provider setups.
**[@jeffscottward](https://github.com/jeffscottward)** — Claude Haiku model ID fix (PR #145)
Caught and corrected the Claude Haiku model ID (`3-5``4-5`) immediately after the Anthropic release — the kind of quick community catch that keeps the model dropdown accurate.
**[@kcclaw001](https://github.com/kcclaw001)** — Credential redaction in API responses (PR #243)
Added credential redaction to all API response paths so API keys, tokens, and other secrets in session data or error messages are masked before reaching the browser.
**[@mbac](https://github.com/mbac)** — Phantom "Custom" provider group fix (PR #191)
Removed the phantom "Custom" optgroup that appeared in the model dropdown even when no custom provider was configured — a small but consistently confusing UI noise issue.
The 7th built-in theme: pure black backgrounds with warm accents tuned to reduce burn-in risk.
**[@andrewy-wizard](https://github.com/andrewy-wizard)** — Chinese localization (PR #177)
Added Simplified Chinese (`zh`) locale to the WebUI. One of the first non-English locales and the most-used non-English locale in the codebase.
Initial Simplified Chinese (`zh`) locale. One of the first non-English locales.
**[@mmartial](https://github.com/mmartial)** — Docker UID/GID matching (PR #237)
Added Docker support for running as an arbitrary UID/GID matching the host user, eliminating permission issues with bind-mounted volumes — essential for Docker deployments where the host user isn't UID 1000.
**[@DelightRun](https://github.com/DelightRun)** — `session_search` fix for WebUI sessions (PR #356)
Tracked down the missing `SessionDB` injection in the streaming path that was silently breaking the tool for every WebUI session.
**[@vCillusion](https://github.com/vCillusion)** — pip package resolution fix (PR #76)
Fixed agent dependency resolution to prefer packages from the venv's site-packages over the agent directory itself, preventing shadowing bugs when developing locally.
**[@lawrencel1ng](https://github.com/lawrencel1ng)** — Bandit security fixes (PR #354)
Systematic bandit-scan fixes: URL scheme validation before `urlopen`, MD5 `usedforsecurity=False`, and 40+ bare `except: pass` blocks replaced with proper logging.
**[@carlytwozero](https://github.com/carlytwozero)** — API key pass-through for non-Anthropic providers (PR #78)
Fixed `api_key` not being passed to `AIAgent` for non-Anthropic `/anthropic` providers — a quiet regression that silently broke any non-default provider.
**[@shaoxianbilly](https://github.com/shaoxianbilly)** — Unicode filename downloads (PR #378)
Proper `Content-Disposition` with RFC 5987 `filename*=UTF-8''...` encoding so non-ASCII filenames download without crashing.
**[@mangodxd](https://github.com/mangodxd)** — Type hints cleanup (PR #115)
Added missing type hints across 10 files and corrected 9 inaccurate existing ones — the kind of maintenance work that makes the codebase easier to reason about.
**[@lx3133584](https://github.com/lx3133584)** — CSRF fix for reverse proxy (PR #360)
A real-world blocker for anyone hosting behind Nginx Proxy Manager or similar on a port other than 80/443.
**[@Argonaut790](https://github.com/Argonaut790)** — HTML entity decode + Traditional Chinese locale (PR #239)
Fixed double-escaping of HTML entities in `renderMd()` — LLM output containing `&lt;code&gt;` was being escaped a second time, rendering as literal text instead of the intended markdown. The same PR also completed the Simplified Chinese translation (40+ missing keys) and added a full Traditional Chinese (`zh-Hant`) locale.
**[@betamod](https://github.com/betamod)** — Security audit (PR #171)
A comprehensive CSRF / SSRF / XSS / env-race-condition audit that shipped in v0.39.0.
**[@indigokarasu](https://github.com/indigokarasu)** — Visual redesign proposal: icon rail + design token system + 7 themes (PR #213)
A CSS-only redesign of the full UI — proper design tokens (`--bg-primary`, `--text-info`, spacing scale), an icon rail sidebar replacing the emoji tab strip, consistent form cards, breadcrumb nav, and 7 built-in themes as custom properties. The PR didn't merge as-is but directly shaped the design language and theme architecture that shipped in v0.50.0.
**[@zenc-cp](https://github.com/zenc-cp)** — Anti-hallucination guard for ReAct loop (PR #133)
Added a streaming token buffer and post-run message scrub to `streaming.py` to detect and strip fake tool execution JSON that weaker models write inline instead of calling tools properly. A three-layer approach: ephemeral anti-hallucination prompt, live token filtering, and session history cleanup. The pattern influenced later streaming.py improvements.
---
Want to contribute? See [ARCHITECTURE.md](ARCHITECTURE.md) for the codebase layout and [TESTING.md](TESTING.md) for how to run the test suite. The best contributions are focused, well-tested, and solve a real problem — exactly what every person on this list did.
**[@TaraTheStar](https://github.com/TaraTheStar)** — Bot name + thinking blocks + login refactor (PRs #132, #176, #181)
Configurable assistant display name, thinking/reasoning block display, and a login page refactor.
## Repo
+297 -311
View File
@@ -1,363 +1,349 @@
# Hermes Web UI: Full Parity Roadmap
# Hermes Web UI Roadmap
> Goal: Full 1:1 parity with the Hermes CLI experience via a clean dark web UI.
> Everything you can do from the CLI terminal, you can do from this UI.
> Web companion to the Hermes Agent CLI. Same workflows, browser-native.
>
> Last updated: v0.50.278 (May 03, 2026) — 3936 tests collected
> Tests: `pytest tests/ --collect-only -q`
> Source: <repo>/
> Last updated: v0.51.31 (May 9, 2026) — 5028 tests collected — Release H 12-PR contributor batch (image-mode fix + race fixes + composer drafts + locale parity + custom-provider dedup + TTL config + heartbeat polish)
> Test source: `pytest tests/ --collect-only -q`
> Per-version detail: see [CHANGELOG.md](./CHANGELOG.md)
---
## Sprint History (Completed)
## Status snapshot
| Sprint | Theme | Highlights | Tests |
|--------|-------|-----------|-------|
| Sprint 1 | Bug fixes + foundations | B1-B11 fixed, LOCK on SESSIONS, section headers, request logging | 19 |
| Sprint 2 | Rich file preview | Image preview, rendered markdown, table support, smart icons | 27 |
| Sprint 3 | Panel nav + viewers | Sidebar tabs, cron/skills/memory panels, B6/B10/B14, Phase D start | 48 |
| Sprint 4 | Relocation + power features | Source to <repo>/, CSS extracted, session rename/search, file ops | 68 |
| Sprint 5 | Phase A complete + workspace | JS extracted (server.py 1778->1042 lines), workspace management, copy message, file editor, session index | 86 |
| Test hardening | Isolated test environment | Port 8788 test server, conftest autouse, cleanup_zero_message, 5 test files rewritten | 90 |
| Sprint 6 | Polish + Phase E complete | HTML to static/, resizable panels, cron create, session JSON export, Escape from editor | 106 |
| Sprint 7 | Wave 2 Core: CRUD + Search | Cron edit/delete, skill create/edit/delete, memory write, session content search, health improvements, git init | 125 |
| Sprint 8 | Daily Driver Finish Line | Edit+regenerate user messages, regenerate last response, clear conversation, Prism.js syntax highlighting, reconnect banner fix, session list scroll fix | 139 |
| Sprint 8 hotfix | Message queue + INFLIGHT fix | Queue messages while busy (toast + badge + auto-drain), INFLIGHT-first loadSession (message stays on switch-away/back) | 139 |
| Sprint 9 | Codebase health + daily driver gaps | app.js deleted and replaced by 6 modules, tool call cards inline, attachment persistence on reload, todo list panel | 149 |
| Sprint 10 | Server health + operational polish | server.py split into api/ modules, background task cancel, cron run history viewer, tool card UX polish | 167 |
| Sprint 10 fixes | Import regressions + regression tests | uuid, AIAgent, has_pending, SSE cancel loop, Session.__init__ tool_calls; test_regressions.py | 177 |
| Concurrency sweeps | Multi-session correctness | Approval cross-session (R10), activity bar per-session (R11), live cards on switch-back (R12), tool cards after done (R13), session model authoritative (R14), newSession cards (R15) | 190 |
| Sprint 11 | Multi-provider models + streaming | Dynamic model dropdown (any Hermes provider), smooth scroll pinning, routes extracted to api/routes.py (server.py 704→76 lines) | 201 |
| Sprint 12 | Settings + reliability + session QoL | Settings panel (gear icon, settings.json), SSE auto-reconnect, pin sessions, import session from JSON | 211 |
| Sprint 13 | Alerts + polish | Cron completion alerts (polling + badge), background error banner, session duplicate, browser tab title | 221 |
| Sprint 14 | Visual polish + workspace ops | Mermaid diagrams, message timestamps, file rename, folder create, session tags, session archive | 233 |
| Sprint 15 | Session projects + code copy | Session projects/folders, code block copy button, tool card expand/collapse toggle | 237 |
| Sprint 16 | Session sidebar visual polish | SVG action icons, session action dropdown, pin indicator, project border, safe HTML rendering | 289 |
| Sprint 17 | Workspace polish + slash commands + settings | Breadcrumb navigation, slash command autocomplete, send key setting (#26) | 318 |
| Sprint 18 | Thinking display + workspace tree | File preview auto-close, thinking/reasoning cards, expandable directory tree (#22) | 318 |
| Sprint 19 | Auth + security hardening | Password auth (off by default), login page, security headers, 20MB body limit (#23) | 328 |
| Sprint 20 | Voice input + send button | Voice input (Web Speech API), send button icon-circle with pop-in animation | 415 |
| Sprint 21 | Mobile responsive + Docker | Hamburger sidebar, mobile nav, files slide-over, Docker support (#21, #7) | 415 |
| Sprint 22 | Multi-profile support | Profile picker, management panel, seamless switching, per-session tracking (#28) | 415 |
| Sprint 23 | Agentic transparency | Token/cost display, subagent cards, skill picker in cron, skill linked files, workspace tree persistence, timestamp fixes | 424 |
| v0.44.0 patch | Fix batch: approval card, login CSP, update diagnostics, Lucide icons | PRs #221 #225 #226 #227 #228 | 579 |
| v0.45.0 | Custom endpoint in new profile form | Base URL + API key fields; server-side URL validation; config.yaml merge; 9 new tests (PR #233, fixes #170) | 604 |
| v0.46.0 | Security, Docker UID/GID, model discovery, i18n, cancel fix | Credential redaction in API responses (PR #243); Docker UID/GID matching (PR #237); custom model API key discovery (PR #238); HTML entity decode + zh/zh-Hant i18n (PR #239); cancel interrupts agent (PR #244); +20 tests | 624 |
| v0.47.0 | Dialogs, session menu, skills command, mobile fixes, mobile QA | Shared app dialogs (#251); session ⋯ menu (#252); mobile QA suite (#254); custom provider slash routing fix (#255); Android Chrome mobile fixes (#256); /skills command (#257); +21 tests | 645 |
| v0.47.1 | Spanish locale | Full Spanish (es) locale, 175 keys, key-parity tests (#275 @gabogabucho); +3 tests | 648 |
| v0.48.0 | Gateway session sync | Real-time Telegram/Discord/Slack sessions in sidebar via SSE + DB polling (#274 @bergeouss); +10 tests | 658 |
| v0.48.1 | Table inline formatting | `inlineMd()` in table cells — **bold**, *italic*, `code`, links render correctly (PR #278); 0 new tests | 658 |
| v0.48.2 | Provider mismatch warning | Toast warning + auth_mismatch error type for provider/model mismatches (#283, fixes #266); +21 tests | 679 |
| v0.49.1 | Docker docs + mobile Profiles button | Two-container Docker compose (#291/#288); Profiles added to the mobile navigation flow with correct panel wiring and SVG sizing (#297/#265 @gabogabucho); +3 tests | 700 |
| v0.49.0 | First-run onboarding wizard + self-update hardening | One-shot bootstrap + guided setup wizard; provider config persisted to config.yaml + .env; OpenRouter/Anthropic/OpenAI/Custom; wizard hidden after completion (#285); self-update stderr/split-ref/conflict fixes (#287); skip flaky redaction test (#289); +18 tests | 697 |
| v0.32 | Auto-compaction handling | Compression detection, /compact command, real context window indicator | 424 |
| v0.33 | /insights sync | Opt-in state.db sync so `hermes /insights` includes WebUI sessions | 424 |
| v0.34 | Sprint 26 — Pluggable themes | Dark, Light, Slate, Solarized, Monokai, Nord; settings unsaved-changes guard; /theme command | 433 |
| v0.34.1 | Theme variable polish | 30+ hardcoded dark-navy colors replaced with theme-aware CSS variables | 433 |
| v0.34.2 | Theme text colors | 5 new per-theme typography variables (--strong, --em, --code-text, --code-inline-bg, --pre-text) | 433 |
| v0.34.3 | Light theme final polish | 46 light-scoped selector overrides for sidebar, roles, chips, interactive elements | 433 |
| v0.35 | Security hardening | Env race fix, random signing key, upload path traversal, PBKDF2 password hash | 433 |
| v0.36v0.37 | Model routing, personality config, tool card reload, duplicate model fixes | Model routing by provider prefix, personality via config.yaml, tool cards reload on page refresh | 466 |
| v0.38.0v0.38.6 | Model selector, custom endpoints, OLED theme, reasoning display, insights sync | Custom endpoint URL fix, OLED theme, top-level reasoning field fix, message_count sync to state.db | 466 |
| v0.39.0 | Security hardening (Sprint 29) | CSRF, PBKDF2, rate limiting, session ID validation, SSRF, ENV_LOCK, XSS, HMAC, skills traversal, secure cookie, error sanitization, startup warning | 499 |
| v0.40v0.44.2 | Approval card + Lucide icons + sprint auth | Approval prompt surfaced in UI, emoji icons → Lucide SVG, login CSP inline fix, update diagnostics | 579 |
| v0.45v0.46 | Custom endpoints + security + i18n + cancel | Custom endpoint Base URL + API key on profile create, credential redaction (PR #243), Docker UID/GID (PR #237), HTML entity decode + zh/zh-Hant i18n, cancel interrupts agent | 624 |
| v0.47v0.47.1 | Dialogs + session menu + skills + mobile QA + Spanish | Shared app dialogs, session ⋯ menu, /skills command, mobile QA suite, Android Chrome fixes, Spanish locale (@gabogabucho) | 648 |
| v0.48v0.48.2 | Gateway session sync + table formatting + provider warnings | Real-time Telegram/Discord/Slack sessions in sidebar (@bergeouss), inlineMd() in table cells, provider/model mismatch toast | 679 |
| v0.49v0.49.1 | Onboarding wizard + Docker two-container | One-shot bootstrap + guided setup wizard, OpenRouter/Anthropic/OpenAI/Custom provider config, two-container Docker compose, mobile Profiles button | 700 |
| v0.50.0 | v0.50.0 UI overhaul (Sprint 34) | Composer-centric controls, Hermes Control Center modal, workspace panel state machine, collapsible date groups, rAF streaming throttle, context ring indicator (@aronprins) | 742 |
| v0.50.5v0.50.10 | Think-tag edge cases + onboarding hardening + mobile fixes | MiniMax M2.5 leading-whitespace think-tag fix, skip-onboarding env var, OAuth provider path, Docker bridge networks fix, model dropdown dedup, title auto-generation fix, mobile close button | 802 |
| v0.50.11v0.50.12 | Chat table styles + URL autolink + profile env isolation | .msg-body table borders, plain URL auto-linking, profile .env secret isolation on switch (prevents API key leakage across profiles, @Hinotoi-agent) | 815 |
| v0.50.13v0.50.15 | session_search + security sweep + KaTeX math | SessionDB injection for session_search in WebUI (@DelightRun), bandit B310/B324/B110 + QuietHTTPServer (@lawrencel1ng), KaTeX math rendering with fence-before-math fix | 871 |
| v0.50.16v0.50.17 | CSRF reverse proxy + Docker uv pre-install | Scheme-aware CSRF port normalization for non-standard ports (@lx3133584), Docker uv pre-installed at build time as root (fixes air-gapped startup, @mmartial-pattern) | 900 |
| v0.50.18v0.50.19 | Workspace fallback + Unicode filenames | Cascading workspace path recovery (@Jordan-SkyLF), Unicode Content-Disposition headers with RFC 5987 filename* (@shaoxianbilly), silent auth error surfacing, stale model cleanup | 924 |
| v0.50.20v0.50.21 | Silent errors + live model fetching + durable streaming recovery | apperror on empty agent response, /api/models/live endpoint with SSRF guard, live reasoning cards, tool_complete SSE events, SESSION_QUEUES, localStorage reload recovery (@Jordan-SkyLF) | 961 |
| v0.50.22v0.50.36-local.1 | Upstream sync + minimal local patch retention | Synced to upstream `v0.50.36`; retained first-password session continuity in Settings/onboarding; removed local Assistant Reply Language enhancement; added legacy settings cleanup regression coverage | 1059 |
| v0.50.37v0.50.40 | Sprint 40 — rendering fixes + KaTeX CSP + MEDIA images | Think-tag edge cases, renderMd link double-linking fix, MEDIA: inline image rendering, KaTeX CSP font-src fix | 1117 |
| v0.50.41v0.50.43 | Sprint 41/42 — context ring, session polish, renderMd hardening | Context indicator live usage, session display fixes, renderMd bold+code stash, outer link pass ordering, _ob_stash, autolink double-link fixes (@multiple contributors) | 1150 |
| v0.50.44 | Renderer formatting bug fixes (#486, #487) | CSS: inline code sizing in table cells; JS: markdown image syntax ![alt](url) → <img> in renderMd + inlineMd; _img_stash for autolink protection | 1195 |
| v0.50.45v0.50.100 | Upstream sync + contributor sprint | Sidebar declutter, SKIP_ONBOARDING, runtime route details, subpath mount, bug batch (light theme/panel/model cache/Docker), Docker UID/GID auto-detect, chat transcript redesign, favicon SVG+PNG+ICO, Docker UID-mismatch crash fix, auto-title markdown strip | 1777 |
| v0.50.101v0.50.139 | Contributor sprint wave | Custom providers, Russian locale, collapsed timestamps, IME composition fixes, model-switch toast, approval queue multi-slot, live model fetching SSRF guard, orphaned tool-message sanitization, profile polish sprint (model routing, workspace cross-profile, legacy session backfill), font-size CSS fix | 1777 |
| v0.50.140v0.50.147 | Bug batch + appearance | Font size setting visibly scales UI text (#843), slash command echoed as user message (#840), scroll selected item into view (#838), tasks refresh button (#835), font size toggle (#833), stale model fix (#829), session search clear on boot (#822), gateway SSE polling fallback (#635) | 1858 |
| v0.50.148v0.50.150 | Session index + read-path + profile | Prune stale _index.json ghost rows after session-id rotation (#847 @franksong2702), GET /api/session side-effect-free model resolution (#848 @franksong2702), profile switching cookie persist + syncTopbar fix (#849 @migueltavares) | 1858 |
| v0.50.151 | credential_pool + Ollama Cloud | Providers added via auth store credential_pool now visible in model dropdown; Ollama Cloud support; ambient gh-cli token suppression; _apply_provider_prefix helper (#820 @starship-s) | 1898 |
| v0.50.152 | Image rendering + auto-title | image_generate MEDIA: token renders all https:// URLs as img regardless of extension (closes #853); auto-title strips Qwen3-style plain-text thinking preambles (closes #857) | 1898 |
| v0.50.153 | Portal model routing | Live-fetched models from portal providers (Nous, OpenCode) now get @provider: prefix so they route correctly instead of falling through to OpenRouter (closes #854) | 1898 |
| v0.50.154 | Thinking card mirror fix | _streamDisplay() early return removed — thinking card and main response now show distinct content when provider double-emits (closes #852) | 1898 |
| v0.50.155 | Honcho session stability | gateway_session_key=session_id passed to AIAgent so Honcho per-session strategy maintains one Honcho session per WebUI chat instead of one per turn (closes #855) | 1903 |
| v0.50.156 | Auto-install security gate | auto_install_agent_deps() is now opt-in; set HERMES_WEBUI_AUTO_INSTALL=1 to enable; _trusted_agent_dir() checks ownership/permission bits before running pip (⚠️ breaking: default changed) | 1903 |
| Surface | Status |
|---|---|
| **Hermes CLI parity** | ✅ Complete — every CLI workflow has a web equivalent |
| **Streaming + tool transparency** | ✅ Live tool cards, reasoning cards, approval prompts, cancel |
| **Multi-provider model support** | ✅ Any provider configured in `config.yaml` shows in the picker |
| **Sessions + projects + search** | ✅ CRUD, content search, projects, tags, archive, fork, import |
| **Mobile + Docker + auth** | ✅ Hamburger nav, slide-overs, password auth, GHCR images |
| **Auxiliary surfaces** | ✅ Workspace tree + edit, cron CRUD, skills CRUD, memory write, MCP server UI |
| **Visual polish** | ✅ 8 themes (incl. light/system/OLED/Sienna), Mermaid, KaTeX, syntax highlighting |
| **Native distribution** | ✅ macOS desktop app (universal arm64+x86_64 DMG, signed) — separate repo |
Remaining gaps and forward work live in [Forward Work](#forward-work) below.
---
## Current Architecture Status
## Architecture
| Layer | Location | Status |
|-------|----------|--------|
| Python server | <repo>/server.py (~165 lines) + api/ modules (~5000 lines) | Thin shell + QuietHTTPServer + auth middleware + business logic in api/ |
| HTML template | <repo>/static/index.html (~600 lines) | Served from disk |
| CSS | <repo>/static/style.css (~1050 lines) | Served from disk, incl. mobile responsive, KaTeX, table styles |
| JavaScript | <repo>/static/{ui,workspace,sessions,messages,panels,boot,commands,icons,i18n,login}.js | 10 modules, ~7100 lines total |
| Docker | Dockerfile, docker-compose.yml, .dockerignore | python:3.12-slim, multi-arch (amd64+arm64) |
| CI/CD | .github/workflows/release.yml | Auto-release + GHCR publish on tag push |
| Runtime state | ~/.hermes/webui-mvp/sessions/ | Session JSON files |
| Test server | Port 8788 (conftest.py), port 8789 (browser sanity) | Isolated, wiped per run |
| Production server | Port 8787 | SSH tunnel from Mac |
| Layer | Files | Status |
|---|---|---|
| Python server | `server.py` (~165 lines) + `api/` modules (~20k lines) | Thin shell + auth middleware + business logic |
| HTML template | `static/index.html` (~600 lines) | Served from disk |
| CSS | `static/style.css` (~3k lines) | Themes, mobile responsive, KaTeX, table styles |
| JavaScript | `static/{ui,sessions,messages,workspace,panels,boot,commands,icons,i18n,login,onboarding}.js` (~26k lines) | 11 modules served as static files |
| Service worker | `static/sw.js` | Offline shell cache, version-pinned assets |
| Docker | `Dockerfile`, `docker-compose.yml` | `python:3.12-slim`, multi-arch (amd64+arm64), HEALTHCHECK |
| CI/CD | `.github/workflows/release.yml` | Auto-release + GHCR publish on tag push |
| Test isolation | `tests/_pytest_port.py` | Per-worktree port + state-dir derivation, no collisions |
---
## Feature Parity Checklist
## Feature parity checklist
### Chat and Agent
### Chat and streaming
- [x] Send messages, get SSE-streaming responses
- [x] Switch models per session (10 models, grouped by provider)
- [x] Composer-scoped model picker in footer (moved from sidebar to align with per-conversation model selection)
- [x] Multi-provider API support: use any Hermes agent API provider (OpenAI, Anthropic, Google, etc.) directly, not just OpenRouter (Sprint 11)
- [x] Custom endpoint model discovery: auto-detect models from Ollama, LM Studio, and other local LLM servers via base_url (PR #18)
- [x] Upload files to workspace (drag-drop, click, clipboard paste)
- [x] File tray with remove button
- [x] Tool progress shown inline in the conversation via live tool cards
- [x] Approval card for dangerous commands (Allow once/session/always, Deny)
- [x] Composer-scoped model picker (per-conversation model selection)
- [x] Multi-provider API support — OpenAI, Anthropic, Google, OpenRouter, xAI, GLM, DeepSeek, Mistral, MiniMax, Kimi, OpenCode, Nous Portal, custom OpenAI-compatible endpoints
- [x] Live custom-endpoint model discovery (Ollama, LM Studio, vLLM via `/v1/models`)
- [x] Free-form OpenRouter model name (autocomplete + custom input)
- [x] Tool progress shown inline via live tool cards
- [x] Approval card for dangerous commands (Allow once / session / always, Deny)
- [x] Approval polling + SSE-pushed approval events
- [x] Clarify dialog — agent can ask blocking clarifying questions
- [x] Subagent delegation cards in tool view
- [x] INFLIGHT guard: switch sessions mid-request without losing response
- [x] Session restores from localStorage on page load
- [x] Reconnect banner if page reloaded mid-stream
- [x] SSE auto-reconnect with stream replay
- [x] Token / cost estimate per message and per session
- [x] Context usage indicator (compact ring badge in composer footer)
- [x] Auto-compaction handling + `/compact` command
- [x] rAF-throttled token rendering (smooth, no DOM thrash)
- [x] Cancel / stop button in composer footer
- [x] Reasoning effort selector (low / medium / high / xhigh) + `/reasoning`
- [x] Pure-text streaming with crash-recovery — partial messages restored from localStorage on reload
### Conversation controls
- [x] Copy message to clipboard (hover icon on each bubble)
- [x] Edit last user message and regenerate
- [ ] Branch/fork conversation (Wave 3)
- [x] Token/cost estimate per message (Sprint 23)
### Tool Visibility
- [x] Tool progress in live tool cards (kept out of the composer/footer chrome)
- [x] Approval card with all 4 choices
- [x] Tool call cards inline (collapsed, show name/args/result)
### Workspace / Files
- [x] Workspace panel defaults closed and opens only for active browsing or preview
- [x] Browse workspace directory tree with type icons
- [x] Preview text/code files (read-only)
- [x] Preview markdown files (rendered, tables supported)
- [x] Preview image files (PNG, JPG, GIF, SVG, WEBP inline)
- [x] Edit files inline (Edit button, Enter to save, Escape to cancel)
- [x] Create new file (+ button in panel header)
- [x] Delete file (hover trash, confirmation modal)
- [x] File name truncation with tooltip for long names
- [x] Right panel resizable (drag inner edge)
- [x] Syntax highlighted code preview (Prism.js)
- [x] Rename file (Sprint 14)
- [x] Create folder (Sprint 14)
- [x] Shared app modal for confirm/input flows (Sprint 33)
- [x] Regenerate last response
- [x] Clear conversation (wipe messages, keep session)
- [x] Branch / fork conversation from any message point (#465)
- [x] Pure-text + tool-call streams both recover
### Sessions
- [x] Create session (+ button or Cmd/Ctrl+K)
- [x] Load session (click in sidebar)
- [x] Delete session (hover trash, toast, correct fallback)
- [x] Auto-title from first user message
- [x] Rename session title (double-click in sidebar, Enter saves, Escape cancels)
- [x] Filter/search sessions by title (live filter box)
- [x] Date group headers (Today / Yesterday / Earlier)
- [x] Download session as Markdown transcript
- [x] Export session as JSON (full messages + metadata)
- [x] Session inherits last-used workspace on creation
- [x] Session content search (search message text across sessions)
- [x] Session tags / labels (Sprint 14)
- [x] Archive sessions (Sprint 14)
- [x] Clear conversation (wipe messages, keep session) (Wave 3)
- [x] Import session from JSON (Sprint 12)
- [x] Pin/star sessions to top of list (Sprint 12)
- [x] Duplicate session (Sprint 13)
- [x] Session projects / folders (Sprint 15)
- [x] Delete session (hover trash, toast undo, fallback)
- [x] Auto-title from first user message + adaptive title refresh (configurable cadence)
- [x] LLM-generated titles via auxiliary route (configurable model)
- [x] Rename session inline (double-click, Enter saves, Escape cancels)
- [x] Title search (live filter)
- [x] Content search (full-text across all sessions)
- [x] Date group headers (Today / Yesterday / Earlier) with collapsible groups
- [x] Pin / star sessions to top
- [x] Duplicate session
- [x] Import / Export session as JSON (full messages + metadata)
- [x] Download as Markdown transcript
- [x] Tags (`#tag` extraction + filter chips)
- [x] Archive sessions (hidden by default, "Show N archived" toggle)
- [x] Projects / folders (chip filter bar, "Unassigned" filter)
- [x] Per-session profile tracking
- [x] Per-session toolset override (`/toolsets`)
- [x] Batch select mode (multi-select, bulk delete / move / archive)
- [x] CLI session bridge — read CLI sessions from state.db, import as WebUI sessions
### Workspace Management
- [x] Add workspace with path validation (must be existing directory)
- [x] Remove workspace
- [x] Rename workspace display name
- [x] Quick-switch workspace from topbar dropdown
- [x] Sidebar live workspace display (name + path, updates in real time)
- [x] New sessions inherit last used workspace
- [x] Workspace list persists to workspaces.json
- [ ] Workspace reorder (drag) (Wave 2)
### Workspace and files
- [x] Add workspace with path validation (existing directory, follows symlinks)
- [x] Remove / rename workspace
- [x] Quick-switch from topbar dropdown
- [x] Sidebar live workspace display (name + path)
- [x] New sessions inherit last-used workspace
- [x] Browse workspace directory tree with type icons
- [x] Tree view with expand / collapse + lazy load (#22)
- [x] Breadcrumb navigation in subdirectories
- [x] Preview text / code (read-only)
- [x] Preview markdown (rendered + tables + Mermaid + KaTeX)
- [x] Preview images (PNG, JPG, GIF, SVG, WEBP, AVIF inline)
- [x] Preview PDF / SVG / audio / video / Excalidraw / CSV / JSON / YAML
- [x] Edit files inline (Edit button, Enter saves, Escape cancels)
- [x] Create / rename / delete files and folders (in current directory)
- [x] Drag-drop / click / clipboard paste upload
- [x] Archive upload (zip / tar) with extraction
- [x] Syntax highlighted code preview (Prism.js, language-aware)
- [x] File preview auto-close on directory navigation
- [x] Right panel resizable (drag inner edge)
- [x] Embedded workspace terminal (`/api/terminal/{start,input,output}`)
- [x] Git branch + dirty status badge in workspace header
### Scheduled Tasks (Cron)
- [x] View all cron jobs (Tasks sidebar tab)
- [x] View last run output per job (auto-loaded on expand)
- [x] Expand job to see prompt, schedule, last output
- [x] Run job manually (Run now button)
- [x] Pause / Resume job
- [x] Create cron job from UI (+ New job form with name, schedule, prompt, delivery)
- [x] Edit existing cron job
- [x] Delete cron job
- [x] View full cron run history (expandable per job)
- [x] Skill picker in cron create form (Sprint 23)
### Cron jobs
- [x] List all cron jobs (Tasks sidebar tab)
- [x] View job details (prompt, schedule, last run, output)
- [x] Run / pause / resume / delete
- [x] Create job from UI (name, schedule, prompt, delivery target)
- [x] Edit job inline (full create-form parity, including skills)
- [x] Skill picker in create + edit forms
- [x] Cron run history viewer (expandable per job)
- [x] Cron completion alerts (toast + badge)
- [x] Run-status tracking with live watch mode
### Skills
- [x] List all skills grouped by category (Skills sidebar tab)
- [x] Search/filter skills by name, description, category
- [x] View full SKILL.md content in right preview panel
- [x] Create skill
- [x] Edit skill
- [x] Delete skill
- [x] View skill linked files (Sprint 23)
- [x] List all skills grouped by category
- [x] Search / filter by name, description, category
- [x] View full SKILL.md content
- [x] View skill linked files
- [x] Create / edit / delete skill
- [x] `/skills` slash command
### Memory
- [x] View personal notes (MEMORY.md) rendered as markdown (Memory tab)
- [x] View user profile (USER.md) rendered as markdown (Memory tab)
- [x] Last-modified timestamp on each section
- [x] Add/edit memory entry inline
### Configuration
- [x] Settings panel (default model, default workspace) (Sprint 12)
- [x] Send key preference (Enter or Ctrl+Enter) (Sprint 17)
- [x] Password authentication (Sprint 19)
- [ ] Enable/disable toolsets per session (deferred)
### Notifications
- [x] Cron job completion alerts (Sprint 13)
- [x] Background agent error alerts (Sprint 13)
### Workspace
- [x] Breadcrumb navigation in subdirectories (Sprint 17)
- [x] Workspace tree view with expand/collapse (Sprint 18, Issue #22)
- [x] File preview auto-close on directory navigation (Sprint 18)
### Slash Commands
- [x] Command registry + autocomplete dropdown (Sprint 17)
- [x] Built-in: /help, /clear, /model, /workspace, /new (Sprint 17)
### Security
- [x] Password auth with signed cookies (Sprint 19, Issue #23)
- [x] Security headers (X-Content-Type-Options, X-Frame-Options) (Sprint 19)
- [x] POST body size limit (20MB) (Sprint 19)
### Thinking / Reasoning
- [x] Collapsible thinking cards for extended-thinking models (Sprint 18)
### Voice
- [x] Voice input via Web Speech API (Sprint 20)
### Mobile
- [x] Mobile responsive layout — hamburger sidebar, sidebar tabs on phones, files slide-over (Sprint 21 + later mobile nav simplification)
- [x] View personal notes (MEMORY.md) rendered as markdown
- [x] View user profile (USER.md) rendered as markdown
- [x] Last-modified timestamp per section
- [x] Add / edit memory entries inline
### Profiles
- [x] Multi-profile support — create, switch, delete profiles (Sprint 22, Issue #28)
- [x] Multi-profile support — create, switch, delete (#28)
- [x] Topbar profile picker with gateway-status dots
- [x] Profile management panel (full CRUD)
- [x] Seamless switching (no server restart, refreshes models / skills / memory / cron / workspace)
- [x] Profile-local workspace storage
- [x] First-run onboarding wizard with provider config (OpenRouter / Anthropic / OpenAI / Custom)
- [x] In-app OAuth for Codex and Claude
### Advanced / Future
- [ ] Subagent session tree -- show subagent hierarchy in sidebar with expand/collapse (PR #75)
- [ ] Specialized tool card renderers -- diff viewer, terminal output, todo checklist views (PR #75)
- [x] Streaming performance -- rAF-throttled token rendering (Sprint 24, PR #81)
- [x] Workspace git detection -- branch name and dirty status badge (Sprint 24, PR #82)
- [x] Collapsible date groups -- click group headers to collapse (Sprint 24, PR #80)
- [x] Context usage indicator -- compact circular badge in composer footer (Sprint 24, PR #83; refreshed April 10, 2026)
- [ ] LLM-generated session titles -- auto-title via small model instead of first-message substring (PR #75)
- [ ] Workspace git detection -- show branch name, dirty status in workspace header (PR #75)
- [ ] Clarify dialog -- agent can ask clarifying questions that block until user responds (PR #75)
- [ ] Gateway approval polling -- support blocking approvals from messaging gateway (PR #75)
- [ ] Unified session storage -- SessionDB shared between webui and CLI (PR #75)
- [ ] TTS playback of responses (deferred)
- [x] Background task cancel (composer footer stop button)
- [ ] Code execution cell (deferred)
- [ ] Desktop application (Sprint 25, PLANNED)
- [x] Pluggable UI themes -- Dark, Light, Slate, Solarized, Monokai, Nord (Sprint 26, v0.34)
- [ ] Extended slash command / skill integration (deferred)
- [ ] Virtual scroll for large lists (deferred)
### Configuration
- [x] Settings panel (default model, default workspace, send key, theme, voice, font size)
- [x] Send key preference (Enter or Ctrl+Enter)
- [x] Password authentication (off by default)
- [x] Per-session toolset override
- [x] Personality config via `config.yaml`
- [x] Reasoning effort persistence
### Notifications
- [x] Cron job completion alerts
- [x] Background agent error banner
- [x] Approval pending badge
- [x] Provider / model mismatch toast warning
### Slash commands
- [x] Command registry + autocomplete dropdown
- [x] Built-ins: `/help`, `/clear`, `/model`, `/workspace`, `/new`, `/usage`, `/theme`, `/compact`, `/queue`, `/interrupt`, `/steer`, `/goal`, `/btw`, `/reasoning`, `/skills`, `/toolsets`
- [x] Transparent pass-through for unrecognized commands
### Security
- [x] Password auth with signed HMAC HTTP-only cookies (24h TTL)
- [x] Security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy)
- [x] CSRF protection (scheme-aware, port-normalized for reverse proxies)
- [x] PBKDF2 password hashing
- [x] Rate limiting on auth endpoints
- [x] Session ID validation
- [x] SSRF guard on `/api/models/live`, `cfg_base_url`, `custom_providers[]`
- [x] ENV_LOCK around env mutations
- [x] XSS sanitization on all rendered HTML
- [x] HMAC-signed signing keys (random per install)
- [x] Skills path-traversal guard
- [x] Secure cookie flags (HttpOnly, SameSite, Secure when HTTPS)
- [x] Error message sanitization (no stack traces in responses)
- [x] POST body size limit (20MB)
- [x] Upload path-traversal guard
- [x] Credential redaction in API responses
- [x] Profile `.env` secret isolation on switch
- [x] Auto-install gate (opt-in via `HERMES_WEBUI_AUTO_INSTALL=1`)
### Visual / UX
- [x] 8 themes — Dark, Light, System (auto-sync), Slate, Solarized, Monokai, Nord, OLED, Sienna
- [x] 2-axis appearance model (theme + skin) for community theme contributions
- [x] Mermaid diagram rendering
- [x] KaTeX math rendering with fence-before-math fix
- [x] Syntax highlighting (Prism.js, language-aware, YAML newline preservation)
- [x] Markdown image syntax `![alt](url)` and inline MEDIA: tokens render as `<img>`
- [x] Plain URL auto-linking
- [x] Inline markdown in table cells (bold, italic, code, links)
- [x] Code block copy button
- [x] Tool card expand / collapse toggle
- [x] Collapsible thinking / reasoning cards (Claude extended thinking, o3 reasoning tokens)
- [x] Message timestamps (subtle, full date on hover)
- [x] Empty composer hides send button (icon-circle with pop-in animation)
- [x] Pluggable Lucide SVG icons (no emoji rendering inconsistencies)
- [x] Composer-centric controls (v0.50.0 UI overhaul)
- [x] Hermes Control Center modal (centralized actions)
- [x] Workspace panel state machine (defaults closed, opens for browsing / preview)
- [x] PWA manifest + service worker (offline shell)
- [x] Favicon (SVG + PNG + ICO)
- [x] Branded onboarding wizard
### Voice
- [x] Voice input via Web Speech API (push-to-talk dictation)
- [x] Hands-free voice mode (turn-based conversation, opt-in via Settings → Preferences)
- [x] TTS playback of responses (configurable voice, rate, pitch)
### Mobile
- [x] Hamburger sidebar (slide-in overlay)
- [x] Bottom navigation bar (5-tab iOS-style)
- [x] Files slide-over (right panel as slide-over)
- [x] 44px minimum touch targets
- [x] Container queries on composer
- [x] Android Chrome compatibility fixes
- [x] PWA installation (manifest + icons + Android support)
### Internationalization
- [x] 9 locales — English, Japanese, Russian, Spanish, German, Chinese (zh + zh-Hant), Portuguese, Korean, French
- [x] Key-parity test ensures every locale has every key
- [x] Right-to-left and CJK input (IME composition fixes)
### Gateway integration
- [x] Real-time gateway sessions in sidebar (Telegram, Discord, Slack, Weixin) via SSE + DB polling
- [x] Cross-channel handoff dock — composer-docked flyout summarizing the live external session
- [x] Transcript-summary card at 10+ rounds
- [x] Sidebar dedup keying on per-conversation identity (distinct chats from same platform stay separate)
- [x] Gateway session sync skips dup / delete options for external sessions
- [x] LLM Gateway routing metadata display — assistant turns and session metadata show the served model/provider, failover path, and model-switch warnings when response metadata includes `used_provider`, `used_model`, or `routing` (#732)
### MCP integration
- [x] MCP server management UI (System Settings → MCP Servers)
- [x] Add / edit / delete MCP server entries
### Distribution
- [x] Docker support (multi-arch amd64 + arm64, HEALTHCHECK, UID/GID auto-detect)
- [x] Two-container Docker compose (webui + agent)
- [x] GHCR auto-publish on tag push
- [x] Subpath mount support (reverse proxy at `/hermes/`)
- [x] PWA installable from any browser
- [x] Native macOS app — universal Intel + Apple Silicon, signed + notarized DMG, Sparkle 2 auto-update — see `hermes-webui/hermes-swift-mac` repo
---
## Sprint 7: Wave 2 Core -- Cron/Skill/Memory CRUD + Session Content Search (COMPLETED)
## Forward work
**Theme:** "Wave 2 Core -- Cron/Skill/Memory CRUD + Session Content Search"
### Confirmed candidates (open feature requests with sprint-candidate or active interest)
### Track A: Bug Fixes
| Item | Description |
|------|-------------|
| Activity bar sizing | Activity bar sometimes overlaps first message on short viewports |
| Model dropdown sync | Model chip in topbar sometimes shows stale model after session switch |
| Cron output truncation | Long cron output in the tasks panel overflows its container |
| Theme | Tracking | Why |
|---|---|---|
| Persistent-host stability | #1458 | Bootstrap fork pattern crashes under launchd / systemd — partial fix shipped (foreground mode); state.db FD leak and HTTP-unhealthy wedge remain |
| Free-tier OpenRouter variants visible | #1426 | `:free` tool-support filter currently hides them from the picker |
| macOS scroll override regression | #1360 | Auto-scroll sometimes overrides user scroll on the desktop app |
| GLM dual-use (main + auxiliary) | #1291 | Currently mutually exclusive; same provider can't serve both surfaces |
| Auto-assign session to filtered project | #1468 | When user is filtering by project X, new session should default to project X |
| Update banner "What's new?" link | #1512 | Surface release highlights from the update banner |
| Sunset legacy `LMSTUDIO_API_KEY` env var | #1502 | Tracking issue — alias stays for one minor cycle, then removed |
| Hermes Agent dashboard cross-link | #1459 | Detect a running Hermes Agent and surface link in nav |
| Gateway status card in Settings | #1457 | Current gateway-status dots only on profile picker |
| Insights — daily token chart + per-model breakdown | #1456 | Existing usage badge is per-message; need rollup view |
| Logs tab — view agent / errors / gateway logs | #1455 | Currently requires terminal access to log files |
| Model picker collision handling | #1425 | Same-name models from different providers aren't disambiguated in dropdown |
| "Reveal in Finder" right-click on workspace | #1424 | macOS desktop app convenience |
| Configurable session persistence timing | #1406 | Currently every checkpoint, want operator control |
| Silent credential self-heal on 401 | #1401 | Gateway auth.json drift should resolve without user re-auth |
| LLM Wiki status panel | #1257 | On / off toggle for Wiki integration |
| Lightweight in-app Canvas editing | #1255 | Text canvas for prompt drafting / shared notes |
| Provider / Model source-of-truth alignment | #1240 | Reconcile WebUI vs CLI vs Gateway provider resolution |
| Built-in SearXNG web search | #1037 | Lightweight search tool with on / off toggle |
| Subagent session relationship view | #1004 | Show subagent hierarchy in sidebar with expand / collapse |
### Track B: Features
| Feature | What | Value |
|---------|------|-------|
| Session content search | Search message text across all sessions, not just titles. GET /api/sessions/search already does title search; extend to message content with a configurable depth limit | High: the single most-requested nav feature after rename |
| Cron edit + delete | Edit an existing cron job (name, schedule, prompt, delivery) inline in the tasks panel. Delete with confirm. POST /api/crons/update and /api/crons/delete | High: closes the cron CRUD gap (create was Sprint 6) |
| Skill create + edit | A "New skill" form in the Skills panel. Name, category, SKILL.md content in a textarea editor. Save calls POST /api/skills/save (writes to ~/.hermes/skills/). Edit opens existing skill in the same editor | High: biggest remaining CLI gap after cron |
### Backlog (deferred, listed for visibility)
### Track C: Architecture
| Item | What |
|------|------|
| Phase E: app.js module split (start) | Split app.js (1332 lines) into logical modules: sessions.js, chat.js, workspace.js, panels.js, ui.js. Serve via ES module imports in index.html. This is Phase E completion. |
| Health endpoint improvement | Add active_streams, uptime_seconds to /health response (Phase G) |
| Git init | git init <repo>, first commit, push to private GitHub repo |
- **Insights / monitoring suite** — agent heartbeat + alerts (#716), quota / rate-limit display (#706), data tabs (#722), monitor dashboard concepts (#766, #721)
- **Native MCP server expose** — Hermes WebUI as an MCP server for direct agent integration (#733)
- **Teams / agents management panel** — editable names, roles, assignments (#719)
- **Web UI profile model alignment with Hermes runtime** — design parity (#749)
- **DOM windowing / message virtualization** — for sessions with hundreds of messages (#734)
- **Searchable global tool list** (#697)
- **Add agent / replace model modals** (#698)
- **Code execution inline cells** — Jupyter-style cell rendering inside chat
- **Sharing / public conversation URLs** — requires hosted backend with access control (out of scope for self-host)
### Tests
- ~20 new pytest tests (cron update/delete, skill save, session content search)
- TESTING.md: Sections 29-31 (cron edit, skill edit, session search)
- Estimated total after Sprint 7: ~126
### Intentionally not planned
- Full SwiftUI rewrite of the frontend — the WKWebView shell already gets 95% of native benefit
- App Store distribution — sandboxing breaks the local server model
- Real-time multi-user collaboration — single-user assumption throughout
- Plugin marketplace — Hermes skills cover this surface
- Anthropic / Claude proprietary features — Projects AI memory, Claude artifacts sync (not reproducible)
---
## Wave 2: Full CRUD and Interaction Parity
## Sprint history
**Status:** In progress. Sprint 6 completed cron create and workspace management.
Remaining Wave 2 items targeted for Sprints 7-8.
Per-version detail lives in [CHANGELOG.md](./CHANGELOG.md). The table below is a high-level chronology of major sprint themes; individual PR / fix detail moved to CHANGELOG to keep this file readable.
### Sprint 2.0: Workspace Management (COMPLETE Sprint 5+6)
All workspace features delivered: add/validate/remove/rename workspaces, topbar quick-switch,
sidebar live display, new sessions inherit last workspace. See Sprint 5 completed section.
### Sprint 2.1: Cron Job Management (Partial -- Sprint 7 for remaining)
- [x] View all jobs (Sprint 3)
- [x] Run / pause / resume (Sprint 3)
- [x] Create job from UI (Sprint 6)
- [x] Edit job
- [x] Delete job
- [x] Full cron run history
### Sprint 2.2: Skill Management (Partial -- Sprint 7 for remaining)
- [x] List all skills with categories (Sprint 3)
- [x] View SKILL.md content (Sprint 3)
- [x] Create skill
- [x] Edit skill
- [x] Delete skill
### Sprint 2.3: Memory Write (Sprint 7)
- [x] View notes + profile (Sprint 3)
- [x] Edit notes inline
### Sprint 2.4: Todo Management (Wave 2)
- [x] View current todo list (sidebar Todo panel, parsed from session history)
### Sprint 2.5: Session Content Search (Sprint 7)
- [x] Session title search (Sprint 4)
- [x] Message content search across sessions
### Sprint 2.6: Session Rename (COMPLETE Sprint 4)
Double-click any session title in the left sidebar to edit inline.
Enter saves, Escape cancels. Topbar updates immediately.
| Range | Theme | Highlights |
|---|---|---|
| Sprints 16 | Foundations + workspace | server / static split, JS module split, workspace CRUD, file editor, message queue + INFLIGHT, isolated test environment |
| Sprint 7 | Wave 2 core | Cron / skill / memory CRUD, session content search, health endpoint, git init |
| Sprint 8 | Daily-driver finish line | Edit + regenerate, regenerate last response, clear conversation, Prism.js, queue + INFLIGHT polish |
| Sprints 910 | Codebase health + operational polish | `app.js` → 6 modules, server.py → `api/` modules, tool card UX, background task cancel, regression tests |
| Sprint 11 | Multi-provider models + streaming | Dynamic model dropdown, smooth scroll pinning, routes extracted to `api/routes.py` |
| Sprint 12 | Settings + reliability + session QoL | Settings panel, SSE auto-reconnect, pin sessions, JSON import |
| Sprint 13 | Alerts + polish | Cron alerts, background error banner, session duplicate, browser tab title |
| Sprint 14 | Visual polish + workspace ops | Mermaid, message timestamps, file rename, folder create, session tags, archive |
| Sprint 15 | Session projects + code copy | Projects / folders, code copy button, tool card expand / collapse |
| Sprint 16 | Sidebar visual polish | SVG icons, action dropdown, pin indicator, project border, safe HTML rendering |
| Sprint 17 | Workspace polish + slash commands | Breadcrumb nav, slash command autocomplete, send key setting (#26) |
| Sprint 18 | Thinking display + workspace tree | File preview auto-close, thinking / reasoning cards, expandable directory tree (#22) |
| Sprint 19 | Auth + security hardening | Password auth, login page, security headers, body limit (#23) |
| Sprint 20 | Voice input + send button | Web Speech API voice, send button polish |
| Sprint 21 | Mobile responsive + Docker | Hamburger sidebar, mobile nav, slide-over files, Docker support (#21, #7) |
| Sprint 22 | Multi-profile support | Profile picker, management panel, seamless switching, per-session tracking (#28) |
| Sprint 23 | Agentic transparency | Token / cost display, subagent cards, skill picker in cron, profile-local storage |
| Sprint 24 | Web polish | rAF streaming, git detection, collapsible date groups, context ring (#80, #81, #82, #83) |
| Sprint 25 | macOS desktop application | Native Swift + WKWebView shell, universal DMG, Sparkle 2 auto-update — separate repo |
| Sprint 26 | Pluggable themes | Light / Slate / Solarized / Monokai / Nord, settings unsaved-changes guard, `/theme` |
| Sprint 27 | Theme polish | 30+ hardcoded colors → CSS variables, light theme final polish |
| Sprint 28 | Security hardening | Env race fix, random signing key, upload traversal, PBKDF2 |
| Sprints 2932 | Model routing + custom endpoints + reasoning | Model routing by provider prefix, custom endpoint URL fix, OLED theme, top-level reasoning, message_count sync |
| Sprint 33 | Approval card + Lucide icons | Approval prompt surfaced, emoji → SVG, login CSP fix, update diagnostics |
| Sprint 34 | v0.50.0 UI overhaul | Composer-centric controls, Control Center modal, workspace state machine, collapsible date groups, rAF throttle, context ring |
| Sprints 3537 | Onboarding + i18n + Spanish | First-run wizard, OpenRouter / Anthropic / OpenAI / Custom config, Spanish locale, Docker two-container, mobile Profiles button |
| Sprints 3840 | Session + UI polish + Sprint 40 | Five-bug clean-up + sidebar timestamp + test port isolation |
| Sprints 4142 | Renderer hardening + KaTeX + handoff | Context ring live usage, renderMd link / image / code stash chain, MEDIA: image rendering, gateway handoff foundation |
| Sprints 43+ | Continuous contributor sprints | Custom providers, Russian locale, IME fixes, model-switch toast, approval queue multi-slot, profile polish, font-size CSS, contributor wave |
---
## Completed Waves (Summary)
## Versioning conventions
| Wave | Theme | Key Deliverables |
|------|-------|-----------------|
| Wave 2 | Full CRUD + Interaction | Cron/skill/memory CRUD, session search, workspace management, session rename |
| Wave 3 | Power Features | Tool call cards, multi-model dropdown, resizable panels, file actions, conversation controls |
| Wave 4 | Settings + Notifications | Settings panel, cron alerts, background error banner |
| Wave 5 | Session Continuity | Session tags, archive, projects/folders |
| Wave 6 | Agentic Features | Background task cancel, voice input (Web Speech API) |
| Wave 7 | Production Hardening | Password auth, security headers, mobile responsive, Docker + GHCR CI |
- **Patch** (`v0.50.X`) — small batches, contributor PR releases, hotfixes
- **Minor** (`v0.X.0`) — sprint completion, new feature surface, architecture milestone
- **Major** (`v1.0.0`) — declared when CLI parity + Claude parity reach steady state and the feature surface stabilizes
---
## User Requested Features
Community-requested enhancements tracked from GitHub issues. All shipped.
| Feature | Issue | Shipped | Sprint |
|---------|-------|---------|--------|
| Workspace tree view | #22 | Done | Sprint 18 |
| Docker container + GHCR images | #7 | Done | Sprint 21 + v0.28.1 CI |
| Authentication | #23 | Done | Sprint 19 |
| Send key / personalization | #26 | Done | Sprint 17 |
| Multi-profile support | #28 | Done | Sprint 22 |
| Mobile responsive UI | #21 | Done | Sprint 21 |
| Profile creation in Docker | #44 | Done | v0.27 |
Per-version detail and contributor attribution live in [CHANGELOG.md](./CHANGELOG.md).
+92 -1104
View File
File diff suppressed because it is too large Load Diff
+2 -2
View File
@@ -1835,8 +1835,8 @@ Bridged CLI sessions:
---
*Last updated: v0.50.278, May 03, 2026*
*Total automated tests collected: 3936*
*Last updated: v0.51.31, May 9, 2026*
*Total automated tests collected: 4977*
*Regression gate: tests/test_regressions.py*
*Run: pytest tests/ -v --timeout=60*
*Source: <repo>/*
+330
View File
@@ -0,0 +1,330 @@
"""Hermes agent/gateway heartbeat payload helpers (#716, #1879).
The WebUI process is not always paired with a long-running Hermes gateway. Some
setups use WebUI only, while self-hosted messaging deployments run a separate
Hermes gateway daemon that records runtime metadata in the Hermes Agent home.
This module turns those existing safe runtime signals into a small UI-facing
heartbeat without shelling out or adding psutil as a hard dependency.
Cross-container note (#1879): ``gateway.status.get_running_pid()`` uses
``fcntl.flock`` and ``os.kill(pid, 0)``, both of which require the caller to
share a PID namespace with the gateway process. In multi-container deployments
where the WebUI runs separately from ``hermes-agent`` and only a Hermes data
volume is shared, those checks always return ``None`` and the dashboard
incorrectly shows "Gateway not running". To stay accurate without forcing a
``pid: "service:hermes-agent"`` compose workaround, we accept a recent
``updated_at`` timestamp on ``gateway_state.json`` (combined with
``gateway_state == "running"``) as an equivalent live-process signal the
gateway already writes that file on every tick.
"""
from __future__ import annotations
import importlib
import json
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
_GATEWAY_PID_FILE = "gateway.pid"
_GATEWAY_RUNTIME_STATUS_FILE = "gateway_state.json"
# Two cron ticks (~60s each). Chosen to avoid false negatives during brief
# gateway restarts while still surfacing a true outage within a couple of
# minutes. Override is intentionally not exposed: keep the check deterministic
# and identical across deployments so support diagnostics are reproducible.
GATEWAY_FRESHNESS_THRESHOLD_S: float = 120.0
def _checked_at() -> str:
return datetime.now(timezone.utc).isoformat()
def _runtime_status_is_fresh(
runtime_status: dict[str, Any] | None,
*,
now: datetime | None = None,
threshold_s: float = GATEWAY_FRESHNESS_THRESHOLD_S,
) -> bool:
"""Return ``True`` when ``gateway_state.json`` looks freshly written.
"Fresh" means the gateway self-reported ``running`` and the ``updated_at``
ISO-8601 timestamp is no older than ``threshold_s`` seconds. This is the
cross-container liveness signal used when ``get_running_pid()`` returns
``None`` purely because of PID-namespace isolation (#1879).
Any unparseable input is treated as "not fresh" a stale or missing
timestamp must never report alive.
"""
if not isinstance(runtime_status, dict):
return False
if runtime_status.get("gateway_state") != "running":
return False
raw_updated_at = runtime_status.get("updated_at")
if not isinstance(raw_updated_at, str) or not raw_updated_at:
return False
# ``datetime.fromisoformat`` accepts the exact format gateway/status.py
# writes (``datetime.now(timezone.utc).isoformat()``). We deliberately
# don't pull in dateutil — keeping this stdlib-only matches the rest of
# this module.
try:
updated_at = datetime.fromisoformat(raw_updated_at)
except (TypeError, ValueError):
return False
if updated_at.tzinfo is None:
# A naive timestamp could mean anything across containers / hosts.
# Refuse to interpret it rather than assume UTC.
return False
reference = now if now is not None else datetime.now(timezone.utc)
age_s = (reference - updated_at).total_seconds()
if age_s < 0:
# Clock skew between containers can produce small negatives. A future
# timestamp is still a "fresh" signal — the gateway clearly wrote it
# very recently — so accept it. A wildly-future timestamp (> threshold
# in the future) is rejected to avoid trusting a broken clock.
return -age_s <= threshold_s
return age_s <= threshold_s
def _runtime_status_is_stale_stopped(
runtime_status: dict[str, Any] | None,
*,
now: datetime | None = None,
threshold_s: float = GATEWAY_FRESHNESS_THRESHOLD_S,
) -> bool:
"""Return ``True`` for an old clean-stop root gateway state.
A user may run only profile-scoped gateways while a root
``gateway_state.json`` from an older, intentionally stopped gateway remains
on disk (#1944). Treat that stale stopped file like "no root gateway
configured" so the heartbeat banner does not keep warning about a service
the user is not running. Fresh stopped state still reports down.
"""
if not isinstance(runtime_status, dict):
return False
if runtime_status.get("gateway_state") != "stopped":
return False
raw_updated_at = runtime_status.get("updated_at")
if not isinstance(raw_updated_at, str) or not raw_updated_at:
return False
try:
updated_at = datetime.fromisoformat(raw_updated_at)
except (TypeError, ValueError):
return False
if updated_at.tzinfo is None:
return False
reference = now if now is not None else datetime.now(timezone.utc)
age_s = (reference - updated_at).total_seconds()
return age_s > threshold_s
def _gateway_status_module():
"""Load gateway.status lazily so tests and WebUI-only installs stay isolated."""
return importlib.import_module("gateway.status")
def _gateway_root_pid_path() -> Path | None:
"""Return the root Hermes gateway PID path.
Gateway runtime files are root-level singletons. A profile-scoped WebUI
process may have HERMES_HOME=<root>/profiles/<name>, but gateway.pid,
gateway.lock, and gateway_state.json still live under <root>.
"""
try:
from hermes_constants import get_default_hermes_root
return get_default_hermes_root() / _GATEWAY_PID_FILE
except Exception:
return None
def _read_runtime_status_path(path: Path) -> dict[str, Any] | None:
try:
payload = json.loads(path.read_text(encoding="utf-8"))
except (OSError, UnicodeDecodeError, json.JSONDecodeError):
return None
if isinstance(payload, dict):
return payload
return None
def _read_gateway_runtime_status(gateway_status: Any, pid_path: Path | None) -> dict[str, Any] | None:
read_runtime_status = gateway_status.read_runtime_status
if pid_path is not None:
try:
return read_runtime_status(pid_path=pid_path)
except TypeError:
try:
return read_runtime_status(pid_path)
except TypeError:
if getattr(gateway_status, "__name__", "") == "gateway.status" or hasattr(
gateway_status,
"_read_json_file",
):
runtime_status_file = str(
getattr(gateway_status, "_RUNTIME_STATUS_FILE", _GATEWAY_RUNTIME_STATUS_FILE)
)
runtime_status = _read_runtime_status_path(pid_path.with_name(runtime_status_file))
if runtime_status is not None:
return runtime_status
return read_runtime_status()
def _gateway_running_pid(gateway_status: Any, pid_path: Path | None) -> int | None:
get_running_pid = gateway_status.get_running_pid
if pid_path is not None:
try:
return get_running_pid(pid_path=pid_path, cleanup_stale=False)
except TypeError:
try:
return get_running_pid(pid_path, cleanup_stale=False)
except TypeError:
pass
try:
return get_running_pid(cleanup_stale=False)
except TypeError:
# Older agent versions may not expose cleanup_stale. Keep compatibility.
return get_running_pid()
def _runtime_detail_subset(runtime_status: dict[str, Any] | None) -> dict[str, Any]:
"""Return only non-sensitive runtime fields for the browser.
gateway.status records argv/PID metadata so the CLI can validate process
identity. The WebUI alert only needs health semantics, never raw command
lines, paths, environment, or tokens.
"""
if not isinstance(runtime_status, dict):
return {}
details: dict[str, Any] = {}
gateway_state = runtime_status.get("gateway_state")
if isinstance(gateway_state, str) and gateway_state:
details["gateway_state"] = gateway_state
updated_at = runtime_status.get("updated_at")
if isinstance(updated_at, str) and updated_at:
details["updated_at"] = updated_at
try:
details["active_agents"] = max(0, int(runtime_status.get("active_agents") or 0))
except (TypeError, ValueError):
pass
platforms = runtime_status.get("platforms")
if isinstance(platforms, dict):
details["platform_count"] = len(platforms)
states: dict[str, int] = {}
for payload in platforms.values():
if not isinstance(payload, dict):
continue
state = payload.get("state")
if isinstance(state, str) and state:
states[state] = states.get(state, 0) + 1
if states:
details["platform_states"] = states
return details
def build_agent_health_payload() -> dict[str, Any]:
"""Return `{alive, checked_at, details}` for the Hermes gateway/agent.
`alive` is intentionally tri-state:
* True: a gateway runtime signal says the process is alive.
* False: gateway metadata exists, but no live gateway process owns it.
* None: no gateway metadata/status is available, so this WebUI setup is
probably not configured with a separate gateway process.
"""
checked_at = _checked_at()
try:
gateway_status = _gateway_status_module()
except Exception as exc:
return {
"alive": None,
"checked_at": checked_at,
"details": {
"state": "unknown",
"reason": "gateway_status_unavailable",
"error": type(exc).__name__,
},
}
gateway_pid_path = _gateway_root_pid_path()
runtime_status = None
try:
runtime_status = _read_gateway_runtime_status(gateway_status, gateway_pid_path)
except Exception:
runtime_status = None
try:
running_pid = _gateway_running_pid(gateway_status, gateway_pid_path)
except Exception:
running_pid = None
safe_details = _runtime_detail_subset(runtime_status)
if running_pid is not None:
return {
"alive": True,
"checked_at": checked_at,
"details": {
"state": "alive",
**safe_details,
},
}
# Cross-container fallback (#1879): when ``get_running_pid()`` cannot see
# the gateway because we're in a different PID namespace, a recent
# ``updated_at`` on ``gateway_state.json`` is a reliable equivalent signal
# since the gateway writes it on every tick. We only trust this fallback
# when the gateway also self-reports ``gateway_state == "running"`` so
# crash-without-cleanup scenarios still surface as "down".
if _runtime_status_is_fresh(runtime_status):
return {
"alive": True,
"checked_at": checked_at,
"details": {
"state": "alive",
"reason": "cross_container_freshness",
**safe_details,
},
}
if _runtime_status_is_stale_stopped(runtime_status):
return {
"alive": None,
"checked_at": checked_at,
"details": {
"state": "unknown",
"reason": "gateway_stale_stopped_state",
**safe_details,
},
}
if isinstance(runtime_status, dict):
return {
"alive": False,
"checked_at": checked_at,
"details": {
"state": "down",
"reason": "gateway_not_running",
**safe_details,
},
}
return {
"alive": None,
"checked_at": checked_at,
"details": {
"state": "unknown",
"reason": "gateway_not_configured",
},
}
+315 -5
View File
@@ -14,6 +14,9 @@ MESSAGING_SOURCES = {
'weixin',
}
CLI_MIN_UNTITLED_MESSAGE_COUNT = 6
CLI_MIN_UNTITLED_USER_MESSAGE_COUNT = 2
SOURCE_LABELS = {
'api_server': 'API',
'cli': 'CLI',
@@ -71,6 +74,115 @@ def _optional_col(name: str, columns: set[str], fallback: str = "NULL") -> str:
return f"s.{name}" if name in columns else f"{fallback} AS {name}"
def _safe_lower(value) -> str:
return str(value or "").strip().lower()
def _normalize_source_name(value: object) -> str:
source = _safe_lower(value)
if not source:
return ""
if source.endswith(" session"):
source = source[:-len(" session")].strip()
return source
def _looks_like_default_cli_title(row: dict) -> bool:
"""Return True when a CLI row looks like framework-generated metadata."""
title = _safe_lower(row.get("title"))
if not title or title == "untitled":
return True
if title in {"cli", "cli session"}:
return True
source_candidates = {
_normalize_source_name(row.get("source")),
_normalize_source_name(row.get("session_source")),
_normalize_source_name(row.get("source_tag")),
_normalize_source_name(row.get("raw_source")),
_normalize_source_name(row.get("source_label")),
}
source_candidates.discard("")
source_candidates.add("cli")
return any(title == f"{candidate} session" for candidate in source_candidates)
def _as_positive_int(value) -> int:
try:
return max(0, int(float(value)))
except (TypeError, ValueError):
return 0
def _count_user_turns(row: dict) -> int:
user_turns = row.get("actual_user_message_count")
if user_turns is None:
user_turns = row.get("user_message_count")
if user_turns is None:
messages = row.get("messages") or []
if isinstance(messages, list):
return sum(
1
for msg in messages
if _safe_lower(msg.get("role") if isinstance(msg, dict) else msg) == "user"
)
return 0
return _as_positive_int(user_turns)
def _has_cli_lineage(row: dict) -> bool:
segment_count = _as_positive_int(row.get("_compression_segment_count"))
return segment_count > 1 or bool(row.get("_lineage_root_id"))
def is_cli_session_row(row: dict) -> bool:
"""Return True for rows that should be treated as CLI-imported sessions."""
if not isinstance(row, dict):
return False
source = _safe_lower(row.get("session_source"))
if source == "messaging":
return False
if source == "cli":
return True
source_tag = _safe_lower(row.get("source_tag"))
raw_source = _safe_lower(row.get("raw_source"))
source_name = _safe_lower(row.get("source"))
source_label = _safe_lower(row.get("source_label"))
if source_tag == "cli" or raw_source == "cli" or source_name == "cli" or source_label == "cli":
return True
# Legacy imported CLI rows may only be marked as CLI in sidebar metadata.
# Keep this conservative to avoid treating messaging sessions as CLI.
return bool(
row.get("is_cli_session")
and source not in MESSAGING_SOURCES
and source_tag not in MESSAGING_SOURCES
and raw_source not in MESSAGING_SOURCES
and source_name not in MESSAGING_SOURCES
and _looks_like_default_cli_title(row)
)
def is_cli_session_row_visible(row: dict) -> bool:
"""Return whether a CLI-related row should remain visible in the sidebar."""
if not isinstance(row, dict):
return False
if not is_cli_session_row(row):
return True
message_count = _as_positive_int(row.get("actual_message_count") or row.get("message_count"))
if message_count <= 0:
return False
if _has_cli_lineage(row):
return True
if not _looks_like_default_cli_title(row):
return True
return _count_user_turns(row) >= CLI_MIN_UNTITLED_USER_MESSAGE_COUNT
def _is_continuation_session(parent: dict | None, child: dict | None) -> bool:
"""Return True when ``child`` is the next segment of the same conversation.
@@ -79,9 +191,18 @@ def _is_continuation_session(parent: dict | None, child: dict | None) -> bool:
should continue the same visible conversation rather than becoming a
separate child-session row. Plain parent/child links that started before the
parent's ended boundary remain child sessions.
Do not collapse lineage across raw sources. A WebUI session that continues
from a Telegram/CLI/etc. parent must remain visible as its own surface-owned
conversation; otherwise the tip inherits the root's title/source metadata and
can disappear under messaging/sidebar policies.
"""
if not parent or not child:
return False
parent_source = str(parent.get('source') or '').strip().lower()
child_source = str(child.get('source') or '').strip().lower()
if parent_source and child_source and parent_source != child_source:
return False
if parent.get('end_reason') not in {'compression', 'cli_close'}:
return False
ended_at = parent.get('ended_at')
@@ -133,10 +254,13 @@ def _project_agent_session_rows(rows: list[dict]) -> list[dict]:
if not parent_id:
continue
children_by_parent.setdefault(parent_id, []).append(row)
if _is_continuation_session(rows_by_id.get(parent_id), row):
parent = rows_by_id.get(parent_id)
if _is_continuation_session(parent, row):
continuation_child_ids.add(row['id'])
else:
row['relationship_type'] = 'child_session'
row['parent_title'] = parent.get('title') if parent else None
row['parent_source'] = parent.get('source') if parent else None
parent_root = _continuation_root_id(rows_by_id, parent_id)
if parent_root:
row['_parent_lineage_root_id'] = parent_root
@@ -189,7 +313,7 @@ def _project_agent_session_rows(rows: list[dict]) -> list[dict]:
# touched standalone sessions — exactly the inverse of what a user
# expects from "Show agent sessions" sorted by activity.
for key in (
'id', 'model', 'message_count', 'actual_message_count',
'id', 'model', 'message_count', 'actual_message_count', 'actual_user_message_count',
'ended_at', 'end_reason', 'last_activity',
):
if key in tip:
@@ -214,9 +338,9 @@ def read_importable_agent_session_rows(
db_path: Path,
limit: int = 200,
log=None,
exclude_sources: tuple[str, ...] | None = ("cron",),
exclude_sources: tuple[str, ...] | None = ("cron", "webui"),
) -> list[dict]:
"""Return non-WebUI agent sessions projected as importable conversations.
"""Return agent sessions projected as importable conversations.
Hermes Agent can create rows in ``state.db.sessions`` before a session has
any messages, and long conversations can be split into compression-linked
@@ -243,6 +367,8 @@ def read_importable_agent_session_rows(
# source column we cannot safely distinguish WebUI rows from agent rows.
cur.execute("PRAGMA table_info(sessions)")
session_cols = {row[1] for row in cur.fetchall()}
cur.execute("PRAGMA table_info(messages)")
message_cols = {row[1] for row in cur.fetchall()}
if 'source' not in session_cols:
log.warning(
"agent session listing skipped: state.db at %s has no 'source' column "
@@ -255,8 +381,21 @@ def read_importable_agent_session_rows(
parent_expr = _optional_col('parent_session_id', session_cols)
ended_expr = _optional_col('ended_at', session_cols)
end_reason_expr = _optional_col('end_reason', session_cols)
user_id_expr = _optional_col('user_id', session_cols)
chat_id_expr = _optional_col('chat_id', session_cols)
chat_type_expr = _optional_col('chat_type', session_cols)
thread_id_expr = _optional_col('thread_id', session_cols)
session_key_expr = _optional_col('session_key', session_cols)
origin_chat_id_expr = _optional_col('origin_chat_id', session_cols)
origin_user_id_expr = _optional_col('origin_user_id', session_cols)
platform_expr = _optional_col('platform', session_cols)
user_message_count_expr = (
"COUNT(CASE WHEN LOWER(m.role) = 'user' THEN 1 END)"
if 'role' in message_cols
else "COUNT(m.id)"
)
where_clauses = ["s.source IS NOT NULL", "s.source != 'webui'"]
where_clauses = ["s.source IS NOT NULL"]
params: list[str] = []
if exclude_sources:
excluded = tuple(str(source) for source in exclude_sources if source)
@@ -269,10 +408,19 @@ def read_importable_agent_session_rows(
f"""
SELECT s.id, s.title, s.model, s.message_count,
s.started_at, s.source,
{user_id_expr},
{chat_id_expr},
{chat_type_expr},
{thread_id_expr},
{session_key_expr},
{origin_chat_id_expr},
{origin_user_id_expr},
{platform_expr},
{parent_expr},
{ended_expr},
{end_reason_expr},
COUNT(m.id) AS actual_message_count,
{user_message_count_expr} AS actual_user_message_count,
MAX(m.timestamp) AS last_activity
FROM sessions s
LEFT JOIN messages m ON m.session_id = s.id
@@ -284,12 +432,170 @@ def read_importable_agent_session_rows(
)
projected = _project_agent_session_rows([dict(row) for row in cur.fetchall()])
projected = [_with_normalized_source(row) for row in projected]
projected = [row for row in projected if is_cli_session_row_visible(row)]
if limit is None:
return projected
return projected[:max(0, int(limit))]
def _lineage_report_row(row: dict, role: str) -> dict:
updated_at = row.get('ended_at') if row.get('ended_at') is not None else row.get('started_at')
return {
'session_id': row.get('id'),
'role': role,
'title': row.get('title'),
'source': row.get('source'),
'started_at': row.get('started_at'),
'updated_at': updated_at,
'end_reason': row.get('end_reason'),
'active': row.get('ended_at') is None,
'archived': False,
}
def _empty_lineage_report(session_id: str, *, found: bool = False) -> dict:
return {
'mutation': False,
'found': found,
'session_id': session_id,
'lineage_key': session_id,
'tip_session_id': session_id,
'total_segments': 0,
'materialized_segments': 0,
'segments': [],
'children': [],
'manual_review': False,
}
def read_session_lineage_report(db_path: Path, session_id: str | None, max_hops: int = 20) -> dict:
"""Return a bounded, read-only lifecycle report for a session lineage.
This helper intentionally reports only facts that can be derived from
``state.db.sessions`` without mutating WebUI JSON, archiving rows, or
deleting historical segments. It mirrors the sidebar continuation rules so
a future UI/PR can explain which rows are hidden compression/cli-close
segments and which child-session branches remain distinct.
"""
sid = str(session_id or '').strip()
if not sid:
return _empty_lineage_report('')
db_path = Path(db_path)
if not db_path.exists():
return _empty_lineage_report(sid)
try:
with closing(sqlite3.connect(str(db_path))) as conn:
conn.row_factory = sqlite3.Row
cur = conn.cursor()
cur.execute("PRAGMA table_info(sessions)")
session_cols = {row[1] for row in cur.fetchall()}
required = {'id', 'parent_session_id', 'end_reason'}
if not required.issubset(session_cols):
return _empty_lineage_report(sid)
source_expr = _optional_col('source', session_cols)
title_expr = _optional_col('title', session_cols)
started_expr = _optional_col('started_at', session_cols, '0')
ended_expr = _optional_col('ended_at', session_cols)
end_reason_expr = _optional_col('end_reason', session_cols)
parent_expr = _optional_col('parent_session_id', session_cols)
def fetch_one(row_id: str | None) -> dict | None:
if not row_id:
return None
cur.execute(
f"""
SELECT s.id,
{source_expr},
{title_expr},
{started_expr},
{parent_expr},
{ended_expr},
{end_reason_expr}
FROM sessions s
WHERE s.id = ?
""",
(row_id,),
)
row = cur.fetchone()
return dict(row) if row else None
target = fetch_one(sid)
if not target:
return _empty_lineage_report(sid)
segments = [target]
current = target
seen = {sid}
manual_review = False
for _hop in range(max(0, int(max_hops))):
parent_id = current.get('parent_session_id')
parent = fetch_one(parent_id)
if not parent or parent_id in seen:
manual_review = bool(parent_id and parent_id in seen)
break
if not _is_continuation_session(parent, current):
break
segments.append(parent)
seen.add(parent_id)
current = parent
else:
manual_review = True
segment_ids = {row['id'] for row in segments}
child_rows: list[dict] = []
for parent in segments:
cur.execute(
f"""
SELECT s.id,
{source_expr},
{title_expr},
{started_expr},
{parent_expr},
{ended_expr},
{end_reason_expr}
FROM sessions s
WHERE s.parent_session_id = ?
ORDER BY s.started_at DESC
""",
(parent['id'],),
)
for child_row in cur.fetchall():
child = dict(child_row)
if child['id'] in segment_ids:
continue
if _is_continuation_session(parent, child):
# A continuation outside the selected path means the
# lineage is branched or the caller selected an older
# segment. Report manual review rather than proposing
# destructive cleanup candidates.
manual_review = True
continue
child_rows.append(child)
except Exception:
return _empty_lineage_report(sid)
root_id = segments[-1]['id'] if segments else sid
tip_id = segments[0]['id'] if segments else sid
return {
'mutation': False,
'found': True,
'session_id': sid,
'lineage_key': root_id,
'tip_session_id': tip_id,
'total_segments': len(segments),
'materialized_segments': len(segments),
'segments': [
_lineage_report_row(row, 'tip' if idx == 0 else 'hidden_segment')
for idx, row in enumerate(segments)
],
'children': [_lineage_report_row(row, 'child_session') for row in child_rows],
'manual_review': manual_review,
}
def read_session_lineage_metadata(db_path: Path, session_ids: list[str] | set[str]) -> dict[str, dict]:
"""Return compression-lineage metadata for known WebUI sidebar sessions.
@@ -378,6 +684,10 @@ def read_session_lineage_metadata(db_path: Path, session_ids: list[str] | set[st
entry['relationship_type'] = 'child_session'
entry['parent_title'] = parent_row.get('title')
entry['parent_source'] = parent_row.get('source')
parent_source = str(parent_row.get('source') or '').strip().lower()
child_source = str(row.get('source') or '').strip().lower()
if parent_source and child_source and parent_source != child_source:
entry['_cross_surface_child_session'] = True
parent_root = _continuation_root_id(rows, parent_id)
if parent_root:
entry['_parent_lineage_root_id'] = parent_root
+88 -8
View File
@@ -17,16 +17,41 @@ from api.config import STATE_DIR, load_settings
logger = logging.getLogger(__name__)
# Default session TTL — 30 days. Kept as a module-level constant for backwards
# compatibility with downstream code and regression tests that import it.
# At runtime, prefer ``_resolve_session_ttl()`` which honours the env var and
# settings.json overrides; this constant is the floor / fallback.
SESSION_TTL = 86400 * 30 # 30 days
def _resolve_session_ttl() -> int:
"""Resolve session TTL from env > settings > default.
Priority mirrors get_password_hash(): HERMES_WEBUI_SESSION_TTL env var
first, then settings.json, falling back to ``SESSION_TTL`` (30 days).
Clamped to [60s, 1 year] to prevent runaway cookies or self-lockout.
"""
env_v = os.getenv('HERMES_WEBUI_SESSION_TTL', '').strip()
if env_v.isdigit():
val = int(env_v)
if 60 <= val <= 86400 * 365:
return val
s = load_settings()
v = s.get('session_ttl_seconds')
if isinstance(v, int) and 60 <= v <= 86400 * 365:
return v
return SESSION_TTL
# ── Public paths (no auth required) ─────────────────────────────────────────
PUBLIC_PATHS = frozenset({
'/login', '/health', '/favicon.ico',
'/login', '/health', '/favicon.ico', '/sw.js',
'/api/auth/login', '/api/auth/status',
'/manifest.json', '/manifest.webmanifest',
'/sw.js',
})
COOKIE_NAME = 'hermes_session'
SESSION_TTL = 86400 * 30 # 30 days
_SESSIONS_FILE = STATE_DIR / '.sessions.json'
@@ -78,24 +103,79 @@ def _save_sessions(sessions: dict[str, float]) -> None:
_sessions = _load_sessions()
# ── Login rate limiter ──────────────────────────────────────────────────────
_login_attempts = {} # ip -> [timestamp, ...]
_LOGIN_ATTEMPTS_FILE = STATE_DIR / '.login_attempts.json'
_LOGIN_MAX_ATTEMPTS = 5
_LOGIN_WINDOW = 60 # seconds
def _load_login_attempts() -> dict[str, list[float]]:
"""Load persisted login attempts from STATE_DIR, pruning expired entries."""
try:
if _LOGIN_ATTEMPTS_FILE.exists():
data = json.loads(_LOGIN_ATTEMPTS_FILE.read_text(encoding='utf-8'))
if not isinstance(data, dict):
raise ValueError('malformed login-attempts file — expected dict')
now = time.time()
attempts: dict[str, list[float]] = {}
for ip, raw_times in data.items():
if not isinstance(ip, str) or not isinstance(raw_times, list):
continue
fresh = [
float(t)
for t in raw_times
if isinstance(t, (int, float)) and now - float(t) < _LOGIN_WINDOW
]
if fresh:
attempts[ip] = fresh
return attempts
except Exception as e:
logger.debug("Failed to load login attempts file, starting fresh: %s", e)
return {}
def _save_login_attempts(attempts: dict[str, list[float]]) -> None:
"""Atomically persist login attempts to STATE_DIR/.login_attempts.json (0600)."""
try:
_LOGIN_ATTEMPTS_FILE.parent.mkdir(parents=True, exist_ok=True)
fd, tmp = tempfile.mkstemp(dir=_LOGIN_ATTEMPTS_FILE.parent, suffix='.login_attempts.tmp')
try:
with os.fdopen(fd, 'w', encoding='utf-8') as f:
json.dump(attempts, f)
os.chmod(tmp, 0o600)
os.replace(tmp, _LOGIN_ATTEMPTS_FILE)
except Exception:
try:
os.unlink(tmp)
except OSError:
pass
raise
except Exception as e:
logger.debug("Failed to persist login attempts: %s", e)
_login_attempts = _load_login_attempts() # ip -> [timestamp, ...]
def _check_login_rate(ip: str) -> bool:
"""Return True if the IP is allowed to attempt login."""
now = time.time()
attempts = _login_attempts.get(ip, [])
# Prune old attempts
attempts = [t for t in attempts if now - t < _LOGIN_WINDOW]
_login_attempts[ip] = attempts
if attempts:
_login_attempts[ip] = attempts
else:
_login_attempts.pop(ip, None)
_save_login_attempts(_login_attempts)
return len(attempts) < _LOGIN_MAX_ATTEMPTS
def _record_login_attempt(ip: str) -> None:
now = time.time()
attempts = _login_attempts.get(ip, [])
attempts.append(now)
_login_attempts[ip] = attempts
_save_login_attempts(_login_attempts)
def _signing_key():
@@ -156,7 +236,7 @@ def verify_password(plain) -> bool:
def create_session() -> str:
"""Create a new auth session. Returns signed cookie value."""
token = secrets.token_hex(32)
_sessions[token] = time.time() + SESSION_TTL
_sessions[token] = time.time() + _resolve_session_ttl()
_save_sessions(_sessions)
sig = hmac.new(_signing_key(), token.encode(), hashlib.sha256).hexdigest()[:32]
return f"{token}.{sig}"
@@ -257,7 +337,7 @@ def check_auth(handler, parsed) -> bool:
# safe='/' keeps path separators readable; everything else (including
# `?`, `&`, `=`) gets percent-encoded.
_next = _urlparse.quote(_path_with_query, safe='/')
handler.send_header('Location', '/login?next=' + _next)
handler.send_header('Location', 'login?next=' + _next)
handler.end_headers()
return False
@@ -269,7 +349,7 @@ def set_auth_cookie(handler, cookie_value) -> None:
cookie[COOKIE_NAME]['httponly'] = True
cookie[COOKIE_NAME]['samesite'] = 'Lax'
cookie[COOKIE_NAME]['path'] = '/'
cookie[COOKIE_NAME]['max-age'] = str(SESSION_TTL)
cookie[COOKIE_NAME]['max-age'] = str(_resolve_session_ttl())
# Set Secure flag when connection is HTTPS
if getattr(handler.request, 'getpeercert', None) is not None or handler.headers.get('X-Forwarded-Proto', '') == 'https':
cookie[COOKIE_NAME]['secure'] = True
+1685 -86
View File
File diff suppressed because it is too large Load Diff
+211
View File
@@ -0,0 +1,211 @@
"""Safe server-side probe for the official Hermes Agent dashboard.
The official `hermes dashboard` binds to 127.0.0.1:9119 by default and exposes
GET /api/status as a public, read-only identity/status endpoint. Keep all
probing server-side to avoid browser CORS/mixed-content failures, and only allow
loopback targets so a user-controlled setting cannot become an SSRF primitive.
"""
from __future__ import annotations
import json
import logging
import os
import urllib.request
from urllib.parse import urlparse
logger = logging.getLogger(__name__)
DEFAULT_DASHBOARD_PORT = 9119
DEFAULT_DASHBOARD_TIMEOUT = 0.5
DEFAULT_DASHBOARD_TARGETS = (("127.0.0.1", DEFAULT_DASHBOARD_PORT), ("localhost", DEFAULT_DASHBOARD_PORT))
_DASHBOARD_ENABLED_VALUES = {"auto", "always", "never"}
_LOOPBACK_HOSTS = {"127.0.0.1", "localhost", "::1"}
def _base_url(host: str, port: int, scheme: str = "http") -> str:
display_host = f"[{host}]" if ":" in host and not host.startswith("[") else host
return f"{scheme}://{display_host}:{port}"
def normalize_dashboard_url(raw_url: str | None) -> tuple[str, int, str, str] | None:
"""Return (host, port, scheme, base_url) for a safe loopback dashboard URL.
Overrides intentionally accept only scheme + loopback host + explicit port.
Paths, query strings, fragments, and credentials are rejected: the probe
appends the official `/api/status` fingerprint itself and must not become an
arbitrary local URL fetcher.
"""
raw = str(raw_url or "").strip()
if not raw:
return None
parsed = urlparse(raw)
if parsed.scheme not in {"http", "https"}:
raise ValueError("invalid dashboard URL scheme")
if parsed.username or parsed.password:
raise ValueError("invalid dashboard URL credentials")
host = parsed.hostname or ""
normalized_host = host.strip().lower()
if normalized_host not in _LOOPBACK_HOSTS:
raise ValueError("invalid dashboard URL host")
try:
port = parsed.port
except ValueError as exc:
raise ValueError("invalid dashboard URL port") from exc
if not isinstance(port, int) or not (1 <= port <= 65535):
raise ValueError("invalid dashboard URL port")
path = parsed.path or ""
if path not in ("", "/") or parsed.params or parsed.query or parsed.fragment:
raise ValueError("invalid dashboard URL path")
base = _base_url(normalized_host, port, parsed.scheme)
return normalized_host, port, parsed.scheme, base
def _looks_like_official_dashboard(payload: object) -> bool:
if not isinstance(payload, dict):
return False
version = payload.get("version")
if not isinstance(version, str) or not version.strip():
return False
# Verified against current Hermes Agent `hermes_cli.web_server.get_status()`:
# /api/status returns version plus these Hermes-specific fields. Requiring at
# least one avoids treating any generic {version: ...} local service as the
# official dashboard.
return any(key in payload for key in ("release_date", "hermes_home", "config_path", "gateway_running"))
def probe_official_dashboard(
host: str,
port: int,
timeout: float = DEFAULT_DASHBOARD_TIMEOUT,
scheme: str = "http",
) -> dict:
"""Best-effort check that `hermes dashboard` is running on host:port."""
try:
normalized_host = str(host or "").strip().lower()
if normalized_host not in _LOOPBACK_HOSTS:
raise ValueError("dashboard probe host must be loopback")
port = int(port)
if not (1 <= port <= 65535):
raise ValueError("dashboard probe port out of range")
if scheme not in {"http", "https"}:
raise ValueError("dashboard probe scheme must be http or https")
base = _base_url(normalized_host, port, scheme)
request = urllib.request.Request(
f"{base}/api/status",
headers={"Accept": "application/json", "User-Agent": "hermes-webui-dashboard-probe"},
)
with urllib.request.urlopen(request, timeout=timeout) as response:
if getattr(response, "status", None) != 200:
return {"running": False}
payload = json.loads(response.read().decode("utf-8"))
if not _looks_like_official_dashboard(payload):
return {"running": False}
result = {"running": True, "host": normalized_host, "port": port, "url": base}
version = payload.get("version")
if isinstance(version, str) and version.strip():
result["version"] = version.strip()
return result
except Exception:
logger.debug("official Hermes dashboard probe failed", exc_info=True)
return {"running": False}
def _dashboard_config(config_data: dict | None = None) -> dict:
if config_data is None:
try:
from api.config import get_config
config_data = get_config()
except Exception:
config_data = {}
webui_cfg = config_data.get("webui", {}) if isinstance(config_data, dict) else {}
dashboard_cfg = webui_cfg.get("dashboard", {}) if isinstance(webui_cfg, dict) else {}
return dashboard_cfg if isinstance(dashboard_cfg, dict) else {}
def get_dashboard_config(config_data: dict | None = None) -> dict:
"""Return normalized profile config for the Settings → System controls."""
dashboard_cfg = _dashboard_config(config_data)
enabled = str(dashboard_cfg.get("enabled", "auto") or "auto").strip().lower()
if enabled not in _DASHBOARD_ENABLED_VALUES:
enabled = "auto"
raw_url = str(dashboard_cfg.get("url") or "").strip()
if raw_url:
# Normalize before echoing so the UI never displays unsafe/stale values.
_host, _port, _scheme, raw_url = normalize_dashboard_url(raw_url)
return {"enabled": enabled, "url": raw_url}
def save_dashboard_config(payload: dict) -> dict:
"""Persist dashboard link settings under webui.dashboard in config.yaml."""
enabled = str((payload or {}).get("enabled", "auto") or "auto").strip().lower()
if enabled not in _DASHBOARD_ENABLED_VALUES:
raise ValueError("invalid dashboard enabled mode")
raw_url = str((payload or {}).get("url", "") or "").strip()
normalized_url = ""
if raw_url:
_host, _port, _scheme, normalized_url = normalize_dashboard_url(raw_url)
from api import config as webui_config
config_path = webui_config._get_config_path()
config_data = webui_config._load_yaml_config_file(config_path)
webui_section = config_data.get("webui")
if not isinstance(webui_section, dict):
webui_section = {}
config_data["webui"] = webui_section
dashboard_section = webui_section.get("dashboard")
if not isinstance(dashboard_section, dict):
dashboard_section = {}
webui_section["dashboard"] = dashboard_section
dashboard_section["enabled"] = enabled
if normalized_url:
dashboard_section["url"] = normalized_url
else:
dashboard_section.pop("url", None)
webui_config._save_yaml_config_file(config_path, config_data)
webui_config.reload_config()
return {"enabled": enabled, "url": normalized_url}
def _webui_bind_host_allows_auto_probe() -> bool:
raw_host = str(os.environ.get("HERMES_WEBUI_HOST") or "127.0.0.1").strip().lower()
host = raw_host.replace("[", "").replace("]", "")
return host in _LOOPBACK_HOSTS
def get_dashboard_status(config_data: dict | None = None) -> dict:
"""Return the safe status payload consumed by GET /api/dashboard/status."""
dashboard_cfg = _dashboard_config(config_data)
enabled = str(dashboard_cfg.get("enabled", "auto") or "auto").strip().lower()
if enabled not in _DASHBOARD_ENABLED_VALUES:
enabled = "auto"
if enabled == "never":
return {"running": False, "enabled": "never"}
raw_url = dashboard_cfg.get("url") or dashboard_cfg.get("target") or ""
try:
override = normalize_dashboard_url(raw_url)
except ValueError:
return {"running": False, "enabled": enabled, "error": "invalid dashboard url"}
targets: list[tuple[str, int, str, str]]
if override:
targets = [override]
else:
targets = [(host, port, "http", _base_url(host, port)) for host, port in DEFAULT_DASHBOARD_TARGETS]
if enabled == "always":
host, port, scheme, base = targets[0]
return {"running": True, "enabled": enabled, "host": host, "port": port, "url": base}
if not _webui_bind_host_allows_auto_probe():
return {"running": False, "enabled": enabled}
for host, port, scheme, _base in targets:
result = probe_official_dashboard(host, port, timeout=DEFAULT_DASHBOARD_TIMEOUT, scheme=scheme)
if result.get("running"):
result["enabled"] = enabled
return result
return {"running": False, "enabled": enabled}
+608
View File
@@ -0,0 +1,608 @@
"""WebUI bridge for Hermes persistent session goals."""
from __future__ import annotations
import copy
import logging
import re
import time
from pathlib import Path
from typing import Any, Dict, Optional
logger = logging.getLogger(__name__)
try: # Exposed as a module attribute so tests can monkeypatch it directly.
from hermes_cli.goals import ( # type: ignore
CONTINUATION_PROMPT_TEMPLATE,
DEFAULT_MAX_TURNS,
GoalManager as _NativeGoalManager,
GoalState,
judge_goal,
)
except Exception: # pragma: no cover - depends on installed hermes-agent
CONTINUATION_PROMPT_TEMPLATE = "" # type: ignore
DEFAULT_MAX_TURNS = 20 # type: ignore
_NativeGoalManager = None # type: ignore
GoalState = None # type: ignore
judge_goal = None # type: ignore
GoalManager = _NativeGoalManager # type: ignore
_DB_CACHE: dict[str, Any] = {}
def _default_max_turns() -> int:
"""Return the configured /goal turn budget, defaulting to Hermes' 20 turns."""
try:
from api import config as _config
cfg = getattr(_config, "cfg", {}) or {}
goals_cfg = cfg.get("goals", {}) if isinstance(cfg, dict) else {}
if not isinstance(goals_cfg, dict):
return int(DEFAULT_MAX_TURNS or 20)
return max(1, int(goals_cfg.get("max_turns", DEFAULT_MAX_TURNS or 20) or 20))
except Exception:
return int(DEFAULT_MAX_TURNS or 20)
def _meta_key(session_id: str) -> str:
return f"goal:{session_id}"
def _profile_db(profile_home: str | Path):
"""Return a SessionDB pinned to *profile_home*, without reading HERMES_HOME.
The upstream Hermes GoalManager persists through hermes_cli.goals.load_goal(),
which resolves SessionDB from process-global HERMES_HOME. WebUI sessions are
profile-scoped and can run concurrently, so the WebUI bridge uses an explicit
state.db path whenever the caller provides the session's profile home.
"""
home = Path(profile_home).expanduser().resolve()
key = str(home)
cached = _DB_CACHE.get(key)
if cached is not None:
return cached
try:
from hermes_state import SessionDB # type: ignore
db = SessionDB(db_path=home / "state.db")
except Exception as exc: # pragma: no cover - import/env dependent
logger.debug("GoalManager profile DB unavailable for %s: %s", home, exc)
return None
_DB_CACHE[key] = db
return db
class _ProfileGoalManager:
"""Small WebUI-local GoalManager adapter with explicit profile persistence."""
def __init__(self, session_id: str, *, profile_home: str | Path, default_max_turns: int = 20):
if GoalState is None:
raise RuntimeError("Hermes goal state unavailable")
self.session_id = session_id
self.profile_home = Path(profile_home).expanduser().resolve()
self.default_max_turns = int(default_max_turns or DEFAULT_MAX_TURNS or 20)
self._state = self._load()
@property
def state(self):
return self._state
def _load(self):
db = _profile_db(self.profile_home)
if db is None or not self.session_id:
return None
try:
raw = db.get_meta(_meta_key(self.session_id))
except Exception as exc:
logger.debug("GoalManager profile get_meta failed: %s", exc)
return None
if not raw:
return None
try:
return GoalState.from_json(raw) # type: ignore[union-attr]
except Exception as exc:
logger.warning("GoalManager profile state parse failed for %s: %s", self.session_id, exc)
return None
def _save(self, state) -> None:
db = _profile_db(self.profile_home)
if db is None or not self.session_id or state is None:
return
try:
db.set_meta(_meta_key(self.session_id), state.to_json())
except Exception as exc:
logger.debug("GoalManager profile set_meta failed: %s", exc)
def is_active(self) -> bool:
return self._state is not None and self._state.status == "active"
def has_goal(self) -> bool:
return self._state is not None and self._state.status in ("active", "paused")
def status_line(self) -> str:
s = self._state
if s is None or s.status in ("cleared",):
return "No active goal. Set one with /goal <text>."
turns = f"{s.turns_used}/{s.max_turns} turns"
if s.status == "active":
return f"⊙ Goal (active, {turns}): {s.goal}"
if s.status == "paused":
extra = f"{s.paused_reason}" if s.paused_reason else ""
return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
if s.status == "done":
return f"✓ Goal done ({turns}): {s.goal}"
return f"Goal ({s.status}, {turns}): {s.goal}"
def set(self, goal: str, *, max_turns: Optional[int] = None):
goal = (goal or "").strip()
if not goal:
raise ValueError("goal text is empty")
state = GoalState( # type: ignore[operator]
goal=goal,
status="active",
turns_used=0,
max_turns=int(max_turns) if max_turns else self.default_max_turns,
created_at=time.time(),
last_turn_at=0.0,
)
self._state = state
self._save(state)
return state
def pause(self, reason: str = "user-paused"):
if not self._state:
return None
self._state.status = "paused"
self._state.paused_reason = reason
self._save(self._state)
return self._state
def resume(self, *, reset_budget: bool = True):
if not self._state:
return None
self._state.status = "active"
self._state.paused_reason = None
if reset_budget:
self._state.turns_used = 0
self._save(self._state)
return self._state
def clear(self) -> None:
if self._state is None:
return
self._state.status = "cleared"
self._save(self._state)
self._state = None
def evaluate_after_turn(self, last_response: str, *, user_initiated: bool = True) -> Dict[str, Any]:
state = self._state
if state is None or state.status != "active":
return {
"status": state.status if state else None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "no active goal",
"message": "",
}
state.turns_used += 1
state.last_turn_at = time.time()
if judge_goal is None:
verdict, reason = "continue", "goal judge unavailable"
else:
verdict, reason = judge_goal(state.goal, str(last_response or ""))
state.last_verdict = verdict
state.last_reason = reason
if verdict == "done":
state.status = "done"
self._save(state)
return {
"status": "done",
"should_continue": False,
"continuation_prompt": None,
"verdict": "done",
"reason": reason,
"message": f"✓ Goal achieved: {reason}",
}
if state.turns_used >= state.max_turns:
state.status = "paused"
state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
self._save(state)
return {
"status": "paused",
"should_continue": False,
"continuation_prompt": None,
"verdict": "continue",
"reason": reason,
"message": (
f"⏸ Goal paused — {state.turns_used}/{state.max_turns} turns used. "
"Use /goal resume to keep going, or /goal clear to stop."
),
}
self._save(state)
return {
"status": "active",
"should_continue": True,
"continuation_prompt": self.next_continuation_prompt(),
"verdict": "continue",
"reason": reason,
"message": f"↻ Continuing toward goal ({state.turns_used}/{state.max_turns}): {reason}",
}
def next_continuation_prompt(self) -> Optional[str]:
if not self._state or self._state.status != "active":
return None
return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)
def _manager(session_id: str, *, profile_home: str | Path | None = None):
if GoalManager is None:
return None
if profile_home and GoalManager is _NativeGoalManager and GoalState is not None:
try:
return _ProfileGoalManager(
session_id=session_id,
profile_home=profile_home,
default_max_turns=_default_max_turns(),
)
except Exception as exc:
logger.debug("Profile-scoped GoalManager unavailable: %s", exc)
return None
return GoalManager(session_id=session_id, default_max_turns=_default_max_turns())
def _state_payload(state: Any) -> Optional[Dict[str, Any]]:
if state is None:
return None
return {
"goal": getattr(state, "goal", "") or "",
"status": getattr(state, "status", "") or "",
"turns_used": int(getattr(state, "turns_used", 0) or 0),
"max_turns": int(getattr(state, "max_turns", 0) or 0),
"last_verdict": getattr(state, "last_verdict", None),
"last_reason": getattr(state, "last_reason", None),
"paused_reason": getattr(state, "paused_reason", None),
}
def _payload(
*,
ok: bool = True,
action: str,
message: str,
state: Any = None,
error: str | None = None,
kickoff_prompt: str | None = None,
decision: Dict[str, Any] | None = None,
message_key: str | None = None,
message_args: list[Any] | None = None,
) -> Dict[str, Any]:
body: Dict[str, Any] = {
"ok": bool(ok),
"action": action,
"message": message,
"goal": _state_payload(state),
}
if error:
body["error"] = error
if kickoff_prompt:
body["kickoff_prompt"] = kickoff_prompt
if decision is not None:
body["decision"] = decision
if message_key:
body["message_key"] = message_key
if message_args is not None:
body["message_args"] = [a for a in message_args if a is not None]
return body
def _goal_status_payload(state: Any, *, default_message: str | None = None) -> Dict[str, Any]:
"""Build localized-status style payload fields from a goal state."""
if default_message is None:
default_message = "No active goal. Set one with /goal <text>."
if state is None:
return {"message": default_message, "message_key": "goal_status_none"}
status = str(getattr(state, "status", "") or "").strip()
if status in ("cleared",):
return {"message": default_message, "message_key": "goal_status_none"}
turns_used = int(getattr(state, "turns_used", 0) or 0)
max_turns = int(getattr(state, "max_turns", 0) or 0)
goal = str(getattr(state, "goal", "") or "")
if status == "active":
return {
"message": f"⊙ Goal (active, {turns_used}/{max_turns} turns): {goal}",
"message_key": "goal_status_active",
"message_args": [turns_used, max_turns, goal],
}
if status == "paused":
reason = str(getattr(state, "paused_reason", "") or "")
return {
"message": f"⏸ Goal (paused, {turns_used}/{max_turns}{'' + reason if reason else ''}): {goal}",
"message_key": "goal_status_paused",
"message_args": [turns_used, max_turns, reason, goal],
}
if status == "done":
return {
"message": f"✓ Goal done ({turns_used}/{max_turns}): {goal}",
"message_key": "goal_status_done",
"message_args": [turns_used, max_turns, goal],
}
return {
"message": f"Goal ({status}, {turns_used}/{max_turns}): {goal}",
"message_args": [status, turns_used, max_turns, goal],
}
def _extract_goal_turns_from_message(message: str) -> tuple[int, int]:
"""Best-effort extraction for continuation messages like '(1/20)'."""
if not message:
return 0, 0
match = re.search(r"\((\d+)\s*/\s*(\d+)\)", message)
if not match:
return 0, 0
try:
return int(match.group(1)), int(match.group(2))
except Exception:
return 0, 0
def _goal_decision_payload(
decision: Dict[str, Any],
state: Any,
) -> Dict[str, Any]:
"""Attach goal message i18n key/args to an evaluation decision."""
if not isinstance(decision, dict):
return decision
status = str(decision.get("status") or "").strip()
reason = str(decision.get("reason") or "").strip()
turns_used = int(getattr(state, "turns_used", 0) or 0)
max_turns = int(getattr(state, "max_turns", 0) or 0)
if (turns_used, max_turns) == (0, 0):
turns_used, max_turns = _extract_goal_turns_from_message(str(decision.get("message") or ""))
if status == "done":
return {
**decision,
"message_key": "goal_achieved",
"message_args": [reason],
}
if status == "paused":
return {
**decision,
"message_key": "goal_paused_budget_exhausted",
"message_args": [turns_used, max_turns],
}
if decision.get("should_continue"):
return {
**decision,
"message_key": "goal_continuing",
"message_args": [turns_used, max_turns, reason],
}
return decision
def goal_state_snapshot(session_id: str, *, profile_home: str | Path | None = None) -> Any:
"""Return a deep copy of current goal state for rollback before kickoff."""
mgr = _manager(str(session_id or ""), profile_home=profile_home)
if mgr is None:
return None
return copy.deepcopy(getattr(mgr, "state", None))
def restore_goal_state(session_id: str, snapshot: Any, *, profile_home: str | Path | None = None) -> None:
"""Restore a prior goal state after kickoff stream creation fails."""
mgr = _manager(str(session_id or ""), profile_home=profile_home)
if mgr is None:
return
if snapshot is None:
try:
mgr.clear()
except Exception:
pass
return
if isinstance(mgr, _ProfileGoalManager):
mgr._state = snapshot
mgr._save(snapshot)
return
try:
from hermes_cli.goals import save_goal # type: ignore
save_goal(str(session_id or ""), snapshot)
except Exception as exc: # pragma: no cover - native fallback only
logger.debug("Goal state restore failed for %s: %s", session_id, exc)
def goal_command_payload(
session_id: str,
args: str = "",
*,
stream_running: bool = False,
profile_home: str | Path | None = None,
) -> Dict[str, Any]:
"""Return the WebUI response payload for a /goal command.
Mirrors the gateway command semantics:
- /goal or /goal status shows status
- /goal pause pauses
- /goal resume resumes without auto-starting a turn
- /goal clear|stop|done clears
- /goal <text> sets a new active goal and returns kickoff_prompt so the
caller can start the first normal user-role turn immediately.
"""
sid = str(session_id or "").strip()
if not sid:
return _payload(ok=False, action="error", error="missing_session", message="session_id required")
mgr = _manager(sid, profile_home=profile_home)
if mgr is None:
return _payload(ok=False, action="error", error="unavailable", message="Goals unavailable on this session.")
text = str(args or "").strip()
lower = text.lower()
if not text or lower == "status":
state = getattr(mgr, "state", None)
status_payload = _goal_status_payload(state)
return _payload(action="status", state=state, **status_payload)
if lower == "pause":
state = mgr.pause(reason="user-paused")
if state is None:
return _payload(
ok=False,
action="pause",
error="no_goal",
message="No goal set.",
message_key="goal_no_goal",
)
return _payload(
action="pause",
message=f"⏸ Goal paused: {state.goal}",
message_key="goal_paused",
message_args=[str(state.goal)],
state=state,
)
if lower == "resume":
state = mgr.resume()
if state is None:
return _payload(
ok=False,
action="resume",
error="no_goal",
message="No goal to resume.",
message_key="goal_no_goal",
)
return _payload(
action="resume",
message=(
f"▶ Goal resumed: {state.goal}\n"
"Send a new message, or type continue, to kick it off."
),
message_key="goal_resumed",
message_args=[str(state.goal)],
state=state,
)
if lower in ("clear", "stop", "done"):
had = bool(mgr.has_goal())
mgr.clear()
return _payload(
action="clear",
message="Goal cleared." if had else "No active goal.",
message_key="goal_cleared" if had else "goal_no_goal",
state=getattr(mgr, "state", None),
)
if stream_running:
return _payload(
ok=False,
action="set",
error="agent_running",
message=(
"Agent is running — use /goal status / pause / clear mid-run, "
"or /stop before setting a new goal."
),
)
try:
state = mgr.set(text)
except ValueError as exc:
return _payload(ok=False, action="set", error="invalid_goal", message=f"Invalid goal: {exc}")
return _payload(
action="set",
message=(
f"⊙ Goal set ({state.max_turns}-turn budget): {state.goal}\n"
"I'll keep working until the goal is done, you pause/clear it, or the budget is exhausted.\n"
"Controls: /goal status · /goal pause · /goal resume · /goal clear"
),
message_key="goal_set",
message_args=[state.max_turns, state.goal],
state=state,
kickoff_prompt=state.goal,
)
def has_active_goal(
session_id: str,
*,
profile_home: str | Path | None = None,
) -> bool:
"""Return True when the session has an active standing goal to evaluate."""
sid = str(session_id or "").strip()
if not sid:
return False
mgr = _manager(sid, profile_home=profile_home)
if mgr is None:
return False
try:
return bool(mgr.is_active())
except Exception as exc:
logger.debug("goal active-state check failed for session=%s: %s", sid, exc)
return False
def evaluate_goal_after_turn(
session_id: str,
last_response: str,
*,
user_initiated: bool = True,
profile_home: str | Path | None = None,
) -> Dict[str, Any]:
"""Evaluate a completed turn against the standing goal, if any."""
sid = str(session_id or "").strip()
if not sid:
return {
"status": None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "missing session_id",
"message": "",
}
mgr = _manager(sid, profile_home=profile_home)
if mgr is None:
return {
"status": None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "goals unavailable",
"message": "",
}
try:
if not mgr.is_active():
return {
"status": getattr(getattr(mgr, "state", None), "status", None),
"should_continue": False,
"continuation_prompt": None,
"verdict": "inactive",
"reason": "no active goal",
"message": "",
}
decision = mgr.evaluate_after_turn(str(last_response or ""), user_initiated=user_initiated)
except Exception as exc:
logger.debug("goal evaluation failed for session=%s: %s", sid, exc)
return {
"status": None,
"should_continue": False,
"continuation_prompt": None,
"verdict": "error",
"reason": f"goal evaluation failed: {type(exc).__name__}",
"message": "",
}
if not isinstance(decision, dict):
decision = {}
decision.setdefault("should_continue", False)
decision.setdefault("continuation_prompt", None)
decision.setdefault("message", "")
decision = dict(decision)
decision = _goal_decision_payload(decision, getattr(mgr, "state", None))
return decision
+17 -9
View File
@@ -2,6 +2,7 @@
Hermes Web UI -- HTTP helper functions.
"""
import json as _json
import os
import re as _re
from pathlib import Path
from api.config import IMAGE_EXTS, MD_EXTS
@@ -45,7 +46,7 @@ def _security_headers(handler):
"default-src 'self' https://*.cloudflareaccess.com; "
"script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net https://static.cloudflareinsights.com; "
"style-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net https://fonts.googleapis.com; "
"img-src 'self' data: https: blob:; font-src 'self' data: https://cdn.jsdelivr.net https://fonts.gstatic.com; connect-src 'self'; "
"img-src 'self' data: https: blob:; font-src 'self' data: https://cdn.jsdelivr.net https://fonts.gstatic.com; connect-src 'self' https://cdn.jsdelivr.net; "
"manifest-src 'self' https://*.cloudflareaccess.com; "
"base-uri 'self'; form-action 'self'"
)
@@ -252,8 +253,13 @@ def read_body(handler) -> dict:
PROFILE_COOKIE_NAME = 'hermes_profile'
def get_profile_cookie_name() -> str:
"""Return the cookie name used to persist the active WebUI profile."""
return os.getenv('WEBUI_PROFILE_COOKIE_NAME', PROFILE_COOKIE_NAME)
def get_profile_cookie(handler) -> str | None:
"""Extract the hermes_profile cookie value from the request, or None."""
"""Extract the active-profile cookie value from the request, or None."""
cookie_header = handler.headers.get('Cookie', '')
if not cookie_header:
return None
@@ -263,7 +269,8 @@ def get_profile_cookie(handler) -> str | None:
cookie.load(cookie_header)
except _hc.CookieError:
return None
morsel = cookie.get(PROFILE_COOKIE_NAME)
cookie_name = get_profile_cookie_name()
morsel = cookie.get(cookie_name)
if morsel and morsel.value:
# Validate against profile-name pattern before trusting
from api.profiles import _PROFILE_ID_RE
@@ -274,7 +281,7 @@ def get_profile_cookie(handler) -> str | None:
def build_profile_cookie(name: str) -> str:
"""Build a Set-Cookie header value for the hermes_profile cookie.
"""Build a Set-Cookie header value for the active-profile cookie.
Always persist the selected profile in the cookie, including 'default'.
Clearing the cookie causes the backend to fall back to process-global
@@ -287,8 +294,9 @@ def build_profile_cookie(name: str) -> str:
"""
import http.cookies as _hc
cookie = _hc.SimpleCookie()
cookie[PROFILE_COOKIE_NAME] = name
cookie[PROFILE_COOKIE_NAME]['path'] = '/'
cookie[PROFILE_COOKIE_NAME]['httponly'] = True
cookie[PROFILE_COOKIE_NAME]['samesite'] = 'Lax'
return cookie[PROFILE_COOKIE_NAME].OutputString()
cookie_name = get_profile_cookie_name()
cookie[cookie_name] = name
cookie[cookie_name]['path'] = '/'
cookie[cookie_name]['httponly'] = True
cookie[cookie_name]['samesite'] = 'Lax'
return cookie[cookie_name].OutputString()
+1255
View File
File diff suppressed because it is too large Load Diff
+31 -24
View File
@@ -1,17 +1,17 @@
"""
Hermes Web UI -- Streaming performance metering.
Tracks Tokens Per Second (TPS) across all active WebUI sessions, and the
HIGH/LOW TPS values observed over the past 60 minutes. Metering data is
emitted via SSE events so the header label can update live during a stream.
Tracks Tokens Per Second (TPS) across active WebUI streams. Metering data is
emitted via SSE events so a streaming assistant message can update its own
header while the turn is running.
Architecture
Each streaming session is tracked independently. TPS per session is:
Each streaming session is tracked independently. TPS per stream is:
session_tps = total_tokens / (last_token_ts - first_token_ts)
stream_tps = total_stream_deltas / (last_delta_ts - first_delta_ts)
The global tps is the average of all currently active sessions' TPS values.
The global tps is the average of all currently active streams' TPS values.
This correctly represents the system's real-time capacity regardless of how
many sessions are running or how long each has been streaming.
@@ -19,8 +19,8 @@ For HIGH/LOW tracking, every stats snapshot records the current global tps
(only when > 0 idle periods are skipped) into a rolling 60-minute history.
The max/min of that history gives the peak throughput observed over the past hour.
The ticker in streaming.py calls get_interval() it returns 1.0 when sessions
are actively receiving tokens so the header updates at 1 Hz, and 10.0 when idle
The ticker in streaming.py calls get_interval() it returns 1.0 when streams
are actively receiving output deltas so message headers update at 1 Hz, and 10.0 when idle
so the ticker exits and no idle readings are emitted.
Usage from api/streaming.py
@@ -28,15 +28,17 @@ Usage from api/streaming.py
from api.metering import meter
meter().begin_session(stream_id) # stream starts
meter().record_token(stream_id, running_output) # per output token
meter().record_reasoning(stream_id, running_reasoning_len) # per reasoning token
meter().record_token(stream_id, running_output_deltas)
meter().record_reasoning(stream_id, running_reasoning_deltas)
The SSE `metering` event payload:
{
"tps": 47.3, # average TPS across active sessions (real-time)
"high": 52.1, # highest average TPS observed in the past 60 minutes
"low": 31.4, # lowest average TPS (excl. readings < 1 tps, to ignore idle)
"active": 1, # sessions currently streaming
"tps": 47.3, # omitted/null until a real reading exists
"tps_available": true, # frontend must hide TPS when false
"estimated": false, # never show byte/character-size estimates
"high": 52.1,
"low": 31.4,
"active": 1,
}
"""
@@ -60,9 +62,9 @@ class _SessionMeter:
def total_tokens(self) -> int:
return self.output_tokens + self.reasoning_tokens
def tps(self) -> float:
def tps(self) -> float | None:
if self.first_token_ts == 0.0 or self.last_token_ts <= self.first_token_ts:
return 0.0
return None
return self.total_tokens() / (self.last_token_ts - self.first_token_ts)
@@ -148,12 +150,15 @@ class GlobalMeter:
if not self._sessions:
self._window_start = now
# Compute global tps: average of per-session TPS values
# Compute global tps: average only streams with a real reading. The
# UI hides TPS entirely when this is unavailable instead of showing
# placeholder/estimated values.
active = [s for s in self._sessions.values() if s.first_token_ts > 0]
if active:
global_tps = sum(s.tps() for s in active) / len(active)
active_tps = [v for s in active for v in [s.tps()] if v is not None and v > 0]
if active_tps:
global_tps = sum(active_tps) / len(active_tps)
else:
global_tps = 0.0
global_tps = None
# Prune readings older than 1 hour
cutoff = now - _HOUR_SECS
@@ -162,7 +167,7 @@ class GlobalMeter:
# Only record this snapshot for HIGH/LOW if there is active work.
# This prevents idle periods from flooding the history and keeps
# HIGH/LOW meaningful for the past hour of actual throughput.
if global_tps > 0:
if global_tps is not None and global_tps > 0:
self._readings.append((now, global_tps))
# HIGH/LOW from the past hour (skip near-zero idle readings)
@@ -171,9 +176,11 @@ class GlobalMeter:
low = min(active_readings) if active_readings else 0.0
return {
'tps': round(global_tps, 1),
'high': round(high, 1),
'low': round(low, 1),
'tps': round(global_tps, 1) if global_tps is not None else None,
'tps_available': global_tps is not None,
'estimated': False,
'high': round(high, 1) if high else None,
'low': round(low, 1) if low else None,
'active': len(self._sessions),
}
+805 -41
View File
File diff suppressed because it is too large Load Diff
+734 -151
View File
@@ -1,187 +1,770 @@
"""In-app OAuth flow implementations for providers like OpenAI Codex.
"""In-app OAuth flow implementations for onboarding.
Uses only stdlib (urllib.request, json, time) no external dependencies.
Credentials are stored in ~/.hermes/auth.json under the credential_pool.
The browser receives only WebUI-local flow metadata (flow_id, user_code,
verification_uri, high-level status). Provider device/auth codes and OAuth
tokens stay server-side and are persisted to the active Hermes profile's
``auth.json`` credential_pool.
"""
from __future__ import annotations
import json
import logging
import os
import stat
import threading
import time
import uuid
import urllib.request
import urllib.parse
import urllib.error
import urllib.parse
import urllib.request
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
# Compatibility for older helper tests and self-heal code that import these.
AUTH_JSON_PATH = Path.home() / ".hermes" / "auth.json"
# ── Codex OAuth constants (from hermes_cli/auth.py) ──
CODEX_CLIENT_ID = "pdlLIX2Y72MIl2rhLhTE9VV9bN905kBh"
CODEX_AUTH_URL = "https://auth.openai.com/oauth/device/authorize"
CODEX_TOKEN_URL = "https://auth.openai.com/oauth/token"
CODEX_SCOPE = "openid profile email offline_access"
CODEX_GRANT_TYPE_DEVICE = "urn:ietf:params:oauth:grant-type:device_code"
CODEX_ISSUER = "https://auth.openai.com"
CODEX_CLIENT_ID = "app_EMoamEEZ73f0CkXaXp7hrann"
CODEX_VERIFICATION_URI = f"{CODEX_ISSUER}/codex/device"
CODEX_USER_CODE_URL = f"{CODEX_ISSUER}/api/accounts/deviceauth/usercode"
CODEX_DEVICE_TOKEN_URL = f"{CODEX_ISSUER}/api/accounts/deviceauth/token"
CODEX_TOKEN_URL = f"{CODEX_ISSUER}/oauth/token"
CODEX_REDIRECT_URI = f"{CODEX_ISSUER}/deviceauth/callback"
CODEX_BASE_URL = "https://chatgpt.com/backend-api/codex"
CODEX_FLOW_MAX_WAIT_SECONDS = 15 * 60
_ALLOWED_ONBOARDING_OAUTH_PROVIDERS = {"openai-codex", "anthropic", "claude", "claude-code"}
_ANTHROPIC_PROVIDER_ALIASES = {"anthropic", "claude", "claude-code"}
_REJECTED_ONBOARDING_OAUTH_PROVIDERS = {
"nous",
"qwen-oauth",
"gemini-cli",
"google-gemini-cli",
"minimax",
"minimax-oauth",
"copilot",
"copilot-acp",
}
ANTHROPIC_CREDENTIAL_POLL_SECONDS = 5
ANTHROPIC_FLOW_MAX_WAIT_SECONDS = 15 * 60
ANTHROPIC_PUBLIC_LINK_ERROR = "Claude Code credential linking failed. Check server logs."
_OAUTH_FLOWS: dict[str, dict[str, Any]] = {}
_OAUTH_FLOWS_LOCK = threading.Lock()
_ANTHROPIC_ENV_KEYS = ("ANTHROPIC_TOKEN", "ANTHROPIC_API_KEY")
# ── auth.json helpers ──
def _clear_process_anthropic_env_values() -> None:
"""Clear Anthropic process env fallbacks under the streaming env lock."""
from api.streaming import _ENV_LOCK
def _read_auth_json():
"""Read auth.json and return parsed dict, or empty dict."""
if AUTH_JSON_PATH.exists():
with _ENV_LOCK:
for key in _ANTHROPIC_ENV_KEYS:
os.environ.pop(key, None)
def resolve_runtime_provider_with_anthropic_env_lock(resolver, *args, **kwargs):
"""Resolve runtime credentials under the Anthropic onboarding env lock.
Request paths must resolve Anthropic env fallbacks per outbound request,
not cache ANTHROPIC_TOKEN or ANTHROPIC_API_KEY across onboarding. Sharing
the process-env lock prevents a chat stream from observing one stale
Anthropic env value while onboarding has already cleared the other.
"""
from api.streaming import _ENV_LOCK
with _ENV_LOCK:
return resolver(*args, **kwargs)
def _normalize_onboarding_oauth_provider(provider: str) -> str:
provider = str(provider or "").strip().lower()
if provider in _ANTHROPIC_PROVIDER_ALIASES:
return "anthropic"
return provider or "openai-codex"
def _get_active_hermes_home() -> Path:
try:
from api.profiles import get_active_hermes_home
return Path(get_active_hermes_home())
except Exception as exc:
# Per Opus advisor on stage-296: log the silent fallback so a corrupt
# profile state ending up writing tokens to ~/.hermes (instead of the
# active profile) is observable in logs rather than failing silently.
logger.warning(
"Falling back to ~/.hermes for OAuth credential storage: "
"active-profile resolution failed: %s",
exc,
)
return Path.home() / ".hermes"
# ── legacy auth.json helpers ────────────────────────────────────────────────
def _read_auth_json(auth_path: Path | None = None) -> dict[str, Any]:
"""Read auth.json and return parsed dict, or an empty compatible store."""
path = auth_path or AUTH_JSON_PATH
if path.exists():
try:
return json.loads(AUTH_JSON_PATH.read_text())
loaded = json.loads(path.read_text(encoding="utf-8"))
return loaded if isinstance(loaded, dict) else {}
except json.JSONDecodeError as exc:
logger.warning("Failed to parse %s: %s", AUTH_JSON_PATH, exc)
logger.warning("Failed to parse %s: %s", path, exc)
return {}
return {}
def _write_auth_json(data):
"""Atomically write auth.json via temp-file rename.
def read_auth_json():
"""Public wrapper for streaming credential self-heal code."""
return _read_auth_json()
SECURITY: auth.json contains OAuth access/refresh tokens. ``tmp.replace()``
preserves the temp file's mode (created with the process umask, typically
0644 or 0664), NOT the prior auth.json mode. Without an explicit chmod,
tokens land world-readable on shared systems. Set 0600 BEFORE the rename
so there is no window where the final file is world-readable.
(Opus pre-release advisor finding.)
def _write_auth_json(data: dict[str, Any], auth_path: Path | None = None) -> Path:
"""Atomically write auth.json with owner-only permissions.
OAuth access/refresh tokens live in this file. The temp file is chmod 0600
before rename so the final path never inherits a permissive process umask.
"""
import os, stat
AUTH_JSON_PATH.parent.mkdir(parents=True, exist_ok=True)
tmp = AUTH_JSON_PATH.with_suffix('.tmp')
tmp.write_text(json.dumps(data, indent=2, ensure_ascii=False))
path = auth_path or AUTH_JSON_PATH
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_name(f"{path.name}.tmp.{os.getpid()}.{uuid.uuid4().hex}")
try:
tmp.chmod(0o600)
except OSError as e:
# Best-effort: if chmod fails (e.g. on a filesystem that doesn't
# support POSIX modes), don't abort. The startup permission fixer
# in api.startup will sweep auth.json on the next process start.
logger.warning("Failed to chmod 0600 on %s: %s", tmp, e)
tmp.replace(AUTH_JSON_PATH)
tmp.write_text(json.dumps(data, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")
try:
tmp.chmod(0o600)
except OSError as exc:
logger.warning("Failed to chmod 0600 on %s: %s", tmp, exc)
tmp.replace(path)
try:
path.chmod(stat.S_IRUSR | stat.S_IWUSR)
except OSError:
pass
return path
finally:
try:
if tmp.exists():
tmp.unlink()
except OSError:
pass
# ── Codex device-code flow ──
def _now_iso() -> str:
return datetime.now(timezone.utc).isoformat().replace("+00:00", "Z")
def start_codex_device_code():
"""Start Codex OAuth device-code flow.
Returns dict: { device_code, user_code, verification_uri, expires_in, interval }
Raises RuntimeError on network error.
def _persist_codex_credentials(hermes_home: Path, token_data: dict[str, Any]) -> Path:
"""Persist Codex OAuth credentials to active-profile auth.json."""
access_token = str(token_data.get("access_token") or "").strip()
refresh_token = str(token_data.get("refresh_token") or "").strip()
if not access_token:
raise RuntimeError("Codex token exchange did not return an access_token")
auth_path = Path(hermes_home) / "auth.json"
auth = _read_auth_json(auth_path)
auth.setdefault("version", 1)
pool = auth.setdefault("credential_pool", {})
if not isinstance(pool, dict):
pool = {}
auth["credential_pool"] = pool
entries = pool.setdefault("openai-codex", [])
if not isinstance(entries, list):
entries = []
pool["openai-codex"] = entries
now = _now_iso()
entry = None
# Per Opus advisor on stage-296: also accept the legacy `source ==
# "oauth_device"` value so users with prior Codex OAuth credentials
# (written by older WebUI versions before this PR's source-key change)
# get their existing entry updated in-place rather than accumulating a
# stale duplicate pool entry.
_accept_sources = {"manual:device_code", "oauth_device"}
for candidate in entries:
if isinstance(candidate, dict) and candidate.get("source") in _accept_sources:
entry = candidate
break
if entry is None:
entry = {
"id": "codex-oauth-" + uuid.uuid4().hex[:12],
"label": "Codex OAuth",
"auth_type": "oauth",
"priority": 0,
"source": "manual:device_code",
"base_url": CODEX_BASE_URL,
"created_at": now,
}
entries.insert(0, entry)
entry.update(
{
"label": "Codex OAuth",
"auth_type": "oauth",
"priority": 0,
"source": "manual:device_code",
"access_token": access_token,
"refresh_token": refresh_token,
"base_url": CODEX_BASE_URL,
"last_refresh": now,
"updated_at": now,
}
)
auth["updated_at"] = now
path = _write_auth_json(auth, auth_path)
try:
from api.config import invalidate_credential_pool_cache
invalidate_credential_pool_cache("openai-codex")
except Exception:
logger.debug("Failed to invalidate openai-codex credential cache", exc_info=True)
return path
# Backward-compatible wrapper used by older code/tests.
def _save_codex_credentials(token_data):
return _persist_codex_credentials(_get_active_hermes_home(), token_data)
# ── Anthropic / Claude Code credential linking ─────────────────────────────
def _read_claude_code_credentials() -> dict[str, Any] | None:
"""Read Claude Code OAuth credentials from the host without exposing them.
Delegates to the agent adapter which knows about ~/.claude/.credentials.json
and macOS Keychain. Returns the credential dict or None.
"""
params = {
"client_id": CODEX_CLIENT_ID,
"scope": CODEX_SCOPE,
try:
from agent.anthropic_adapter import (
is_claude_code_token_valid,
read_claude_code_credentials,
)
creds = read_claude_code_credentials()
if creds and (
is_claude_code_token_valid(creds) or bool(creds.get("refreshToken"))
):
return creds
except Exception as exc:
logger.debug("Could not read Claude Code credentials: %s", exc)
return None
def _clear_anthropic_env_values(hermes_home: Path) -> None:
"""Clear Anthropic API/setup-token env values in the active profile only.
The .env write path already clears os.environ while holding the streaming
env lock. Keep a locked process-env clear here too so import/write failures
cannot leave or partially clear stale Anthropic fallbacks.
"""
try:
from api.providers import _write_env_file
_write_env_file(
Path(hermes_home) / ".env",
{key: None for key in _ANTHROPIC_ENV_KEYS},
)
except Exception as exc:
logger.warning("Failed to clear Anthropic env values: %s", exc)
_clear_process_anthropic_env_values()
def _link_anthropic_credentials(hermes_home: Path) -> None:
"""Link Hermes to use Claude Code's credential store.
Clears ANTHROPIC_TOKEN and ANTHROPIC_API_KEY from the Hermes .env so
that resolve_anthropic_token() falls through to reading Claude Code's
~/.claude/.credentials.json directly the same thing the CLI's
``use_anthropic_claude_code_credentials()`` does.
Also writes a marker entry in auth.json credential_pool so that
``_provider_oauth_authenticated("anthropic", ...)`` can detect the
linked state without touching the actual credential files.
"""
_clear_anthropic_env_values(hermes_home)
# Write a pool marker (no secrets) so onboarding status can detect linkage.
auth_path = Path(hermes_home) / "auth.json"
auth = _read_auth_json(auth_path)
auth.setdefault("version", 1)
pool = auth.setdefault("credential_pool", {})
if not isinstance(pool, dict):
pool = {}
auth["credential_pool"] = pool
entries = pool.setdefault("anthropic", [])
if not isinstance(entries, list):
entries = []
pool["anthropic"] = entries
now = _now_iso()
entry = None
for candidate in entries:
if isinstance(candidate, dict) and candidate.get("source") == "claude_code_linked":
entry = candidate
break
if entry is None:
entry = {
"id": "anthropic-claude-code-" + uuid.uuid4().hex[:12],
"label": "Claude Code (linked)",
"auth_type": "oauth",
"priority": 0,
"source": "claude_code_linked",
"created_at": now,
}
entries.insert(0, entry)
entry.update({
"label": "Claude Code (linked)",
"auth_type": "oauth",
"priority": 0,
"source": "claude_code_linked",
"updated_at": now,
})
auth["updated_at"] = now
_write_auth_json(auth, auth_path)
try:
from api.config import invalidate_credential_pool_cache
invalidate_credential_pool_cache("anthropic")
except Exception:
logger.debug("Failed to invalidate anthropic credential cache", exc_info=True)
def _anthropic_public_start_payload(flow_id: str, flow: dict[str, Any]) -> dict[str, Any]:
payload: dict[str, Any] = {
"ok": True,
"provider": "anthropic",
"flow_id": flow_id,
"status": flow.get("status", "pending"),
"poll_interval_seconds": flow.get("poll_interval_seconds", ANTHROPIC_CREDENTIAL_POLL_SECONDS),
}
data = urllib.parse.urlencode(params).encode()
req = urllib.request.Request(CODEX_AUTH_URL, data=data, method="POST")
req.add_header("Content-Type", "application/x-www-form-urlencoded")
if flow.get("status") == "pending":
payload["action_required"] = (
"Claude Code credentials were not found on this server. "
"Please run 'claude login' or 'claude setup-token' in a terminal "
"on the host, then return here — this page will detect the credentials automatically."
)
if flow.get("expires_at"):
payload["expires_at"] = flow["expires_at"]
return payload
def _anthropic_public_status_payload(flow_id: str, flow: dict[str, Any]) -> dict[str, Any]:
payload: dict[str, Any] = {
"ok": True,
"provider": "anthropic",
"flow_id": flow_id,
"status": flow.get("status", "error"),
}
if flow.get("status") == "error" and flow.get("error"):
payload["error"] = ANTHROPIC_PUBLIC_LINK_ERROR
return payload
def _spawn_anthropic_credential_worker(flow_id: str) -> None:
worker = threading.Thread(
target=_run_anthropic_credential_worker, args=(flow_id,), daemon=True,
)
worker.start()
def _run_anthropic_credential_worker(flow_id: str) -> None:
"""Poll for Claude Code credential appearance until found, cancelled, or expired."""
while True:
with _OAUTH_FLOWS_LOCK:
flow = dict(_OAUTH_FLOWS.get(flow_id) or {})
if not flow:
return
if flow.get("status") != "pending":
return
if float(flow.get("expires_at") or 0) <= time.time():
_set_flow_status(flow_id, "expired")
return
time.sleep(max(1, int(flow.get("poll_interval_seconds") or ANTHROPIC_CREDENTIAL_POLL_SECONDS)))
# Re-check status under lock (cancel may have arrived during sleep)
with _OAUTH_FLOWS_LOCK:
live = _OAUTH_FLOWS.get(flow_id)
if not live or live.get("status") != "pending":
return
try:
creds = _read_claude_code_credentials()
if creds is None:
continue
# Re-check status under lock before linking — cancel must win
with _OAUTH_FLOWS_LOCK:
current = _OAUTH_FLOWS.get(flow_id)
if not current or current.get("status") != "pending":
return
hermes_home = Path(flow["hermes_home"])
_link_anthropic_credentials(hermes_home)
with _OAUTH_FLOWS_LOCK:
current = _OAUTH_FLOWS.get(flow_id)
if not current or current.get("status") != "pending":
cancelled = bool(current and current.get("status") == "cancelled")
else:
current["status"] = "success"
current["updated_at"] = time.time()
_drop_sensitive_flow_fields(current)
cancelled = False
if cancelled:
_remove_anthropic_link_marker(hermes_home)
return
except Exception as exc:
logger.warning("Anthropic credential polling failed: %s", exc)
with _OAUTH_FLOWS_LOCK:
current = _OAUTH_FLOWS.get(flow_id)
if current and current.get("status") == "pending":
current["status"] = "error"
current["updated_at"] = time.time()
current["error"] = str(exc)
_drop_sensitive_flow_fields(current)
return
def _remove_anthropic_link_marker(hermes_home: Path) -> None:
"""Remove the secret-free Claude Code linked marker after a cancelled race."""
auth_path = Path(hermes_home) / "auth.json"
auth = _read_auth_json(auth_path)
pool = auth.get("credential_pool")
if not isinstance(pool, dict):
return
entries = pool.get("anthropic")
if not isinstance(entries, list):
return
kept = [entry for entry in entries if not (isinstance(entry, dict) and entry.get("source") == "claude_code_linked")]
if len(kept) == len(entries):
return
if kept:
pool["anthropic"] = kept
else:
pool.pop("anthropic", None)
auth["updated_at"] = _now_iso()
_write_auth_json(auth, auth_path)
try:
with urllib.request.urlopen(req, timeout=15) as resp:
return json.loads(resp.read().decode())
except Exception as e:
raise RuntimeError(f"Failed to start Codex OAuth: {e}") from e
from api.config import invalidate_credential_pool_cache
invalidate_credential_pool_cache("anthropic")
except Exception:
logger.debug("Failed to invalidate anthropic credential cache", exc_info=True)
# ── Codex protocol ──────────────────────────────────────────────────────────
def _json_request(url: str, payload: dict[str, Any], *, form: bool = False) -> dict[str, Any]:
if form:
data = urllib.parse.urlencode(payload).encode("utf-8")
content_type = "application/x-www-form-urlencoded"
else:
data = json.dumps(payload).encode("utf-8")
content_type = "application/json"
req = urllib.request.Request(
url,
data=data,
method="POST",
headers={"Content-Type": content_type, "Accept": "application/json"},
)
with urllib.request.urlopen(req, timeout=15) as resp:
return json.loads(resp.read().decode("utf-8"))
def _request_codex_user_code() -> dict[str, Any]:
return _json_request(CODEX_USER_CODE_URL, {"client_id": CODEX_CLIENT_ID})
def _poll_codex_authorization(device_auth_id: str, user_code: str) -> dict[str, Any] | None:
try:
return _json_request(
CODEX_DEVICE_TOKEN_URL,
{"device_auth_id": device_auth_id, "user_code": user_code},
)
except urllib.error.HTTPError as exc:
if exc.code in (403, 404):
return None
raise
def _exchange_codex_authorization(authorization_code: str, code_verifier: str) -> dict[str, Any]:
return _json_request(
CODEX_TOKEN_URL,
{
"grant_type": "authorization_code",
"code": authorization_code,
"redirect_uri": CODEX_REDIRECT_URI,
"client_id": CODEX_CLIENT_ID,
"code_verifier": code_verifier,
},
form=True,
)
def _codex_public_start_payload(flow_id: str, flow: dict[str, Any]) -> dict[str, Any]:
return {
"ok": True,
"provider": "openai-codex",
"flow_id": flow_id,
"status": flow.get("status", "pending"),
"verification_uri": CODEX_VERIFICATION_URI,
"user_code": flow.get("user_code", ""),
"expires_at": flow.get("expires_at"),
"poll_interval_seconds": flow.get("poll_interval_seconds", 5),
}
def _codex_public_status_payload(flow_id: str, flow: dict[str, Any]) -> dict[str, Any]:
payload = {
"ok": True,
"provider": "openai-codex",
"flow_id": flow_id,
"status": flow.get("status", "error"),
}
if flow.get("status") == "error" and flow.get("error"):
payload["error"] = str(flow.get("error"))[:200]
return payload
def _public_start_payload(flow_id: str, flow: dict[str, Any]) -> dict[str, Any]:
provider = flow.get("provider", "openai-codex")
if provider == "anthropic":
return _anthropic_public_start_payload(flow_id, flow)
return _codex_public_start_payload(flow_id, flow)
def _public_status_payload(flow_id: str, flow: dict[str, Any]) -> dict[str, Any]:
provider = flow.get("provider", "openai-codex")
if provider == "anthropic":
return _anthropic_public_status_payload(flow_id, flow)
return _codex_public_status_payload(flow_id, flow)
def _drop_sensitive_flow_fields(flow: dict[str, Any]) -> None:
for key in (
"device_auth_id",
"authorization_code",
"code_verifier",
"access_token",
"refresh_token",
"token_data",
):
flow.pop(key, None)
def _cleanup_oauth_flows(now: float | None = None) -> None:
now = now or time.time()
cutoff = now - 300
with _OAUTH_FLOWS_LOCK:
for fid, flow in list(_OAUTH_FLOWS.items()):
status = flow.get("status")
if status == "pending" and float(flow.get("expires_at") or 0) <= now:
flow["status"] = "expired"
_drop_sensitive_flow_fields(flow)
if status in {"success", "expired", "cancelled", "error"} and float(flow.get("updated_at") or 0) < cutoff:
_OAUTH_FLOWS.pop(fid, None)
def _spawn_codex_oauth_worker(flow_id: str) -> None:
worker = threading.Thread(target=_run_codex_oauth_worker, args=(flow_id,), daemon=True)
worker.start()
def _set_flow_status(flow_id: str, status: str, **fields: Any) -> None:
with _OAUTH_FLOWS_LOCK:
flow = _OAUTH_FLOWS.get(flow_id)
if not flow:
return
flow["status"] = status
flow["updated_at"] = time.time()
flow.update(fields)
if status in {"success", "expired", "cancelled", "error"}:
_drop_sensitive_flow_fields(flow)
def _run_codex_oauth_worker(flow_id: str) -> None:
while True:
with _OAUTH_FLOWS_LOCK:
flow = dict(_OAUTH_FLOWS.get(flow_id) or {})
if not flow:
return
status = flow.get("status")
if status != "pending":
return
if float(flow.get("expires_at") or 0) <= time.time():
_set_flow_status(flow_id, "expired")
return
time.sleep(max(1, int(flow.get("poll_interval_seconds") or 5)))
with _OAUTH_FLOWS_LOCK:
live = dict(_OAUTH_FLOWS.get(flow_id) or {})
if live.get("status") != "pending":
return
try:
code_resp = _poll_codex_authorization(
str(live.get("device_auth_id") or ""),
str(live.get("user_code") or ""),
)
if code_resp is None:
continue
authorization_code = str(code_resp.get("authorization_code") or "").strip()
code_verifier = str(code_resp.get("code_verifier") or "").strip()
if not authorization_code or not code_verifier:
raise RuntimeError("Device auth response missing authorization_code or code_verifier")
tokens = _exchange_codex_authorization(authorization_code, code_verifier)
# Re-check status under lock before persisting: a cancel/expire that
# raced with the device-token + token-exchange network calls must
# win, so we don't persist credentials the user explicitly aborted.
with _OAUTH_FLOWS_LOCK:
current = _OAUTH_FLOWS.get(flow_id)
if not current or current.get("status") != "pending":
return
_persist_codex_credentials(Path(live["hermes_home"]), tokens)
_set_flow_status(flow_id, "success")
return
except Exception as exc:
logger.warning("Codex OAuth onboarding flow failed: %s", exc)
_set_flow_status(flow_id, "error", error=str(exc))
return
def _start_anthropic_flow(hermes_home: Path) -> dict[str, Any]:
"""Start or immediately complete the Anthropic credential-linking flow."""
creds = _read_claude_code_credentials()
flow_id = uuid.uuid4().hex
if creds:
# Credentials already exist — link and return success immediately.
_link_anthropic_credentials(hermes_home)
flow = {
"provider": "anthropic",
"status": "success",
"hermes_home": str(hermes_home),
"created_at": time.time(),
"updated_at": time.time(),
}
with _OAUTH_FLOWS_LOCK:
_OAUTH_FLOWS[flow_id] = flow
return _public_start_payload(flow_id, flow)
# No credentials found — create a pending flow that polls for them.
expires_at = time.time() + ANTHROPIC_FLOW_MAX_WAIT_SECONDS
flow = {
"provider": "anthropic",
"status": "pending",
"expires_at": expires_at,
"poll_interval_seconds": ANTHROPIC_CREDENTIAL_POLL_SECONDS,
"hermes_home": str(hermes_home),
"created_at": time.time(),
"updated_at": time.time(),
}
with _OAUTH_FLOWS_LOCK:
_OAUTH_FLOWS[flow_id] = flow
_spawn_anthropic_credential_worker(flow_id)
return _public_start_payload(flow_id, flow)
def start_onboarding_oauth_flow(body: dict[str, Any] | None) -> dict[str, Any]:
"""Start the supported onboarding OAuth flow.
Supports OpenAI Codex (device-code flow) and Anthropic/Claude Code
(credential-linking flow). Other providers are rejected.
"""
_cleanup_oauth_flows()
provider = str((body or {}).get("provider") or "").strip().lower()
if provider not in _ALLOWED_ONBOARDING_OAUTH_PROVIDERS:
if provider in _REJECTED_ONBOARDING_OAUTH_PROVIDERS or provider:
raise ValueError(
"Only OpenAI Codex and Anthropic/Claude OAuth are supported "
"in WebUI onboarding right now"
)
raise ValueError("provider is required")
# Normalize Claude aliases to canonical "anthropic"
if provider in _ANTHROPIC_PROVIDER_ALIASES:
return _start_anthropic_flow(_get_active_hermes_home())
# Codex flow
hermes_home = _get_active_hermes_home()
try:
device = _request_codex_user_code()
except Exception as exc:
raise RuntimeError(f"Failed to start Codex OAuth: {exc}") from exc
user_code = str(device.get("user_code") or "").strip()
device_auth_id = str(device.get("device_auth_id") or "").strip()
if not user_code or not device_auth_id:
raise RuntimeError("Device code response missing required fields")
interval = max(3, int(device.get("interval") or 5))
expires_in = int(device.get("expires_in") or CODEX_FLOW_MAX_WAIT_SECONDS)
expires_at = time.time() + min(max(expires_in, 60), CODEX_FLOW_MAX_WAIT_SECONDS)
flow_id = uuid.uuid4().hex
flow = {
"provider": "openai-codex",
"status": "pending",
"device_auth_id": device_auth_id,
"user_code": user_code,
"expires_at": expires_at,
"poll_interval_seconds": interval,
"hermes_home": str(hermes_home),
"created_at": time.time(),
"updated_at": time.time(),
}
with _OAUTH_FLOWS_LOCK:
_OAUTH_FLOWS[flow_id] = flow
_spawn_codex_oauth_worker(flow_id)
return _public_start_payload(flow_id, flow)
def poll_onboarding_oauth_flow(flow_id: str) -> dict[str, Any]:
_cleanup_oauth_flows()
fid = str(flow_id or "").strip()
if not fid:
raise ValueError("flow_id is required")
with _OAUTH_FLOWS_LOCK:
flow = _OAUTH_FLOWS.get(fid)
if not flow:
raise KeyError("OAuth flow not found")
if flow.get("status") == "pending" and float(flow.get("expires_at") or 0) <= time.time():
flow["status"] = "expired"
flow["updated_at"] = time.time()
_drop_sensitive_flow_fields(flow)
return _public_status_payload(fid, dict(flow))
def cancel_onboarding_oauth_flow(body: dict[str, Any] | None) -> dict[str, Any]:
fid = str((body or {}).get("flow_id") or "").strip()
if not fid:
raise ValueError("flow_id is required")
requested_provider = _normalize_onboarding_oauth_provider(str((body or {}).get("provider") or ""))
if requested_provider not in {"openai-codex", "anthropic"}:
requested_provider = "openai-codex"
with _OAUTH_FLOWS_LOCK:
flow = _OAUTH_FLOWS.get(fid)
if not flow:
return {"ok": True, "provider": requested_provider, "flow_id": fid, "status": "cancelled"}
if flow.get("status") == "pending":
flow["status"] = "cancelled"
flow["updated_at"] = time.time()
_drop_sensitive_flow_fields(flow)
result = _public_status_payload(fid, dict(flow))
return result
# Backward-compatible names from the abandoned spike. They intentionally do not
# expose provider device secrets to callers anymore.
def start_codex_device_code():
return start_onboarding_oauth_flow({"provider": "openai-codex"})
def poll_codex_token(device_code, interval=5):
"""Poll for Codex OAuth token. Generator that yields status dicts.
Yields:
{"status": "polling", "attempt": N, "max_attempts": 40}
{"status": "success", "credentials": {...}}
{"status": "error", "error": "..."}
"""
params = {
"grant_type": CODEX_GRANT_TYPE_DEVICE,
"device_code": device_code,
"client_id": CODEX_CLIENT_ID,
}
data = urllib.parse.urlencode(params).encode()
max_attempts = 40 # 40 * 5 = 200s max
for attempt in range(max_attempts):
yield {"status": "polling", "attempt": attempt + 1, "max_attempts": max_attempts}
req = urllib.request.Request(CODEX_TOKEN_URL, data=data, method="POST")
req.add_header("Content-Type", "application/x-www-form-urlencoded")
try:
with urllib.request.urlopen(req, timeout=15) as resp:
token_data = json.loads(resp.read().decode())
# Save to auth.json credential_pool
_save_codex_credentials(token_data)
yield {"status": "success", "credentials": {
"access_token": "***",
"refresh_token": "***",
"token_type": token_data.get("token_type"),
"expires_in": token_data.get("expires_in"),
}}
return
except urllib.error.HTTPError as e:
body = e.read().decode()
try:
err_data = json.loads(body)
error = err_data.get("error", "")
if error == "authorization_pending":
time.sleep(interval)
continue
elif error == "slow_down":
time.sleep(interval + 5)
continue
elif error == "expired_token":
yield {"status": "error", "error": "Device code expired. Please try again."}
return
else:
yield {"status": "error", "error": err_data.get("error_description", error)}
return
except Exception:
yield {"status": "error", "error": body[:200]}
return
except Exception as e:
yield {"status": "error", "error": str(e)}
return
yield {"status": "error", "error": "OAuth flow timed out. Please try again."}
def _save_codex_credentials(token_data):
"""Save Codex OAuth credentials to auth.json credential_pool."""
auth = _read_auth_json()
if "credential_pool" not in auth:
auth["credential_pool"] = {}
pool = auth["credential_pool"]
if "openai-codex" not in pool:
pool["openai-codex"] = []
# Check if an oauth_device entry already exists (update in place)
updated = False
_now_iso = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
for entry in pool["openai-codex"]:
if entry.get("source") == "oauth_device":
entry["access_token"] = token_data.get("access_token", "")
entry["refresh_token"] = token_data.get("refresh_token", "")
entry["auth_type"] = "oauth"
entry["updated_at"] = _now_iso
updated = True
break
if not updated:
existing_ids = {e["id"] for e in pool.get("openai-codex", [])}
for _ in range(3): # retry on collision
cred_id = "codex-oauth-" + uuid.uuid4().hex[:8]
if cred_id not in existing_ids:
break
pool["openai-codex"].append({
"id": cred_id,
"label": "Codex OAuth",
"auth_type": "oauth",
"source": "oauth_device",
"access_token": token_data.get("access_token", ""),
"refresh_token": token_data.get("refresh_token", ""),
"priority": 1,
"created_at": _now_iso,
})
auth["updated_at"] = _now_iso
_write_auth_json(auth)
yield {"status": "error", "error": "Use /api/onboarding/oauth/poll with flow_id"}
+43 -13
View File
@@ -53,6 +53,8 @@ _SUPPORTED_PROVIDER_SETUPS = {
"requires_base_url": False,
"models": list(_PROVIDER_MODELS.get("anthropic", [])),
"category": "easy_start",
"oauth_provider": "anthropic",
"oauth_label": "Claude Code OAuth",
},
"openai": {
"label": "OpenAI",
@@ -137,6 +139,15 @@ _SUPPORTED_PROVIDER_SETUPS = {
"models": list(_PROVIDER_MODELS.get("deepseek", [])),
"category": "specialized",
},
"xiaomi": {
"label": "Xiaomi MiMo",
"env_var": "XIAOMI_API_KEY",
"default_model": "mimo-v2.5-pro",
"default_base_url": "https://api.xiaomimimo.com/v1",
"requires_base_url": False,
"models": list(_PROVIDER_MODELS.get("xiaomi", [])),
"category": "specialized",
},
"zai": {
"label": "Z.AI / GLM (智谱)",
"env_var": "GLM_API_KEY",
@@ -185,8 +196,9 @@ _PROVIDER_CATEGORIES = [
]
_UNSUPPORTED_PROVIDER_NOTE = (
"OAuth and advanced provider flows such as Nous Portal, OpenAI Codex, and GitHub "
"Copilot are still terminal-first. Use `hermes model` for those flows."
"Advanced provider flows such as Nous Portal and GitHub Copilot are still "
"terminal-first. OpenAI Codex and Anthropic Claude Code can be authenticated in this onboarding flow "
"when your Hermes config selects the corresponding provider."
)
@@ -537,7 +549,7 @@ def _provider_api_key_present(
# var names and can check os.environ for a valid key.
# Exclude known OAuth/token-flow providers — those are handled separately by
# _provider_oauth_authenticated() and should not be short-circuited here.
_known_oauth = {"openai-codex", "copilot", "copilot-acp", "qwen-oauth", "nous"}
_known_oauth = {"openai-codex", "copilot", "copilot-acp", "qwen-oauth", "nous", "anthropic"}
if provider not in _SUPPORTED_PROVIDER_SETUPS and provider not in _known_oauth:
try:
from hermes_cli.auth import get_auth_status as _gas
@@ -581,10 +593,11 @@ def _provider_oauth_authenticated(provider: str, hermes_home: "Path") -> bool:
used by current Hermes runtime auth resolution.
"""
provider = (provider or "").strip().lower()
provider = {"claude": "anthropic", "claude-code": "anthropic"}.get(provider, provider)
if not provider:
return False
_known_oauth_providers = {"openai-codex", "copilot", "copilot-acp", "qwen-oauth", "nous"}
_known_oauth_providers = {"openai-codex", "copilot", "copilot-acp", "qwen-oauth", "nous", "anthropic"}
if provider not in _known_oauth_providers:
return False
@@ -606,7 +619,16 @@ def _provider_oauth_authenticated(provider: str, hermes_home: "Path") -> bool:
if isinstance(pool_store, dict):
entries = pool_store.get(provider)
if isinstance(entries, list):
return any(_oauth_payload_has_token(entry) for entry in entries)
for entry in entries:
if _oauth_payload_has_token(entry):
return True
if (
provider == "anthropic"
and isinstance(entry, dict)
and entry.get("auth_type") == "oauth"
and entry.get("source") == "claude_code_linked"
):
return True
return False
except Exception:
@@ -647,6 +669,10 @@ def _status_from_runtime(cfg: dict, imports_ok: bool) -> dict:
)
else:
provider_ready = _provider_api_key_present(provider, cfg, env_values)
if not provider_ready and meta.get("oauth_provider"):
provider_ready = _provider_oauth_authenticated(
str(meta.get("oauth_provider")), _get_active_hermes_home()
)
else:
# Unknown provider — may be an OAuth flow (openai-codex, copilot, etc.)
# OR an API-key provider not in the quick-setup list (minimax-cn, deepseek,
@@ -729,6 +755,8 @@ def _build_setup_catalog(cfg: dict) -> dict:
"models": list(meta.get("models", [])),
"category": meta.get("category", "easy_start"),
"quick": meta.get("quick", False),
"oauth_provider": meta.get("oauth_provider") or "",
"oauth_label": meta.get("oauth_label") or "",
}
)
@@ -748,9 +776,9 @@ def _build_setup_catalog(cfg: dict) -> dict:
# Flag whether the currently-configured provider is OAuth-based (not in the
# API-key flow). The frontend uses this to show a confirmation card instead
# of a key input when the user has already authenticated via 'hermes auth'.
current_is_oauth = current_provider not in _SUPPORTED_PROVIDER_SETUPS and bool(
current_provider
)
current_is_oauth = (
current_provider not in _SUPPORTED_PROVIDER_SETUPS and bool(current_provider)
) or _provider_oauth_authenticated(current_provider, _get_active_hermes_home())
return {
"providers": providers,
@@ -915,11 +943,13 @@ def apply_onboarding_setup(body: dict) -> dict:
if not api_key and not _provider_api_key_present(provider, cfg, env_values):
# Providers that may run keyless (lmstudio, ollama, custom — gated by
# `key_optional` in _SUPPORTED_PROVIDER_SETUPS) are allowed to onboard
# with no api_key. The agent runtime substitutes a placeholder
# (LMSTUDIO_NOAUTH_PLACEHOLDER) for those, and the probe (#1499) gives
# the user immediate feedback if their server actually does require
# auth (http_4xx with status 401). See #1499 third sub-bug from #1420.
if not provider_meta.get("key_optional"):
# with no api_key. OAuth-capable wizard providers (currently Anthropic
# via Claude Code) are also allowed once their server-side OAuth/link
# marker is present.
oauth_ready = bool(provider_meta.get("oauth_provider")) and _provider_oauth_authenticated(
str(provider_meta.get("oauth_provider")), _get_active_hermes_home()
)
if not provider_meta.get("key_optional") and not oauth_ready:
raise ValueError(f"{provider_meta['env_var']} is required")
model_cfg = cfg.get("model", {})
+416 -28
View File
@@ -37,6 +37,13 @@ _loaded_profile_env_keys: set[str] = set()
# process-global _active_profile.
_tls = threading.local()
def _unwrap_profile_home_to_base(home: Path) -> Path:
"""Return the base Hermes home when *home* is already a named profile dir."""
if home.parent.name == 'profiles':
return home.parent.parent
return home
def _resolve_base_hermes_home() -> Path:
"""Return the BASE ~/.hermes directory — the root that contains profiles/.
@@ -56,20 +63,22 @@ def _resolve_base_hermes_home() -> Path:
reading it here would make _DEFAULT_HERMES_HOME point to that subdir,
causing switch_profile('webui') to look for
/home/user/.hermes/profiles/webui/profiles/webui which doesn't exist.
HERMES_BASE_HOME normally points at the base home already, but isolated
single-profile WebUI deployments can provide /base/profiles/<name> there as
well. Normalize both env vars through the same helper so active-profile
and per-request resolution share one base-root contract (#749).
"""
# Explicit override for tests or unusual setups
base_override = os.getenv('HERMES_BASE_HOME', '').strip()
if base_override:
return Path(base_override).expanduser()
return _unwrap_profile_home_to_base(Path(base_override).expanduser())
hermes_home = os.getenv('HERMES_HOME', '').strip()
if hermes_home:
p = Path(hermes_home).expanduser()
# If HERMES_HOME points to a profiles/ subdir, walk up two levels to the base
if p.parent.name == 'profiles':
return p.parent.parent
# Otherwise trust it (e.g. test isolation sets HERMES_HOME to TEST_STATE_DIR)
return p
return _unwrap_profile_home_to_base(p)
return Path.home() / '.hermes'
@@ -91,6 +100,103 @@ def _read_active_profile_file() -> str:
# ── Public API ──────────────────────────────────────────────────────────────
# ── Root-profile resolution (#1612) ────────────────────────────────────────
#
# Hermes Agent allows the root/default profile (~/.hermes itself) to have a
# display name other than the legacy literal 'default'. When that happens,
# WebUI must NOT resolve the display name as ~/.hermes/profiles/<name> — that
# directory doesn't exist, and every site that does `if name == 'default':`
# will fall through to the wrong filesystem path.
#
# `_is_root_profile(name)` answers "does this name resolve to ~/.hermes?" and
# is the canonical replacement for scattered `if name == 'default':` checks
# in switch_profile, get_active_hermes_home, _validate_profile_name, etc.
#
# Cost note: list_profiles_api() shells out via hermes_cli (non-trivial), so
# we memoize the lookup. The cache is invalidated whenever profiles are
# created, deleted, renamed, or cloned — i.e. on every mutation site we
# control.
_root_profile_name_cache: set[str] = {'default'}
_root_profile_name_cache_lock = threading.Lock()
_root_profile_name_cache_loaded = False
def _invalidate_root_profile_cache() -> None:
"""Drop the memoized root-profile-name set.
Called whenever profile metadata might have changed: create, clone,
delete, rename. The next _is_root_profile() call repopulates from
list_profiles_api().
"""
global _root_profile_name_cache_loaded
with _root_profile_name_cache_lock:
_root_profile_name_cache.clear()
_root_profile_name_cache.add('default')
_root_profile_name_cache_loaded = False
def _is_root_profile(name: str) -> bool:
"""True if *name* resolves to the Hermes Agent root profile (~/.hermes).
Matches the legacy 'default' alias plus any name where list_profiles_api()
reports is_default=True. Memoized; call _invalidate_root_profile_cache()
after mutating profile metadata.
"""
global _root_profile_name_cache_loaded
if not name:
return False
if name == 'default':
return True
with _root_profile_name_cache_lock:
if _root_profile_name_cache_loaded:
return name in _root_profile_name_cache
# Cache miss — populate from list_profiles_api(). Done outside the lock to
# avoid holding it across a hermes_cli subprocess call.
try:
infos = list_profiles_api()
except Exception:
logger.debug("Failed to list profiles for root-profile lookup", exc_info=True)
return False
with _root_profile_name_cache_lock:
_root_profile_name_cache.clear()
_root_profile_name_cache.add('default')
for p in infos:
try:
if p.get('is_default') and p.get('name'):
_root_profile_name_cache.add(p['name'])
except (AttributeError, TypeError):
continue
_root_profile_name_cache_loaded = True
return name in _root_profile_name_cache
def _profiles_match(row_profile, active_profile) -> bool:
"""Return True if a session/project row's profile matches the active profile.
Treats both the literal alias 'default' and any renamed-root display name
(per _is_root_profile) as equivalent, so legacy rows tagged 'default'
still surface when the user has renamed the root profile to e.g. 'kinni',
and vice versa.
A row with no profile (`None` or empty string) is treated as belonging to
the root profile that's the convention used by the legacy backfill at
api/models.py::all_sessions, and matches the default seen in
`static/sessions.js` (`S.activeProfile||'default'`).
Originally lived in api/routes.py; relocated here so both routes.py and
out-of-process consumers (mcp_server.py) can import the canonical helper
instead of duplicating the body. See #1614 for the visibility model.
"""
row = row_profile or 'default'
active = active_profile or 'default'
if row == active:
return True
# Cross-alias the renamed root.
if _is_root_profile(row) and _is_root_profile(active):
return True
return False
def get_active_profile_name() -> str:
"""Return the currently active profile name.
@@ -123,22 +229,287 @@ def clear_request_profile() -> None:
_tls.profile = None
def _resolve_profile_home_for_name(name: str) -> Path:
"""Resolve a logical profile name to its Hermes home path.
Root/default aliases resolve to _DEFAULT_HERMES_HOME. Valid named profiles
resolve to _DEFAULT_HERMES_HOME/profiles/<name> even when the directory has
not been created yet; the agent layer may create it on first use. Invalid
names fall back to the base home so traversal-shaped cookie values cannot
influence filesystem paths.
"""
if not name or _is_root_profile(name):
return _DEFAULT_HERMES_HOME
if not _PROFILE_ID_RE.fullmatch(name):
return _DEFAULT_HERMES_HOME
return _resolve_named_profile_home(name)
def get_active_hermes_home() -> Path:
"""Return the HERMES_HOME path for the currently active profile.
Uses get_active_profile_name() so per-request TLS context (issue #798)
is respected, not just the process-level global.
"""
name = get_active_profile_name()
if name == 'default':
return _DEFAULT_HERMES_HOME
profile_dir = _DEFAULT_HERMES_HOME / 'profiles' / name
if profile_dir.is_dir():
return profile_dir
return _DEFAULT_HERMES_HOME
return _resolve_profile_home_for_name(get_active_profile_name())
# ── Cron-call profile isolation (issue: Scheduled jobs ignored active profile) ─
# `cron.jobs` reads HERMES_HOME from os.environ (process-global) at function-
# call time. That bypasses our per-request thread-local profile, so the
# `/api/crons*` endpoints always returned the process-default profile's jobs.
# This context manager swaps HERMES_HOME (and the cached module-level constants
# in cron.jobs) for the duration of a cron call, serialized by a lock so
# concurrent requests from different profiles don't race on the global env var.
#
# Thread-safety note on os.environ mutation:
# CPython's os.environ assignment is GIL-protected at the bytecode level, but
# multi-step read-modify-write sequences (snapshot prev → assign new → restore
# on exit) are NOT atomic without explicit serialization. The _cron_env_lock
# below makes the entire context-manager body run-to-completion serially, so
# all webui access to HERMES_HOME goes through one thread at a time. Any
# subprocess.Popen() call inside `run_job` inherits the env at fork time,
# which is also under the lock — so child processes always see a consistent
# (own-profile) HERMES_HOME, never a half-swapped state.
_cron_env_lock = threading.Lock()
def _cron_profile_context_depth() -> int:
return int(getattr(_tls, 'cron_profile_depth', 0) or 0)
def _push_cron_profile_context_depth() -> None:
_tls.cron_profile_depth = _cron_profile_context_depth() + 1
def _pop_cron_profile_context_depth() -> None:
depth = _cron_profile_context_depth()
_tls.cron_profile_depth = max(0, depth - 1)
def _home_for_scheduled_cron_job(job: dict) -> Path:
"""Resolve the profile home an auto-fired scheduler job should execute in.
Legacy jobs with no profile keep the scheduler's server-default profile.
Jobs pinned to a named profile execute under that profile's HERMES_HOME, so
an in-process WebUI scheduler thread does not leak process-global config or
.env into the agent run. If a profile was deleted after the job was saved,
fall back to the server default rather than crashing every scheduler tick.
"""
raw = str((job or {}).get('profile') or '').strip()
if not raw:
return get_active_hermes_home()
if _is_root_profile(raw):
return _DEFAULT_HERMES_HOME
if not _PROFILE_ID_RE.fullmatch(raw):
logger.warning(
"Cron job %s has invalid profile %r; falling back to server default",
(job or {}).get('id', '?'), raw,
)
return get_active_hermes_home()
home = _resolve_named_profile_home(raw)
if not home.is_dir():
logger.warning(
"Cron job %s references missing profile %r; falling back to server default",
(job or {}).get('id', '?'), raw,
)
return get_active_hermes_home()
return home
def install_cron_scheduler_profile_isolation() -> None:
"""Patch cron.scheduler.run_job for WebUI in-process scheduler safety.
Standard WebUI deployments do not start the scheduler thread in-process, but
if a future/single-process deployment calls cron.scheduler.tick() from the
WebUI worker, tick's background job path has no request TLS context. Wrap
run_job so each auto-fired job's persisted ``profile`` field gets the same
HERMES_HOME isolation as the manual /api/crons/run path.
"""
try:
import cron.scheduler as _cs
except ImportError:
logger.debug("install_cron_scheduler_profile_isolation: cron.scheduler unavailable")
return
original = getattr(_cs, 'run_job', None)
if original is None or getattr(original, '_webui_profile_isolated', False):
return
def _webui_profile_isolated_run_job(job, *args, **kwargs):
# Manual WebUI runs already enter cron_profile_context_for_home before
# calling run_job. Avoid nesting the non-reentrant env lock or changing
# the explicitly selected manual execution profile.
if _cron_profile_context_depth() > 0:
return original(job, *args, **kwargs)
with cron_profile_context_for_home(_home_for_scheduled_cron_job(job)):
return original(job, *args, **kwargs)
_webui_profile_isolated_run_job._webui_profile_isolated = True
_webui_profile_isolated_run_job._webui_original_run_job = original
_cs.run_job = _webui_profile_isolated_run_job
class cron_profile_context_for_home:
"""Context manager that pins HERMES_HOME to an explicit profile home path.
Use this variant from worker threads that don't have TLS context (e.g. the
background thread started by /api/crons/run). The HTTP-side variant below
resolves the home via TLS.
"""
def __init__(self, home: Path):
self._home = Path(home)
def __enter__(self):
_cron_env_lock.acquire()
_push_cron_profile_context_depth()
try:
self._prev_env = os.environ.get('HERMES_HOME')
os.environ['HERMES_HOME'] = str(self._home)
# Re-patch cron.jobs module-level constants (see main context manager
# below for the rationale).
self._prev_cj = None
try:
import cron.jobs as _cj
self._prev_cj = (_cj.HERMES_DIR, _cj.CRON_DIR, _cj.JOBS_FILE, _cj.OUTPUT_DIR)
_cj.HERMES_DIR = self._home
_cj.CRON_DIR = self._home / 'cron'
_cj.JOBS_FILE = _cj.CRON_DIR / 'jobs.json'
_cj.OUTPUT_DIR = _cj.CRON_DIR / 'output'
except (ImportError, AttributeError):
logger.debug("cron_profile_context_for_home: cron.jobs unavailable")
# cron.scheduler snapshots _hermes_home at import time and run_job()
# reads config/.env from that module global. Patch it alongside
# cron.jobs so manual WebUI runs actually execute under the selected
# profile, not merely write output metadata there (#617).
self._prev_cs = None
try:
import cron.scheduler as _cs
self._prev_cs = (
getattr(_cs, '_hermes_home', None),
getattr(_cs, '_LOCK_DIR', None),
getattr(_cs, '_LOCK_FILE', None),
)
_cs._hermes_home = self._home
_cs._LOCK_DIR = self._home / 'cron'
_cs._LOCK_FILE = _cs._LOCK_DIR / '.tick.lock'
except (ImportError, AttributeError):
logger.debug("cron_profile_context_for_home: cron.scheduler unavailable")
except Exception:
_pop_cron_profile_context_depth()
_cron_env_lock.release()
raise
return self
def __exit__(self, exc_type, exc_val, exc_tb):
try:
if self._prev_env is None:
os.environ.pop('HERMES_HOME', None)
else:
os.environ['HERMES_HOME'] = self._prev_env
if self._prev_cj is not None:
try:
import cron.jobs as _cj
_cj.HERMES_DIR, _cj.CRON_DIR, _cj.JOBS_FILE, _cj.OUTPUT_DIR = self._prev_cj
except (ImportError, AttributeError):
pass
if getattr(self, '_prev_cs', None) is not None:
try:
import cron.scheduler as _cs
_cs._hermes_home, _cs._LOCK_DIR, _cs._LOCK_FILE = self._prev_cs
except (ImportError, AttributeError):
pass
finally:
_pop_cron_profile_context_depth()
_cron_env_lock.release()
return False
class cron_profile_context:
"""Context manager that pins HERMES_HOME to the TLS-active profile.
Usage:
with cron_profile_context():
from cron.jobs import list_jobs
jobs = list_jobs(include_disabled=True)
Serializes cron API calls across profiles (cron API is low-frequency;
serialization cost is negligible compared to correctness).
"""
def __enter__(self):
_cron_env_lock.acquire()
_push_cron_profile_context_depth()
try:
self._prev_env = os.environ.get('HERMES_HOME')
home = get_active_hermes_home()
os.environ['HERMES_HOME'] = str(home)
# Re-patch cron.jobs module-level constants. They are snapshot at
# import time (line 68-71 of cron/jobs.py) and don't participate in
# the module's __getattr__ lazy path, so env-var alone is not enough
# for callers that reference the module constants directly.
self._prev_cj = None
try:
import cron.jobs as _cj
self._prev_cj = (_cj.HERMES_DIR, _cj.CRON_DIR, _cj.JOBS_FILE, _cj.OUTPUT_DIR)
_cj.HERMES_DIR = home
_cj.CRON_DIR = home / 'cron'
_cj.JOBS_FILE = _cj.CRON_DIR / 'jobs.json'
_cj.OUTPUT_DIR = _cj.CRON_DIR / 'output'
except (ImportError, AttributeError):
logger.debug("cron_profile_context: cron.jobs unavailable; env-var only")
self._prev_cs = None
try:
import cron.scheduler as _cs
self._prev_cs = (
getattr(_cs, '_hermes_home', None),
getattr(_cs, '_LOCK_DIR', None),
getattr(_cs, '_LOCK_FILE', None),
)
_cs._hermes_home = home
_cs._LOCK_DIR = home / 'cron'
_cs._LOCK_FILE = _cs._LOCK_DIR / '.tick.lock'
except (ImportError, AttributeError):
logger.debug("cron_profile_context: cron.scheduler unavailable; env-var only")
except Exception:
_pop_cron_profile_context_depth()
_cron_env_lock.release()
raise
return self
def __exit__(self, exc_type, exc_val, exc_tb):
try:
# Restore env var
if self._prev_env is None:
os.environ.pop('HERMES_HOME', None)
else:
os.environ['HERMES_HOME'] = self._prev_env
# Restore cron.jobs module constants
if self._prev_cj is not None:
try:
import cron.jobs as _cj
_cj.HERMES_DIR, _cj.CRON_DIR, _cj.JOBS_FILE, _cj.OUTPUT_DIR = self._prev_cj
except (ImportError, AttributeError):
pass
if getattr(self, '_prev_cs', None) is not None:
try:
import cron.scheduler as _cs
_cs._hermes_home, _cs._LOCK_DIR, _cs._LOCK_FILE = self._prev_cs
except (ImportError, AttributeError):
pass
finally:
_pop_cron_profile_context_depth()
_cron_env_lock.release()
return False
def get_hermes_home_for_profile(name: str) -> Path:
"""Return the HERMES_HOME Path for *name* without mutating any process state.
@@ -150,10 +521,7 @@ def get_hermes_home_for_profile(name: str) -> Path:
empty, 'default', or does not match the profile-name format (rejects path
traversal such as '../../etc').
"""
if not name or name == 'default' or not _PROFILE_ID_RE.fullmatch(name):
return _DEFAULT_HERMES_HOME
profile_dir = _DEFAULT_HERMES_HOME / 'profiles' / name
return profile_dir
return _resolve_profile_home_for_name(name)
_TERMINAL_ENV_MAPPINGS = {
@@ -261,6 +629,14 @@ def _set_hermes_home(home: Path):
except (ImportError, AttributeError):
logger.debug("Failed to patch cron.jobs module")
try:
import cron.scheduler as _cs
_cs._hermes_home = home
_cs._LOCK_DIR = home / 'cron'
_cs._LOCK_FILE = _cs._LOCK_DIR / '.tick.lock'
except (ImportError, AttributeError):
logger.debug("Failed to patch cron.scheduler module")
def _reload_dotenv(home: Path):
"""Load .env from the profile dir into os.environ with profile isolation.
@@ -306,6 +682,7 @@ def init_profile_state() -> None:
_active_profile = _read_active_profile_file()
home = get_active_hermes_home()
_set_hermes_home(home)
install_cron_scheduler_profile_isolation()
_reload_dotenv(home)
@@ -329,16 +706,21 @@ def switch_profile(name: str, *, process_wide: bool = True) -> dict:
# Import here to avoid circular import at module load
from api.config import STREAMS, STREAMS_LOCK, reload_config
# Block if agent is running
with STREAMS_LOCK:
if len(STREAMS) > 0:
raise RuntimeError(
'Cannot switch profiles while an agent is running. '
'Cancel or wait for it to finish.'
)
# Process-wide profile switches mutate HERMES_HOME, module-level path caches,
# os.environ-backed .env keys, and the global config cache. Keep those blocked
# while any agent stream is active. Per-client WebUI switches are cookie/TLS
# scoped (process_wide=False) and do not mutate those globals, so users can
# leave a running session in one profile and start work in another (#1700).
if process_wide:
with STREAMS_LOCK:
if len(STREAMS) > 0:
raise RuntimeError(
'Cannot switch profiles while an agent is running. '
'Cancel or wait for it to finish.'
)
# Resolve profile directory
if name == 'default':
if _is_root_profile(name):
home = _DEFAULT_HERMES_HOME
else:
home = _resolve_named_profile_home(name)
@@ -356,7 +738,7 @@ def switch_profile(name: str, *, process_wide: bool = True) -> dict:
# Write sticky default for CLI consistency
try:
ap_file = _DEFAULT_HERMES_HOME / 'active_profile'
ap_file.write_text(name if name != 'default' else '', encoding='utf-8')
ap_file.write_text('' if _is_root_profile(name) else name, encoding='utf-8')
except Exception:
logger.debug("Failed to write active profile file")
@@ -526,7 +908,7 @@ def _create_profile_fallback(name: str, clone_from: str = None,
# Clone config files from source profile if requested
if clone_config and clone_from:
if clone_from == 'default':
if _is_root_profile(clone_from):
source_dir = _DEFAULT_HERMES_HOME
else:
source_dir = _DEFAULT_HERMES_HOME / 'profiles' / clone_from
@@ -575,7 +957,7 @@ def create_profile_api(name: str, clone_from: str = None,
_validate_profile_name(name)
# Defense-in-depth: validate clone_from here too, even though routes.py
# also validates it. Any caller that bypasses the HTTP layer gets protection.
if clone_from is not None and clone_from != 'default':
if clone_from is not None and not _is_root_profile(clone_from):
_validate_profile_name(clone_from)
try:
@@ -606,6 +988,10 @@ def create_profile_api(name: str, clone_from: str = None,
profile_path.mkdir(parents=True, exist_ok=True)
_write_endpoint_to_config(profile_path, base_url=base_url, api_key=api_key)
# Invalidate cached root-profile-name lookup; create_profile may have added
# a new profile that flips is_default semantics on the agent side (#1612).
_invalidate_root_profile_cache()
# Find and return the newly created profile info.
# When hermes_cli is not importable, list_profiles_api() also falls back
# to the stub default-only list and won't find the new profile by name.
@@ -628,7 +1014,7 @@ def create_profile_api(name: str, clone_from: str = None,
def delete_profile_api(name: str) -> dict:
"""Delete a profile. Switches to default first if it's the active one."""
if name == 'default':
if _is_root_profile(name):
raise ValueError("Cannot delete the default profile.")
_validate_profile_name(name)
@@ -654,4 +1040,6 @@ def delete_profile_api(name: str) -> dict:
else:
raise ValueError(f"Profile '{name}' does not exist.")
# Drop cached root-profile-name lookup — list_profiles_api() shape changed.
_invalidate_root_profile_cache()
return {'ok': True, 'name': name}
+622 -3
View File
@@ -7,15 +7,27 @@ multi-provider support).
from __future__ import annotations
import json
import logging
import os
import signal
import subprocess
import sys
import threading
import urllib.error
import urllib.request
from datetime import datetime, timezone
from pathlib import Path
from types import SimpleNamespace
from typing import Any
from api.config import (
_PROVIDER_DISPLAY,
_PROVIDER_MODELS,
_get_config_path,
_get_label_for_model,
_models_from_live_provider_ids,
_read_live_provider_model_ids,
_read_visible_codex_cache_model_ids,
_save_yaml_config_file,
get_config,
invalidate_models_cache,
@@ -24,6 +36,126 @@ from api.config import (
logger = logging.getLogger(__name__)
_OPENROUTER_KEY_URL = "https://openrouter.ai/api/v1/key"
_PROVIDER_QUOTA_TIMEOUT_SECONDS = 3.0
_ACCOUNT_USAGE_SUBPROCESS_TIMEOUT_SECONDS = 35.0
_ACCOUNT_USAGE_PROVIDERS = frozenset({"openai-codex", "anthropic"})
# Upper bound on simultaneous profile-isolated quota probe subprocesses.
# Each probe runs a Python child for up to 35 s; capping concurrency prevents
# resource exhaustion when the UI polls all providers rapidly. The limit is
# deliberately low (2) since _ACCOUNT_USAGE_SUBPROCESS_TIMEOUT_SECONDS is
# already 35 s and probe I/O is lightweight HTTP calls.
_MAX_CONCURRENT_ACCOUNT_USAGE_PROBES = 2
# Parent-death-signal setup: on Linux, arrange for the quota-probe child to
# receive SIGTERM when the WebUI parent dies (e.g. systemctl restart, OOM kill).
# This prevents probe children from becoming orphaned zombies that continue
# calling the provider API indefinitely after the WebUI process is gone.
# We use prctl(PR_SET_DEATHSIG, SIGTERM) which is standard on modern Linux
# kernels and available via ctypes (no external C extension needed).
# If prctl is unavailable (non-Linux, or Linux without prctl support), the
# probe child exits normally when its parent (WebUI) terminates -- on macOS/
# Windows this is handled by OS-level process tree cleanup.
# Portable parent-death-signal bootstrap. On Linux this arranges for the
# probe child to receive SIGTERM when the WebUI parent dies (systemctl
# restart, OOM kill, etc.), preventing orphaned zombie probes from continuing
# to call the provider API indefinitely. Non-Linux platforms (macOS, Windows)
# rely on OS-level process-tree cleanup instead; this variable is then unused.
# prctl(PR_SET_DEATHSIG, SIGTERM) is available via ctypes without any C
# extension — the same technique used throughout the Hermes codebase.
_ACCOUNT_USAGE_PARENT_DEATHSIG_BOOTSTRAP = (
# fmt: off
# Lines are written as string literals so this block passes
# `python3 -m py_compile` cleanly and is safe to include verbatim
# inside the single argument string passed to `python -c ...`.
'import sys\n'
'try:\n'
' import ctypes, signal\n'
' libc = ctypes.CDLL(None)\n'
' libc.prctl(1, signal.SIGTERM) # PR_SET_DEATHSIG=1, SIGTERM=15\n'
'except Exception:\n'
' pass\n'
# fmt: on
)
# Module-level cap on concurrent quota-probe subprocesses.
# Lazily created so this module compiles even when threading isn't ready.
_account_usage_probe_semaphore: threading.BoundedSemaphore | None = None
def _get_account_usage_probe_semaphore() -> threading.BoundedSemaphore:
global _account_usage_probe_semaphore
if _account_usage_probe_semaphore is None:
_account_usage_probe_semaphore = threading.BoundedSemaphore(
_MAX_CONCURRENT_ACCOUNT_USAGE_PROBES
)
return _account_usage_probe_semaphore
# ── preexec_fn: parent-death signal for the probe subprocess ─────────────────
# On POSIX/Linux, arrange for the child to receive SIGTERM when the WebUI
# parent dies (systemctl restart, OOM kill, etc.). The parent's bootstrap
# code (_ACCOUNT_USAGE_PARENT_DEATHSIG_BOOTSTRAP) also covers the grandchild
# fork inside the child, but this preexec_fn handles the direct child-process
# case. Returns None on non-POSIX or when prctl is unavailable so that
# subprocess.run() works on Windows/macOS without changes.
def _account_usage_preexec_fn() -> None:
try:
import ctypes
libc = ctypes.CDLL(None)
libc.prctl(1, signal.SIGTERM) # PR_SET_PDEATHSIG=1, SIGTERM=15
except Exception:
pass
_ACCOUNT_USAGE_SUBPROCESS_CODE = r"""
import json
import sys
from agent.account_usage import fetch_account_usage
def _iso(value):
if value in (None, ""):
return None
if hasattr(value, "isoformat"):
text = value.isoformat()
return text.replace("+00:00", "Z")
text = str(value).strip()
return text or None
def _snapshot_payload(snapshot):
if snapshot is None:
return None
windows = []
for window in getattr(snapshot, "windows", ()) or ():
windows.append({
"label": str(getattr(window, "label", "") or ""),
"used_percent": getattr(window, "used_percent", None),
"reset_at": _iso(getattr(window, "reset_at", None)),
"detail": getattr(window, "detail", None),
})
return {
"provider": str(getattr(snapshot, "provider", "") or ""),
"source": str(getattr(snapshot, "source", "") or ""),
"title": str(getattr(snapshot, "title", "") or ""),
"plan": getattr(snapshot, "plan", None),
"windows": windows,
"details": list(getattr(snapshot, "details", ()) or ()),
"available": bool(getattr(snapshot, "available", bool(windows))),
"unavailable_reason": getattr(snapshot, "unavailable_reason", None),
"fetched_at": _iso(getattr(snapshot, "fetched_at", None)),
}
provider = sys.argv[1]
api_key = sys.argv[2] or None
print(json.dumps(_snapshot_payload(fetch_account_usage(provider, api_key=api_key))))
"""
# SECTION: Provider ↔ env var mapping
# Maps canonical provider slug → env var name for API key.
@@ -42,6 +174,7 @@ _PROVIDER_ENV_VAR: dict[str, str] = {
"minimax-cn": "MINIMAX_CN_API_KEY",
"mistralai": "MISTRAL_API_KEY",
"x-ai": "XAI_API_KEY",
"xiaomi": "XIAOMI_API_KEY",
"opencode-zen": "OPENCODE_ZEN_API_KEY",
"opencode-go": "OPENCODE_GO_API_KEY",
# NOTE: bare "ollama" (local) deliberately omitted — local Ollama is keyless
@@ -269,6 +402,411 @@ def _provider_has_key(provider_id: str) -> bool:
return False
def _get_provider_api_key(provider_id: str) -> str | None:
"""Return a configured provider API key without exposing it to callers."""
provider_id = (provider_id or "").strip().lower()
env_var = _PROVIDER_ENV_VAR.get(provider_id)
if env_var:
env_path = _get_hermes_home() / ".env"
env_values = _load_env_file(env_path)
if env_values.get(env_var):
return str(env_values[env_var]).strip() or None
if os.getenv(env_var):
return os.getenv(env_var, "").strip() or None
for alias in _PROVIDER_ENV_VAR_ALIASES.get(provider_id, ()) or ():
if env_values.get(alias):
return str(env_values[alias]).strip() or None
if os.getenv(alias):
return os.getenv(alias, "").strip() or None
cfg = get_config()
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, dict):
active_provider = str(model_cfg.get("provider") or "").strip().lower()
model_key = str(model_cfg.get("api_key") or "").strip()
if model_key and active_provider == provider_id:
return model_key
providers_cfg = cfg.get("providers", {})
if isinstance(providers_cfg, dict):
provider_cfg = providers_cfg.get(provider_id, {})
if isinstance(provider_cfg, dict):
provider_key = str(provider_cfg.get("api_key") or "").strip()
if provider_key:
return provider_key
custom_providers = cfg.get("custom_providers", [])
if isinstance(custom_providers, list):
for cp in custom_providers:
if not isinstance(cp, dict):
continue
cp_name = str(cp.get("name") or "").strip().lower().replace(" ", "-")
if f"custom:{cp_name}" == provider_id or str(cp.get("name", "")).strip().lower() == provider_id:
cp_key = str(cp.get("api_key") or "").strip()
if cp_key.startswith("${") and cp_key.endswith("}"):
return os.getenv(cp_key[2:-1], "").strip() or None
if cp_key:
return cp_key
return None
def _active_provider_id() -> str | None:
cfg = get_config()
model_cfg = cfg.get("model", {})
if not isinstance(model_cfg, dict):
return None
provider = str(model_cfg.get("provider") or "").strip().lower()
return provider or None
def _quota_number(value: Any) -> int | float | None:
if isinstance(value, bool) or value is None:
return None
if isinstance(value, (int, float)):
return value
try:
text = str(value).strip()
if not text:
return None
number = float(text)
return int(number) if number.is_integer() else number
except (TypeError, ValueError):
return None
def _sanitize_openrouter_quota(payload: Any) -> dict[str, int | float | None]:
if isinstance(payload, dict) and isinstance(payload.get("data"), dict):
payload = payload["data"]
if not isinstance(payload, dict):
payload = {}
return {
"limit_remaining": _quota_number(payload.get("limit_remaining")),
"usage": _quota_number(payload.get("usage")),
"limit": _quota_number(payload.get("limit")),
}
def _isoformat_utc(value: Any) -> str | None:
if value in (None, ""):
return None
if isinstance(value, datetime):
dt = value if value.tzinfo else value.replace(tzinfo=timezone.utc)
return dt.astimezone(timezone.utc).isoformat().replace("+00:00", "Z")
text = str(value).strip()
return text or None
def _serialize_account_usage_snapshot(snapshot: Any) -> dict[str, Any] | None:
if snapshot is None:
return None
windows: list[dict[str, Any]] = []
for window in getattr(snapshot, "windows", ()) or ():
label = str(getattr(window, "label", "") or "").strip()
if not label:
continue
used_percent = _quota_number(getattr(window, "used_percent", None))
remaining_percent = None
if used_percent is not None:
remaining_percent = max(0.0, min(100.0, 100.0 - float(used_percent)))
windows.append({
"label": label,
"used_percent": used_percent,
"remaining_percent": remaining_percent,
"reset_at": _isoformat_utc(getattr(window, "reset_at", None)),
"detail": str(getattr(window, "detail", "") or "").strip() or None,
})
details = [
str(detail).strip()
for detail in (getattr(snapshot, "details", ()) or ())
if str(detail).strip()
]
plan = str(getattr(snapshot, "plan", "") or "").strip() or None
unavailable_reason = str(getattr(snapshot, "unavailable_reason", "") or "").strip() or None
return {
"provider": str(getattr(snapshot, "provider", "") or "").strip() or None,
"source": str(getattr(snapshot, "source", "") or "").strip() or None,
"title": str(getattr(snapshot, "title", "") or "").strip() or "Account limits",
"plan": plan,
"windows": windows,
"details": details,
"available": bool(getattr(snapshot, "available", bool(windows or details))) and not unavailable_reason,
"unavailable_reason": unavailable_reason,
"fetched_at": _isoformat_utc(getattr(snapshot, "fetched_at", None)),
}
def _agent_fetch_account_usage(provider: str, *, base_url: str | None = None, api_key: str | None = None) -> Any:
from agent.account_usage import fetch_account_usage
return fetch_account_usage(provider, base_url=base_url, api_key=api_key)
def _account_usage_subprocess_env(home: Path, provider: str, api_key: str | None) -> dict[str, str]:
env = dict(os.environ)
env["HERMES_HOME"] = str(Path(home))
# Profile .env values should affect only the child quota probe, not the
# WebUI process-global environment. This is especially important for
# Anthropic account usage, where the agent resolver reads OAuth/API tokens
# from environment variables.
for key, value in _load_env_file(Path(home) / ".env").items():
if value:
env[key] = value
env_var = _PROVIDER_ENV_VAR.get((provider or "").strip().lower())
if env_var and api_key:
env[env_var] = api_key
try:
from api.config import _AGENT_DIR
except Exception:
_AGENT_DIR = None
pythonpath_parts: list[str] = []
if _AGENT_DIR:
pythonpath_parts.append(str(_AGENT_DIR))
existing_pythonpath = env.get("PYTHONPATH", "")
if existing_pythonpath:
pythonpath_parts.append(existing_pythonpath)
if pythonpath_parts:
env["PYTHONPATH"] = os.pathsep.join(pythonpath_parts)
return env
def _account_usage_payload_to_snapshot(payload: Any) -> Any:
if not isinstance(payload, dict):
return None
windows = tuple(
SimpleNamespace(
label=window.get("label"),
used_percent=window.get("used_percent"),
reset_at=window.get("reset_at"),
detail=window.get("detail"),
)
for window in (payload.get("windows") or ())
if isinstance(window, dict)
)
return SimpleNamespace(
provider=payload.get("provider"),
source=payload.get("source"),
title=payload.get("title"),
plan=payload.get("plan"),
windows=windows,
details=tuple(payload.get("details") or ()),
available=bool(payload.get("available")),
unavailable_reason=payload.get("unavailable_reason"),
fetched_at=payload.get("fetched_at"),
)
def _agent_fetch_account_usage_for_home(provider: str, home: Path, *, api_key: str | None = None) -> Any:
try:
from api.config import PYTHON_EXE
except Exception:
PYTHON_EXE = sys.executable or "python3"
try:
# On POSIX (Linux/macOS), wire parent-death signal so the child dies
# cleanly if the WebUI parent terminates. preexec_fn is not safe on
# Windows, where OS-level process-tree cleanup handles child orphans.
kwargs: dict[str, Any] = {
"stdin": subprocess.DEVNULL,
"stdout": subprocess.PIPE,
"stderr": subprocess.PIPE,
"text": True,
"timeout": _ACCOUNT_USAGE_SUBPROCESS_TIMEOUT_SECONDS,
"check": False,
}
if hasattr(os, "fork"): # POSIX
kwargs["preexec_fn"] = _account_usage_preexec_fn
proc = subprocess.run(
[
PYTHON_EXE, "-c",
_ACCOUNT_USAGE_PARENT_DEATHSIG_BOOTSTRAP + _ACCOUNT_USAGE_SUBPROCESS_CODE,
provider,
api_key or "",
],
env=_account_usage_subprocess_env(home, provider, api_key),
**kwargs,
)
except subprocess.TimeoutExpired:
logger.debug("Account usage probe for %s timed out", provider)
return None
except Exception:
logger.debug("Account usage probe for %s failed to launch", provider, exc_info=True)
return None
if proc.returncode != 0:
logger.debug("Account usage probe for %s exited with status %s", provider, proc.returncode)
return None
try:
payload = json.loads((proc.stdout or "").strip() or "null")
except json.JSONDecodeError:
logger.debug("Account usage probe for %s returned invalid JSON", provider)
return None
return _account_usage_payload_to_snapshot(payload)
def _fetch_account_usage_with_profile_context(provider: str) -> Any:
"""Fetch account usage for a provider within the active profile context.
Concurrency is capped by the module-level BoundedSemaphore so that rapid
UI polls (e.g. Settings page refresh) cannot exhaust file-descriptors or
memory by spawning more than _MAX_CONCURRENT_ACCOUNT_USAGE_PROBES probe
subprocesses simultaneously. Each probe runs up to 35 s.
A warm worker-pool (reuse of persistent subprocess handles) is a natural
follow-up if this first slice proves insufficient in production.
"""
home = _get_hermes_home()
api_key = _get_provider_api_key(provider)
sem = _get_account_usage_probe_semaphore()
try:
with sem:
return _agent_fetch_account_usage_for_home(
provider,
home,
api_key=api_key,
)
except Exception:
logger.debug("Failed to fetch account usage for %s", provider, exc_info=True)
return None
def _provider_account_usage_status(provider: str, display_name: str) -> dict[str, Any]:
snapshot = _fetch_account_usage_with_profile_context(provider)
account_limits = _serialize_account_usage_snapshot(snapshot)
if account_limits and account_limits.get("available"):
return {
"ok": True,
"provider": provider,
"display_name": display_name,
"supported": True,
"status": "available",
"label": account_limits.get("title") or "Account limits",
"quota": None,
"account_limits": account_limits,
"message": f"{display_name} account limits loaded.",
}
reason = ""
if account_limits:
reason = str(account_limits.get("unavailable_reason") or "").strip()
message = (
f"{display_name} account limits are unavailable. {reason}"
if reason
else f"{display_name} account limits are unavailable. Confirm provider authentication and try again."
)
return {
"ok": False,
"provider": provider,
"display_name": display_name,
"supported": True,
"status": "unavailable",
"quota": None,
"account_limits": account_limits,
"message": message,
}
def get_provider_quota(provider_id: str | None = None) -> dict[str, Any]:
"""Return sanitized quota/rate-limit status for the active provider.
OpenRouter keeps its documented key endpoint. OAuth-backed account usage
providers reuse Hermes Agent's /usage account-limits abstraction so WebUI
stays aligned with CLI/Gateway provider semantics.
"""
provider = (provider_id or _active_provider_id() or "").strip().lower()
if not provider:
return {
"ok": False,
"provider": None,
"display_name": None,
"supported": False,
"status": "unavailable",
"quota": None,
"message": "No active provider is configured.",
}
display_name = _PROVIDER_DISPLAY.get(provider, provider.replace("-", " ").title())
if provider in _ACCOUNT_USAGE_PROVIDERS:
return _provider_account_usage_status(provider, display_name)
if provider != "openrouter":
detail = "OpenAI/Anthropic rate-limit headers are a follow-up once WebUI captures provider response metadata."
return {
"ok": False,
"provider": provider,
"display_name": display_name,
"supported": False,
"status": "unsupported",
"quota": None,
"message": f"Quota status is not available for {display_name}. {detail}",
}
api_key = _get_provider_api_key("openrouter")
if not api_key:
return {
"ok": False,
"provider": "openrouter",
"display_name": display_name,
"supported": True,
"status": "no_key",
"quota": None,
"message": "OpenRouter quota status needs an OPENROUTER_API_KEY configured on the server.",
}
req = urllib.request.Request(
_OPENROUTER_KEY_URL,
headers={
"Authorization": f"Bearer {api_key}",
"Accept": "application/json",
},
)
try:
with urllib.request.urlopen(req, timeout=_PROVIDER_QUOTA_TIMEOUT_SECONDS) as resp:
raw = resp.read()
payload = json.loads(raw.decode("utf-8")) if isinstance(raw, (bytes, bytearray)) else json.loads(raw)
quota = _sanitize_openrouter_quota(payload)
return {
"ok": True,
"provider": "openrouter",
"display_name": display_name,
"supported": True,
"status": "available",
"label": "OpenRouter credits",
"quota": quota,
"message": "OpenRouter quota status loaded.",
}
except urllib.error.HTTPError as exc:
status = "invalid_key" if exc.code in (401, 403) else "unavailable"
message = (
"OpenRouter rejected the configured API key."
if status == "invalid_key"
else "OpenRouter quota status is temporarily unavailable."
)
return {
"ok": False,
"provider": "openrouter",
"display_name": display_name,
"supported": True,
"status": status,
"quota": None,
"message": message,
}
except (TimeoutError, urllib.error.URLError, json.JSONDecodeError, OSError, ValueError):
return {
"ok": False,
"provider": "openrouter",
"display_name": display_name,
"supported": True,
"status": "unavailable",
"quota": None,
"message": "OpenRouter quota status is temporarily unavailable.",
}
def _provider_is_oauth(provider_id: str) -> bool:
"""Check whether a provider uses OAuth/token flows (managed by CLI)."""
return provider_id in _OAUTH_PROVIDERS
@@ -391,7 +929,67 @@ def get_providers() -> dict[str, Any]:
except Exception:
pass
models = _PROVIDER_MODELS.get(pid, [])
models = list(_PROVIDER_MODELS.get(pid, []))
models_total = len(models)
# OpenAI Codex account catalogs drift independently from WebUI releases.
# The model picker already prefers hermes_cli + Codex local cache for
# this provider (the agent's `provider_model_ids("openai-codex")` filters
# IDs with `supported_in_api: false`, but Codex CLI still surfaces some
# of those — notably `gpt-5.3-codex-spark` from #1680 — in its picker).
# Merge both sources here so the providers card matches the picker
# exactly. Static entries remain the offline fallback when live
# discovery and the local Codex cache are both unavailable. (#1807
# follow-up to v0.51.19 #1812.)
if pid == "openai-codex":
live_ids = _read_live_provider_model_ids("openai-codex")
for mid in _read_visible_codex_cache_model_ids():
if mid not in live_ids:
live_ids.append(mid)
live_models = _models_from_live_provider_ids(pid, live_ids)
if live_models:
models = live_models
models_total = len(models)
# Nous Portal: prefer the live catalog so the providers card matches
# the dropdown picker (#1538). Same fallback shape as the static-only
# case below — when hermes_cli is unavailable or its lookup raises,
# we keep the four-entry curated list.
#
# On large-tier accounts (#1567 reporter Deor saw 396 entries), we
# render the same featured subset the picker uses so the providers
# card body doesn't become a 396-pill wall. The full count is still
# reported via models_total — surfaced in the header line as
# "396 models · OAuth" by static/panels.js — so the user knows the
# complete catalog is reachable (via /model autocomplete or a future
# "show all" disclosure if added).
if pid == "nous":
try:
from hermes_cli.models import provider_model_ids as _provider_model_ids
live_ids = _provider_model_ids("nous") or []
if live_ids:
# Lazy-import to avoid circular dep with api.config.
from api.config import _format_nous_label, _build_nous_featured_set
featured_ids, _extras = _build_nous_featured_set(live_ids)
models = [
{"id": f"@nous:{mid}", "label": _format_nous_label(mid)}
for mid in featured_ids
]
models_total = len(live_ids)
except Exception:
logger.debug("Failed to load Nous Portal models from hermes_cli")
# LM Studio: fetch live locally-loaded models so the providers card
# matches what's actually available on the user's server (#WebUI).
if pid == "lmstudio":
try:
from hermes_cli.models import provider_model_ids as _pmi
lm_live = _pmi("lmstudio") or []
if lm_live:
models = [{"id": mid, "label": mid} for mid in lm_live]
models_total = len(models)
except Exception:
logger.debug("Failed to load LM Studio models from hermes_cli")
# Also include models from config.yaml providers section
if isinstance(providers_cfg, dict):
provider_cfg = providers_cfg.get(pid, {})
@@ -401,6 +999,13 @@ def get_providers() -> dict[str, Any]:
models = models + [{"id": k, "label": k} for k in cfg_models.keys()]
elif isinstance(cfg_models, list):
models = models + [{"id": k, "label": k} for k in cfg_models]
# Recompute models_total when config.yaml contributes additional
# entries on top of the live/static catalog. For non-Nous
# providers models_total still equals len(models); for Nous
# we keep the live count (which already includes any models
# surfaced in the curated featured slice).
if pid != "nous":
models_total = len(models)
providers.append({
"id": pid,
@@ -411,6 +1016,14 @@ def get_providers() -> dict[str, Any]:
"key_source": key_source,
"auth_error": auth_error,
"models": models,
# models_total reflects the complete catalog size (e.g. 396 for
# an enterprise Nous Portal account), even when "models" is
# trimmed to a featured subset for UI scannability. The frontend
# uses this for the header text "396 models · OAuth" so users
# know the full catalog exists and is reachable via the slash
# command. For providers that don't trim, models_total ==
# len(models) and the frontend behaves identically to before.
"models_total": models_total,
})
# Scan custom_providers from config.yaml (e.g. glmcode, timicc)
@@ -548,7 +1161,13 @@ def _clean_provider_key_from_config(provider_id: str) -> None:
from api.config import _cfg_lock
try:
config_path = _get_config_path()
# Resolve through api.config at call time instead of the function imported
# at module load. Several tests (and some profile flows) monkeypatch the
# config module's path resolver after api.providers has already been
# imported; using the stale imported reference can clean the wrong
# config.yaml.
import api.config as _config
config_path = _config._get_config_path()
except Exception:
return
+160
View File
@@ -0,0 +1,160 @@
"""Slow request diagnostics for latency-sensitive browser API paths."""
from __future__ import annotations
import json
import logging
import os
import sys
import threading
import time
import traceback
import uuid
from typing import Any
DEFAULT_SLOW_REQUEST_SECONDS = 5.0
MAX_STACK_FRAMES_PER_THREAD = 40
def _slow_request_seconds() -> float:
raw = os.getenv("HERMES_WEBUI_SLOW_REQUEST_SECONDS", "").strip()
if not raw:
return DEFAULT_SLOW_REQUEST_SECONDS
try:
value = float(raw)
except ValueError:
return DEFAULT_SLOW_REQUEST_SECONDS
return max(0.0, value)
class RequestDiagnostics:
"""Track request stages and emit a watchdog record if a request wedges."""
def __init__(
self,
method: str,
path: str,
*,
logger: logging.Logger | None = None,
timeout_seconds: float | None = None,
auto_start: bool = True,
) -> None:
self.request_id = uuid.uuid4().hex[:10]
self.method = str(method or "-")
self.path = str(path or "-").split("?", 1)[0]
self.logger = logger or logging.getLogger(__name__)
self.timeout_seconds = _slow_request_seconds() if timeout_seconds is None else max(0.0, float(timeout_seconds))
self.started_monotonic = time.monotonic()
self.started_wall = time.time()
self._lock = threading.Lock()
self._stages: list[dict[str, Any]] = []
self._current_stage = "start"
self._current_stage_started = self.started_monotonic
self._finished = False
self._watchdog_logged = False
self._timer: threading.Timer | None = None
if auto_start and self.timeout_seconds > 0:
self._timer = threading.Timer(self.timeout_seconds, self._on_timeout)
self._timer.daemon = True
self._timer.start()
@classmethod
def maybe_start(
cls,
method: str,
path: str,
*,
logger: logging.Logger | None = None,
) -> "RequestDiagnostics | None":
clean_path = str(path or "").split("?", 1)[0]
if (method.upper(), clean_path) not in {
("GET", "/api/sessions"),
("POST", "/api/chat/start"),
}:
return None
return cls(method, clean_path, logger=logger)
def stage(self, name: str) -> None:
now = time.monotonic()
clean = str(name or "unknown").strip() or "unknown"
with self._lock:
if self._finished:
return
self._stages.append(
{
"name": self._current_stage,
"ms": round((now - self._current_stage_started) * 1000, 1),
}
)
self._current_stage = clean
self._current_stage_started = now
def finish(self) -> None:
timer = None
record = None
with self._lock:
if self._finished:
return
self._finished = True
timer = self._timer
record = self._build_record_locked(include_stacks=False)
if timer is not None:
timer.cancel()
if record and self.timeout_seconds > 0 and record["elapsed_ms"] >= self.timeout_seconds * 1000:
self.logger.warning(
"Slow WebUI request completed: %s",
json.dumps(record, sort_keys=True),
)
def _on_timeout(self) -> None:
with self._lock:
if self._finished or self._watchdog_logged:
return
self._watchdog_logged = True
record = self._build_record_locked(include_stacks=True)
self.logger.warning(
"Slow WebUI request still running: %s",
json.dumps(record, sort_keys=True),
)
def _build_record_locked(self, *, include_stacks: bool) -> dict[str, Any]:
now = time.monotonic()
stages = list(self._stages)
stages.append(
{
"name": self._current_stage,
"ms": round((now - self._current_stage_started) * 1000, 1),
}
)
record: dict[str, Any] = {
"request_id": self.request_id,
"method": self.method,
"path": self.path,
"started_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime(self.started_wall)),
"elapsed_ms": round((now - self.started_monotonic) * 1000, 1),
"current_stage": self._current_stage,
"stages": stages,
}
if include_stacks:
record["thread_stacks"] = _thread_stack_snapshot()
return record
def _thread_stack_snapshot() -> list[dict[str, Any]]:
frames = sys._current_frames()
threads = {thread.ident: thread for thread in threading.enumerate()}
snapshot: list[dict[str, Any]] = []
for ident, frame in frames.items():
thread = threads.get(ident)
stack = traceback.format_stack(frame, limit=MAX_STACK_FRAMES_PER_THREAD)
snapshot.append(
{
"thread_id": ident,
"thread_name": thread.name if thread else "",
"daemon": bool(thread.daemon) if thread else None,
"stack": [line.rstrip() for line in stack],
}
)
snapshot.sort(key=lambda item: str(item.get("thread_name") or ""))
return snapshot
+4144 -288
View File
File diff suppressed because it is too large Load Diff
+593
View File
@@ -0,0 +1,593 @@
"""
Session recovery from .bak snapshots last line of defense against
data-loss bugs like #1558.
``Session.save()`` writes a ``<sid>.json.bak`` snapshot of the previous
state whenever an incoming save would shrink the messages array. This
module reads those snapshots back and restores any session whose live
file has fewer messages than its backup, or whose live file is missing
while a valid backup remains.
Three integration points:
1. ``recover_all_sessions_on_startup()`` called from server.py at boot,
scans the session dir, restores any session whose JSON has fewer
messages than its .bak, and recreates a missing ``<sid>.json`` from an
orphaned ``<sid>.json.bak`` when the canonical state DB still has that
session. Idempotent: a clean run is a no-op.
2. ``recover_session(sid)`` single-session helper backing the
``POST /api/session/recover`` endpoint, so users can re-run recovery
manually if their session was open through a server restart.
3. ``inspect_session_recovery_status(sid)`` read-only audit returning
message counts for the live JSON, the .bak, and a recommendation.
"""
from __future__ import annotations
import argparse
import json
import logging
import os
import shutil
import sqlite3
import threading
from pathlib import Path
logger = logging.getLogger(__name__)
def _msg_count(p: Path) -> int:
"""Return the number of messages in a session JSON file, or -1 on read/parse error.
Returns -1 for any non-session-shape file:
- File can't be read (OSError)
- Top-level isn't valid JSON or is invalid (JSONDecodeError, ValueError)
- Top-level isn't a dict (AttributeError on .get) — e.g. ``_index.json``
which is a top-level list of session metadata, not a session itself.
The startup recovery scanner globs ``*.json`` and would otherwise
crash on the first non-dict file it encounters.
"""
try:
data = json.loads(p.read_text(encoding='utf-8'))
except (OSError, json.JSONDecodeError, ValueError):
return -1
if not isinstance(data, dict):
return -1
msgs = data.get('messages')
return len(msgs) if isinstance(msgs, list) else -1
def inspect_session_recovery_status(session_path: Path) -> dict:
"""Return a status dict describing whether recovery is recommended.
{
"session_id": "...",
"live_messages": int, # -1 if live file unreadable
"bak_messages": int, # -1 if no .bak or unreadable
"recommend": "restore" | "no_action" | "no_backup",
}
"""
bak_path = session_path.with_suffix('.json.bak')
live_count = _msg_count(session_path)
if not bak_path.exists():
return {
"session_id": session_path.stem,
"live_messages": live_count,
"bak_messages": -1,
"recommend": "no_backup",
}
bak_count = _msg_count(bak_path)
if bak_count > live_count:
return {
"session_id": session_path.stem,
"live_messages": live_count,
"bak_messages": bak_count,
"recommend": "restore",
}
return {
"session_id": session_path.stem,
"live_messages": live_count,
"bak_messages": bak_count,
"recommend": "no_action",
}
def recover_session(session_path: Path) -> dict:
"""Restore session_path from its .bak when the bak has more messages.
Returns a status dict identical to ``inspect_session_recovery_status``
plus a "restored" boolean.
"""
status = inspect_session_recovery_status(session_path)
if status["recommend"] != "restore":
return {**status, "restored": False}
bak_path = session_path.with_suffix('.json.bak')
# Stage the recovery via a tmp copy + atomic replace so a crash mid-restore
# cannot leave a half-written session.json.
tmp_path = session_path.with_suffix('.json.recover.tmp')
try:
shutil.copyfile(bak_path, tmp_path)
tmp_path.replace(session_path)
except OSError as exc:
logger.warning("recover_session: copy failed for %s: %s", session_path, exc)
try:
tmp_path.unlink(missing_ok=True)
except OSError:
pass
return {**status, "restored": False, "error": str(exc)}
logger.warning(
"recover_session: restored %s from .bak (live=%d → bak=%d messages). "
"See #1558 for the data-loss class this guards against.",
session_path.name, status["live_messages"], status["bak_messages"],
)
return {**status, "restored": True}
def _state_db_has_session(session_id: str, state_db_path: Path | None) -> bool:
"""Return whether state.db still knows this session.
The check is deliberately fail-open: recovery must not be prevented by a
locked, absent, or older-schema state DB. When a DB is readable and has no
row, treat the orphan backup as a tombstoned/deleted session and skip it.
"""
if state_db_path is None or not state_db_path.exists():
return True
try:
with sqlite3.connect(f"file:{state_db_path}?mode=ro", uri=True) as conn:
cur = conn.execute(
"select 1 from sqlite_master where type='table' and name='sessions'"
)
if cur.fetchone() is None:
return True
cur = conn.execute("select 1 from sessions where id = ? limit 1", (session_id,))
return cur.fetchone() is not None
except Exception as exc:
logger.debug("state_db session tombstone check failed for %s: %s", session_id, exc)
return True
def _orphaned_backup_live_paths(
session_dir: Path,
state_db_path: Path | None = None,
) -> list[Path]:
"""Return live ``<sid>.json`` paths whose ``<sid>.json.bak`` exists.
``Path.glob('*.json')`` does not see orphan backups because their suffix is
``.bak``. Existing startup recovery only handled shrunken live files; this
helper covers the crash shape where the live sidecar is gone but the rescue
copy remains.
"""
paths: list[Path] = []
for bak_path in sorted(session_dir.glob('*.json.bak')):
live_path = bak_path.with_suffix('')
if live_path.name.startswith('_') or live_path.exists():
continue
if _msg_count(bak_path) < 0:
continue
session_id = live_path.stem
if not _state_db_has_session(session_id, state_db_path):
logger.info(
"recover_all_sessions_on_startup: skipped orphan backup %s; "
"state.db has no live session row",
bak_path.name,
)
continue
paths.append(live_path)
return paths
def _read_state_db_missing_sidecar_rows(session_dir: Path, state_db_path: Path | None) -> list[dict]:
"""Return WebUI-origin state.db rows whose JSON sidecar is missing."""
if state_db_path is None or not state_db_path.exists():
return []
try:
with sqlite3.connect(f"file:{state_db_path}?mode=ro", uri=True) as conn:
conn.row_factory = sqlite3.Row
session_cols = {row[1] for row in conn.execute("PRAGMA table_info(sessions)").fetchall()}
message_cols = {row[1] for row in conn.execute("PRAGMA table_info(messages)").fetchall()}
if not {'id', 'source'}.issubset(session_cols):
return []
title_expr = _sql_optional_col('title', session_cols)
model_expr = _sql_optional_col('model', session_cols)
started_expr = _sql_optional_col('started_at', session_cols, '0')
parent_expr = _sql_optional_col('parent_session_id', session_cols)
msg_count_expr = _sql_optional_col('message_count', session_cols, '0')
workspace_expr = _sql_optional_col('workspace', session_cols)
worktree_path_expr = _sql_optional_col('worktree_path', session_cols)
worktree_branch_expr = _sql_optional_col('worktree_branch', session_cols)
worktree_repo_root_expr = _sql_optional_col('worktree_repo_root', session_cols)
worktree_created_at_expr = _sql_optional_col('worktree_created_at', session_cols)
rows = []
for row in conn.execute(
f"""
SELECT id, source, {title_expr}, {model_expr}, {started_expr},
{parent_expr}, {msg_count_expr}, {workspace_expr},
{worktree_path_expr}, {worktree_branch_expr},
{worktree_repo_root_expr}, {worktree_created_at_expr}
FROM sessions
WHERE source = 'webui'
ORDER BY COALESCE(started_at, 0) DESC
"""
).fetchall():
data = dict(row)
sid = str(data.get('id') or '').strip()
if not sid or (session_dir / f"{sid}.json").exists():
continue
message_rows: list[dict] = []
if {'session_id', 'role', 'content'}.issubset(message_cols):
order = "timestamp, id" if 'timestamp' in message_cols and 'id' in message_cols else "rowid"
ts_expr = 'timestamp' if 'timestamp' in message_cols else 'NULL AS timestamp'
for msg in conn.execute(
f"SELECT role, content, {ts_expr} FROM messages WHERE session_id = ? ORDER BY {order}",
(sid,),
).fetchall():
message = {
'role': msg['role'],
'content': msg['content'] or '',
}
if msg['timestamp'] is not None:
message['timestamp'] = msg['timestamp']
message_rows.append(message)
if not message_rows:
continue
data['messages'] = message_rows
rows.append(data)
return rows
except Exception as exc:
logger.debug("state_db sidecar reconciliation scan failed for %s: %s", state_db_path, exc)
return []
def _sql_optional_col(name: str, columns: set[str], fallback: str = "NULL") -> str:
return name if name in columns else f"{fallback} AS {name}"
def _state_db_row_to_sidecar(row: dict) -> dict:
try:
from api.agent_sessions import normalize_agent_session_source
except Exception:
normalize_agent_session_source = None
source = str(row.get('source') or '').strip().lower()
source_meta = normalize_agent_session_source(source) if normalize_agent_session_source else {
'raw_source': source or None,
'session_source': source or None,
'source_label': source.title() if source else None,
}
started_at = row.get('started_at') or 0
messages = row.get('messages') if isinstance(row.get('messages'), list) else []
last_ts = messages[-1].get('timestamp') if messages and isinstance(messages[-1], dict) else started_at
workspace_value = row.get('workspace') or ''
return {
'session_id': row.get('id'),
'title': row.get('title') or 'Recovered WebUI Session',
'workspace': workspace_value if isinstance(workspace_value, str) else '',
'message_count': row.get('message_count') if isinstance(row.get('message_count'), int) else len(messages),
'worktree_path': row.get('worktree_path') or None,
'worktree_branch': row.get('worktree_branch') or None,
'worktree_repo_root': row.get('worktree_repo_root') or None,
'worktree_created_at': row.get('worktree_created_at') or None,
'model': row.get('model') or 'unknown',
'model_provider': None,
'created_at': started_at,
'updated_at': last_ts or started_at,
'pinned': False,
'archived': False,
'project_id': None,
'profile': None,
'input_tokens': 0,
'output_tokens': 0,
'estimated_cost': None,
'personality': None,
'active_stream_id': None,
'pending_user_message': None,
'pending_attachments': [],
'pending_started_at': None,
'compression_anchor_visible_idx': None,
'compression_anchor_message_key': None,
'compression_anchor_summary': None,
'context_length': None,
'threshold_tokens': None,
'last_prompt_tokens': None,
'gateway_routing': None,
'gateway_routing_history': [],
'llm_title_generated': False,
'parent_session_id': row.get('parent_session_id'),
'is_cli_session': False,
'source_tag': source or None,
**source_meta,
'enabled_toolsets': None,
'composer_draft': {},
'messages': messages,
'tool_calls': [],
'_recovered_from_state_db': True,
}
def recover_missing_sidecars_from_state_db(session_dir: Path, state_db_path: Path | None) -> dict:
"""Materialize missing WebUI JSON sidecars from canonical state.db rows."""
rows = _read_state_db_missing_sidecar_rows(session_dir, state_db_path)
materialized = 0
details: list[dict] = []
session_dir.mkdir(parents=True, exist_ok=True)
for row in rows:
sid = str(row.get('id') or '').strip()
if not sid:
continue
target = session_dir / f"{sid}.json"
if target.exists():
continue
payload = _state_db_row_to_sidecar(row)
# Per-process/per-thread tmp suffix to avoid corruption under
# concurrent reconciliation calls (matches api/models.py:484
# Session.save() convention).
tmp_suffix = f".json.reconcile.tmp.{os.getpid()}.{threading.current_thread().ident}"
tmp = target.with_suffix(tmp_suffix)
try:
tmp.write_text(json.dumps(payload, ensure_ascii=False, indent=2), encoding='utf-8')
except OSError as exc:
try:
tmp.unlink(missing_ok=True)
except OSError:
pass
details.append({'session_id': sid, 'materialized': False, 'error': str(exc)})
continue
# Atomic create-or-fail: os.link() refuses to overwrite an existing
# target. Closes the TOCTOU window between the target.exists() check
# above and the rename — a concurrent Session.save() for the same SID
# will win and we silently skip rather than overwrite a live sidecar.
materialized_now = False
try:
os.link(str(tmp), str(target))
materialized_now = True
except FileExistsError:
# Live sidecar appeared between the check and the link — keep it.
pass
except OSError as exc:
details.append({'session_id': sid, 'materialized': False, 'error': str(exc)})
finally:
try:
tmp.unlink(missing_ok=True)
except OSError:
pass
if materialized_now:
materialized += 1
details.append({'session_id': sid, 'materialized': True, 'messages': len(payload.get('messages') or [])})
elif not any(d.get('session_id') == sid for d in details[-1:]):
details.append({'session_id': sid, 'materialized': False, 'skipped': 'sidecar_appeared_during_reconcile'})
return {'scanned': len(rows), 'materialized': materialized, 'details': details}
def _new_audit_item(
session_id: str,
kind: str,
category: str,
recommendation: str,
live_messages: int = -1,
bak_messages: int = -1,
) -> dict:
return {
"session_id": session_id,
"kind": kind,
"category": category,
"recommendation": recommendation,
"live_messages": live_messages,
"bak_messages": bak_messages,
}
def _read_index_session_ids(index_path: Path) -> set[str]:
try:
data = json.loads(index_path.read_text(encoding='utf-8'))
except (OSError, json.JSONDecodeError, ValueError):
return set()
if not isinstance(data, list):
return set()
ids: set[str] = set()
for entry in data:
if isinstance(entry, dict) and isinstance(entry.get('session_id'), str):
ids.add(entry['session_id'])
return ids
def audit_session_recovery(session_dir: Path, state_db_path: Path | None = None) -> dict:
"""Read-only audit of session recovery state.
The audit intentionally does not mutate files. It classifies only the safe
recovery primitives this module knows how to perform: backup restores and
derived index rebuilds. Call ``recover_all_sessions_on_startup`` separately
for safe repairs.
"""
if not session_dir.exists():
return {
"status": "ok",
"summary": {"ok": 0, "repairable": 0, "unsafe_to_repair": 0},
"items": [],
}
items: list[dict] = []
live_paths = sorted(p for p in session_dir.glob('*.json') if not p.name.startswith('_'))
live_ids = {p.stem for p in live_paths}
for live_path in live_paths:
status = inspect_session_recovery_status(live_path)
if status.get('recommend') == 'restore':
items.append(_new_audit_item(
status['session_id'],
"shrunken_live",
"repairable",
"restore_from_bak",
status.get('live_messages', -1),
status.get('bak_messages', -1),
))
for bak_path in sorted(session_dir.glob('*.json.bak')):
live_path = bak_path.with_suffix('')
if live_path.exists() or live_path.name.startswith('_'):
continue
bak_messages = _msg_count(bak_path)
session_id = live_path.stem
if bak_messages < 0:
items.append(_new_audit_item(
session_id, "malformed_orphan_backup", "unsafe_to_repair", "manual_review", -1, bak_messages
))
elif _state_db_has_session(session_id, state_db_path):
items.append(_new_audit_item(
session_id, "orphan_backup", "repairable", "restore_from_bak", -1, bak_messages
))
else:
items.append(_new_audit_item(
session_id,
"orphan_backup_without_state_row",
"unsafe_to_repair",
"manual_review",
-1,
bak_messages,
))
index_path = session_dir / '_index.json'
if index_path.exists():
index_ids = _read_index_session_ids(index_path)
for session_id in sorted(index_ids - live_ids):
items.append(_new_audit_item(
session_id, "index_missing_file", "repairable", "rebuild_index"
))
for session_id in sorted(live_ids - index_ids):
items.append(_new_audit_item(
session_id, "index_missing_entry", "repairable", "rebuild_index",
_msg_count(session_dir / f"{session_id}.json"), -1,
))
for row in _read_state_db_missing_sidecar_rows(session_dir, state_db_path):
sid = str(row.get('id') or '')
items.append(_new_audit_item(
sid,
"state_db_missing_sidecar",
"repairable",
"materialize_from_state_db",
-1,
-1,
))
summary = {"ok": len(live_paths), "repairable": 0, "unsafe_to_repair": 0}
for item in items:
category = item.get('category')
if category in summary:
summary[category] += 1
if summary["unsafe_to_repair"]:
overall = "needs_manual_review"
elif summary["repairable"]:
overall = "warn"
else:
overall = "ok"
return {"status": overall, "summary": summary, "items": items}
def repair_safe_session_recovery(session_dir: Path, state_db_path: Path | None = None) -> dict:
"""Run safe, deterministic session recovery repairs.
This mutates only repairable classes already handled by startup recovery:
shrunken live sidecars and orphan backups that are not tombstoned by a
readable state.db. Unsafe audit findings remain for manual review.
"""
before = audit_session_recovery(session_dir, state_db_path=state_db_path)
backup_repair = recover_all_sessions_on_startup(
session_dir,
rebuild_index=True,
state_db_path=state_db_path,
)
sidecar_repair = recover_missing_sidecars_from_state_db(session_dir, state_db_path)
if sidecar_repair.get('materialized'):
try:
from api.models import _write_session_index
_write_session_index(updates=None)
except Exception as exc:
logger.warning("repair_safe_session_recovery: index rebuild after state.db reconciliation failed: %s", exc)
after = audit_session_recovery(session_dir, state_db_path=state_db_path)
unsafe_remaining = int((after.get("summary") or {}).get("unsafe_to_repair") or 0)
repairable_remaining = int((after.get("summary") or {}).get("repairable") or 0)
return {
"ok": unsafe_remaining == 0 and repairable_remaining == 0,
"repaired": int(backup_repair.get("restored") or 0) + int(sidecar_repair.get("materialized") or 0),
"before": before,
"backup_repair": backup_repair,
"sidecar_repair": sidecar_repair,
"after": after,
}
def recover_all_sessions_on_startup(
session_dir: Path,
rebuild_index: bool = False,
state_db_path: Path | None = None,
) -> dict:
"""Scan session_dir for shrunken/orphaned sessions and restore from .bak.
Returns {"scanned": N, "restored": M, "orphaned_backups": K, "details": [...]}.
"""
if not session_dir.exists():
return {"scanned": 0, "restored": 0, "orphaned_backups": 0, "details": []}
scanned = 0
restored = 0
details: list[dict] = []
live_paths = [path for path in sorted(session_dir.glob('*.json')) if not path.name.startswith('_')]
orphan_paths = _orphaned_backup_live_paths(session_dir, state_db_path=state_db_path)
for path in [*live_paths, *orphan_paths]:
# Skip non-session JSON files in the same dir:
# - ``_index.json`` is a top-level list of session metadata
# - any future non-session JSON marked with the ``_`` convention is
# skipped automatically (project convention for system files in
# directories that otherwise hold user data)
scanned += 1
try:
result = recover_session(path)
except Exception as exc:
# Defensive: a malformed session file shouldn't break recovery
# for the rest. Log and continue.
logger.warning(
"recover_all_sessions_on_startup: skipped %s due to %s: %s",
path.name, type(exc).__name__, exc,
)
continue
if result.get("restored"):
restored += 1
details.append(result)
if restored:
logger.warning(
"recover_all_sessions_on_startup: restored %d/%d sessions from .bak. "
"If you weren't expecting this, check the session list for missing "
"messages — see #1558.", restored, scanned,
)
if rebuild_index:
try:
from api.models import _write_session_index
_write_session_index(updates=None)
except Exception as exc:
logger.warning("recover_all_sessions_on_startup: index rebuild failed: %s", exc)
return {
"scanned": scanned,
"restored": restored,
"orphaned_backups": len(orphan_paths),
"details": details,
}
def _main() -> int:
parser = argparse.ArgumentParser(description="Audit Hermes WebUI session recovery state")
parser.add_argument("--audit", action="store_true", help="run a read-only recovery audit")
parser.add_argument("--session-dir", type=Path, required=True, help="path to WebUI sessions directory")
parser.add_argument("--state-db", type=Path, default=None, help="optional Hermes state.db path")
parser.add_argument("--repair-safe", action="store_true", help="run safe deterministic repairs after auditing")
args = parser.parse_args()
if args.repair_safe:
report = repair_safe_session_recovery(args.session_dir, state_db_path=args.state_db)
elif args.audit:
report = audit_session_recovery(args.session_dir, state_db_path=args.state_db)
else:
parser.error("choose --audit or --repair-safe")
print(json.dumps(report, sort_keys=True))
return 0
if __name__ == "__main__":
raise SystemExit(_main())
+1577 -145
View File
File diff suppressed because it is too large Load Diff
+167
View File
@@ -0,0 +1,167 @@
"""Safe aggregate host resource metrics for the WebUI VPS panel (#693).
The browser only needs coarse CPU/RAM/disk usage. Keep this module intentionally
small and dependency-free: no process lists, command strings, user identities,
environment variables, or filesystem topology leave the server.
"""
from __future__ import annotations
import shutil
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
_PROC_STAT = Path("/proc/stat")
_PROC_MEMINFO = Path("/proc/meminfo")
_CPU_SAMPLE_SECONDS = 0.05
def _checked_at() -> str:
return datetime.now(timezone.utc).isoformat()
def _clamp_percent(value: Any) -> float:
try:
numeric = float(value)
except (TypeError, ValueError):
return 0.0
if numeric < 0:
numeric = 0.0
if numeric > 100:
numeric = 100.0
return round(numeric, 1)
def _read_proc_stat_cpu() -> tuple[int, int]:
"""Return (idle_ticks, total_ticks) from Linux /proc/stat."""
with _PROC_STAT.open("r", encoding="utf-8") as handle:
first = handle.readline().strip().split()
if not first or first[0] != "cpu":
raise RuntimeError("proc_stat_unavailable")
values = [int(part) for part in first[1:]]
if len(values) < 4:
raise RuntimeError("proc_stat_unavailable")
idle = values[3] + (values[4] if len(values) > 4 else 0)
total = sum(values)
if total <= 0:
raise RuntimeError("proc_stat_unavailable")
return idle, total
def _cpu_delta_percent(start: tuple[int, int], end: tuple[int, int]) -> float:
idle_delta = end[0] - start[0]
total_delta = end[1] - start[1]
if total_delta <= 0:
return 0.0
busy_delta = max(0, total_delta - max(0, idle_delta))
return _clamp_percent((busy_delta / total_delta) * 100.0)
def _cpu_percent() -> float:
"""Sample aggregate CPU usage without psutil.
A short local sample avoids storing cross-request state and returns a stable
percentage on the first poll. Unsupported platforms raise a safe error code.
"""
start = _read_proc_stat_cpu()
time.sleep(_CPU_SAMPLE_SECONDS)
end = _read_proc_stat_cpu()
return _cpu_delta_percent(start, end)
def _read_meminfo_kib() -> dict[str, int]:
data: dict[str, int] = {}
with _PROC_MEMINFO.open("r", encoding="utf-8") as handle:
for line in handle:
key, _, rest = line.partition(":")
if not key or not rest:
continue
parts = rest.strip().split()
if not parts:
continue
try:
data[key] = int(parts[0])
except ValueError:
continue
return data
def _memory_usage() -> dict[str, int | float]:
meminfo = _read_meminfo_kib()
total = int(meminfo.get("MemTotal") or 0) * 1024
if total <= 0:
raise RuntimeError("meminfo_unavailable")
available_kib = meminfo.get("MemAvailable")
if available_kib is None:
available_kib = (
meminfo.get("MemFree", 0)
+ meminfo.get("Buffers", 0)
+ meminfo.get("Cached", 0)
+ meminfo.get("SReclaimable", 0)
- meminfo.get("Shmem", 0)
)
available = max(0, int(available_kib) * 1024)
used = max(0, min(total, total - available))
return {
"used_bytes": used,
"total_bytes": total,
"percent": _clamp_percent((used / total) * 100.0),
}
def _disk_usage() -> dict[str, int | float]:
usage = shutil.disk_usage("/")
total = int(usage.total)
if total <= 0:
raise RuntimeError("disk_unavailable")
used = int(usage.used)
return {
"used_bytes": used,
"total_bytes": total,
"percent": _clamp_percent((used / total) * 100.0),
}
def _safe_error(metric: str, exc: Exception) -> dict[str, str]:
# Keep this intentionally coarse. Exception messages can contain local paths
# on unusual platforms; the browser only needs a safe unavailable reason.
return {"metric": metric, "code": type(exc).__name__}
def build_system_health_payload() -> dict[str, Any]:
metrics: dict[str, Any] = {"cpu": None, "memory": None, "disk": None}
errors: list[dict[str, str]] = []
collectors = {
"cpu": _cpu_percent,
"memory": _memory_usage,
"disk": _disk_usage,
}
for name, collect in collectors.items():
try:
value = collect()
if name == "cpu":
metrics[name] = {"percent": _clamp_percent(value)}
else:
metrics[name] = {
"used_bytes": max(0, int(value["used_bytes"])),
"total_bytes": max(0, int(value["total_bytes"])),
"percent": _clamp_percent(value["percent"]),
}
except Exception as exc:
errors.append(_safe_error(name, exc))
available = any(metrics[name] is not None for name in metrics)
status = "ok" if available and not errors else "partial" if available else "unavailable"
return {
"status": status,
"available": available,
"checked_at": _checked_at(),
"cpu": metrics["cpu"],
"memory": metrics["memory"],
"disk": metrics["disk"],
"errors": errors,
}
+117 -3
View File
@@ -13,7 +13,7 @@ import threading
import time
from pathlib import Path
from api.config import REPO_ROOT
from api.config import REPO_ROOT, STREAMS, STREAMS_LOCK
# Lazy -- may be None if agent not found
try:
@@ -28,6 +28,32 @@ _apply_lock = threading.Lock() # prevents concurrent stash/pull/pop on same re
CACHE_TTL = 1800 # 30 minutes
def _active_stream_count() -> int:
"""Return the current in-memory chat stream count.
Self-update schedules an in-process re-exec after git pull/reset. That is
restart-equivalent for live streams, even when systemd does not see a unit
restart. Refuse update/force-update while a stream exists so a browser
update click cannot recreate the pending-message loss class fixed in #1543.
"""
with STREAMS_LOCK:
return len(STREAMS)
def _restart_blocked_response(target: str, active_streams: int) -> dict:
plural = "s" if active_streams != 1 else ""
return {
'ok': False,
'message': (
f'Cannot update {target} while {active_streams} active chat stream{plural} '
'is running. Wait for the response to finish, then retry the update.'
),
'target': target,
'restart_blocked': True,
'active_streams': active_streams,
}
def _run_git(args, cwd, timeout=10):
"""Run a git command and return (useful output, ok).
@@ -91,8 +117,56 @@ def _detect_webui_version() -> str:
return 'unknown'
def _detect_agent_version() -> str:
"""Detect the running Hermes Agent version for UI display."""
if _AGENT_DIR is None:
return 'not detected'
version_file = Path(_AGENT_DIR) / "VERSION"
try:
if version_file.exists():
text = version_file.read_text(encoding='utf-8').strip()
if text:
return text
except Exception:
pass
# Fallback: infer from git describe when the checkout exists but no VERSION
# file is available (common in source checkouts and developer environments).
if not Path(_AGENT_DIR).exists():
return 'not detected'
# Symmetric with _detect_webui_version() above — `--dirty` flags a
# locally-modified checkout so operators can see when their agent has
# uncommitted changes vs a clean tag. Per Opus advisor on stage-293.
out, ok = _run_git(['describe', '--tags', '--always', '--dirty'], _AGENT_DIR, timeout=3)
if ok and out:
return out
return 'not detected'
# Resolved once at import time — tags cannot change without a process restart.
WEBUI_VERSION: str = _detect_webui_version()
AGENT_VERSION: str = _detect_agent_version()
def _normalize_remote_url(remote_url):
"""Return the browser-facing repository URL for update compare links.
Git remotes may be HTTPS or SSH and may include a literal ``.git`` suffix.
Strip only that literal suffix never use ``str.rstrip('.git')`` because it
treats the argument as a character set and can truncate ``hermes-webui`` to
``hermes-webu``.
"""
if not remote_url:
return remote_url
remote_url = remote_url.strip()
if remote_url.startswith('git@'):
remote_url = remote_url.replace(':', '/', 1).replace('git@', 'https://', 1)
remote_url = remote_url.rstrip('/')
if remote_url.endswith('.git'):
remote_url = remote_url[:-4]
return remote_url.rstrip('/')
def _split_remote_ref(ref):
@@ -146,16 +220,48 @@ def _check_repo(path, name):
out, ok = _run_git(['rev-list', '--count', f'HEAD..{compare_ref}'], path)
behind = int(out) if ok and out.isdigit() else 0
# Get short SHAs for display
current, _ = _run_git(['rev-parse', '--short', 'HEAD'], path)
# Get short SHAs for display.
#
# latest_sha = upstream tip (compare_ref). Always exists on github.com
# because it is literally the commit `git fetch` just pulled.
#
# current_sha is trickier. The intuitive choice — local HEAD — breaks
# the "What's new?" compare URL whenever HEAD is not a public commit:
# unpushed work, dirty stage branches, forks, in-flight rebases, or
# release-time merge commits whose SHA only lives in the maintainer's
# checkout. We saw exactly this in #1579: a banner reporting "17 updates"
# linked to /compare/<localHEAD>...<upstream> and 404'd because <localHEAD>
# was never pushed to the canonical repo.
#
# The right base is the merge-base between HEAD and the upstream ref —
# that's the most recent commit both sides agree on, and (because
# `git fetch` succeeded above) it is guaranteed to be present upstream.
# If a user is 17 commits behind with no local-only commits, merge-base
# equals local HEAD and the URL is identical to what we shipped before;
# if they ARE ahead with local-only commits, the URL still resolves to
# the public history they share with upstream. If merge-base fails for
# any reason (e.g. shallow clone where the bases diverge before the
# cutoff), fall back to None so the JS link guard suppresses the link
# rather than emitting a known-broken URL.
mb_full, mb_ok = _run_git(['merge-base', 'HEAD', compare_ref], path)
if mb_ok and mb_full:
short, ok = _run_git(['rev-parse', '--short', mb_full], path)
current = short if (ok and short) else None
else:
current = None
latest, _ = _run_git(['rev-parse', '--short', compare_ref], path)
# Get repo URL for "What's new?" link
remote_url, _ = _run_git(['remote', 'get-url', 'origin'], path)
remote_url = _normalize_remote_url(remote_url)
return {
'name': name,
'behind': behind,
'current_sha': current,
'latest_sha': latest,
'branch': compare_ref,
'repo_url': remote_url,
}
@@ -240,6 +346,10 @@ def apply_force_update(target: str) -> dict:
response with ``conflict: True`` or ``diverged: True`` and the user
has confirmed they want to discard local changes.
"""
active_streams = _active_stream_count()
if active_streams:
return _restart_blocked_response(target, active_streams)
if not _apply_lock.acquire(blocking=False):
return {'ok': False, 'message': 'Update already in progress'}
try:
@@ -290,6 +400,10 @@ def apply_force_update(target: str) -> dict:
def apply_update(target):
"""Stash, pull --ff-only, pop for the given target repo."""
active_streams = _active_stream_count()
if active_streams:
return _restart_blocked_response(target, active_streams)
if not _apply_lock.acquire(blocking=False):
return {'ok': False, 'message': 'Update already in progress'}
try:
+61 -12
View File
@@ -10,6 +10,7 @@ paths are used as fallback when no profile module is available.
import json
import logging
import os
import stat
import subprocess
import concurrent.futures
from pathlib import Path
@@ -92,7 +93,8 @@ def _profile_default_workspace() -> str:
def _clean_workspace_list(workspaces: list) -> list:
"""Sanitize a workspace list:
- Remove entries whose paths no longer exist on disk.
- Preserve saved paths even when they are currently missing or inaccessible;
picker state must not be destroyed by a transient stat/permission failure.
- Remove entries whose paths live inside another profile's directory
(e.g. ~/.hermes/profiles/X/... should not appear on a different profile).
- Rename any entry whose name is literally 'default' to 'Home' (avoids
@@ -104,10 +106,9 @@ def _clean_workspace_list(workspaces: list) -> list:
for w in workspaces:
path = w.get('path', '')
name = w.get('name', '')
p = Path(path).resolve() if path else Path('/')
# Skip paths that no longer exist
if not p.is_dir():
if not path:
continue
p = _safe_resolve(Path(path).expanduser())
# Skip paths inside a DIFFERENT profile's directory (cross-profile leak).
# Allow paths inside the CURRENT profile's own directory (e.g. test workspaces
# created under ~/.hermes/profiles/webui/webui-mvp-test/).
@@ -130,6 +131,32 @@ def _clean_workspace_list(workspaces: list) -> list:
return result
def _workspace_access_error(candidate: Path, *, missing_label: str = "Path does not exist") -> str | None:
"""Return a user-facing validation error for an unusable workspace path.
``Path.exists()`` can collapse permission/stat failures into a generic falsey
result on some Python/OS combinations, which produced misleading "does not
exist" messages for macOS/TCC-denied directories. Probe with ``stat()`` so
missing paths, non-directories, and permission-denied paths can be reported
separately.
"""
try:
st = candidate.stat()
except FileNotFoundError:
return f"{missing_label}: {candidate}"
except PermissionError as exc:
return (
f"Cannot access path: {candidate}. The server process could not inspect "
f"this directory ({exc}). On macOS, grant Full Disk Access or Files and "
f"Folders permission to the Hermes/WebUI app or server process, then try again."
)
except OSError as exc:
return f"Cannot access path: {candidate}. The server process could not inspect this path ({exc})."
if not stat.S_ISDIR(st.st_mode):
return f"Path is not a directory: {candidate}"
return None
def _migrate_global_workspaces() -> list:
"""Read the legacy global workspaces.json, clean it, and return the result.
@@ -517,10 +544,9 @@ def resolve_trusted_workspace(path: str | Path | None = None) -> Path:
candidate = Path(path).expanduser().resolve()
if not candidate.exists():
raise ValueError(f"Path does not exist: {candidate}")
if not candidate.is_dir():
raise ValueError(f"Path is not a directory: {candidate}")
access_error = _workspace_access_error(candidate)
if access_error:
raise ValueError(access_error)
# (A) Trusted if under the user's home directory — cross-platform via Path.home()
# Must be checked before system roots to allow symlinks like /var/home.
@@ -566,6 +592,25 @@ def resolve_trusted_workspace(path: str | Path | None = None) -> Path:
def _strip_surrounding_quotes(path: str) -> str:
"""Strip a single pair of surrounding single or double quotes from a path string.
macOS Finder's "Copy as Pathname" (Cmd+Option+C) returns paths wrapped in
single quotes, e.g. ``'/Users/x/Documents/foo'``. Other shells and OS file
managers do similar things with double quotes. Users routinely paste these
quoted strings into the Add Space input expecting them to "just work"
the only reason they didn't was a missing strip.
Only paired quotes are stripped (matching opener and closer). One-sided quotes
are preserved on the slim chance a path legitimately contains a literal quote
character.
"""
s = path.strip()
if len(s) >= 2 and s[0] == s[-1] and s[0] in ("'", '"'):
return s[1:-1]
return s
def validate_workspace_to_add(path: str) -> Path:
"""Validate a path for *adding* to the workspace list (less restrictive than resolve_trusted_workspace).
@@ -575,13 +620,17 @@ def validate_workspace_to_add(path: str) -> Path:
The stricter ``resolve_trusted_workspace`` is used when *using* an existing workspace
(file reads/writes) to prevent path traversal after the list is built.
Surrounding quotes (single or double) are stripped before validation
macOS Finder's "Copy as Pathname" wraps paths in single quotes by default,
and users routinely paste those into the Add Space input.
"""
path = _strip_surrounding_quotes(path)
candidate = Path(path).expanduser().resolve()
if not candidate.exists():
raise ValueError(f"Path does not exist: {candidate}")
if not candidate.is_dir():
raise ValueError(f"Path is not a directory: {candidate}")
access_error = _workspace_access_error(candidate)
if access_error:
raise ValueError(access_error)
# Home directory is always trusted regardless of where it lives on disk
# (e.g. /var/home/... on systemd-homed Fedora/RHEL).
+73
View File
@@ -0,0 +1,73 @@
"""Helpers for WebUI-managed Hermes Agent git worktrees."""
from __future__ import annotations
import subprocess
import time
from contextlib import redirect_stderr, redirect_stdout
from io import StringIO
from pathlib import Path
import logging
logger = logging.getLogger(__name__)
def find_git_repo_root(workspace: str | Path) -> Path:
"""Return the enclosing git repo root for *workspace*.
Use git itself instead of checking ``workspace/.git`` so nested workspaces
and linked git worktrees are both handled correctly.
"""
ws = Path(workspace).expanduser().resolve()
if not ws.is_dir():
raise ValueError("Workspace path does not exist or is not a directory")
try:
result = subprocess.run(
["git", "rev-parse", "--show-toplevel"],
cwd=ws,
text=True,
capture_output=True,
timeout=5,
check=False,
)
except (OSError, subprocess.TimeoutExpired) as exc:
raise ValueError("Workspace is not inside a git repository") from exc
if result.returncode != 0:
raise ValueError("Workspace is not inside a git repository")
root = result.stdout.strip()
if not root:
raise ValueError("Workspace is not inside a git repository")
return Path(root).expanduser().resolve()
def _setup_agent_worktree(repo_root: str) -> dict:
try:
import api.config # noqa: F401 # ensure Hermes Agent dir is on sys.path
from cli import _setup_worktree
except Exception as exc:
raise RuntimeError("Hermes Agent worktree helper is unavailable") from exc
output = StringIO()
with redirect_stdout(output), redirect_stderr(output):
info = _setup_worktree(repo_root)
emitted = output.getvalue().strip()
if emitted:
logger.debug("Hermes Agent worktree helper output: %s", emitted)
if not info:
raise RuntimeError("Hermes Agent failed to create a git worktree")
return info
def create_worktree_for_workspace(workspace: str | Path) -> dict:
repo_root = find_git_repo_root(workspace)
info = _setup_agent_worktree(str(repo_root))
path = info.get("path")
branch = info.get("branch")
if not path or not branch:
raise RuntimeError("Hermes Agent returned incomplete worktree metadata")
return {
"path": str(Path(path).expanduser().resolve()),
"branch": str(branch),
"repo_root": str(Path(info.get("repo_root") or repo_root).expanduser().resolve()),
"created_at": time.time(),
}
+52 -2
View File
@@ -90,6 +90,47 @@ def ensure_supported_platform() -> None:
)
def _agent_dir_from_hermes_cli() -> Path | None:
"""Resolve the agent install root by inspecting the `hermes` CLI shebang.
The Hermes Agent installer drops a `hermes` console-script in the user's
PATH whose shebang points at the agent's bundled venv:
#!/path/to/hermes-agent/venv/bin/python3
Walking up the parents until we find a directory that contains
`run_agent.py` recovers the install root regardless of where the user
chose to clone the agent (e.g. ~/Projects/GitHub/hermes-agent), which
the hard-coded candidate list in :func:`discover_agent_dir` cannot.
Last-resort only: this is invoked after every explicit candidate
(`HERMES_WEBUI_AGENT_DIR`, `$HERMES_HOME/hermes-agent`, etc.) has missed.
A stale clone in a known location still wins over the live `hermes` CLI
that's intentional, since the candidate list is treated as
authoritative when present, and matches existing behavior.
"""
hermes_path = shutil.which("hermes")
if not hermes_path:
return None
try:
with open(hermes_path, "r", encoding="utf-8", errors="replace") as f:
first_line = f.readline().strip()
except OSError:
return None
if not first_line.startswith("#!"):
return None
interp_field = first_line[2:].strip().split(None, 1)
if not interp_field:
return None
interp = Path(interp_field[0])
if not interp.is_absolute():
return None
for parent in interp.parents:
if (parent / "run_agent.py").exists():
return parent.resolve()
return None
def discover_agent_dir() -> Path | None:
home = Path(os.getenv("HERMES_HOME", str(Path.home() / ".hermes"))).expanduser()
candidates = [
@@ -105,7 +146,7 @@ def discover_agent_dir() -> Path | None:
candidate = Path(raw).expanduser().resolve()
if candidate.exists() and (candidate / "run_agent.py").exists():
return candidate
return None
return _agent_dir_from_hermes_cli()
def discover_launcher_python(agent_dir: Path | None) -> str:
@@ -179,7 +220,16 @@ def ensure_python_has_webui_deps(python_exe: str, agent_dir: Path | None = None)
)
if not venv_python.exists():
info(f"Creating local virtualenv at {venv_dir}")
venv.EnvBuilder(with_pip=True).create(venv_dir)
# symlinks=True: some Python builds (notably mise/asdf shared-library
# installs on macOS) default venv to copy mode. The copied binary still
# uses @executable_path/../lib/libpython3.X.dylib for its load command,
# so the venv binary aborts with SIGABRT on first import because the
# dylib never gets copied into .venv/lib. Symlinking the interpreter
# keeps @executable_path resolving back to the original install.
# CPython's venv falls back to copy mode automatically when symlink
# creation fails (e.g. older Windows without SeCreateSymbolicLinkPrivilege),
# so this is safe to set unconditionally.
venv.EnvBuilder(with_pip=True, symlinks=True).create(venv_dir)
info("Installing WebUI dependencies into local virtualenv")
subprocess.run(
Executable
+367
View File
@@ -0,0 +1,367 @@
#!/usr/bin/env bash
set -euo pipefail
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HERMES_HOME="${HERMES_HOME:-${HOME}/.hermes}"
PID_FILE="${HERMES_WEBUI_PID_FILE:-${HERMES_HOME}/webui.pid}"
LOG_FILE="${HERMES_WEBUI_LOG_FILE:-${HERMES_HOME}/webui.log}"
STATE_FILE="${HERMES_WEBUI_CTL_STATE_FILE:-${HERMES_HOME}/webui.ctl.env}"
DEFAULT_STATE_DIR="${HERMES_WEBUI_STATE_DIR:-${HERMES_HOME}/webui}"
usage() {
cat <<'EOF'
Usage: ./ctl.sh <command> [args]
Commands:
start [bootstrap args...] Start Hermes WebUI as a background daemon
stop Stop the daemon started by ctl.sh
restart [bootstrap args...] Stop, then start again
status Show daemon, host/port, log, and health status
logs [--lines N] [--follow|--no-follow]
Show the daemon log (defaults to tail -n 100 -f)
EOF
}
ensure_home() {
mkdir -p "${HERMES_HOME}" "${DEFAULT_STATE_DIR}"
}
_load_repo_dotenv_preserving_env() {
local env_file="${REPO_ROOT}/.env"
[[ -f "${env_file}" ]] || return 0
local -a preserved=()
local line key value
while IFS= read -r line || [[ -n "${line}" ]]; do
line="${line#${line%%[![:space:]]*}}"
[[ -z "${line}" || "${line}" == \#* || "${line}" != *=* ]] && continue
key="${line%%=*}"
key="${key#export }"
key="${key//[[:space:]]/}"
[[ "${key}" =~ ^[A-Za-z_][A-Za-z0-9_]*$ ]] || continue
if [[ -n "${!key+x}" ]]; then
value="${!key}"
preserved+=("${key}=${value}")
fi
done < "${env_file}"
set -a
# shellcheck source=/dev/null
source "${env_file}"
set +a
local assignment
for assignment in "${preserved[@]}"; do
export "${assignment}"
done
}
_find_python() {
if [[ -n "${HERMES_WEBUI_PYTHON:-}" ]]; then
printf '%s\n' "${HERMES_WEBUI_PYTHON}"
elif command -v python3 >/dev/null 2>&1; then
command -v python3
elif command -v python >/dev/null 2>&1; then
command -v python
else
echo "[ctl] Python 3 is required to run bootstrap.py" >&2
return 1
fi
}
_parse_launch_binding() {
CTL_HOST="${HERMES_WEBUI_HOST:-127.0.0.1}"
CTL_PORT="${HERMES_WEBUI_PORT:-8787}"
local arg next_is_host=0 saw_port=0
for arg in "$@"; do
if (( next_is_host )); then
CTL_HOST="${arg}"
next_is_host=0
continue
fi
case "${arg}" in
--host)
next_is_host=1
;;
--host=*)
CTL_HOST="${arg#--host=}"
;;
--*)
;;
*)
if (( ! saw_port )) && [[ "${arg}" =~ ^[0-9]+$ ]]; then
CTL_PORT="${arg}"
saw_port=1
fi
;;
esac
done
}
_build_bootstrap_args() {
CTL_BOOTSTRAP_ARGS=()
local arg next_is_host=0 saw_port=0
for arg in "$@"; do
if (( next_is_host )); then
next_is_host=0
continue
fi
case "${arg}" in
--host)
next_is_host=1
;;
--host=*)
;;
--*)
CTL_BOOTSTRAP_ARGS+=("${arg}")
;;
*)
if (( ! saw_port )) && [[ "${arg}" =~ ^[0-9]+$ ]]; then
saw_port=1
else
CTL_BOOTSTRAP_ARGS+=("${arg}")
fi
;;
esac
done
}
_write_state() {
local pid="$1" host="$2" port="$3"
local state_dir="${HERMES_WEBUI_STATE_DIR:-${DEFAULT_STATE_DIR}}"
{
printf 'PID=%q\n' "${pid}"
printf 'REPO_ROOT=%q\n' "${REPO_ROOT}"
printf 'HOST=%q\n' "${host}"
printf 'PORT=%q\n' "${port}"
printf 'LOG_FILE=%q\n' "${LOG_FILE}"
printf 'STATE_DIR=%q\n' "${state_dir}"
printf 'STARTED_AT=%q\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
} > "${STATE_FILE}"
}
_load_state_if_present() {
if [[ -f "${STATE_FILE}" ]]; then
# shellcheck source=/dev/null
source "${STATE_FILE}"
fi
}
_pid_from_file() {
[[ -f "${PID_FILE}" ]] || return 1
local pid
pid="$(tr -d '[:space:]' < "${PID_FILE}")"
[[ "${pid}" =~ ^[0-9]+$ ]] || return 1
printf '%s\n' "${pid}"
}
_is_alive() {
local pid="$1"
kill -0 "${pid}" >/dev/null 2>&1
}
_proc_args() {
local pid="$1"
ps -p "${pid}" -o args= 2>/dev/null || true
}
_is_owned_webui_pid() {
local pid="$1" args state_repo=""
[[ -f "${STATE_FILE}" ]] || return 1
_load_state_if_present
state_repo="${REPO_ROOT:-}"
[[ "${state_repo}" == "$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" ]] || return 1
args="$(_proc_args "${pid}")"
[[ -n "${args}" ]] || return 1
[[ "${args}" == *"${state_repo}/bootstrap.py"* || "${args}" == *"${state_repo}/server.py"* || "${args}" == *"${state_repo}/start.sh"* ]]
}
_current_pid() {
local pid
pid="$(_pid_from_file)" || return 1
if _is_alive "${pid}" && _is_owned_webui_pid "${pid}"; then
printf '%s\n' "${pid}"
return 0
fi
return 1
}
_clear_stale_pid() {
if [[ -f "${PID_FILE}" ]]; then
rm -f "${PID_FILE}" "${STATE_FILE}"
echo "[ctl] Removed stale PID file: ${PID_FILE}"
fi
}
start_cmd() {
ensure_home
_load_repo_dotenv_preserving_env
export HERMES_WEBUI_STATE_DIR="${HERMES_WEBUI_STATE_DIR:-${DEFAULT_STATE_DIR}}"
mkdir -p "${HERMES_WEBUI_STATE_DIR}"
_parse_launch_binding "$@"
_build_bootstrap_args "$@"
export HERMES_WEBUI_HOST="${CTL_HOST}"
export HERMES_WEBUI_PORT="${CTL_PORT}"
local existing_pid
if existing_pid="$(_current_pid 2>/dev/null)"; then
echo "[ctl] Hermes WebUI is already running (PID ${existing_pid})"
return 0
fi
_clear_stale_pid >/dev/null 2>&1 || true
local python_exe pid
python_exe="$(_find_python)"
: >> "${LOG_FILE}"
(
cd "${REPO_ROOT}"
exec "${python_exe}" "${REPO_ROOT}/bootstrap.py" --no-browser --foreground --host "${CTL_HOST}" "${CTL_PORT}" ${CTL_BOOTSTRAP_ARGS[@]+"${CTL_BOOTSTRAP_ARGS[@]}"}
) >> "${LOG_FILE}" 2>&1 &
pid=$!
printf '%s\n' "${pid}" > "${PID_FILE}"
_write_state "${pid}" "${CTL_HOST}" "${CTL_PORT}"
sleep 0.15
if ! _is_alive "${pid}"; then
echo "[ctl] Hermes WebUI failed to stay running. Log: ${LOG_FILE}" >&2
rm -f "${PID_FILE}" "${STATE_FILE}"
return 1
fi
echo "[ctl] Started Hermes WebUI (PID ${pid})"
echo "[ctl] Bound: ${CTL_HOST}:${CTL_PORT}"
echo "[ctl] Log: ${LOG_FILE}"
}
stop_cmd() {
ensure_home
local pid
if ! pid="$(_pid_from_file 2>/dev/null)"; then
echo "[ctl] Hermes WebUI is stopped"
rm -f "${PID_FILE}" "${STATE_FILE}"
return 0
fi
if ! _is_alive "${pid}" || ! _is_owned_webui_pid "${pid}"; then
_clear_stale_pid
return 0
fi
echo "[ctl] Stopping Hermes WebUI (PID ${pid})"
kill "${pid}" >/dev/null 2>&1 || true
local i
for i in {1..50}; do
if ! _is_alive "${pid}"; then
rm -f "${PID_FILE}" "${STATE_FILE}"
echo "[ctl] Stopped"
return 0
fi
sleep 0.1
done
echo "[ctl] Process did not exit after SIGTERM; sending SIGKILL" >&2
kill -KILL "${pid}" >/dev/null 2>&1 || true
rm -f "${PID_FILE}" "${STATE_FILE}"
}
_health_line() {
local host="$1" port="$2" url result
url="http://${host}:${port}/health"
if command -v curl >/dev/null 2>&1; then
if result="$(curl -fsS --max-time 2 "${url}" 2>/dev/null)"; then
if command -v python3 >/dev/null 2>&1; then
printf '%s' "${result}" | python3 -c 'import json,sys
try:
data=json.load(sys.stdin)
sessions=data.get("sessions", data.get("session_count", "?"))
active=data.get("active_streams", "?")
status=data.get("status", "ok")
print(f"ok ({sessions} sessions, {active} active streams)" if status == "ok" else status)
except Exception:
print("ok")'
else
echo "ok"
fi
else
echo "unreachable (${url})"
fi
else
echo "unknown (curl not found; ${url})"
fi
}
status_cmd() {
ensure_home
_load_state_if_present
local host="${HOST:-${HERMES_WEBUI_HOST:-127.0.0.1}}"
local port="${PORT:-${HERMES_WEBUI_PORT:-8787}}"
local log_path="${LOG_FILE}"
local pid uptime health
if pid="$(_current_pid 2>/dev/null)"; then
uptime="$(ps -p "${pid}" -o etime= 2>/dev/null | sed 's/^ *//' || true)"
health="$(_health_line "${host}" "${port}")"
echo "● hermes-webui — running"
echo " PID: ${pid}"
echo " Uptime: ${uptime:-unknown}"
echo " Bound: ${host}:${port}"
echo " Log: ${log_path}"
echo " Health: ${health}"
else
[[ -f "${PID_FILE}" ]] && _clear_stale_pid >/dev/null 2>&1 || true
echo "● hermes-webui — stopped"
echo " PID: -"
echo " Bound: ${host}:${port}"
echo " Log: ${log_path}"
echo " Health: not checked"
fi
}
logs_cmd() {
ensure_home
local lines=100 follow=1
while [[ $# -gt 0 ]]; do
case "$1" in
--lines)
shift
lines="${1:-}"
[[ "${lines}" =~ ^[0-9]+$ ]] || { echo "[ctl] --lines requires a number" >&2; return 2; }
;;
--lines=*)
lines="${1#--lines=}"
[[ "${lines}" =~ ^[0-9]+$ ]] || { echo "[ctl] --lines requires a number" >&2; return 2; }
;;
--follow|-f)
follow=1
;;
--no-follow)
follow=0
;;
*)
echo "[ctl] Unknown logs option: $1" >&2
return 2
;;
esac
shift
done
touch "${LOG_FILE}"
if (( follow )); then
tail -n "${lines}" -f "${LOG_FILE}"
else
tail -n "${lines}" "${LOG_FILE}"
fi
}
cmd="${1:-}"
if [[ $# -gt 0 ]]; then
shift
fi
case "${cmd}" in
start) start_cmd "$@" ;;
stop) stop_cmd ;;
restart) stop_cmd; start_cmd "$@" ;;
status) status_cmd ;;
logs) logs_cmd "$@" ;;
-h|--help|help|"") usage ;;
*) echo "[ctl] Unknown command: ${cmd}" >&2; usage >&2; exit 2 ;;
esac
+94 -46
View File
@@ -36,25 +36,25 @@ script_fullname=$0
echo " - script_fullname: ${script_fullname}"
ignore_value="VALUE_TO_IGNORE"
# everyone can read our files by default
umask 0022
# Keep init scratch files private to the container user that owns them.
umask 0077
# Write a world-writeable file (preferably inside /tmp -- ie within the container)
write_worldtmpfile() {
write_privtmpfile() {
tmpfile=$1
if [ -z "${tmpfile}" ]; then error_exit "write_worldfile: missing argument"; fi
if [ -f $tmpfile ]; then rm -f $tmpfile; fi
echo -n $2 > ${tmpfile}
chmod 777 ${tmpfile}
if [ -z "${tmpfile}" ]; then error_exit "write_privtmpfile: missing argument"; fi
if [ -f "$tmpfile" ]; then rm -f "$tmpfile"; fi
printf '%s' "$2" > "$tmpfile"
chmod 600 "$tmpfile"
}
itdir=/tmp/hermeswebui_init
if [ ! -d $itdir ]; then mkdir $itdir; chmod 777 $itdir; fi
if [ ! -d $itdir ]; then error_exit "Failed to create $itdir"; fi
if [ ! -d "$itdir" ]; then mkdir -p "$itdir"; fi
chmod 700 "$itdir" || error_exit "Failed to secure $itdir"
if [ ! -d "$itdir" ]; then error_exit "Failed to create $itdir"; fi
# Set user and group id
# logic: if not set and file exists, use file value, else use default. Create file for persistence when the container is re-run
# reasoning: needed when using docker compose as the file will exist in the stopped container, and changing the value from environment variables or configuration file must be propagated from hermeswebuitoo to hermeswebuitoo transition (those values are the only ones loaded before the environment variables dump file are loaded)
# reasoning: needed when using docker compose as the file will exist in the stopped container, and changing the value from environment variables or configuration file must be propagated from the root init phase to the hermeswebui runtime phase
it=$itdir/hermeswebui_user_uid
if [ -z "${WANTED_UID+x}" ]; then
if [ -f $it ]; then WANTED_UID=$(cat $it); fi
@@ -88,7 +88,7 @@ if [ -z "${WANTED_UID+x}" ] || [ "${WANTED_UID}" = "1024" ]; then
fi
fi
WANTED_UID=${WANTED_UID:-1024}
write_worldtmpfile $it "$WANTED_UID"
write_privtmpfile $it "$WANTED_UID"
echo "-- WANTED_UID: \"${WANTED_UID}\""
it=$itdir/hermeswebui_user_gid
@@ -120,7 +120,7 @@ if [ -z "${WANTED_GID+x}" ] || [ "${WANTED_GID}" = "1024" ]; then
fi
fi
WANTED_GID=${WANTED_GID:-1024}
write_worldtmpfile $it "$WANTED_GID"
write_privtmpfile $it "$WANTED_GID"
echo "-- WANTED_GID: \"${WANTED_GID}\""
echo "== Most Environment variables set"
@@ -180,27 +180,78 @@ load_env() {
fi
}
# hermeswebuitoo is a specfiic user not existing by default on ubuntu, we can check its whomai
if [ "A${whoami}" == "Ahermeswebuitoo" ]; then
echo "-- Running as hermeswebuitoo, will switch hermeswebui to the desired UID/GID"
# The script is started as hermeswebuitoo -- UID/GID 1025/1025
# The production image does not ship sudo. The entrypoint starts as root only
# long enough to align the hermeswebui UID/GID with mounted volumes, prepare
# root-owned paths, and then drop privileges for the server process.
if [ "A${whoami}" == "Aroot" ]; then
echo "-- Running as root for one-time container init; will switch to hermeswebui"
# We are altering the UID/GID of the hermeswebui user to the desired ones and restarting as that user
# using usermod for the already create hermeswebui user, knowing it is not already in use
# using usermod for the already created hermeswebui user, knowing it is not already in use
# per usermod manual: "You must make certain that the named user is not executing any processes when this command is being executed"
sudo groupmod -o -g ${WANTED_GID} hermeswebui || error_exit "Failed to set GID of hermeswebui user"
sudo usermod -o -u ${WANTED_UID} hermeswebui || error_exit "Failed to set UID of hermeswebui user"
sudo chown -R ${WANTED_UID}:${WANTED_GID} /home/hermeswebui || error_exit "Failed to set owner of /home/hermeswebui"
save_env /tmp/hermeswebuitoo_env.txt
# Guard for read-only root filesystem (podman with read_only=true, issue #1470).
_readonly_root=false
if ! sh -c 'test -w /etc/group && test -w /etc/passwd' 2>/dev/null; then
_readonly_root=true
echo " !! Detected read-only root filesystem — /etc/group or /etc/passwd is not writable"
fi
if [ "A${_readonly_root}" == "Atrue" ]; then
_current_hermeswebui_gid=$(id -g hermeswebui 2>/dev/null || echo "")
_current_hermeswebui_uid=$(id -u hermeswebui 2>/dev/null || echo "")
if [ "A${_current_hermeswebui_gid}" == "A${WANTED_GID}" ] && [ "A${_current_hermeswebui_uid}" == "A${WANTED_UID}" ]; then
echo " -- Skipping groupmod/usermod — hermeswebui already has UID ${WANTED_UID} GID ${WANTED_GID} and root fs is read-only"
else
error_exit "Cannot modify /etc/group or /etc/passwd (read-only root fs). Set UID=${_current_hermeswebui_uid} and GID=${_current_hermeswebui_gid} to match, or run without read_only=true. See issue #1470."
fi
else
groupmod -o -g "${WANTED_GID}" hermeswebui || error_exit "Failed to set GID of hermeswebui user"
usermod -o -u "${WANTED_UID}" hermeswebui || error_exit "Failed to set UID of hermeswebui user"
fi
chown -R "${WANTED_UID}:${WANTED_GID}" /home/hermeswebui || error_exit "Failed to set owner of /home/hermeswebui"
echo ""; echo "-- Preparing /app for the hermeswebui runtime user"
mkdir -p /app || error_exit "Failed to create /app directory"
chown hermeswebui:hermeswebui /app || error_exit "Failed to set owner of /app to hermeswebui user"
rsync -av --chown=hermeswebui:hermeswebui /apptoo/ /app/ || error_exit "Failed to sync /apptoo to /app with correct ownership"
if [ -z "${HERMES_WEBUI_DEFAULT_WORKSPACE+x}" ]; then export HERMES_WEBUI_DEFAULT_WORKSPACE="/workspace"; fi
if [ ! -d "$HERMES_WEBUI_DEFAULT_WORKSPACE" ]; then
mkdir -p "$HERMES_WEBUI_DEFAULT_WORKSPACE" || error_exit "Failed to create default workspace at $HERMES_WEBUI_DEFAULT_WORKSPACE"
fi
if [ ! -d "$HERMES_WEBUI_DEFAULT_WORKSPACE" ]; then error_exit "HERMES_WEBUI_DEFAULT_WORKSPACE directory does not exist at $HERMES_WEBUI_DEFAULT_WORKSPACE"; fi
chown hermeswebui:hermeswebui "$HERMES_WEBUI_DEFAULT_WORKSPACE" 2>/dev/null || echo "!! WARNING: Could not chown $HERMES_WEBUI_DEFAULT_WORKSPACE (continuing)"
export UV_CACHE_DIR=${UV_CACHE_DIR:-/uv_cache}
mkdir -p "${UV_CACHE_DIR}" || error_exit "Failed to create ${UV_CACHE_DIR} directory"
chown hermeswebui:hermeswebui "${UV_CACHE_DIR}" || error_exit "Failed to set owner of ${UV_CACHE_DIR} to hermeswebui user"
chown -R "${WANTED_UID}:${WANTED_GID}" "$itdir" || error_exit "Failed to set owner of $itdir"
# Issue #2010 — Railway / user-namespaced runtimes: in-container UID 0 may map
# to a host UID outside the writable subuid range, so /tmp writes fail despite
# id -u == 0. Probe writability and fall back through $itdir → /app.
ENV_FILE="/tmp/hermeswebui_root_env.txt"
if ! ( : > "$ENV_FILE" ) 2>/dev/null; then
ENV_FILE="${itdir:-/tmp/hermeswebui_init}/hermeswebui_root_env.txt"
mkdir -p "$(dirname "$ENV_FILE")" 2>/dev/null
if ! ( : > "$ENV_FILE" ) 2>/dev/null; then
ENV_FILE="/app/.hermeswebui_root_env"
fi
echo " !! /tmp not writable by root — falling back to $ENV_FILE (user-namespaced runtime?)"
fi
save_env "$ENV_FILE"
chown "${WANTED_UID}:${WANTED_GID}" "$ENV_FILE" || error_exit "Failed to set owner of $ENV_FILE"
chmod 600 "$ENV_FILE" || error_exit "Failed to secure $ENV_FILE"
export _HW_ROOT_ENV_PATH="$ENV_FILE"
# restart the script as hermeswebui set with the correct UID/GID this time
echo "-- Restarting as hermeswebui user with UID ${WANTED_UID} GID ${WANTED_GID}"
sudo su hermeswebui $script_fullname || error_exit "subscript failed"
ok_exit "Clean exit"
exec su -s /bin/bash -c "exec \"${script_fullname}\"" hermeswebui || error_exit "subscript failed"
fi
# If we are here, the script is started as another user than hermeswebuitoo
# because the whoami value for the hermeswebui user can be any existing user, we can not check against it
# instead we check if the UID/GID are the expected ones
# If we are here, the script is started as an unprivileged runtime user.
# Because the whoami value for the hermeswebui user can be any existing user, we cannot check against it;
# instead we check if the UID/GID are the expected ones.
if [ "$WANTED_GID" != "$new_gid" ]; then error_exit "hermeswebui MUST be running as UID ${WANTED_UID} GID ${WANTED_GID}, current UID ${new_uid} GID ${new_gid}"; fi
if [ "$WANTED_UID" != "$new_uid" ]; then error_exit "hermeswebui MUST be running as UID ${WANTED_UID} GID ${WANTED_GID}, current UID ${new_uid} GID ${new_gid}"; fi
@@ -209,18 +260,16 @@ if [ "$WANTED_UID" != "$new_uid" ]; then error_exit "hermeswebui MUST be running
# We are therefore running as hermeswebui
echo ""; echo "== Running as hermeswebui"
# Load environment variables one by one if they do not exist from /tmp/hermeswebuitoo_env.txt
it=/tmp/hermeswebuitoo_env.txt
if [ -f $it ]; then
echo "-- Loading not already set environment variables from $it"
load_env $it true
# Load environment variables one by one if they do not exist from the root init phase
tmp_root_env="${_HW_ROOT_ENV_PATH:-/tmp/hermeswebui_root_env.txt}"
if [ -f $tmp_root_env ]; then
echo "-- Loading not already set environment variables from $tmp_root_env"
load_env $tmp_root_env true
fi
##
echo ""; echo "-- Making sure /app is owned by the hermeswebui user to avoid permission issues when running the server "
sudo mkdir -p /app || error_exit "Failed to create /app directory"
sudo chown hermeswebui:hermeswebui /app || error_exit "Failed to set owner of /app to hermeswebui user"
sudo rsync -av --chown=hermeswebui:hermeswebui /apptoo/ /app/ || error_exit "Failed to sync /apptoo to /app with correct ownership"
echo ""; echo "-- Verifying /app is writable by the hermeswebui runtime user"
if [ ! -d /app ]; then error_exit "/app directory does not exist"; fi
it=/app/.testfile; touch $it || error_exit "Failed to verify /app directory"
rm -f $it || error_exit "Failed to delete test file in /app"
@@ -239,19 +288,18 @@ rm -f $it || error_exit "Failed to delete test file in $HERMES_WEBUI_STATE_DIR"
echo ""; echo "-- HERMES_WEBUI_DEFAULT_WORKSPACE: Default workspace directory shown on first launch"
if [ -z "${HERMES_WEBUI_DEFAULT_WORKSPACE+x}" ]; then echo "HERMES_WEBUI_DEFAULT_WORKSPACE not set, setting to /workspace"; export HERMES_WEBUI_DEFAULT_WORKSPACE="/workspace"; fi;
echo "-- HERMES_WEBUI_DEFAULT_WORKSPACE: $HERMES_WEBUI_DEFAULT_WORKSPACE"
# Use sudo for mkdir — Docker may auto-create bind-mount directories as root (#357).
# Skip mkdir if the directory already exists (e.g. a read-only mount — #670).
# The root init phase creates/chowns missing bind-mount directories before
# dropping privileges. After that, the runtime user only verifies access.
if [ ! -d "$HERMES_WEBUI_DEFAULT_WORKSPACE" ]; then
sudo mkdir -p "$HERMES_WEBUI_DEFAULT_WORKSPACE" || error_exit "Failed to create default workspace at $HERMES_WEBUI_DEFAULT_WORKSPACE"
mkdir -p "$HERMES_WEBUI_DEFAULT_WORKSPACE" || error_exit "Failed to create default workspace at $HERMES_WEBUI_DEFAULT_WORKSPACE"
fi
if [ ! -d "$HERMES_WEBUI_DEFAULT_WORKSPACE" ]; then error_exit "HERMES_WEBUI_DEFAULT_WORKSPACE directory does not exist at $HERMES_WEBUI_DEFAULT_WORKSPACE"; fi
# Only chown and write-test if the workspace is writable. Read-only bind-mounts
# (:ro) are valid — the workspace is used for browsing, not writing by the server.
# Only write-test if the workspace is writable. Read-only bind-mounts (:ro)
# are valid — the workspace is used for browsing, not writing by the server.
if [ -w "$HERMES_WEBUI_DEFAULT_WORKSPACE" ]; then
sudo chown hermeswebui:hermeswebui "$HERMES_WEBUI_DEFAULT_WORKSPACE" || echo "!! WARNING: Could not chown $HERMES_WEBUI_DEFAULT_WORKSPACE (continuing)"
it="$HERMES_WEBUI_DEFAULT_WORKSPACE/.testfile"; touch $it && rm -f $it || echo "!! WARNING: Could not write to $HERMES_WEBUI_DEFAULT_WORKSPACE (continuing)"
else
echo "-- HERMES_WEBUI_DEFAULT_WORKSPACE is read-only — skipping chown/write check (read-only workspace is supported)"
echo "-- HERMES_WEBUI_DEFAULT_WORKSPACE is read-only — skipping write check (read-only workspace is supported)"
fi
echo ""; echo "==================="
@@ -266,9 +314,9 @@ else
fi
export UV_PROJECT_ENVIRONMENT=venv
export UV_CACHE_DIR=/uv_cache
sudo mkdir -p ${UV_CACHE_DIR} || error_exit "Failed to create /uv_cache directory"
sudo chown hermeswebui:hermeswebui ${UV_CACHE_DIR} || error_exit "Failed to set owner of ${UV_CACHE_DIR} to hermeswebui user"
export UV_CACHE_DIR=${UV_CACHE_DIR:-/uv_cache}
mkdir -p "${UV_CACHE_DIR}" || error_exit "Failed to create ${UV_CACHE_DIR} directory"
test -w "${UV_CACHE_DIR}" || error_exit "${UV_CACHE_DIR} is not writable by hermeswebui"
cd /app
if [ -f /app/venv/bin/python3 ]; then
+18
View File
@@ -13,6 +13,24 @@ This is the comprehensive Docker reference. For a 5-minute quickstart, see the [
If something stops working, **start with the single-container setup** — it's the simplest path and fixes most permission/UID/path-mismatch issues by construction.
## Production image security model
The production Docker image is hardened for the normal single-tenant container threat model:
Hermes WebUI assumes one operator controls the container, mounted Hermes home, and workspace.
The image does **not** install `sudo`, does not add runtime users to a sudo group, and does not
grant `NOPASSWD` escalation. If an agent/tool process gains a shell as `hermeswebui`, it should
not be able to become root with a passwordless sudo command.
The entrypoint still starts as `root` for a narrow init phase because Docker bind mounts often need
UID/GID alignment and ownership preparation before the app can read `~/.hermes`, `/workspace`,
`/app`, and `/uv_cache`. After that setup, `docker_init.bash` re-execs itself as the unprivileged
`hermeswebui` user and starts the server there. Init scratch files under `/tmp/hermeswebui_init`
are owner-only (`0700` directory, `0600` files), not world-writable.
For multi-tenant or hostile-container environments, rebuild with your own runtime user, mount policy,
and supervisor assumptions. Development images that need package-manager convenience should add
those tools in a dev-only Dockerfile instead of reintroducing passwordless sudo to production.
## 5-minute quickstart (single container)
```bash
+181
View File
@@ -0,0 +1,181 @@
# First-run onboarding guide
This guide explains what happens the first time Hermes WebUI starts, which
setup path to choose, and how to recover when the wizard cannot finish.
The short version: run the bootstrap, open the WebUI, choose a provider, choose
a workspace, optionally set a password, then start a chat. If you are using a
local model server from Docker, pay special attention to the Base URL section
below.
## Before you start
Hermes WebUI is only the browser interface. The actual agent runtime, memory,
skills, config, cron jobs, and provider credentials belong to Hermes Agent.
The bootstrap supports Linux, macOS, and WSL2. Native Windows is not supported
by the bootstrap yet. A community native Windows setup is being tracked in
[#1952](https://github.com/nesquena/hermes-webui/issues/1952), including:
- [Native Windows guide](https://github.com/markwang2658/hermes-windows-native-guide)
- [Native Windows setup scripts](https://github.com/markwang2658/hermes-windows-native)
For Windows users who want the supported path today, use WSL2 and see
[Windows / WSL auto-start](wsl-autostart.md).
## Install path choices
| Path | Use it when | Notes |
|---|---|---|
| Local bootstrap | You run WebUI directly on Linux, macOS, or WSL2 | Best for a personal server, Mac mini, VPS, or homelab host. |
| Docker single-container | You want the simplest container setup | Recommended first Docker path. WebUI runs the agent in-process. |
| Docker two-container | You already run the agent gateway separately | More isolated, but tools launched from WebUI run in the WebUI container. |
| Docker three-container | You want agent gateway plus dashboard plus WebUI | Same caveats as two-container, plus the dashboard service. |
| Native Windows community path | You are intentionally testing unsupported native Windows | Community-maintained for now, not the official bootstrap path. |
If a Docker install gets confusing, start again with the single-container setup.
It avoids most UID/GID, source-volume, and tool-location surprises. See
[Docker setup guide](docker.md) for the full container reference.
## Re-running onboarding safely
Do not delete `~/.hermes` just to see the wizard again. That directory can hold
your real Hermes config, credentials, memory, skills, profiles, sessions, and
cron state.
For a clean local trial, use an isolated Hermes home and WebUI state directory:
```bash
mkdir -p ~/hermes-onboarding-test
HERMES_HOME=~/hermes-onboarding-test/.hermes \
HERMES_WEBUI_STATE_DIR=~/hermes-onboarding-test/webui \
HERMES_WEBUI_PORT=8789 \
python3 bootstrap.py
```
Then open `http://127.0.0.1:8789`.
If your repo has a `.env` file, remember that the bootstrap loads it. Remove or
adjust any `HERMES_HOME`, `HERMES_WEBUI_STATE_DIR`, or `HERMES_WEBUI_PORT`
entries there before using the isolated command above.
For managed hosting or fully preconfigured images, set
`HERMES_WEBUI_SKIP_ONBOARDING=1` to bypass the wizard.
## What the wizard checks
The first screen reports the runtime state WebUI can see:
- Hermes Agent importability: whether WebUI can import and run `AIAgent`.
- Provider status: whether `config.yaml` and credential state are enough for a
chat request.
- Password status: whether WebUI password protection is enabled.
- Config paths: the active `config.yaml` and `.env` locations for this profile.
If the agent check fails, use [Troubleshooting](troubleshooting.md), especially
the `AIAgent not available` section. If provider setup is incomplete, continue
through the wizard or run `hermes model` in the same machine environment that
will run WebUI.
## Choosing a provider
The setup step groups providers by how much information they usually need.
| Group | Examples | What you usually enter |
|---|---|---|
| Easy start | OpenRouter, Anthropic, OpenAI | API key and model. |
| Open / self-hosted | Ollama, LM Studio, custom OpenAI-compatible | Base URL, model, optional API key. |
| Specialized | Gemini, DeepSeek, Xiaomi MiMo, Z.AI / GLM, NVIDIA NIM, Mistral, xAI | Provider API key and default model. |
For API-key providers, the wizard writes the key to the active Hermes `.env`
file and writes the default model/provider to `config.yaml`.
For local providers, the API key field can be blank when the server is keyless.
Most LM Studio, Ollama, vLLM, llama-server, and TabbyAPI installs run this way.
Use **Test connection** to verify the Base URL and populate the model list
before continuing.
Advanced provider flows such as Nous Portal and GitHub Copilot are still
terminal-first. OpenAI Codex and Anthropic Claude Code OAuth can be started in
the onboarding flow when your Hermes config selects the corresponding provider.
If the wizard points you back to `hermes model`, use that CLI flow first, then
refresh WebUI.
## Base URL rules for local model servers
For self-hosted providers, the Base URL should point to the OpenAI-compatible
API root. Common examples:
| Server | Typical Base URL |
|---|---|
| LM Studio on the same non-Docker host | `http://127.0.0.1:1234/v1` |
| Ollama on the same non-Docker host | `http://127.0.0.1:11434/v1` |
| LM Studio from Docker Desktop | `http://host.docker.internal:1234/v1` |
| Ollama from Docker Desktop | `http://host.docker.internal:11434/v1` |
| Local server on another LAN machine | `http://<lan-ip>:<port>/v1` |
Inside Docker, `localhost` means the WebUI container itself, not your Mac,
Windows host, or another machine on your LAN. If LM Studio or Ollama is running
outside the container, use `host.docker.internal` on Docker Desktop or the
server's LAN IP address.
The wizard probes `<base-url>/models` before saving. A successful probe fills
the model dropdown. A failed probe blocks the setup step and shows an inline
error such as DNS failure, connection refused, timeout, HTTP error, or
unexpected response shape.
## Workspace step
The workspace is the filesystem location Hermes should use for new sessions.
It can be a source checkout, a project directory, or a general workspace folder.
In Docker, the default browsable path is `/workspace`, which maps to the host
directory mounted by the compose file. If the workspace appears empty, check the
Docker UID/GID and mount guidance in [Docker setup guide](docker.md).
## Password step
Password protection is optional for localhost-only installs. Enable it if you
expose WebUI outside `127.0.0.1`, behind a reverse proxy, or on a LAN.
The password is stored through the normal WebUI settings path and hashed
server-side. You can change it later from Settings.
## What gets written
The wizard uses the same files and APIs as the normal app:
- Active Hermes `config.yaml`: provider, default model, and Base URL when
relevant.
- Active Hermes `.env`: provider API keys when you entered one.
- WebUI `settings.json`: onboarding completion, workspace, password state, and
other WebUI preferences.
State normally lives outside the repository. By default:
- Hermes Agent state: `~/.hermes`
- WebUI state: `~/.hermes/webui`
Override these with `HERMES_HOME` and `HERMES_WEBUI_STATE_DIR` when you need an
isolated test install.
## When to file an issue
File an issue when the diagnostics point to WebUI rather than local
configuration. Include:
1. Install path: local bootstrap, Docker single-container, Docker
two-container, Docker three-container, WSL2, or community native Windows.
2. Output from `/health`, or the startup banner if the server never starts.
3. The provider selected in onboarding and the Base URL shape, with secrets
redacted.
4. For Docker provider problems, the result of probing from inside the
container, for example:
```bash
docker exec hermes-webui sh -c 'curl -sS -w "\nHTTP %{http_code}\n" http://host.docker.internal:1234/v1/models | head -50'
```
5. Any inline wizard error text and relevant logs.
Never paste API keys, OAuth tokens, or full `.env` contents into an issue.
Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 158 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 132 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 147 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 118 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 142 KiB

@@ -0,0 +1,25 @@
{
"issue": 1772,
"check": "api.models.get_cli_session_messages preserves CLI tool metadata for WebUI rendering",
"session_id": "cli_issue_1772_demo",
"message_count": 2,
"assistant_tool_calls": [
{
"id": "call_1772_demo",
"type": "function",
"function": {
"name": "terminal",
"arguments": "{\"command\": \"printf ok\"}"
}
}
],
"tool_result": {
"role": "tool",
"tool_call_id": "call_1772_demo",
"tool_name": "terminal",
"name": "terminal",
"content": {
"output": "ok"
}
}
}
Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

+25
View File
@@ -0,0 +1,25 @@
{
"issue": 1784,
"commit_under_test": "9875967",
"fixture": "Synthetic 180-row session sidebar with active sid_0 streaming and long chat pane content.",
"pre_fix_observation": {
"steps": [
"Set _scrollPinned=true with #messages at scrollTop 0 in a long chat fixture.",
"Dispatch a wheel gesture on the active sidebar session row.",
"Call scrollIfPinned() to mimic the next streaming token render."
],
"result": "#messages jumped from scrollTop 0 to 3073 immediately after the sidebar wheel gesture, showing the chat auto-scroll path fought non-chat scroll intent."
},
"post_fix_observation": {
"steps": [
"Repeat the same fixture and sidebar wheel gesture after the fix.",
"Call scrollIfPinned() immediately, then again after the 350ms non-chat intent guard expires."
],
"result": {
"afterSidebarWheel": 0,
"afterIntentExpires": 2992,
"sessionListCss": "overscroll-behavior-y: contain; touch-action: pan-y"
},
"meaning": "A sidebar wheel/touch scroll intent now suppresses only the immediate chat-pane auto-scroll write, leaving the sidebar gesture free while streaming continues."
}
}
Binary file not shown.

After

Width:  |  Height:  |  Size: 131 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 125 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 140 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

@@ -0,0 +1,35 @@
{
"id": "openai-codex",
"display_name": "OpenAI Codex",
"has_key": true,
"configurable": false,
"is_oauth": true,
"key_source": "oauth",
"models": [
{
"id": "gpt-5.5",
"label": "GPT 5.5"
},
{
"id": "gpt-5.4",
"label": "GPT 5.4"
},
{
"id": "gpt-5.4-mini",
"label": "GPT 5.4 Mini"
},
{
"id": "gpt-5.3-codex",
"label": "GPT 5.3 Codex"
},
{
"id": "gpt-5.2",
"label": "GPT 5.2"
},
{
"id": "gpt-5.3-codex-spark",
"label": "GPT 5.3 Codex Spark"
}
],
"models_total": 6
}
Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 141 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 180 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 176 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 171 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 143 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

Some files were not shown because too many files have changed in this diff Show More