Commit Graph

1889 Commits

Author SHA1 Message Date
nesquena-hermes 85ae0acbdc docs: CHANGELOG v0.51.45 Release U (9-PR batch + Opus SHOULD-FIX) 2026-05-11 17:31:18 +00:00
nesquena-hermes 83de9d0cf0 fix(providers): log warning when custom provider entry yields empty slug
Opus stage-338 review SHOULD-FIX: silent drop at api/providers.py:1049
was diagnostically opaque. logger.warning() now surfaces the bad
config entry so operators can spot misconfigurations.

Co-authored-by: Opus advisor <opus-advisor@hermes.local>
2026-05-11 17:30:56 +00:00
nesquena-hermes 87bd9ea372 docs: CHANGELOG Unreleased — stage-338 (9 PRs) 2026-05-11 17:18:16 +00:00
nesquena-hermes 6a016dae6c Merge PR #2077 into stage-338
Refactor compression anchor visibility helpers
by @franksong2702
2026-05-11 17:17:25 +00:00
nesquena-hermes 98b6925333 Merge PR #2065 into stage-338
Fix session recovery polish
by @franksong2702

# Conflicts:
#	CHANGELOG.md
2026-05-11 17:17:24 +00:00
nesquena-hermes 0662f0986f Merge PR #2056 into stage-338
Fix custom provider name slugs with ports
by @franksong2702

# Conflicts:
#	CHANGELOG.md
2026-05-11 17:17:19 +00:00
nesquena-hermes 4388cb1a10 Merge PR #2068 into stage-338
fix(ui): prevent stuck sidebar spinner on completed sessions (closes #2066)
by @franksong2702
2026-05-11 17:17:05 +00:00
nesquena-hermes 2bfd538714 Merge PR #2063 into stage-338
fix: keep explicit forks out of lineage report
by @dso2ng
2026-05-11 17:17:05 +00:00
nesquena-hermes ee6c67f30c Merge PR #2074 into stage-338
Fix HERMES_HOME skill cache patching
by @franksong2702
2026-05-11 17:17:04 +00:00
nesquena-hermes da6b897e54 Merge PR #2076 into stage-338
test: add kanban locale parity check (refs #1973)
by @bergeouss
2026-05-11 17:17:03 +00:00
nesquena-hermes d87b23e76f Merge PR #2073 into stage-338
test: allow top-level markdown docs
by @ai-ag2026
2026-05-11 17:17:01 +00:00
nesquena-hermes 7037b084de Merge PR #2088 into stage-338
docs(themes): align THEMES.md with Theme × Skin architecture
by @michael-dg
2026-05-11 17:17:00 +00:00
Michael De Gols 0f8ba4d8d3 docs(themes): align THEMES.md with Theme × Skin architecture
THEMES.md still described the pre-#627 model where each theme was a
monolithic palette name (Dark, Light, Slate, Solarized Dark, Monokai,
Nord, OLED). The current architecture splits appearance into two
orthogonal pickers:

- Theme (System / Dark / Light) — applied as `.dark` class on <html>
- Skin (8 named accent palettes) — applied as `data-skin` attribute

Rewrite the doc to:
- Open with the Theme × Skin separation and how they combine
- List the 3 themes and 8 actual skins shipped in static/style.css
  (default, ares, mono, slate, poseidon, sisyphus, charizard, sienna),
  with the same descriptive tone as the original
- Replace "Creating a Custom Theme" with "Creating a Custom Skin" as
  the primary extension point, with paired light + dark CSS variants
- Note the WebUI extensions surface (docs/EXTENSIONS.md) as a
  no-fork path for self-hosted custom skins
- Update internals to reflect classList.toggle('dark') + dataset.skin
  + dataset.fontSize instead of the old data-theme-only model
- Add a brief Font Size section since it sits in the same picker
- Keep a smaller Custom Theme section for the rare case someone wants
  to override the core palette, redirecting most users to skins

Docs-only change; no code touched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 18:35:12 +02:00
ai-ag2026 c4b7a65356 test: keep local context docs ignored 2026-05-11 17:09:19 +02:00
Frank Song 6a52edf2ab Fix stale inflight purge runtime lookup 2026-05-11 21:53:43 +08:00
Frank Song 18124ced62 Refactor compression anchor visibility helpers 2026-05-11 20:56:30 +08:00
bergeouss c0ccefd322 test: add kanban locale parity check (refs #1973)
Add test_kanban_locale_parity to test_kanban_ui_static.py that asserts
every kanban_* i18n key in the English locale exists in all non-English
locale blocks. Pattern follows test_lineage_segment_locale_keys_are_defined_for_sidebar_locales.
2026-05-11 12:38:48 +00:00
Frank Song c8d110a7f0 test: align sidebar spinner state assertions 2026-05-11 20:31:00 +08:00
Frank Song a0e9c06102 Fix HERMES_HOME skill cache patching 2026-05-11 19:12:02 +08:00
ai-ag2026 d30263bcf1 test: allow top-level markdown docs 2026-05-11 12:36:35 +02:00
Frank Song c60078b356 fix(ui): prevent stuck sidebar spinner on completed sessions (closes #2066)
The spinner (.session-state-indicator.is-streaming) can remain spinning
indefinitely on completed sessions when the INFLIGHT in-memory cache is
not cleaned up due to abnormal stream termination (page refresh, network
disconnect, gateway restart).

Add a staleness guard in _isSessionLocallyStreaming: if the server
reports is_streaming=false and last_message_at is older than 5 minutes,
force the streaming state to false regardless of stale INFLIGHT entries.
2026-05-11 17:54:14 +08:00
Frank Song f6115b78c6 Fix custom provider name slugs with ports 2026-05-11 17:24:53 +08:00
Dennis Soong 5efd287264 fix: align fork lineage projection paths 2026-05-11 17:15:22 +08:00
Frank Song 2cd10868aa Fix session recovery polish 2026-05-11 16:30:25 +08:00
Dennis Soong 1e8d65ea01 fix: keep explicit forks out of lineage report 2026-05-11 15:23:52 +08:00
Nathan Esquenazi b766b7f759 Merge pull request #2060 from nesquena/contributors-refresh-v0.51.44
docs(contributors): refresh contributor stats to v0.51.44
2026-05-11 00:03:19 -07:00
nesquena-hermes b34643b92c docs(contributors): refresh contributor stats to v0.51.44
Update CONTRIBUTORS.md and the README contributors section to reflect
130 contributors and 568 PR credits as of v0.51.44 (was 66/142 at
v0.50.245). The numbers grew because:

- The previous refresh was 1 release-cycle ago (50+ tags + 8 batch
  releases of contributor PRs ago).
- The new counting rule explicitly includes closed-but-absorbed PRs:
  PRs whose original branch shows "closed" on GitHub but whose content
  shipped via batch-release squash with a Co-authored-by trailer, or
  via salvage rewrite with CHANGELOG attribution. This better reflects
  what users actually contributed.

The compilation pipeline:

1. Pull every closed PR from gh api (state=closed, both merged and
   unmerged on GitHub) — 1421 PRs.
2. Walk CHANGELOG.md release-by-release and extract:
   - `PR #N by @user` (canonical bullet form)
   - `(#N by @user`, `(PR #N by @user`, `(#N, @user;`
   - `PRs #A, #B by @user` (plural)
   - `@user — PR #N`, `@user — N PR (#A, #B)`
   - `(credit: @user)` and `(credit: @userA and @userB)`
3. For every PR# mentioned in CHANGELOG, union the explicit @-attributed
   users with the gh PR author (when external). Maintainer accounts
   (@nesquena, @nesquena-hermes) are excluded.
4. For PRs merged on GitHub but not mentioned in CHANGELOG (very early
   PRs, non-noteworthy direct merges), credit the gh author.
5. Three salvaged-design contributors not directly in CHANGELOG are
   credited in the special-thanks roll: @indigokarasu (#213 →
   v0.50.0 design language), @andrewy-wizard (#177 → initial Chinese
   locale absorbed into v0.42.0), @zenc-cp (#133 → anti-hallucination
   guard absorbed into streaming.py).

Pre-cleaning step strips HTML entities (`&#10;` etc.) before PR# scan
to avoid false matches. PR# regex requires a whitespace/paren/bracket
preceder so identifiers like `--key=123` and `(##10`-style headings
don't pollute the count.

Per-user first/last release computed from:
- For merged-on-GH PRs: the smallest tag whose creator-date is >= the
  PR's merged_at timestamp.
- For absorbed PRs: the release section in CHANGELOG that explicitly
  attributes to the user (or the earliest release that mentions the
  PR# if no explicit attribution exists for that user).

CONTRIBUTORS.md sections:
- Top contributors (5+ PRs) — 20 people, ranked
- Sustained contributors (3–4 PRs) — 11 people
- Two-PR contributors — 14 people, flat list
- Single-PR contributors — 85 people, flat list
- How credit is tracked — four paths described
- Special thanks — 11 highlight blurbs

README contributors section trimmed to top-10 table + notable-
contribution blurbs (29 distinct contributors mentioned with concrete
PR numbers). Same data, condensed for the README.

No code changes. Docs only.
2026-05-11 06:59:42 +00:00
nesquena-hermes f00cb74f77 Merge pull request #2058 from nesquena/stage-337
Release T (v0.51.44): 5-PR batch (#2048 + #2052 + #2053 + #2055 + #1970) + test-suite network isolation
v0.51.44
2026-05-10 23:20:29 -07:00
nesquena-hermes cd7107cefb test(infra): identity check by qname (CI re-imports conftest under multiple roots)
CI's pytest invocation imports conftest twice (once via the standard
tests/ discovery, once via repo-root rootdir discovery), producing two
distinct function objects with the same __qualname__ but different `is`
identity. The strict identity assertion failed because each import
created a fresh closure. Switch to __qualname__ substring check — same
guarantee (default-on state has the wrapper installed; fixture restores
the real one) without the multi-import sensitivity.
2026-05-11 06:18:13 +00:00
nesquena-hermes d9bc8360a4 test(infra): fixture swaps real functions via monkeypatch (CI-robust)
CI on Python 3.11 still failed test_allow_outbound_network_fixture_*
because the previous module-global toggle (_ALLOW_OUTBOUND=True/False)
was unreliable on the runner — the wrapper's global lookup at call time
sometimes saw False even after the fixture's True assignment.

Switch to monkeypatch-based fixture: instead of toggling a global that
the wrapper checks, restore socket.create_connection and
socket.socket.connect to their REAL captured implementations for the
duration of the test. Pytest's monkeypatch fixture handles teardown so
the wrappers are reinstalled automatically.

Rewrote the two paired tests to check function identity
(socket.create_connection is _hermes_blocked_create_connection vs. is
_REAL_CREATE_CONNECTION) instead of attempting a live outbound to
8.8.8.8:53 — direct identity check is hermetic and doesn't depend on
whether the CI runner has any outbound network access at all.
2026-05-11 06:15:46 +00:00
nesquena-hermes 6d83d16016 test(infra): tighten IPv6 unique-local check + replace self-passing fixture test
Two low-severity follow-ups from Opus regrounding review:

1. The IPv6 unique-local fc00::/7 check was `h.startswith('fc') or
   h.startswith('fd')` — too loose. It would also classify hostnames
   like 'food.example.com' or 'fdsa.test' as 'local' and silently let
   them through the block. Tightened to a regex match for canonical
   IPv6 syntax (`f[cd][0-9a-f]{0,2}:`) so only actual IPv6 addresses
   match. Same fix in both tests/conftest.py and server.py.

2. test_allow_outbound_network_fixture_unblocks was technically
   self-passing: it tried to connect to a *.invalid hostname, which is
   in the allow-list, so the real socket.create_connection would run
   regardless of whether the fixture toggled the block. Replaced with
   a public-IP-based test that actually proves the toggle works, plus
   a paired test_block_is_active_outside_the_fixture sanity test that
   proves the block is on without the fixture.

Both follow-ups noted by Opus advisor as 'defer-OK' but trivial fixes
so landing them in this batch.
2026-05-11 06:12:07 +00:00
nesquena-hermes 23cfc99738 fix(config): split hermes_cli and urlopen fallback in lmstudio branch (CI fix)
CI on Python 3.13 (clean editable install, no hermes_cli package) was still
failing the 3 lmstudio tests after the first fix attempt. Root cause: the
outer try/except in the lmstudio branch was catching ImportError from
`from hermes_cli.models import provider_model_ids`, hijacking the whole
branch and silently skipping the urlopen fallback.

Restructured into two independent tiers:
  1. hermes_cli lookup in its own try/except — ImportError logs at DEBUG
     and continues with lm_ids=[].
  2. urlopen fallback runs unconditionally when lm_ids is empty, including
     after hermes_cli import failure.

New regression test `test_lmstudio_fallback_works_when_hermes_cli_unavailable`
explicitly blocks hermes_cli via sys.meta_path and verifies the lmstudio
group still populates from the urlopen fallback. Without this test, the
CI-vs-local divergence (local env had hermes_cli installed, CI didn't)
would keep slipping through.

All 12 lmstudio-related tests pass, including the 3 #1527 tests that
broke on stage-337.
2026-05-11 06:06:58 +00:00
nesquena-hermes 1819ead93d docs: CHANGELOG v0.51.44 Release T (5-PR batch + test network isolation) 2026-05-11 06:03:12 +00:00
nesquena-hermes 12cef733e3 fix(recovery): preserve worktree metadata + workspace + message_count on state.db sidecar rebuild
PR #2053 added worktree-backed session creation. PR #2041 (shipped in
v0.51.42) added state.db sidecar reconciliation that rebuilds a missing
<sid>.json sidecar from the canonical state.db row when the JSON file is
gone (failed save, manual rm, restore-from-backup with mismatched dirs).

The two interact silently. `_state_db_row_to_sidecar()` was hard-coding
`'workspace': ''` and never propagating the four worktree_* fields from
the row to the rebuilt sidecar dict. So a worktree-backed session that
loses its sidecar and gets rebuilt from state.db:

- loses `worktree_path` → matches the empty-session sidebar filter at
  `api/models.py:1067/1107` (which spares worktree-backed empty sessions
  via `not s.get('worktree_path')`) → session disappears from the
  sidebar even though the worktree directory still exists on disk.

- loses `workspace` → downstream tools (terminal panels, file pickers
  that use `s.workspace`) operate on empty string instead of the original
  worktree path.

- always reports `message_count == 0` → contributes to the empty-session
  filter even for sessions that have messages in `state.db.messages`.

Fix:

1. `_read_state_db_missing_sidecar_rows()` SELECT now includes
   `workspace, worktree_path, worktree_branch, worktree_repo_root,
   worktree_created_at, message_count` (each gated by
   `_sql_optional_col()` so older state.db schemas without those columns
   continue to work — recovery degrades gracefully rather than 500ing).

2. `_state_db_row_to_sidecar()` propagates each field. workspace comes
   from the row if it's a string, otherwise '' (matching pre-fix behavior
   for non-worktree sessions). message_count comes from the row if
   it's an int, otherwise falls back to `len(messages)` so the rebuilt
   sidecar always has a coherent count.

3 new regression tests in tests/test_state_db_worktree_recovery.py
exercise:
- worktree session with messages → all four worktree_* fields preserved.
- non-worktree session → worktree_* fields all None (no spurious
  propagation), workspace=''.
- empty worktree session (the worst case) → confirms the rebuilt sidecar
  does NOT match the empty-session-exempt filter, so it stays visible
  in the sidebar.

Caught by Opus advisor during stage-337 review (the cross-PR interaction
between #2053 and the previously-shipped #2041 wasn't exercised by either
PR's individual test suite).
2026-05-11 06:00:13 +00:00
nesquena-hermes 2ca220eec0 fix(config): PR #1970 lmstudio branch must honor cfg.model.base_url fallback
PR #1970 added a dedicated `elif pid == "lmstudio":` branch in
`get_available_models()` that fetches the live /v1/models list when the
hermes_cli helper doesn't have ids cached. The fallback path inside that
branch only looked at `cfg["providers"]["lmstudio"]["base_url"]`, missing
the historical config shape where the URL lives under `cfg["model"]`:

  model:
    provider: lmstudio
    base_url: http://192.168.1.22:1234/v1   ← here, not under providers.lmstudio
  providers:
    lmstudio:
      api_key: local-key

3 pre-existing tests in tests/test_issue1527_lmstudio_base_url_classification
broke on stage-337 because of this — they passed on master, failed after
the PR #1970 merge.

The simpler fix is to enhance the already-introduced `_get_provider_base_url()`
helper so it falls back to `cfg["model"]["base_url"]` when
`cfg["model"]["provider"] == provider_id`, then use the helper inside the
lmstudio branch instead of a direct lookup. This keeps the previous
behaviour (where the generic configured-provider branch handled lmstudio
via the model block) while preserving PR #1970's live-discovery additions.

Belt-and-suspenders: `_get_provider_base_url()` explicitly does NOT inherit
model.base_url for providers other than the active one — if a user's config
says `model.provider: anthropic` and they have `providers.openai` configured
without a base_url, openai must still resolve to None (use SDK default),
not to the anthropic proxy URL.

6 new regression tests in tests/test_pr1970_lmstudio_base_url_fallback.py
lock the two-location lookup, the precedence rule (explicit providers entry
wins over model fallback), trailing-slash stripping, and the negative case
(model.base_url MUST NOT leak to non-active providers).

All 51 tests in the existing model-resolver + custom-provider banks still
pass.

Caught by maintainer review on stage-337 (full pytest with the new network
isolation in place surfaced the regression that the fork-CI mock-server path
would have hidden).
2026-05-11 05:59:59 +00:00
nesquena-hermes a6174d08db test(infra): hermetic network isolation — block all outbound from tests
Tests should not reach the public internet. Before this commit, an
accidentally-leaking outbound socket from the test_server fixture (real
TLS handshakes to Anthropic / Amazon / OpenRouter, sometimes triggered
by SDK-init paths that found a credential the credential-strip allowlist
missed) was adding 60+s of wall-time to a 100s test run and creating a
class of flaky failures.

This installs a default-deny socket-block at two layers:

1. Pytest process, via tests/conftest.py module-level monkey-patch on
   socket.create_connection + socket.socket.connect. Loopback / RFC1918
   private / link-local / RFC2606 reserved-TLD destinations pass through;
   anything else raises OSError("hermes test network isolation: outbound
   to ... blocked"). Tests that legitimately need real outbound opt back
   in via the new `allow_outbound_network` fixture (no current callers).

2. Test_server subprocess (server.py), via a HERMES_WEBUI_TEST_NETWORK_BLOCK=1
   environment-variable-gated guard at the top of server.py. tests/conftest.py
   sets the env var on every test_server spawn. Without this, the subprocess
   could make outbound that the pytest-side block can't see (which is exactly
   what was happening — verified via `ss -tnp` showing the server.py child
   with established ESTAB sockets to [2607:6bc0::10]:443).

In production the env var is unset, so the guard is a no-op.

Companion changes:

- test_dns_resolution_failure refactored to mock socket.getaddrinfo
  raising gaierror, instead of relying on a real DNS lookup of a
  *.invalid hostname. The test was the one outlier that genuinely
  exercised real DNS; mocking matches what every other probe-error test
  in the same file already does.

- New tests/test_conftest_network_isolation.py with 9 adversarial
  tests proving the block fires for public IPs (including the exact
  Anthropic IPv6 and Amazon IPv4 destinations we observed leaking),
  the allow-list passes loopback / RFC1918 / link-local / reserved-TLDs,
  and the opt-in fixture re-enables real outbound when needed.

Test suite: 5,120 → 5,192 (+72 net new from this commit + the regression
tests in the companion commits). Wall time: 161s → 95s on the same
hardware. No remaining outbound from any test path.
2026-05-11 05:59:42 +00:00
nesquena-hermes d86dcc12c6 Merge PR #2055: fix: duplicate assistant transcript merge 2026-05-11 05:12:05 +00:00
nesquena-hermes e0ecf2a035 Merge PR #1970: feat: LM Studio provider with live model discovery 2026-05-11 05:12:04 +00:00
nesquena-hermes 44e7378be8 Merge PR #2053: feat: worktree-backed session creation
# Conflicts:
#	CHANGELOG.md
2026-05-11 05:12:00 +00:00
nesquena-hermes 48cccbcd2e Merge PR #2052: docs: add first-run onboarding guide 2026-05-11 05:11:23 +00:00
nesquena-hermes e3001d16fc Merge PR #2048: [security] validate workspace on import 2026-05-11 05:11:21 +00:00
Frank Song 5a445e7562 Fix duplicate assistant transcript merge 2026-05-11 13:09:16 +08:00
nesquena-hermes 640cf6e6a9 Merge pull request #2054 from nesquena/feat/sidebar-collapse-fused
feat(ux): collapse sidebar by clicking the active rail icon (fuses #1884 + #1924)
v0.51.43
2026-05-10 22:04:29 -07:00
nesquena-hermes b13bc9619c docs: CHANGELOG v0.51.43 Release S 2026-05-11 05:02:13 +00:00
Nathan Esquenazi ba66872f70 fix(sidebar): align collapse CSS breakpoint with JS _isDesktopWidth (641px)
`_isDesktopWidth()` in boot.js gates every collapse path on
`matchMedia('(min-width:641px)')` — matching where the rail itself becomes
visible. The CSS rules driving the actual visual collapse were nested inside
the workspace-panel block at `@media(min-width:901px)` — a threshold copied
from the right-panel collapse but with no functional reason to apply here.

Behavioural consequence in the 641–900 px band (tablet portrait + small
laptop windows):

  - Rail is visible, user clicks the active icon
  - JS adds `.layout.sidebar-collapsed` and writes localStorage='1'
  - JS sets aria-expanded='false' on the active rail button
  - CSS at min-width:901px does NOT apply → sidebar stays at 300 px width
  - User sees no visual change; screen reader announces collapsed state for
    a sidebar that is still visible; localStorage silently persists
  - Resize to ≥901 px later → sidebar suddenly collapses (surprise state)

Fix: hoist the three `.sidebar-collapsed` / flash-prevention rules out of
the workspace-panel @media block and into their own `@media(min-width:641px)`
block. The rail visibility breakpoint, the JS gate, and the CSS gate now
all agree.

`:not(.mobile-open)` is preserved on both selectors so the mobile slide-in
overlay (handled in the `max-width:640px` block) is never targeted — the
new @641 boundary doesn't change that contract.

Verified breakpoint matrix end-to-end (Node harness over real boot.js +
style.css):

  Width | JS desktop | CSS applies | Effect
  ------|------------|-------------|------------
   640  | no         | no          | no-op (mobile overlay)
   641  | yes        | yes         | collapses ✓
   700  | yes        | yes         | collapses ✓
   768  | yes        | yes         | collapses ✓
   900  | yes        | yes         | collapses ✓
   1024 | yes        | yes         | collapses ✓

Regression test added: `test_css_breakpoint_matches_js_isdesktopwidth`
parses boot.js for the `_isDesktopWidth` matchMedia query, walks CSS to
find the @media block enclosing `.layout.sidebar-collapsed`, and asserts
the thresholds match. Locks the invariant so a future refactor can't
re-introduce the asymmetric-band silent-state-leak.

Test counts:
  - tests/test_sidebar_collapse_toggle.py: 35/35 pass (was 34, +1 regression)
  - Full suite (Python 3.14, local): 5040 passed, 0 failed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:57:47 -07:00
Frank Song db6857ba86 Address worktree session review notes 2026-05-11 12:51:57 +08:00
nesquena-hermes 1a2cf2812c test(conftest): block AWS IMDS probing + expand credential-strip allowlist
Two test-infrastructure fixes surfaced while running the full suite on
this branch. Both prevent accidental outbound network calls from the
pytest process — a class of bug that doesn't show up as test failures
but corrupts timing, leaks credentials, and was responsible for a recent
10× slowdown observation.

## 1. AWS_EC2_METADATA_DISABLED for the whole pytest session

When hermes-agent's bedrock_adapter / botocore credential chain is
imported during tests (e.g. via api/config.py provider-catalog imports),
botocore probes the EC2 Instance Metadata Service at 169.254.169.254
looking for an instance role. On VPS hosts where IMDS is reachable but
rate-limited (HTTP 429) or non-responsive, those probes dominate wall
time — a 161s test run was observed extending to 600+s.

Set `AWS_EC2_METADATA_DISABLED=true` at module load (before any test-file
imports trigger botocore initialisation). This is the documented AWS-
supported way to silence the probe and matches the guard the agent's own
`hermes_cli/doctor.py` already uses inside its parallel-probe block.

Also explicitly re-set the var on the spawned test-server env so it
can't be accidentally cleared by a later `env.update(...)`.

## 2. Expanded credential-strip allowlist

The original strip list covered 6 providers (OpenRouter, OpenAI,
Anthropic, Google, DeepSeek, Xiaomi). Several others leaked through
into the test server subprocess:

- `MEM0_API_KEY`, `XAI_API_KEY`, `MISTRAL_API_KEY`, `OLLAMA_API_KEY`,
  `GROQ_API_KEY`, `TOGETHER_API_KEY`, …
- AWS credentials (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`,
  `AWS_SESSION_TOKEN`, `AWS_PROFILE`, `AWS_BEARER_TOKEN_BEDROCK`)
- Messaging bot tokens (`TELEGRAM_BOT_TOKEN`, `DISCORD_BOT_TOKEN`,
  `SLACK_BOT_TOKEN`, `SIGNAL_API_TOKEN`, `WHATSAPP_API_TOKEN`)
- Memory providers (`HONCHO_API_KEY`, `SUPERMEMORY_API_KEY`)
- Search / browser / image-gen (`FIRECRAWL_API_KEY`, `FAL_KEY`,
  `TAVILY_API_KEY`, `SERPER_API_KEY`, `BRAVE_API_KEY`)
- GitHub tokens (`GH_TOKEN`, `GITHUB_TOKEN`)
- Azure OpenAI (`AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`)

A real outbound TLS connection to a provider's IPv6 endpoint was
observed during a test run on this host before the strip was expanded.
The test server uses a mock config and has no business making real API
calls.

## Test status

5,151 passed / 11 skipped / 1 xfailed / 2 xpassed / 0 regressions in
139s on Python 3.11. Down from 147s before the fixes (and from
intermittent 10×-slowdowns on IMDS-rate-limited hosts). All API/feature
contracts unchanged.

## Security audit of remaining test-suite host references

Every IP / URL / hostname referenced in `tests/**.py` was classified:
- Loopback (127.0.0.1, localhost, ::1, 0.0.0.0)
- RFC1918 private (10.*, 172.16-31.*, 192.168.*)
- RFC 5737 TEST-NET-3 documentation (203.0.113.*)
- RFC 2606 reserved docs domains (*.example.com, *.example.local,
  *.example.test)
- Security-attack input strings used only as parser/validator input
  (evil.com, attacker, evil.example.com — never resolved or contacted)
- Real provider/CDN endpoints used only as `base_url` config strings
  or CSP-allowlist assertions — never actually fetched
- 8.8.8.8 used only as a "non-loopback example" in `_is_local_from_handler()`
  unit tests

No suspicious egress destinations.
2026-05-11 04:49:46 +00:00
nesquena-hermes 2dbee503c2 feat(ux): collapse sidebar by clicking the active rail icon (fuses #1884 + #1924)
Lets desktop users collapse the session-list sidebar to maximise the chat
area, without adding any visible UI affordance. Default appearance is
identical to master — only users who actively try to toggle (or know the
keyboard shortcut) ever see a difference.

## Behaviour (desktop only, ≥641px)

| State                              | Action                | Result                                  |
|------------------------------------|-----------------------|-----------------------------------------|
| Sidebar open, click active rail    | Toggle                | Sidebar collapses to width:0            |
| Sidebar open, click different rail | Normal switch         | **Sidebar stays open** (no surprise)    |
| Sidebar collapsed, click any rail  | Expand + switch       | Sidebar expands, then panel switches    |
| Anywhere, Cmd/Ctrl+B               | Toggle                | Same as same-active-rail click          |
| Mobile (<641px), any of the above  | No-op                 | Mobile overlay behaviour unchanged       |

Two discoverability paths, both opt-in. **No new visible buttons.** Users
who never click the active rail icon see zero UI change vs. master.

## Surface-minimal design

The behaviour is contained behind one extra arg on the rail/sidebar-nav
onclick: `switchPanel('chat',{fromRailClick:true})`. Without that flag the
function preserves master's behaviour exactly — every programmatic
`switchPanel(name)` callsite (commands, deeplinks, internal state changes)
is unaffected. The guard chain inside `switchPanel`:

  opts.fromRailClick && _isDesktopWidth() && (
      _isSidebarCollapsed() ? expandSidebar() :
      prevPanel === nextPanel ? (toggleSidebar(true); return false))

is the ONLY new code path that can cause a collapse. Cross-panel clicks
fall through to the existing switch logic untouched.

## Polish from both source PRs

- **Click-active gesture** as the primary toggle (#1884 @jasonjcwu — the
  genuine UX innovation; no extra button needed)
- **Cmd/Ctrl+B keyboard shortcut** (#1924 @spektro33; VS Code convention).
  Guarded against firing when typing in INPUT / TEXTAREA / contenteditable
  so the shortcut never steals from in-progress text editing.
- **Inline flash-prevention `<script>`** in `<head>` (#1924) sets
  `data-sidebar-collapsed='1'` on `<html>` BEFORE the stylesheet loads,
  so cold loads with a persisted-collapsed state paint correctly from
  frame 0 with no flicker. Cleared by JS once the class system takes over.
- **Smooth slide animation** via `.24s cubic-bezier(.22,1,.36,1)`
  (#1924, mirrors the existing workspace-panel collapse on the right)
- **`aria-expanded` mirrored** on the active rail button (#1884) so
  screen readers announce open/collapsed transitions.
- **`body.resizing` transition-suppression** (#1884) keeps the drag-resize
  cursor instant — no animation during a width-resize gesture.
- **bfcache `pageshow` re-sync** (#1884) — if another tab toggled the
  sidebar while this page was frozen, bring it in line on restore.

## Drops vs. #1924

- No persistent rail "toggle sidebar" button (Nathan: keep the UI stealth)
- No close-X button in chat panel head (same reason)
- No i18n keys for the dropped buttons

## What did NOT change

- 22 rail/sidebar-nav `onclick` handlers gained the `{fromRailClick:true}`
  arg — function-call shape, invisible to users
- 1 inline `<script>` in `<head>` (flash prevention) — invisible
- 5 lines of CSS — invisible unless someone collapses

That's the entire visible-UI delta. **23 ins / 22 del on `index.html`,
all string-replace.**

## Verification

- 5,151 pytest passing including a new 34-test structural suite covering
  every contract (CSS rules, JS functions, fromRailClick guard, legacy
  proxy forwarding, flash-prevention `<script>` ordering, mobile
  exclusion via :not(.mobile-open) selector, aria-expanded sync).

- Live browser walkthrough at 1280px verified:
  - Default boot state identical to master (sidebar open, width 300px)
  - Click active rail → collapse (width 1, opacity 0, translateX -14px,
    localStorage='1', aria-expanded=false). Panel unchanged.
  - Click active rail again → expand back to width 300, aria=true
  - Click DIFFERENT rail → normal switch, sidebar stays open (legacy-
    preserving case, verified explicitly)
  - Click rail while collapsed → expand + switch in one gesture
  - Cmd+B toggles correctly
  - Cmd+B inside `<textarea>` → suppressed (defaultPrevented=false)
  - Reload with collapsed state persisted → restores without flash
  - Mobile simulation (matchMedia returns false for min-width:641px):
    same-active-rail click is no-op, Cmd+B is no-op, sidebar stays at 300px

Co-authored-by: jasonjcwu <jasonjcwu@users.noreply.github.com>
Co-authored-by: spektro33 <spektro33@users.noreply.github.com>
Closes #1884
Closes #1924
2026-05-11 04:49:18 +00:00
Frank Song 186453ea0e Add worktree-backed session creation 2026-05-11 12:12:40 +08:00
Frank Song 7aa1a5f42c docs: add first-run onboarding guide 2026-05-11 11:47:26 +08:00