Commit Graph

1129 Commits

Author SHA1 Message Date
Dutch AI Agency e4d2704ce8 fix: resolve local models from configured base url 2026-05-03 17:04:46 +00:00
nesquena-hermes 3964339a58 Merge pull request #1540 from nesquena/stage-280
v0.50.280 — Cross-channel messaging handoff (#1404) + reasoning-effort salvage (#1535)
v0.50.280
2026-05-03 09:58:43 -07:00
Hermes Bot b931875b7d release: stamp v0.50.280 — #1535 reasoning-config salvage + #1404 cross-channel handoff (3946 → 3985 tests) 2026-05-03 16:56:44 +00:00
Hermes Bot 0cbada7228 Stage 280: PR #1404 — cross-channel messaging handoff (Frank Song, rebased onto master) 2026-05-03 16:51:34 +00:00
Hermes Bot 1d6a89f753 Stage 280: PR #1535 — pass agent.reasoning_effort into WebUI agents (salvages #1531) 2026-05-03 16:51:34 +00:00
Frank Song 7689046305 Polish handoff flyout alignment 2026-05-03 16:35:50 +00:00
Frank Song c7e52084ba Harden messaging channel handoff 2026-05-03 16:35:50 +00:00
Frank Song 20ef643bb8 Add messaging session handoff summary 2026-05-03 16:35:22 +00:00
nesquena df0d904d87 fix(streaming): pass agent.reasoning_effort into WebUI agents (salvages #1531)
Spliced from #1531 by @Asunfly: take Change-1 only (the actual bug fix +
cache signature inclusion) and skip Change-2 (auxiliary title-route
extra_body change) which is a separate scope concern.

## What

Two surgical fixes in api/streaming.py:

1. Line 1820 — `_cfg.cfg.get(...)` → `_cfg.get(...)`. `get_config()` returns
   a plain dict (not a wrapper exposing `.cfg`).  The buggy line raised
   AttributeError that the surrounding try/except swallowed, so
   `_reasoning_config` was always None regardless of what `/reasoning
   <level>` had been set to.  Verified locally — `api/streaming.py:1959`
   already correctly used `_cfg.get(...)` in the same function, so the
   same `_cfg` was being read two different ways in one file.

2. Line 1888 — added `_reasoning_config or {}` to `_sig_blob`.  Without
   this, switching effort mid-session would fail to take effect because
   the per-session agent cache key would still match the old entry.
   Mirrors how `resolved_provider` / `resolved_base_url` already
   participate in the signature.

## Why splice instead of merge #1531 directly

@Asunfly force-pushed a Change-2 onto #1531 after the original review
that removes `extra_body={"reasoning": {"enabled": False}}` from
`generate_title_raw_via_aux` (the auxiliary title-generation route).
That intent is reasonable (let operator-configured `extra_body.reasoning`
flow through to the title route) but it touches a different surface and
deserves its own PR.

The narrow concern is operators who selected a reasoning-capable
auxiliary title model without explicitly setting
`reasoning.enabled=False` in the task config — pre-Change-2 the WebUI
defended against accidental reasoning on the title hot path; post-Change-2
those configs would reason on every new conversation`s title, with cost
and latency implications.

## What is NOT in this PR

- The `generate_title_raw_via_aux` extra_body refactor (Change-2 from #1531).
- The `test_does_not_override_configured_reasoning_extra_body` test (guards
  Change-2). Asunfly can re-open that as its own focused PR.

## Tests

Two new R17b/R17c regression assertions in tests/test_regressions.py:

- `test_streaming_reads_reasoning_effort_from_config_dict` — static-source
  guard: `_cfg.cfg` must not return to streaming.py
- `test_streaming_agent_cache_signature_includes_reasoning_config` —
  catches removal of `_reasoning_config` from `_sig_blob`

## Closes

- Closes #1531 (the Change-1 portion ships here; Asunfly can re-open
  Change-2 as a separate PR if desired)

Co-authored-by: Asunfly <[email protected]>
2026-05-03 16:34:25 +00:00
nesquena-hermes f8ed6dac05 Merge pull request #1534 from nesquena/stage-279
v0.50.279 — 8-PR batch from full PR sweep + Opus MUST-FIX caught
v0.50.279
2026-05-03 09:26:03 -07:00
Hermes Bot 11cc493806 release: stamp v0.50.279 \u2014 8-PR batch (sweep) + Opus MUST-FIX absorbed
CHANGELOG, ROADMAP, TESTING bumped (3936 \u2192 3946).

8 constituent PRs:
- #1523 (@franksong2702) branch indicator codepoint fix
- #1519 (@franksong2702) onboarding API-key focus loss fix
- #1518 (@franksong2702) voice-mode toggle-off recognizer stop
- #1516 (@franksong2702) YAML newline CSS rules
- #1517 (@franksong2702) __CACHE_VERSION__ \u2192 __WEBUI_VERSION__ rename
- #1532 (@ai-ag2026) state.db WebUI session recovery
- #1525 (@ai-ag2026) stale stream state proactive cleanup
- #1526 (@ai-ag2026) max_tokens forwarding + OpenRouter quota classifier

Opus MUST-FIX absorbed: sw.js conflict-marker cleanup + regression guard.
Opus SHOULD-FIX deferred to follow-up #1533 (race in _clear_stale_stream_state).

2 closed as duplicates: #1528 (identical to #1517), #1529 (superseded by #1516).
1 maintainer-review label: #1531 (Asunfly stowaway change in force-push).
5 stay on hold: #1418 #1464 #1404 #1353 #1311.
2026-05-03 16:23:30 +00:00
Hermes Bot 2856ee6637 fix(stage-279): absorb Opus MUST-FIX — sw.js conflict-marker resolution
Opus advisor flagged that the conflict-marker resolution from PR #1525's
merge had not actually landed — static/sw.js still contained the literal
<<<<<<< HEAD / ======= / >>>>>>> pr-1525 markers, which made the file
fail to parse as JavaScript even though the substring-based source-string
tests still passed (the __WEBUI_VERSION__ token was present, just inside
the conflict block).

Concrete impact pre-fix when shipped:
- Service worker install handler would throw on script load
- SW would never reach activated state
- Old SW (from v0.50.278) would keep controlling the page indefinitely
- Frontend cache-bust pathway silently broken
- The INFLIGHT[sid] clear in static/sessions.js (the frontend half of
  PR #1525's stale-stream cleanup) would never deliver to existing
  browsers because the new SW would never activate

Fix:
- Resolve sw.js conflict to keep CACHE_NAME = 'hermes-shell-__WEBUI_VERSION__'
  (the post-#1517 rename, with the manual -stale-stream-cleanup1 suffix
  dropped as redundant — natural version-token bump invalidates old caches).
- Add tests/test_pwa_manifest_sw.py::test_sw_js_has_no_merge_conflict_markers
  regression guard that scans for <<<<<<<, =======, >>>>>>> in sw.js source.
- Update tests/test_stale_stream_cleanup.py::test_service_worker_cache_
  bumped_for_frontend_fix_delivery to assert the canonical version-token
  CACHE_NAME pattern instead of the (now-removed) -stale-stream-cleanup1
  manual suffix.

3945 → 3946 tests passing (+1 from the new conflict-marker guard).

This issue would have shipped a broken service worker if Opus hadn't
caught it. The new test_sw_js_has_no_merge_conflict_markers test would
have flagged it earlier in the pipeline.

Caught-by: Opus advisor pass on stage-279 brief
Co-authored-by: ai-ag2026 <ai-ag2026@users.noreply.github.com>
2026-05-03 16:21:42 +00:00
Hermes Bot a5e6b9dc8b Merge PR #1526 by @ai-ag2026: pass WebUI max_tokens into agent + classify OpenRouter quota phrases (refs #1524) 2026-05-03 16:06:55 +00:00
Hermes Bot 1148656370 Merge PR #1525 by @ai-ag2026: clear stale WebUI stream state proactively (refs #1471)
Merge conflict resolution: kept HEAD's `CACHE_NAME = 'hermes-shell-__WEBUI_VERSION__'` (post-#1517 rename) over PR #1525's `'hermes-shell-__CACHE_VERSION__-stale-stream-cleanup1'` manual suffix. The renamed placeholder still auto-bumps with each release through the `quote(WEBUI_VERSION, safe="")` substitution, so the manual `-stale-stream-cleanup1` suffix is no longer needed to force-update existing service workers — the natural version bump (v0.50.278 → v0.50.279) already invalidates the old cache via `caches.delete(k)` for `k !== CACHE_NAME` in the SW activate handler. No behavioral regression: the SW cache still bumps on this release, just via the canonical version-token path.

Co-authored-by: ai-ag2026 <ai-ag2026@users.noreply.github.com>
2026-05-03 16:06:42 +00:00
Hermes Bot 437eae00be Merge PR #1532 by @ai-ag2026: recover WebUI-origin state.db sessions when JSON sidecar missing (refs #1471) 2026-05-03 16:06:04 +00:00
Hermes Bot c8c9acbefb Merge PR #1517 by @franksong2702: consolidate __CACHE_VERSION__ into __WEBUI_VERSION__ — closes #1509 2026-05-03 16:05:56 +00:00
Hermes Bot 6755b1eab5 Merge PR #1516 by @franksong2702: YAML code blocks render with newlines (Prism token white-space) — closes #1463 2026-05-03 16:05:56 +00:00
Hermes Bot 6967965782 Merge PR #1518 by @franksong2702: voice-mode pref toggle-off stops the recognizer — closes #1491 2026-05-03 16:05:56 +00:00
Hermes Bot 8080e9885a Merge PR #1519 by @franksong2702: onboarding API-key field stops losing focus during probe — closes #1503 2026-05-03 16:05:56 +00:00
Hermes Bot f06f3cd5e7 Merge PR #1523 by @franksong2702: fix branch indicator codepoint (\u2482 \u2192 \u2442) — closes #1522 2026-05-03 16:05:56 +00:00
Manfred 9c0a16fdd6 fix: recover WebUI-origin state.db sessions 2026-05-03 15:41:56 +02:00
Manfred dbb0879956 fix: pass WebUI max_tokens to agents
Read configured max_tokens from config.yaml, pass it into WebUI-created AIAgent instances when supported, and include it in the agent cache signature. Also classify OpenRouter quota phrasing such as more credits, can only afford, and fewer max_tokens.

Adds regression coverage for max_tokens propagation, cache signature isolation, and quota error classification.
2026-05-03 11:46:42 +02:00
Manfred 6bce34c27e fix: clear stale WebUI stream state
Clear persisted active_stream_id and pending runtime fields when the server no longer has the referenced live stream. Also drop browser-side INFLIGHT state when the server reports a session idle and bump the service-worker cache so the frontend fix is delivered.

Adds regression coverage for backend stale-stream cleanup, frontend inflight invalidation, and cache busting.
2026-05-03 11:46:42 +02:00
Frank Song 57eb2fbf56 fix: update test assertion to match corrected Unicode codepoint (\u2442) 2026-05-03 15:36:05 +08:00
Frank Song dc7b142bb5 fix: use correct Unicode codepoint for branch indicator (⑂ not ⒂)
\u2482 (PARENTHESIZED DIGIT FIFTEEN, displayed as ⒂) → \u2442 (OCR FORK, displayed as ⑂)

Fixes #1522
2026-05-03 15:31:15 +08:00
nesquena-hermes 9e31a2ac65 Merge pull request #1521 from nesquena/stage-278
v0.50.278 — sidebar Unassigned filter chip (splices #1497 + #1513)
v0.50.278
2026-05-03 00:17:17 -07:00
Hermes Bot 0413ee4fc0 release: stamp v0.50.278 (PR #1520 \u2014 sidebar Unassigned filter chip)
CHANGELOG, ROADMAP, TESTING bumped (3929 \u2192 3936).

Pre-release Opus advisor pass: SHIP AS-IS. Sentinel collision impossible
(UUID hex \u2014 no underscores), stale-active-filter on project delete safe,
CSS specificity clean. One non-blocking edge case (stuck filter at zero
projects + zero unassigned) explicitly deferred per Opus advice
(recoverable via reload, too narrow to justify pre-merge work).

Both contributors (Thanatos-Z and AlexeyDsov) credited via Co-authored-by
trailers preserved from the synthesis commit.
2026-05-03 07:15:01 +00:00
Hermes Bot 6a75907802 feat(sidebar): add "Unassigned" project-filter chip for sessions without a project
Spliced from contributor PRs #1497 (Thanatos-Z) and #1513 (AlexeyDsov), which
both added the ability to filter the sidebar to sessions with no project_id
assigned. Lands here as a focused PR with the best of both:

## Synthesis decisions

- **Sentinel constant approach** (from #1497, Thanatos-Z): single state
  variable (`_activeProject` set to `NO_PROJECT_FILTER` sentinel) instead
  of a parallel `_showNoneProject` boolean. No two-state-machine ambiguity,
  no risk of "All" + "Unassigned" both reading active. Clicking "All"
  automatically clears the unassigned filter because there is only one
  variable to reset.

- **Conditional rendering** (from #1497): the chip only appears when
  there are actually unassigned sessions to filter to (`hasUnprojected`).
  Common case where every session is organized → chip stays hidden,
  uncluttered chip bar. The project-bar itself also renders when there
  are unassigned sessions (was previously gated on `_allProjects.length`).

- **Dashed-border visual treatment** (from #1497): `.project-chip.no-project
  {border-style:dashed;}` distinguishes the chip from real project chips
  so it reads as a meta-filter ("things without a project") rather than
  another project. Subtle but present.

- **"Unassigned" label** (new): clearer than #1497s "No project" (which
  reads like a status filter) or #1513s "None" (which is ambiguous —
  none of what?). Matches the conventional file-manager / task-tracker
  mental model: "things not yet assigned to a category." Tooltip elaborates:
  "Show conversations not yet assigned to a project."

- **Branched empty-state copy**: when the Unassigned filter is active
  and the result is empty, show "No unassigned sessions." instead of
  the generic "No sessions in this project yet."

## Tests

7 new tests in tests/test_sidebar_unassigned_filter.py pin every contract:
sentinel constant declared; filter logic uses !s.project_id when sentinel
is active; chip only renders when hasUnprojected; chip label and click
handler; visual treatment (dashed border + .no-project class); empty-state
copy branches on the active filter; All chip handler clears _activeProject
to null (would catch a regression if a parallel _showNoneProject boolean
is ever reintroduced).

Local full suite: 3929 → 3936 passing (+7).

Live verified at port 8789 with seeded data (5 projects + 73 unassigned
sessions in active profile): chip appears between "All" and project chips
when unassigned sessions exist; click cycles correctly; clicking a real
project hides the Unassigned chip from active state; clicking "All"
deactivates everything; dashed border present per getComputedStyle.

Co-authored-by: Thanatos-Z <thanatos-z@users.noreply.github.com>
Co-authored-by: Alexey Denisov <AlexeyDsov@users.noreply.github.com>
2026-05-03 07:08:08 +00:00
Frank Song ac3d336875 fix: onboarding API-key input loses focus when probe completes (#1503)
The onboarding wizard's API-key input calls _scheduleOnboardingProbe()
on every keystroke (oninput). When the 400ms-debounced probe completes,
_setOnboardingProbeState() calls _renderOnboardingBody() which rebuilds
the entire form — destroying and recreating the <input> element. The
user's focus and cursor position are lost.

On fast connections (localhost) the probe completes between keystrokes
so the bug window is narrow. On slow networks (VPN, corporate proxy,
cold-start vLLM) the re-render routinely lands mid-typing.

Fix: remove _scheduleOnboardingProbe() from the api-key input's
oninput handler. The probe still fires on:
- baseUrl input change (oninput + debounce, unchanged)
- api-key field blur (onblur, added)
- 'Test connection' button click (unchanged)
- nextOnboardingStep() before Continue (unchanged)

The baseUrl input retains the oninput probe because the UX trade-off
is acceptable there (text input preserves visible content on re-render).
2026-05-03 15:05:40 +08:00
Frank Song f32989d5bb fix: voice-mode pref toggle-off now stops the recognizer (#1491)
When a user disables 'Hands-free voice mode' in Settings while voice
mode is active, the button hides but the SpeechRecognition keeps
running — the user can't stop it because the button is invisible.

Fix: _applyVoiceModePref() now checks if voice mode is active and
calls _deactivate() when the pref is toggled off. Move
_voiceModeActive declaration above the function to avoid TDZ.

Also removes a duplicate window._applyVoiceModePref assignment.
2026-05-03 15:03:17 +08:00
Frank Song 8f3dbe185d fix: consolidate __CACHE_VERSION__ → __WEBUI_VERSION__ (#1509)
__CACHE_VERSION__ (sw.js) and __WEBUI_VERSION__ (index.html) are
functionally identical — both resolve to quote(WEBUI_VERSION, safe='')
at request time. Two names exist for historical reasons (different files
added at different times).

Rename __CACHE_VERSION__ → __WEBUI_VERSION__ in:
- static/sw.js (CACHE_NAME + VQ constant + comment)
- api/routes.py (substitution string)
- tests/test_pwa_manifest_sw.py (all assertions)

Single canonical name. No behavior change — same ?v=vX.Y.Z query strings
on the same URLs.
2026-05-03 14:59:37 +08:00
Frank Song b57e80f706 fix: YAML code blocks collapse newlines due to Prism token white-space (#1463)
Prism's YAML grammar wraps tokens in <span> elements where white-space
defaults to normal, collapsing \n characters into spaces. The DOM
textContent is correct (confirmed by reporter's probe), so the bug is
purely CSS.

Force white-space:pre on .token elements inside language-yaml code
blocks for both .msg-body and .preview-md contexts.
2026-05-03 14:54:34 +08:00
nesquena-hermes 7921a47f9d Merge pull request #1515 from nesquena/stage-277
v0.50.277 — model-picker shared-reference fix (supersedes #1511)
v0.50.277
2026-05-02 23:50:17 -07:00
Hermes Bot afa7223c1a release: stamp v0.50.277 + Opus SHOULD-FIX (production-path regression guard)
CHANGELOG, ROADMAP, TESTING bumped (3925 → 3929 tests collected).

Opus SHOULD-FIX absorbed in-release: tests #1-3 documented the dedup
contract via direct construction but did not invoke get_models_grouped().
Test #4 (test_get_models_grouped_unconfigured_providers_get_independent_dicts)
inspects the live source for the literal copy.deepcopy(auto_detected_models)
call AND runs an end-to-end smoke of the fixed assignment loop.

A future refactor that removes the deepcopy at api/config.py:2078 will
fail this test immediately.
2026-05-03 06:47:52 +00:00
Hermes Bot 6381ab1b8a fix(model-picker): deepcopy auto_detected_models per group to stop dedup bleed-across (#1511 root cause)
Supersedes contributor PR #1511 (lost9999), which removed the label-suffix
logic in _deduplicate_model_ids() but left the underlying shared-reference
bug intact — IDs would still be silently corrupted across provider groups,
just with cleaner-looking labels.

## Bug shape

When multiple unconfigured providers (Ollama / HuggingFace / custom
endpoints / Google Gemini CLI / Xiaomi / etc.) all fell through to the
'else' branch in api/config.py:get_models_grouped() that ends with:

    groups.append({..., "models": auto_detected_models})

every group ended up sharing the SAME list reference AND the SAME dicts
inside. When _deduplicate_model_ids() then mutated those dicts to add
@provider_id: prefixes and provider-name parentheticals, the changes were
applied to every group that referenced the same dict.

Visible symptom: user 'vishnu' reported the dropdown showing
'Deepseek V4 Flash (Xiaomi) (Ollama) (HuggingFace) (Google-Gemini-Cli)'
on every group. Hidden symptom (worse): the 'id' field collapsed to
'@xiaomi:deepseek-v4-flash' on every group too, so clicking the entry
under any group routed the request to Xiaomi.

## Fix

api/config.py:2078 — wrap auto_detected_models in copy.deepcopy() at the
groups.append site so each group gets its own independent dicts. The
existing _deduplicate_model_ids() logic is correct and unchanged; the
bug was in the assignment site, not the dedup function.

The single-parenthetical disambiguation in labels is retained because
the composer chip (composer-model-label) shows the model label without
the optgroup header context — 'Deepseek V4 Flash (Ollama)' is more
useful than ambiguous 'Deepseek V4 Flash' there.

## Tests

tests/test_issue1511_dedup_shared_reference.py — 3 new tests:
- test_groups_have_independent_model_lists: structural invariant pin
- test_unconfigured_providers_no_shared_dedup_bleed: end-to-end against
  the corrected code path; verifies each group gets its own @provider_id:
  prefix and exactly ONE provider parenthetical per disambiguated label
- test_shared_reference_pre_fix_demonstrates_corruption: documents the
  broken state that motivated the fix

Full suite: 3925 → 3928 passing (+3 new, 0 regressions).

Co-authored-by: lost9999 <56498264+lost9999@users.noreply.github.com>
2026-05-03 06:41:11 +00:00
nesquena-hermes 8ef58cad27 Merge pull request #1510 from nesquena/stage-276
v0.50.276 — SW stale-CSS fix (PR #1508, closes #1507)
v0.50.276
2026-05-02 23:28:34 -07:00
Hermes Bot 2420c6bda3 release: stamp v0.50.276 (PR #1508 — SW stale-CSS fix, closes #1507)
CHANGELOG, ROADMAP, TESTING all updated.
3923 → 3925 tests collected (+2 regression tests).

Pre-release Opus advisor pass: SHIP AS-IS.
Independent review: nesquena APPROVED with end-to-end trace.

Migration note: existing v0.50.275 users will see one more round of
broken styling on first reload after upgrade (old SW serves old
index.html). Subsequent reloads clean. Future upgrades will not
recur because SW pre-cache is now keyed on versioned URL.

Filed follow-up #1509 for __CACHE_VERSION__/__WEBUI_VERSION__
placeholder consolidation (low-priority cleanup, no functional impact).
2026-05-03 06:26:41 +00:00
Hermes Bot d7b34a740e Merge PR #1508: version style.css link so old SW cannot return stale CSS (closes #1507) 2026-05-03 06:20:43 +00:00
nesquena-hermes 4fea813adc fix(sw-cache): version style.css link so old SW cannot return stale CSS (#1507)
Container restart / in-place upgrade left the previous service worker still
controlling open tabs. Its fetch handler intercepted 'static/style.css',
matched the unversioned URL exactly against its old shell cache, and returned
the OLD CSS — while the JS files (which already carry ?v=__WEBUI_VERSION__)
hit the cache as misses and loaded fresh from network. New JS + old CSS
broke the layout until a force refresh bypassed the SW.

Fix is a 1-line attribute change plus aligning the SW pre-cache list:

* static/index.html: add ?v=__WEBUI_VERSION__ to the style.css link, matching
  the pattern already in use for every JS file in the page.
* static/sw.js: add the same ?v=__CACHE_VERSION__ suffix to every versioned
  entry in SHELL_ASSETS so that pre-cache URLs match what the page actually
  requests. Unversioned entries (root, manifest, favicons) stay unversioned.

Tests:

* New regression test_index_versions_stylesheet (lock the href) and
  test_sw_shell_assets_match_versioned_asset_urls in test_pwa_manifest_sw.py.
* test_workspace_panel_preload_marker_restored_in_head in test_sprint37.py
  loosened to match the css link prefix (preserves the ordering invariant).

Verified live on port 8789: served HTML carries
'static/style.css?v=v0.50.275-dirty' and SW SHELL_ASSETS receive the
matching VQ at request time.

Closes #1507.
2026-05-03 06:09:47 +00:00
nesquena-hermes 52226bcdd7 Merge pull request #1506 from nesquena/stage-275
v0.50.275 — /session/static/* MIME-type fix (PR #1505 by @rickchew)
v0.50.275
2026-05-02 22:29:52 -07:00
Hermes Bot 995822ac0d release: stamp v0.50.275 (PR #1505 — /session/static/* MIME-type fix)
CHANGELOG, ROADMAP, TESTING all updated.
3918 → 3923 tests collected (+5 regression tests).

Pre-release Opus advisor pass: SHIP. Path-traversal sandbox confirmed
for literal .. and URL-encoded %2e%2e variants. Auth-exemption benign
(404s any sandbox escape before bytes leak).
2026-05-03 05:25:58 +00:00
Hermes Bot 8f58688b66 test: lock /session/static MIME-type + auth fix; drop unused import
- Add tests/test_session_static_assets.py (5 tests):
  * /session/static/style.css must return text/css (not text/html)
  * /session/static/ui.js must return application/javascript
  * /session/<id> still serves the HTML index (catch-all not weakened)
  * Path-traversal still sandboxed after prefix strip
  * /session/static/* matches /static/* auth-exemption policy
- Drop unused 'from urllib.parse import urlparse as _up' import from
  PR #1505's added block (parsed._replace already gives a usable result).

Co-authored-by: Rick Chew <rickchew@users.noreply.github.com>
2026-05-03 05:20:19 +00:00
Hermes Bot a60273b852 Merge PR #1505: serve static assets correctly under /session/* routes 2026-05-03 05:12:24 +00:00
Rick Chew 7cf2150b94 fix: serve static assets correctly under /session/* routes
When the browser loads a session page at /session/<id>, it requests
static assets relative to that path — e.g. /session/static/style.css.
The /session/* catch-all in handle_get() intercepted those requests and
returned the HTML index page (text/html), causing browsers to refuse the
stylesheet with a MIME-type mismatch error.

Two-part fix:
- routes.py: add a guard before the /session/ catch-all that strips the
  /session prefix from /session/static/* paths and delegates to
  _serve_static(), so the correct Content-Type is returned.
- auth.py: whitelist /session/static/* in check_auth() alongside
  /static/, so static assets on session pages are served without
  requiring an authenticated session (same policy as /static/).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 13:05:15 +08:00
nesquena-hermes 539a72b9e6 Merge pull request #1504 from nesquena/stage-274
Stage 274: PR #1501 — LM Studio onboarding fully fixed (probe + keyless + LM_API_KEY alignment) (closes #1499 #1500)
v0.50.274
2026-05-02 20:34:56 -07:00
Hermes Bot 3837ed8bf1 chore(release): stamp v0.50.274 — LM Studio onboarding fully fixed (#1499 #1500)
PR #1501 closes all three sub-bugs from #1420:
- #1499 (a): probe <base_url>/models before persisting
- #1499 (third sub-bug): keyless setup is a first-class state for self-hosted providers
- #1500: webui env var aligned with agent CLI's canonical LM_API_KEY

Backed by 60+ regression tests (38 new + 22 updated). Pre-release Opus
advisor pass: ship-ready. Independent review by nesquena: APPROVED with
4 non-blocking observations (1 fixed in-release, 3 deferred to follow-ups
#1502 + #1503 + future helper extraction).

Closes #1499, closes #1500.
Refs #1502 (legacy alias sunset tracking), #1503 (probe re-render UX papercut).
2026-05-03 03:33:07 +00:00
Hermes Bot e7a19d2754 Stage 274: PR #1501 — onboarding probe + keyless setup + env-var alignment (#1499 #1500) 2026-05-03 03:24:00 +00:00
Hermes Bot ba6f34488e fix(onboarding,probe): refuse HTTP redirects on probe path (reviewer-flagged on PR #1501)
SSRF defense-in-depth: `urllib.request.urlopen` follows redirects by default,
so a probe at `http://example.com/v1/models` could be redirected to
`http://internal-service:8080/admin` — surfacing internal HTTP services to
the authenticated user. The probe is already gated behind WebUI auth and the
local-network check, so the practical attack surface is 'authenticated user
enumerating internal services' (same as `curl` from their browser DevTools).
Tightening the redirect default is cheap insurance.

Implementation:

- New module-level `_NoRedirectHandler` (subclasses `urllib.request.HTTPRedirectHandler`,
  overrides `redirect_request` to return None — urllib then raises `HTTPError(3xx)`
  rather than following).
- New module-level `_PROBE_OPENER = urllib.request.build_opener(_NoRedirectHandler())`.
- `probe_provider_endpoint` switches from `urlopen(req, …)` to `_PROBE_OPENER.open(req, …)`.
- The existing `HTTPError` handler now categorizes 3xx as `unreachable` with a
  detail string mentioning 'redirect' so the user understands what happened.
  3xx does NOT get its own error code in `PROBE_ERROR_CODES` — the error
  taxonomy contract stays the same shape (frontend i18n unchanged).

Added regression test `test_probe_does_not_follow_redirects` in
`tests/test_issue1499_onboarding_probe.py`. Spins up a tiny HTTP server that
302-redirects `/v1/models` to `/different-endpoint` (which would return
`{'data': [{'id': 'should-not-see'}]}` if followed). Asserts the probe
returns `{ok: False, error: 'unreachable', status: 302, detail: …'redirect'…}`
and that the 'should-not-see' string never appears in the result.

Mutation-verified: reverting `_PROBE_OPENER.open` back to `urlopen` causes
the test to fail with "Probe followed a redirect — should have refused".

Suite delta: 3917 → 3918 passing (+1).

Reviewer-flagged in PR #1501. Per the
'reviewer-flagged-fix-in-release-not-followup' policy: <20 LOC defensive
fix, regression test path obvious, ship in this release rather than punting.
2026-05-03 03:21:22 +00:00
Hermes Bot 8f4692b8cf fix(onboarding): allow keyless setup for self-hosted providers (#1499 third sub-bug)
Pre-fix, the wizard rejected an empty api_key for every provider in
_SUPPORTED_PROVIDER_SETUPS — including lmstudio, ollama, and custom,
which run keyless on the vast majority of local installs. The agent's
LMSTUDIO_NOAUTH_PLACEHOLDER substitution at chat-time was the workaround
for the no-auth case, but the wizard side rejected the empty input first.
Users had to type random gibberish into the API key field to clear the
form — the third sub-bug from #1420 that the prior commit's PR description
explicitly punted to a follow-up.

Surfaced by Nathan during PR review: "I think it's too weird for users
to have to type a string into the API key field, right?"  Yes — and the
probe (#1499) makes the cleanest fix strictly better: we accept empty
keys, and the probe gives instant feedback ("Connected. 2 model(s)
available." for keyless servers, "401" for auth-required servers).

Backend changes
---------------

* `api/onboarding.py` — `_SUPPORTED_PROVIDER_SETUPS` gains
  `key_optional: True` for `lmstudio`, `ollama`, `custom`. Cloud
  providers (openrouter, anthropic, openai, gemini, deepseek, …)
  remain key_required.

* `apply_onboarding_setup` skips the "{env_var} is required" check
  when `key_optional` is set AND no key is supplied. No write to .env
  for the empty-key case (no `LM_API_KEY=*** placeholder lying in the
  user's .env`).

* `_status_from_runtime` reports `provider_ready=True` for key_optional
  providers based on `requires_base_url` alone, so the wizard doesn't
  refire on the next page load just because there's no api_key. Cloud
  providers still need a key for provider_ready=True.

* `_build_setup_catalog` exposes the `key_optional` flag to the frontend.

Frontend changes
----------------

* `static/onboarding.js` — new `_renderOnboardingApiKeyField()` helper.
  For key_optional providers:
    - Label: "API key (optional)"
    - Placeholder: "Leave blank for keyless servers"
    - Inline italic muted help: "Most LM Studio / Ollama / vLLM installs
      run keyless — leave this blank if your server doesn't require
      authentication. Use the Test connection button to verify."
  For cloud providers: unchanged (label "API key", standard placeholder,
  no help block).

* The api-key input also now triggers `_scheduleOnboardingProbe()` on
  oninput, so changing the key re-runs the probe — handles "the server
  rejected my empty key with 401, let me add one and retry."

* `static/i18n.js` — 3 new keys × 9 locales (canonical English in `en`,
  English fallback with `// TODO: translate` markers in the other 8).

* `static/style.css` — `.onboarding-api-key-help` rule for the muted
  italic helper paragraph.

Verified end-to-end on port 8789
--------------------------------

Spun up an isolated test server + a mock LM Studio at
`127.0.0.1:11234/v1/models`. Stepped through the wizard:

* Picked LM Studio → field label flipped to "API key (optional)",
  placeholder showed "Leave blank for keyless servers", help text
  rendered in italic muted gray below.
* Switched to Anthropic → label reverted to "API key", help text
  disappeared. Visual hierarchy correct.
* Left api_key blank, set base_url to the mock, clicked Test connection
  → green "Connected. 2 model(s) available." banner. Probe-discovered
  models populated the workspace-step dropdown.
* Continued through to the finish step. config.yaml written with
  provider/model/base_url. **`.env` does NOT exist** — no placeholder
  string written. `chat_ready: true`, `state: ready`.
* Vision tool confirmed the visual hierarchy: subtle italic help
  reads as documentation, prominent green banner pops as status.

Tests
-----

`tests/test_issue1499_keyless_onboarding.py` — 16 tests in 3 classes:

  TestKeyOptionalProviderSchema (5)
    - lmstudio / ollama / custom declare key_optional=True
    - openrouter / anthropic / openai do NOT (regression defense)
    - setup catalog exposes the flag

  TestKeylessOnboarding (6)
    - lmstudio / ollama / custom: empty api_key accepted, no .env write
    - openrouter / anthropic: empty api_key still rejected
    - lmstudio with explicit key still writes .env (regression defense)

  TestKeylessChatReady (5)
    - lmstudio / ollama: provider_ready=True with no key
    - custom: provider_ready=True with key+base_url, False without base_url
    - openrouter: provider_ready=False with no key (regression defense)
    - End-to-end get_onboarding_status reports chat_ready=True

Full suite: 3901 → 3917 passing (+16 from this commit; +22 cumulative
from the PR's earlier commit). 0 failures.

Closes #1499 (all three sub-bugs from #1420 now addressed)
2026-05-03 03:07:07 +00:00
Hermes Bot 8616033605 fix(onboarding,providers): probe LM Studio /models + align env var with agent CLI (#1499 #1500)
Addresses both #1499 (onboarding wizard never probes the configured base URL)
and #1500 (cross-tool env-var name divergence between webui and agent CLI).
Surfaced together because they're both LM-Studio onboarding bugs that pile
on top of each other — fixing only one leaves the broken UX.

#1499 — Onboarding wizard probes <base_url>/models before persisting

Pre-fix, `apply_onboarding_setup` accepted whatever `base_url` the user typed
without ever fetching `<base_url>/models`. @chwps's log timeline in #1420
showed the wizard finishing in 239ms with zero outbound HTTP — onboarding
silently persisted unreachable URLs and left users with empty model
dropdowns they had to populate by hand-editing config.yaml.

Backend:
* New `probe_provider_endpoint(provider, base_url, api_key, timeout=5.0)`
  in `api/onboarding.py`. Stdlib-only (urllib + socket — no httpx dep).
  Returns `{ok, models}` on success; `{ok: False, error: <code>, detail}`
  on failure with stable error codes the frontend can switch on:
  invalid_url, dns, connect_refused, timeout, http_4xx, http_5xx, parse,
  unreachable. 256 KB response cap and 5s timeout keep a hostile or mis-
  pointed endpoint from blocking the wizard.
* New `POST /api/onboarding/probe` route — thin JSON wrapper around the
  function above. Same local-network gate as `/api/onboarding/setup`
  because the body carries an `api_key` the user typed.
* The probe response is NEVER persisted. Only the user's typed selection
  ends up in config.yaml; the probed model list just populates the
  wizard's dropdown.
* SSRF: deliberately does NOT block private-IP ranges. The wizard is
  gated behind WebUI auth and the legitimate target IS a local LM Studio
  / Ollama / vLLM server. A "block private IPs" SSRF defense would make
  the feature useless for its primary use case.

Frontend:
* `static/onboarding.js`:
  - New `ONBOARDING.probe` state ({status, error, detail, models, probedKey}).
  - `_runOnboardingProbe()` — POSTs to /api/onboarding/probe, idempotent
    & cached on (provider, baseUrl, apiKey).
  - Debounced (400ms) on `oninput` of the base URL field.
  - Explicit "Test connection" button.
  - `nextOnboardingStep` blocks Continue at the setup step for any
    provider with `requires_base_url=True` until the probe succeeds.
    Same localized error renders inline.
* `static/i18n.js`: 13 new keys × 9 locales (canonical English in `en`,
  English fallback with `// TODO: translate` markers in the other 8 —
  same convention as v0.50.271 #1488 voice-buttons).
* `static/style.css`: probe banner + Test button styling (red-tinted
  error variant, green-tinted success variant, neutral probing state).

Verified via manual repro on port 8789:
* connect_refused → red banner, helpful "from Docker, try the host IP"
  hint, blocks Continue.
* DNS failure → red banner, "could not resolve host '...'", blocks Continue.
* Success against a mock /v1/models server → green banner, model dropdown
  populates from the probed list, Continue advances normally.

#1500 — webui env var aligned with agent CLI (LM_API_KEY)

The webui has long used `LMSTUDIO_API_KEY` for LM Studio's API key in
both onboarding and Settings detection. The agent CLI runtime
(hermes_cli/auth.py:177-183) reads `LM_API_KEY`. So a user who configured
auth on their LM Studio instance got Settings → Providers reporting
has_key=True (because webui saw its own LMSTUDIO_API_KEY) but the agent
runtime ignored the key and fell back to LMSTUDIO_NOAUTH_PLACEHOLDER →
401 against the auth-enabled LM Studio server. Masked in practice for
the no-auth majority.

Picked Option B from the issue (defer to the agent — single source of
truth) but mitigated the migration cliff by reading the legacy name as
a fallback:

* `api/onboarding.py:_SUPPORTED_PROVIDER_SETUPS["lmstudio"]`:
  - `env_var: "LM_API_KEY"` (canonical, what onboarding writes going forward).
  - `env_var_aliases: ["LMSTUDIO_API_KEY"]` (read-only fallback for
    pre-#1500 users so detection keeps working without forcing an
    .env rewrite).
* `api/onboarding.py:_provider_api_key_present` reads aliases too.
* `api/providers.py:_PROVIDER_ENV_VAR["lmstudio"] = "LM_API_KEY"`.
* `api/providers.py:_PROVIDER_ENV_VAR_ALIASES["lmstudio"] = ("LMSTUDIO_API_KEY",)`
  — new dict, used by `_provider_has_key` and `get_providers`'s
  key_source resolution. Drops in cleanly when other providers later
  rename their env vars too.

Verified:

```
before fix:  webui writes LMSTUDIO_API_KEY → agent ignores it → 401 on chat
 after fix:  webui writes LM_API_KEY → agent picks it up → chat works
             pre-#1500 .env with LMSTUDIO_API_KEY → still has_key=True in Settings
                                                  → key_source='env_file'
```

Tests

* `tests/test_issue1499_onboarding_probe.py` — 17 tests:
  3 invalid_url variants, dns, connect_refused, success (OpenAI shape),
  success (bare-list shape), http_4xx, http_5xx, parse non-JSON, parse
  wrong-shape, api_key authorization header passthrough, "probe must
  not write to config.yaml or .env", PROBE_ERROR_CODES contract pin,
  3 end-to-end route-level smoke tests against the live server fixture.
* `tests/test_issue1500_lmstudio_env_var_alignment.py` — 5 tests:
  onboarding declares LM_API_KEY canonical with LMSTUDIO_API_KEY alias,
  onboarding writes ONLY the canonical name, legacy env var still
  detected post-migration, canonical takes precedence when both are
  set, _provider_api_key_present reads aliases.
* `tests/test_issue1420_lmstudio_provider_env_var.py` — updated:
  the original 5-test #1420 suite now pins LM_API_KEY as canonical
  and LMSTUDIO_API_KEY as alias.

Full suite: 3879 → 3901 passing (+22), 0 failures.

Out of scope (explicitly NOT addressed here)

The third LM Studio onboarding sub-bug from #1420's thread — that
`apply_onboarding_setup` requires a non-empty api_key for lmstudio
even though most LM Studio installs run keyless — remains. The agent's
`LMSTUDIO_NOAUTH_PLACEHOLDER` substitution kicks in at runtime, but
the onboarding wizard rejects the empty-key case at submit. Fixing
this requires a UX decision (auto-write a sentinel? loosen the
required-key check for self-hosted providers?) and is left as a
separate follow-up.

Closes #1499
Closes #1500

Co-authored-by: chwps <106549456+chwps@users.noreply.github.com>
Co-authored-by: AdoneyGalvan <25235323+AdoneyGalvan@users.noreply.github.com>
2026-05-03 02:46:24 +00:00