Worst case 4×5s=20s per polling request on ThreadingHTTPServer pool is risky
given today's _cron_env_lock near-miss on production 8787. Status probes
should fail fast; client can retry. All four call sites use default timeout.
The _load_repo_dotenv_preserving_env() function iterates over
${preserved[@]} with set -euo pipefail. On bash 3.2 (macOS default),
an empty array triggers 'unbound variable' under set -u, crashing
ctl.sh start. Bash 4+ handles this fine, but macOS ships 3.2.
Wraps the for loop in a length check: [[ ${#preserved[@]} -gt 0 ]]
Opus advisor pass on stage-341 found three surgical items:
1. static/i18n.js:it — PR #2064 branched before stage-340 landed the 'it'
locale (#2067), missing 9 session_*worktree* keys. Mechanical mirror of
en/ja position. Italian falls back to English silently without this fix.
2. api/streaming.py — PR #2107's new break short-circuit was silent in both
the aux and agent title-generation paths. Added logger.debug calls before
each break so production logs surface the exit shape.
3. api/streaming.py — Expanded _title_should_skip_remaining_attempts docstring
to document the membership criterion explicitly (vs the implicit
reasoning-only-burn case it ships with today). Future additions
(llm_safety_blocked, llm_oauth_quota) have a clear inclusion test.
CHANGELOG updated under the Stage-341 maintainer fixes section to mirror
the stage-340 pattern. All targeted tests pass (57/57 in the affected
modules).
Renames the [Unreleased] section to [v0.51.47] (Release W, shipped today
via stage-340) and folds in the stage-341 batch — PR #2105 RFC, PR #2107
title-retry fix, PR #2064 worktree archive copy, plus the stage-341
maintainer fix (RFC conventions guidance).
Also removes the duplicate v0.51.46 heading line that landed in v0.51.47's
stage-340 merge (the duplicate was a no-op — empty body line under the
extra heading — but tidying it up here.
When merging PR #2105 (Hermes Run Adapter RFC) the standing concern was
that landing the RFC unconfirmed would invite the speculative-fragment
implementation pattern we just had to put on hold with PR #2071 — well-
written 651-LOC standalone scripts with no callers.
Add a single bullet to the conventions block so the contract is explicit:
an RFC is a design direction, not an invitation to PR fragments against
it. Implementation slices need maintainer confirmation first.
Applied during stage-341 build, not requested from @Michaelyklam — the
guardrail belongs in the conventions doc itself rather than as a one-off
ask on this PR.
Reasoning models (Qwen3-thinking via LM Studio, DeepSeek-R1, Kimi-K2,
etc.) can burn their entire output budget on hidden reasoning tokens and
emit no visible content. The previous title-generation retry path
classified that as llm_length and doubled the budget — but the second
call produces the same shape, so the retry only doubled the GPU/credit
burn. Repeated across the two prompts in _title_prompts() this came to
~3000 reasoning tokens of GPU work per new chat. On local LM Studio
servers behind a custom: provider (where is_lmstudio=False means
reasoning_effort: none never reaches the model) it manifested as the GPU
never going idle after a prompt.
Fix:
- _extract_title_response: classify reasoning-bearing empty responses
as llm_empty_reasoning regardless of finish_reason. The presence of
reasoning_content is the diagnostic signal, not finish_reason.
- _title_retry_status: drop llm_empty_reasoning from the retry set.
Length-truncated responses WITHOUT reasoning still retry (those are
legitimately recoverable by a larger budget).
- Add _title_should_skip_remaining_attempts() and break out of the
prompt-iteration loop on empty-reasoning. A second prompt against
the same model would produce the same shape.
- Falls through to _fallback_title_from_exchange for a local-summary
title.
Tests updated to invert the previous reasoning-retry assertions:
- test_aux_short_circuits_on_empty_reasoning_without_retrying
- test_aux_still_retries_finish_length_without_reasoning
- test_agent_route_short_circuits_on_empty_reasoning_without_retrying
- test_agent_route_still_retries_finish_length_without_reasoning
Companion agent-side work (LM Studio classifier for custom: providers)
is tracked separately on the hermes-agent side; this WebUI fix is the
belt-and-braces guard so the loop stops regardless of agent classifier
state.
Reported by @darkopetrovic. Closes#2083.
Co-authored-by: darkopetrovic <darkopetrovic@users.noreply.github.com>
(cherry picked from commit efeae4a86e)
Opus SHOULD-FIX from stage-340 review. PR #2067 added the it locale
between en and ja; PR #2100 added 4 toast keys to 8 other locales but
missed it. Falls back to English via t() defaults so no user-visible
break, but it's an i18n parity hole.
4 LOC, mechanical add inside the it: block at the canonical position
(immediately after cron_profile_server_default_hint, mirroring en/ja).
Co-authored-by: ai-ag2026 <261867348+ai-ag2026@users.noreply.github.com>
Co-authored-by: Samuel Gudi <samuel.gudi.official@gmail.com>