Commit Graph

6 Commits

Author SHA1 Message Date
ai-ag2026 4b486f2860 feat: record turn journal lifecycle events 2026-05-11 09:13:25 +02:00
nesquena-hermes 0efa75827a fix(streaming): pass config overrides into context-length fallback (#1896)
The two get_model_context_length() fallback callsites in api/streaming.py
(session save + SSE usage payload) were calling the resolver with only
model + base_url. When the agent's compressor reports 0 (fresh/cached/
transitioning agent), resolution fell through to the 256K DEFAULT_FALLBACK
even when users had set model.context_length: 1048576 in config.yaml.

For LCM users on 1M-context models, the wrong window cascaded into a
session-killing failure: auto-compression triggered at ~25% of the wrong
value, floods of compress requests, 429s, credential pool exhaustion,
fallback 429s, then 'API call failed after 3 retries'.

Reported by @AvidFuturist on Discord with deepseek-v4-flash. Reproduced 5x.

Both callsites now pass config_context_length, provider, and
custom_providers. The resolver consults these BEFORE probing, so the
config override wins. Both are wrapped in except TypeError blocks that
retry with the legacy 2-arg form for older hermes-agent builds whose
get_model_context_length signature pre-dates these kwargs.

Tests: 7 source-string regressions guarding both call shapes, the safe
config parse, the legacy fallback, and the per-profile config source.
Also bumped the line-distance assertion in test_pr1341 (the test
explicitly invites bumping when a new pre-save mutation block is added).

Closes #1896

Co-authored-by: Hermes Agent <agent@hermes.local>
2026-05-08 16:08:42 +00:00
Michael Lam c94ec31dec feat: show LLM Gateway routing metadata 2026-05-05 02:26:55 +00:00
Michael Lam 3ad8846a27 fix: show TPS in assistant message headers 2026-05-04 21:26:43 +00:00
nesquena-hermes 880350312a fix(streaming): fallback to model_metadata for context_length when compressor missing (#1318 follow-up) (#1348)
* fix(streaming): fallback to model_metadata for context_length when compressor missing (#1318 follow-up)

PR #1318 (shipped in v0.50.246 via PR #1341 + commit a5c10d5) persisted
context_length on the session so the context-ring indicator survives
page reloads. But the writer only fired when agent.context_compressor
was present and reported a non-zero value. Fresh agents, interrupted
streams, or compressors without the attribute would still leave
s.context_length=0 — and the indicator would still show 0% on reload.

This follow-up adds a fallback that calls
agent.model_metadata.get_model_context_length(model, base_url) when the
compressor didn't populate the value. The function returns a sensible
static context window for any known model (with a 256K default for
unknown models). Wrapped in a broad try/except because older
hermes-agent builds may not expose the helper.

Sourced from PR #1344 (@jasonjcwu) — extracted into this focused
follow-up after #1344 was closed as superseded by #1341.

Adds 6 structural tests covering: import + call presence, falsy-gate,
agent.model/base_url passing, exception swallowing, save() ordering,
result assignment.

Closes the data-flow gap in #1318 for the compressor-missing case.

* test: relax pr1341 block-size assertion to accommodate the new fallback

---------

Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
2026-04-30 10:27:56 -07:00
nesquena-hermes a5c10d594d fix(streaming): persist context_length on session — completes #1318 fix
Pre-release Opus + nesquena review on v0.50.246 caught that PR #1341
added the data-structure scaffolding (Session.__init__ accepts the 3
fields, save() persists them, compact() exposes them, GET /api/session
returns them) but did NOT add the writer that actually populates them.

Without a writer, the user-visible bug (context-ring shows 0% after
page reload) was NOT fixed by #1341 alone — the fields stayed None
forever because nothing wrote to s.context_length anywhere.

Adds the writer at api/streaming.py:2188 (post-merge per-turn save block,
before s.save()) so the values from agent.context_compressor land on
disk and survive page reloads.

Also moves the SSE usage payload comment to clarify that the live SSE
payload and the session-level persistence are now distinct paths
(payload below, persistence above).

Adds tests/test_pr1341_context_window_persistence.py — 6 structural +
round-trip tests covering Session __init__/save/compact, the routes
response, and the streaming.py writer placement.

Closes #1318 (the actual user-visible bug, not just the scaffolding).
2026-04-30 16:42:32 +00:00