feat(llm): agent runtimes as primary LLM — OpenAICompatibleLLM/HermesLLM/OpenClawLLM + spoken error fallback by nicolotognoni · Pull Request #157 · PatterAI/Patter

nicolotognoni · 2026-06-05T11:12:43Z

Summary

Patter becomes the voice shell in front of an OpenAI-compatible agent runtime: it owns the carrier + STT + turn-taking + TTS, while each conversation turn is answered by the agent at POST {base_url}/chat/completions. This is the "custom-LLM voice layer" model (Patter plays the role ElevenLabs Agents play in the ElevenLabs↔Hermes setup), but self-hosted — if Patter and the agent run on the same box, the agent gateway never needs to be tunnelled/exposed; only Patter faces the carrier.
Three new pipeline-mode LLM providers with full Python/TypeScript parity: a generic OpenAICompatibleLLM plus thin HermesLLM and OpenClawLLM presets.
Opt-in spoken fallback so a gateway-down / long-tool-call timeout speaks a configurable line instead of dead air.

Implementation

OpenAICompatibleLLM (libraries/python/getpatter/llm/openai_compatible.py, libraries/typescript/src/llm/openai-compatible.ts) — drives any OpenAI-compatible endpoint (Hermes, OpenClaw, Ollama, vLLM, LM Studio). Modelled 1:1 on the existing GroqLLM/CerebrasLLM presets. Adds a configurable long timeout (default 60s; the base provider sets none) and opt-in session continuity. Keyless local gateways supported via the conventional EMPTY sentinel; warmup() omits the Authorization header when no key is set.
HermesLLM — preset: 127.0.0.1:8642/v1, model hermes-agent (API_SERVER_MODEL_NAME fallback), key from API_SERVER_KEY, 120s timeout.
OpenClawLLM — preset: 127.0.0.1:18789/v1, model="openclaw/<agent>" pass-through, key from OPENCLAW_API_KEY, x-openclaw-session-key header. Reuses the agent-target/charset rules of the shipped consult target so the two paths can't drift.
Session continuity — opt-in user=patter-call-<call_id> (+ optional session header) so the runtime keys one session per call. call_id is threaded through the LLM loop additively (existing providers gain an optional param; no behaviour change).
agent.llm_error_message / llmErrorMessage — opt-in spoken fallback wired into the pipeline stream-handler's existing LLM-error branch, reusing the same TTS-speak primitive as first_message. Trigger is gated on emitted audio (first_tts_chunk/ttsFirstByteSent), not on tokens received — so a provider that streams partial tokens ("Let me check…") and then times out before a sentence boundary still triggers the line, while an already-spoken sentence never double-speaks. Default None/undefined preserves today's silence-on-error behaviour.
Roots re-export OpenAICompatibleLLM, OpenAICompatibleLLMProvider, HermesLLM, OpenClawLLM (Python and TypeScript symmetric).

Breaking change?

No. Every new field is optional with a default that preserves current behaviour (None/undefined). No existing constructor path requires a new key. The call_id threading on existing providers is additive.

Test plan

Python: pytest tests/ — 2168 passed (incl. new test_llm_openai_compatible.py, test_llm_hermes_openclaw_presets.py, test_llm_loop_call_id_threading.py, tests/unit/test_llm_error_fallback.py)
TypeScript: npm test (1729 passed) + npm run lint (tsc clean) + npm run build
New tests are authentic — they exercise real provider construction / header assembly / model routing and the real stream-handler error path, mocking only the HTTP and TTS byte boundaries
/parity-check (providers + new field audited in-PR by the parity reviewer; defaults byte-identical across SDKs)

Docs updates

docs/integrations/hermes.mdx — "Call your Hermes Agent over the phone using Patter" article (architecture, gateway setup, Python/TS examples, the localhost-not-ngrok security note, dead-air caveat).
docs/integrations/openclaw.mdx — OpenClawLLM-as-primary section + generic OpenAI-compatible runtime note (Ollama/vLLM/LM Studio).

Follow-ups (not in this PR)

Wire hermes/openclaw/openai_compatible as --llm values into the getpatter init wizard + a doctor command — blueprint prepared; lands on the init-wizard branch where the wizard lives.
Feature-inventory row (tracked in the private assets repo).
Pre-existing low/medium parity-harness and cache_read_tokens naming items flagged by review — separate maintenance PR.

…LLM/OpenClawLLM + spoken error fallback Let Patter act as the voice shell in front of an OpenAI-compatible agent runtime: carrier + STT + turn-taking + TTS stay in Patter while each turn is answered by POST {base_url}/chat/completions. Adds three pipeline-mode LLM providers (full Python/TypeScript parity) plus an opt-in dead-air fallback. - OpenAICompatibleLLM: generic provider for any OpenAI-compatible endpoint (Hermes, OpenClaw, Ollama, vLLM, LM Studio). Thin subclass of OpenAILLMProvider with a configurable long timeout (default 60s) and opt-in session continuity. Keyless local gateways supported via the conventional EMPTY sentinel; warmup omits the Authorization header when no key is set. - HermesLLM: preset for the Hermes gateway (127.0.0.1:8642, model hermes-agent, API_SERVER_KEY, 120s timeout). - OpenClawLLM: preset for the OpenClaw gateway (127.0.0.1:18789, openclaw/<agent>, OPENCLAW_API_KEY, x-openclaw-session-key), aligned with the shipped consult target. - Session continuity: opt-in user=patter-call-<call_id> (+ optional session header) so the runtime keys one session per call. call_id threaded through the LLM loop additively (backward compatible). - agent.llm_error_message / llmErrorMessage: opt-in spoken fallback when a turn's LLM stream raises (gateway down / timeout) before any audio reached the caller. Gated on emitted audio (not tokens) so a partial-token timeout still triggers it and an already-spoken sentence never double-speaks. Default None/undefined preserves today's silence-on-error behaviour. Docs: hermes.mdx article ("Call your Hermes Agent over the phone using Patter") + openclaw.mdx section. Tests authentic — mock only the HTTP / TTS boundary.

… custom-provider call_id back-compat Hermes is stateless and keys session continuity off request HEADERS, not the OpenAI user field. HermesLLM now sends X-Hermes-Session-Id: patter-call-<call_id> per call (primary mechanism) plus an optional static X-Hermes-Session-Key for long-term memory scoping (new opt-in session_key / sessionKey param, default off). - OpenAICompatibleLLM: decouple session-header emission from the user-field gating; split into session_id_header/session_id_prefix (per-call) and session_key_header/session_key (static memory scope). Each emitted independently; pre-existing extra_headers preserved. An empty-string session key is treated as unset (no empty header on the wire). - OpenClaw: byte-identical on the wire (user=patter-call-<id> + x-openclaw-session-key=<id>) — its gateway derives the session from the user field, so its behaviour is intentionally unchanged. - Backward compat: LLMLoop passes call_id only when the provider's stream() accepts it (inspect.signature guard, cached per provider type), so a custom provider with stream(messages, tools, *, cancel_event) no longer raises TypeError. TS already tolerates the extra options field; added a regression test. - Docs: hermes.mdx session-continuity prose corrected to the header mechanism. Python 2183 / TypeScript 1738 tests pass; tsc + build clean.

nicolotognoni · 2026-06-05T13:14:50Z

Bumped to 0.6.5 in 36433c0 (all three version files + lockfile; CHANGELOG ## Unreleased rolled into ## 0.6.5 (2026-06-05)).

This PR is now the 0.6.5 release. After CI is green and you merge, tag v0.6.5 on main to trigger the PyPI + npm publish (per the release-via-PR process — the tag is pushed only after merge, never before).

nicolotognoni added 2 commits June 5, 2026 13:11

nicolotognoni merged commit 248b1f9 into main Jun 5, 2026
14 checks passed

nicolotognoni mentioned this pull request Jun 5, 2026

chore(release): 0.6.5 #158

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): agent runtimes as primary LLM — OpenAICompatibleLLM/HermesLLM/OpenClawLLM + spoken error fallback#157

feat(llm): agent runtimes as primary LLM — OpenAICompatibleLLM/HermesLLM/OpenClawLLM + spoken error fallback#157
nicolotognoni merged 2 commits into
mainfrom
feat/agent-llm-providers

nicolotognoni commented Jun 5, 2026

Uh oh!

Uh oh!

nicolotognoni commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nicolotognoni commented Jun 5, 2026

Summary

Implementation

Breaking change?

Test plan

Docs updates

Follow-ups (not in this PR)

Uh oh!

Uh oh!

nicolotognoni commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant