PatterAI · nicolotognoni · Jun 5, 2026 · Jun 5, 2026 · Jun 5, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,33 @@
 ## Unreleased
 
+### Added
+
+- **`llm_error_message` / `llmErrorMessage` — opt-in spoken fallback when the
+  pipeline-mode LLM stream raises before any text is spoken.** Agent-runtime
+  providers (HermesLLM, OpenClawLLM, OpenAICompatibleLLM) may take 30-90 s to
+  complete a turn (tools, memory, skills); on gateway-down or timeout the caller
+  previously heard silence then a silent turn-end. Set
+  `llm_error_message="Sorry, I'm having trouble right now."` (Python) /
+  `llmErrorMessage: "Sorry, I'm having trouble right now."` (TypeScript) on the
+  agent to speak that line through the normal TTS path (subject to barge-in).
+  Four-part trigger condition: (1) a real LLM error (not a clean barge-in
+  abort), (2) zero assistant *audio* emitted this turn — gated on whether a PCM
+  chunk actually reached the carrier (`first_tts_chunk` / `ttsFirstByteSent`),
+  not on whether tokens were received, so a provider that streams partial tokens
+  ("Let me check…") and then times out before a sentence boundary (the chunker
+  buffered them, TTS never ran, the caller heard silence) still triggers the
+  fallback, while a turn that already spoke a full sentence does not double-speak,
+  (3) agent still owns the floor, (4) the field is set to a non-empty string.
+  `undefined` / `None` (default) preserves today's silence-on-error behaviour —
+  fully backward compatible. Pipeline mode only; Realtime and ConvAI surface
+  provider errors on their own audio path.
+  `libraries/python/getpatter/models.py` (`Agent.llm_error_message`),
+  `libraries/python/getpatter/client.py` (`Patter.agent(llm_error_message=...)`),
+  `libraries/python/getpatter/stream_handler.py`
+  (`PipelineStreamHandler._process_streaming_response` error branch),
+  `libraries/typescript/src/types.ts` (`AgentOptions.llmErrorMessage`),
+  `libraries/typescript/src/stream-handler.ts` (`runPipelineLlm` catch branch).
+
 ## 0.6.4 (2026-06-05)
 
 ### Security
@@ -28,6 +56,87 @@
 
 ### Added
 
+- **Patter as a voice shell in front of an agent runtime — three new
+  pipeline-mode LLM providers (`OpenAICompatibleLLM`, `HermesLLM`,
+  `OpenClawLLM`).** Drive a phone call where the LLM is an external,
+  OpenAI-compatible agent runtime reached at `POST {base_url}/chat/completions`:
+  Patter owns the carrier leg, STT, turn-taking / VAD / barge-in, and TTS, and
+  each conversation turn is answered by the runtime (which can run its own
+  tools, memory, and skills before replying). Used like any other pipeline LLM
+  — `phone.agent(llm=HermesLLM())` (Python) / `phone.agent({ llm: new
+  HermesLLM() })` (TypeScript).
+  - **`OpenAICompatibleLLM`** (Python `getpatter/llm/openai_compatible.py`,
+    TypeScript `src/llm/openai-compatible.ts`) — the generic provider for *any*
+    OpenAI-compatible chat endpoint: Hermes, OpenClaw, Ollama, vLLM, LM Studio,
+    or a custom gateway. `base_url` and `model` are required; `timeout` defaults
+    to **60 s** (configurable) so a runtime that runs tools mid-turn isn't cut
+    off — the base OpenAI provider's shorter ceiling is unchanged for raw
+    inference. Keyless local gateways (Ollama / vLLM / LM Studio) are supported:
+    pass no key and the request goes out without an `Authorization` header.
+    `extra_headers` / `extraHeaders` merge after the `getpatter/<version>`
+    User-Agent so it can't be silently clobbered.
+  - **`HermesLLM`** (Python `getpatter/llm/hermes.py`, TypeScript
+    `src/llm/hermes.ts`) — thin preset over `OpenAICompatibleLLM` for the Hermes
+    agent runtime: `base_url` defaults to `http://127.0.0.1:8642/v1`, `model` to
+    `hermes-agent` (env `API_SERVER_MODEL_NAME` fallback), api key from env
+    `API_SERVER_KEY`, `timeout` **120 s**.
+  - **`OpenClawLLM`** (Python `getpatter/llm/openclaw.py`, TypeScript
+    `src/llm/openclaw.ts`) — thin preset over `OpenAICompatibleLLM` for the
+    OpenClaw agent runtime: takes an `agent` id (e.g. `receptionist`), validated
+    and mapped to `model="openclaw/<agent>"` using the **same** charset rule and
+    namespaced pass-through as the shipped `consult` OpenClaw preset; `base_url`
+    defaults to `http://127.0.0.1:18789/v1`, api key from env `OPENCLAW_API_KEY`,
+    `timeout` **120 s**.
+- **Per-call session continuity for the agent-runtime providers (opt-in), now
+  with three decoupled signals.** Each runtime keys session continuity
+  differently, so `OpenAICompatibleLLM` exposes three independent, optional
+  signals — emit any subset:
+  - **`session_user_prefix` / `sessionUserPrefix`** → OpenAI `user` field as a
+    stable `{prefix}{call_id}` (value `patter-call-<call_id>` for the presets).
+    OpenClaw's gateway derives its session from `user`; Hermes uses it only for
+    upstream-log correlation.
+  - **`session_id_header` + `session_id_prefix` / `sessionIdHeader` +
+    `sessionIdPrefix`** → a per-call request header carrying
+    `{session_id_prefix}{call_id}`, for runtimes that key session/transcript
+    continuity off a header rather than `user`.
+  - **`session_key_header` + `session_key` / `sessionKeyHeader` +
+    `sessionKey`** → a *static* request header for long-term memory scoping
+    (value is the configured key, not the call id). Omitted unless the value is
+    set.
+
+  Each signal is gated independently and merged into the per-request headers;
+  when none are configured the request is byte-identical to the base provider
+  (no `user`, no extra headers). Preset wiring:
+  - **`HermesLLM`** — Hermes is stateless and keys continuity off **headers**:
+    `session_id_header="X-Hermes-Session-Id"`, `session_id_prefix="patter-call-"`
+    (so each call sends `X-Hermes-Session-Id: patter-call-<call_id>` by default).
+    A new optional `session_key` (Python) / `sessionKey` (TypeScript) constructor
+    arg, default `None` / `undefined`, opts into long-term memory scoping by
+    emitting `X-Hermes-Session-Key: <value>`. `session_user_prefix` is kept at
+    `patter-call-` for upstream-log correlation but does **not** drive the Hermes
+    session.
+  - **`OpenClawLLM`** — wire-identical to before: `session_user_prefix=
+    "patter-call-"` (OpenClaw's gateway keys off `user`) plus
+    `session_id_header="x-openclaw-session-key"` with `session_id_prefix=""` so
+    the header still carries the raw `call_id`. No memory-scope header.
+
+  The presets enable continuity by default; the generic provider leaves it
+  **off** unless opted in. To make `call_id` reach the provider, `LLMLoop.run`
+  threads it through to `provider.stream()` — Python via an optional `call_id`
+  keyword on the `LLMProvider.stream` protocol (the loop now introspects each
+  provider's `stream` signature and only passes `call_id` to providers that
+  accept it or take `**kwargs`, so a minimal custom provider that declares
+  neither is **not** broken), TypeScript via an optional `callId` field on
+  `LLMStreamOptions` (an extra options-object property a provider ignores is a
+  no-op). Both are additive and optional, so every existing provider is
+  unaffected. `libraries/python/getpatter/llm/openai_compatible.py`,
+  `libraries/python/getpatter/llm/hermes.py`,
+  `libraries/python/getpatter/llm/openclaw.py`,
+  `libraries/python/getpatter/services/llm_loop.py`,
+  `libraries/typescript/src/llm/openai-compatible.ts`,
+  `libraries/typescript/src/llm/hermes.ts`,
+  `libraries/typescript/src/llm/openclaw.ts`,
+  `libraries/typescript/src/llm-loop.ts`.
 - **`allow_insecure_dashboard` / `allowInsecureDashboard` escape hatch (opt-in,
   default off).** New optional config on `Patter(...)` (Python) and `serve(...)`
   `ServeOptions` (TypeScript), defaulting to `False` / `false`. When the