mirror of
https://github.com/nesquena/hermes-webui.git
synced 2026-05-25 03:00:23 +00:00
docs: add Hermes run adapter RFC
This commit is contained in:
@@ -32,5 +32,8 @@ First-time contributor RFCs should be discussed in an issue before opening a PR.
|
||||
|
||||
## Current RFCs
|
||||
|
||||
- [`hermes-run-adapter-contract.md`](hermes-run-adapter-contract.md) — Event/control
|
||||
compatibility contract and gap matrix for moving WebUI chat runs to Hermes-owned
|
||||
runtime execution.
|
||||
- [`turn-journal.md`](turn-journal.md) — Crash-safe WebUI turn journal for
|
||||
recovering interrupted chat submissions.
|
||||
|
||||
@@ -0,0 +1,315 @@
|
||||
# Hermes Run Adapter Compatibility Contract
|
||||
|
||||
- **Status:** Proposed
|
||||
- **Author:** @Michaelyklam
|
||||
- **Created:** 2026-05-11
|
||||
- **Tracking issue:** [#1925](https://github.com/nesquena/hermes-webui/issues/1925)
|
||||
|
||||
## Problem
|
||||
|
||||
Hermes WebUI currently gives a rich workbench experience, but browser-originated
|
||||
chat turns are still executed inside the WebUI server process. The WebUI path
|
||||
creates process-local stream state, starts background agent threads, constructs or
|
||||
reuses `AIAgent`, and owns callback queues for token, tool, reasoning, approval,
|
||||
and clarify state.
|
||||
|
||||
The target boundary from #1925 is:
|
||||
|
||||
> WebUI should be thin in execution ownership, not thin in product scope.
|
||||
|
||||
That means WebUI remains the full browser workbench for sessions, workspace
|
||||
files, chat rendering, tools, approvals, status, diagnostics, and controls. The
|
||||
change is that Hermes Agent must own run lifecycle, event ordering, replay,
|
||||
approvals, clarify, cancellation, and terminal state.
|
||||
|
||||
This document defines the first reviewable contract for a Hermes-owned run
|
||||
adapter. It is intentionally a spec/gap matrix, not an implementation plan for a
|
||||
new WebUI runtime surrogate.
|
||||
|
||||
## Goals
|
||||
|
||||
- Keep the browser-facing WebUI workbench contract stable while execution moves
|
||||
out of the WebUI process.
|
||||
- Define the minimum Hermes Runtime API / IPC v0 surface WebUI needs before it
|
||||
can route new runs to Hermes-owned execution.
|
||||
- Map current WebUI-owned runtime primitives to Hermes-owned APIs, WebUI
|
||||
presentation state, or explicit temporary compatibility shims.
|
||||
- Make restart/reattach the first meaningful success criterion, not merely
|
||||
"basic chat streamed once."
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Do not implement the adapter in this RFC.
|
||||
- Do not create a new run-manager sidecar or broker requirement.
|
||||
- Do not re-create `STREAMS`, cached `AIAgent` objects, approval queues, clarify
|
||||
queues, or cancellation flags under new names inside WebUI.
|
||||
- Do not reduce WebUI product scope. The rich workbench UX remains in WebUI.
|
||||
- Do not require every event to be durably persisted on day one if the first
|
||||
upstream runtime slice can still prove Hermes-owned execution and reconnect.
|
||||
|
||||
## Ownership boundary
|
||||
|
||||
### Hermes Agent owns
|
||||
|
||||
- run creation and lifecycle
|
||||
- run ids and session-to-active-run mapping
|
||||
- ordered event stream and replay cursor
|
||||
- terminal run state, final result, and error metadata
|
||||
- model/provider/profile/toolset routing
|
||||
- agent execution and tool dispatch
|
||||
- command semantics and capability metadata
|
||||
- approval and clarify lifecycle
|
||||
- cancel, interrupt, queue, continue, steer, and goal control where supported
|
||||
- durable runtime/session state needed for reconnect
|
||||
|
||||
### WebUI owns
|
||||
|
||||
- browser authentication and presentation-specific session routing
|
||||
- chat layout, transcript rendering, tool cards, thinking/progress display
|
||||
- approval and clarify widgets
|
||||
- workspace/file-panel UX
|
||||
- settings/admin/diagnostics presentation
|
||||
- adapting Hermes runtime events into WebUI-compatible browser events
|
||||
- temporary compatibility shims explicitly listed in this RFC
|
||||
|
||||
## WebUI event/control compatibility contract
|
||||
|
||||
The browser-facing contract should remain stable enough that the current WebUI
|
||||
workbench can render either the legacy in-process runtime or the Hermes-owned run
|
||||
adapter during migration. These are presentation events over Hermes runtime
|
||||
truth, not a second source of truth.
|
||||
|
||||
All events should include enough metadata for idempotent rendering and
|
||||
reconnect:
|
||||
|
||||
```json
|
||||
{
|
||||
"event_id": "run_123:42",
|
||||
"seq": 42,
|
||||
"run_id": "run_123",
|
||||
"session_id": "20260511_...",
|
||||
"type": "tool.update",
|
||||
"created_at": 1778540000.0,
|
||||
"terminal": false,
|
||||
"payload": {}
|
||||
}
|
||||
```
|
||||
|
||||
`event_id` may be an SSE `id:` value or an equivalent cursor token. `seq` is a
|
||||
monotonic per-run cursor. Clients may send `Last-Event-ID` or `after_seq` on
|
||||
reconnect. The runtime should treat replay as at-least-once delivery; WebUI must
|
||||
deduplicate by `run_id` + `seq` / `event_id`.
|
||||
|
||||
### Event families
|
||||
|
||||
| WebUI event family | Required payload | Runtime source of truth |
|
||||
|---|---|---|
|
||||
| `run.started` / `status` | lifecycle state, controls available, session id, workspace/profile/model/toolset summary | Hermes run state |
|
||||
| `token.delta` | assistant message id/segment id, delta text, optional content type | Hermes model output stream |
|
||||
| `reasoning.delta` / `reasoning.done` | reasoning text or structured reasoning block, visibility metadata | Hermes reasoning callback/event stream |
|
||||
| `progress` | concise status/progress text, optional phase/tool context | Hermes agent progress callbacks |
|
||||
| `tool.started` | tool call id, tool name, sanitized arguments, start time | Hermes tool dispatch lifecycle |
|
||||
| `tool.updated` | stdout/stderr/structured partial data, progress metadata | Hermes tool dispatch lifecycle |
|
||||
| `tool.done` | result, exit/status, duration, error flag | Hermes tool dispatch lifecycle |
|
||||
| `approval.requested` | approval id, command/action summary, risk metadata, available choices | Hermes approval queue/control plane |
|
||||
| `approval.resolved` | approval id, choice, resulting status | Hermes approval queue/control plane |
|
||||
| `clarify.requested` | clarify id, question, choices/input mode | Hermes clarify lifecycle |
|
||||
| `clarify.resolved` | clarify id, answer metadata/status | Hermes clarify lifecycle |
|
||||
| `title.updated` | title text, title source/confidence | Hermes session/title subsystem |
|
||||
| `usage.updated` / `usage.final` | tokens, cost, model/provider, duration where available | Hermes usage accounting |
|
||||
| `error` | stable error code, safe message, redacted diagnostic metadata, terminal flag | Hermes run terminal/error state |
|
||||
| `done` | final lifecycle state, usage, terminal result/error summary, last seq | Hermes run terminal state |
|
||||
|
||||
### Reconnect metadata
|
||||
|
||||
Every active or terminal run must expose:
|
||||
|
||||
- `run_id`
|
||||
- `session_id`
|
||||
- current `status`: `queued`, `running`, `awaiting_approval`,
|
||||
`awaiting_clarify`, `paused`, `cancelling`, `cancelled`, `failed`,
|
||||
`completed`, or `expired`
|
||||
- last committed event cursor / `last_event_id`
|
||||
- terminal state and final result/error when finished
|
||||
- currently available controls
|
||||
- pending approval/clarify ids, if any
|
||||
- session-to-active-run mapping for the current WebUI session
|
||||
|
||||
### Controls
|
||||
|
||||
| WebUI control | Required semantics | Runtime endpoint / IPC |
|
||||
|---|---|---|
|
||||
| cancel | Request graceful cancellation of the current run; terminal event must follow | `cancel_run` / `interrupt` |
|
||||
| queue / continue | Append follow-up work to a live, paused, or resumable run/session according to Hermes semantics | `queue_or_continue` |
|
||||
| approval | Resolve a pending approval request with `allow_once`, `allow_session`, `always`, or `deny` where supported | `respond_approval` |
|
||||
| clarify | Submit answer text or selected choice for a pending clarify request | `respond_clarify` |
|
||||
| goal | Set/status/pause/resume/clear goal where Hermes exposes goal capability for this surface | command/capability API |
|
||||
| observe | Attach to live events and replay from cursor | `observe_run` |
|
||||
| status | Poll lifecycle state when SSE/WebSocket is unavailable | `get_run` |
|
||||
|
||||
WebUI may keep local UI state such as which disclosure rows are expanded, but it
|
||||
must not infer or privately mutate runtime state for these controls.
|
||||
|
||||
## Hermes Runtime API / IPC v0 minimum
|
||||
|
||||
The transport can be HTTP, stdio IPC, websocket, or another Hermes-owned local
|
||||
protocol. The key requirement is the semantic contract: Hermes owns the run id,
|
||||
lifecycle, event cursor, controls, pending human-interaction state, and terminal
|
||||
state.
|
||||
|
||||
### `start_run`
|
||||
|
||||
Creates a Hermes-owned run.
|
||||
|
||||
Input fields:
|
||||
|
||||
- `session_id` or instruction to create one
|
||||
- user message / queued input
|
||||
- workspace context and attachments metadata
|
||||
- profile/provider/model/toolset hints
|
||||
- source/surface metadata, e.g. `source=webui`
|
||||
- optional command intent, e.g. `/goal` if parsed by WebUI command UI
|
||||
- idempotency key for duplicate browser submissions
|
||||
|
||||
Output fields:
|
||||
|
||||
- `run_id`
|
||||
- `session_id`
|
||||
- initial `status`
|
||||
- `observe` cursor / first event id
|
||||
- supported controls for this run
|
||||
|
||||
### `observe_run`
|
||||
|
||||
Streams ordered run events, with replay from a cursor.
|
||||
|
||||
Required behavior:
|
||||
|
||||
- support `after_seq` or `Last-Event-ID`
|
||||
- emit events in monotonically increasing per-run order
|
||||
- replay terminal `error` / `done` state for completed runs
|
||||
- make duplicate delivery safe for reconnecting clients
|
||||
- preserve enough history for short WebUI restarts and browser reloads
|
||||
|
||||
### `get_run`
|
||||
|
||||
Returns current lifecycle state without consuming the event stream.
|
||||
|
||||
Required fields:
|
||||
|
||||
- `run_id`, `session_id`, `status`
|
||||
- `created_at`, `updated_at`, optional `completed_at`
|
||||
- `last_seq` / `last_event_id`
|
||||
- active controls
|
||||
- pending approval/clarify summaries
|
||||
- terminal result/error summary
|
||||
- usage/model/provider/profile/toolset summary where available
|
||||
|
||||
### `cancel_run` / interrupt
|
||||
|
||||
Requests graceful run cancellation or interruption. Hermes owns the final state
|
||||
transition and emits a terminal event. WebUI should not directly toggle a local
|
||||
cancellation flag as the source of truth.
|
||||
|
||||
### `queue_or_continue`
|
||||
|
||||
Submits follow-up work for a live, paused, or resumable run/session. Semantics
|
||||
must match Hermes-native queue/continue behavior so WebUI does not create a
|
||||
parallel continuation model.
|
||||
|
||||
### `respond_approval`
|
||||
|
||||
Resolves a pending approval request by id.
|
||||
|
||||
Required behavior:
|
||||
|
||||
- validate the approval belongs to the run/session
|
||||
- accept only supported choices
|
||||
- emit `approval.resolved`
|
||||
- continue, pause, or fail the run according to Hermes approval semantics
|
||||
|
||||
### `respond_clarify`
|
||||
|
||||
Resolves a pending clarification request by id.
|
||||
|
||||
Required behavior:
|
||||
|
||||
- validate the clarify request belongs to the run/session
|
||||
- accept text or selected-choice payloads
|
||||
- emit `clarify.resolved`
|
||||
- continue or fail the run according to Hermes clarify semantics
|
||||
|
||||
## Gap matrix
|
||||
|
||||
| Current WebUI primitive | Current role | Hermes-owned target | Temporary shim allowed? | Notes / gap |
|
||||
|---|---|---|---|---|
|
||||
| `STREAMS` / `STREAMS_LOCK` | Process-local live stream registry and subscriber fan-out | Hermes run registry + `observe_run` replay/fan-out | Yes, adapter may keep per-browser SSE connections only | Shim must not be the run source of truth and must survive WebUI restart by re-observing Hermes. |
|
||||
| `CANCEL_FLAGS` | Local cancellation signal checked by WebUI-owned agent thread | `cancel_run` / interrupt control | No, except translating button clicks into runtime calls | Cancellation result must come back as Hermes status/events. |
|
||||
| `AGENT_INSTANCES` | Cached `AIAgent` objects inside WebUI process | Hermes Agent runtime owns agent construction/reuse | No | Keeping this in the adapter would recreate the runtime surrogate. |
|
||||
| Partial text buffers | Reconstruct live assistant deltas for browser reconnect/render | Hermes event log/cursor plus WebUI renderer cache | Short-lived presentation cache only | Source should be replayed token events or persisted transcript, not WebUI-only execution state. |
|
||||
| Reasoning buffers | Preserve streamed reasoning/thinking text | Hermes reasoning events + replay | Short-lived presentation cache only | Replay must rebuild the same thinking cards after refresh. |
|
||||
| Tool buffers / live tool calls | Render tool cards and updates | Hermes tool lifecycle events + replay | Short-lived presentation cache only | WebUI owns card rendering, not tool execution state. |
|
||||
| Approval callbacks and queues | Bridge WebUI buttons to a live Python callback | Hermes pending approval state + `respond_approval` | No private callback queue | Pending approval must be discoverable after WebUI restart. |
|
||||
| Clarify callbacks and queues | Bridge WebUI form to a live Python callback | Hermes pending clarify state + `respond_clarify` | No private callback queue | Pending clarify must be discoverable after WebUI restart. |
|
||||
| Command capability metadata | Decide which slash commands render/execute in WebUI | Hermes command registry/capability API with owner/surface metadata | WebUI may cache metadata | Unknown commands should not be reimplemented in WebUI by default. |
|
||||
| Session-to-active-run mapping | Stored implicitly in WebUI session JSON / active stream ids | Hermes session/run mapping API | WebUI may cache last seen run id | Reopen session must rediscover active/completed run from Hermes. |
|
||||
| Reconnect/replay behavior | Depends on WebUI process memory and session JSON | `observe_run(after_seq)` + `get_run` terminal state | Browser SSE adapter only | First milestone must prove WebUI restart does not orphan the run. |
|
||||
| Usage/title/status events | Produced by WebUI streaming callbacks | Hermes usage/title/status events and run state | WebUI formatting only | WebUI can display and persist presentation copies after events arrive. |
|
||||
| Goal / queue / continue hooks | Mixed WebUI command handling and streaming callbacks | Hermes command/control plane | Only UI affordance shim | Goal support should be driven by Hermes capabilities. |
|
||||
|
||||
## Migration ladder
|
||||
|
||||
1. **Inventory and contract**: keep this RFC current with the current WebUI-owned
|
||||
runtime primitives and browser event/control contract.
|
||||
2. **Hermes Runtime API / IPC v0**: add or stabilize upstream Hermes primitives
|
||||
for `start_run`, `observe_run`, `get_run`, `cancel_run`, and replayable event
|
||||
cursors.
|
||||
3. **Read-only observation spike**: from WebUI, observe an existing Hermes-owned
|
||||
run and adapt its events into WebUI-compatible event objects without starting
|
||||
a WebUI-owned agent thread.
|
||||
4. **Feature-flagged new-run path**: route new WebUI runs to Hermes-owned
|
||||
`start_run` behind a flag while preserving the legacy path as fallback.
|
||||
5. **Restart/reattach milestone**: prove a non-trivial WebUI-started run
|
||||
survives a WebUI-only restart and browser reload with ordered replay.
|
||||
6. **Controls migration**: move cancel, queue/continue, approval, clarify, and
|
||||
goal controls to Hermes-owned endpoints/capabilities.
|
||||
7. **Parity tests**: compare legacy and adapter event streams for synthetic
|
||||
token, reasoning, tool, approval, clarify, error, and done scenarios.
|
||||
8. **Retire runtime surrogate state**: remove normal WebUI chat ownership of
|
||||
`AIAgent`, cancellation flags, callback queues, and process-local run truth
|
||||
once parity and fallback criteria are satisfied.
|
||||
|
||||
## First success criterion
|
||||
|
||||
The first implementation milestone is not "basic chat streams through a new
|
||||
endpoint." The first meaningful milestone is:
|
||||
|
||||
1. Start a non-trivial chat run from WebUI through the Hermes-owned path.
|
||||
2. Restart only `hermes-webui` while the run is active.
|
||||
3. Reload or reopen the browser session.
|
||||
4. Rediscover the same `run_id` from Hermes using `session_id` or last known run
|
||||
metadata.
|
||||
5. Replay events from the last cursor with no duplicate visible transcript
|
||||
content.
|
||||
6. Render the same token/reasoning/tool/approval/clarify state the workbench
|
||||
would have rendered without the restart.
|
||||
7. Cancel the run from WebUI and observe Hermes emit the terminal cancelled
|
||||
state.
|
||||
|
||||
If this works, WebUI is moving toward a protocol translator over Hermes-owned
|
||||
execution instead of becoming another runtime with different variable names.
|
||||
|
||||
## Open questions
|
||||
|
||||
- Where should the normative Hermes Runtime API / IPC v0 spec live: in
|
||||
`NousResearch/hermes-agent`, this WebUI RFC, or both with one designated
|
||||
source of truth?
|
||||
- What retention window is enough for v0 event replay: active-run memory only,
|
||||
SQLite-backed event log, or transcript-derived reconstruction plus terminal
|
||||
state?
|
||||
- Should WebUI talk to Hermes over the existing API server, an embedded IPC
|
||||
channel, or a profile-local runtime socket?
|
||||
- How should multiple clients observing the same run coordinate controls and
|
||||
pending approval/clarify prompts?
|
||||
- Which slash commands need surface-specific capability metadata before WebUI
|
||||
can safely delegate them to Hermes?
|
||||
Reference in New Issue
Block a user