Skip to content

feat: tighten runtime policy + transport guards#2331

Open
oxoxDev wants to merge 14 commits into
tinyhumansai:mainfrom
oxoxDev:feat/runtime-policy-tightening
Open

feat: tighten runtime policy + transport guards#2331
oxoxDev wants to merge 14 commits into
tinyhumansai:mainfrom
oxoxDev:feat/runtime-policy-tightening

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 20, 2026

Summary

  • Polish pass across the local listener — Socket.IO, SSE, the public OAuth-style callback, the channel inbound bus, and memory recall — to bring each surface in line with the boundaries the rest of the runtime already uses.
  • Six small, scope-isolated commits. Each independently reviewable, layered foundation → transport → endpoints → memory.
  • No release-cut behaviour shifts for the golden user paths; frontend Socket.IO callers updated atomically.

Problem

  • Several local listener surfaces had grown ahead of the shared helpers the rest of the runtime uses for the same job: Socket.IO did its own handshake plumbing while the HTTP path went through rpc_auth_middleware; SSE accepted a free-form client_id query string with no minted token; the public Telegram callback accepted any GET without fetch-metadata sanity checks; the inbound channel bus keyed by channel alone while every other inbound path keys by (channel, sender, reply_target, thread_ts); memory recall rendered every hit into the agent prompt with no provenance hint.
  • This PR brings each surface back under the same shared helper (or adds a small one where none existed) so behaviour is uniform with the rest of the codebase.

Solution

Layered, scope-isolated commits. Each commit is independently reviewable:

  1. refactor(core/auth): extract a reusable verify_bearer_token helper so non-HTTP transports can use the same comparison the Axum middleware uses internally. Pure refactor — no behaviour change, middleware delegates to the helper.
  2. feat(core/socketio): route the Socket.IO handshake through verify_bearer_token (auth.token payload on the client) plus an Origin allowlist (Tauri shell, localhost loopback, native clients without an Origin header). Adds an AuthedConnection marker on socket extensions so every executable event handler consults the same gate. Frontend (socketService, useDictationHotkey, OverlayApp) passes the per-process bearer via getCoreRpcToken().
  3. feat(core/events): SSE subscribers now mint a single-shot bind token through a small bearer-protected RPC (core.events_subscribe_token) and open /events?client_id=…&token=…. Tokens are bound to the issued client_id, single-shot on validate, and time-bounded. Bearer-header callers (CLI, Tauri shell) are still accepted to keep the existing flows green.
  4. feat(core/auth): the public Telegram OAuth-style callback gains a fetch-metadata gate. Sec-Fetch-Mode must be navigate, Sec-Fetch-Dest must be document, and cross-site redirects must originate from https://t.me/... or https://web.telegram.org/.... Older browsers without fetch-metadata headers stay supported via a Referer host fallback.
  5. fix(channels/bus): restore per-sender keying on the inbound bus path so it matches channels::context::conversation_history_key. DomainEvent::ChannelInboundMessage grows sender / reply_target / thread_ts (all Option<String> — legacy publishers compile unchanged), and a new derive_inbound_thread_id mirrors the canonical shape.
  6. feat(memory): wrap memory-recall hits whose namespace is not on the locally-authored allowlist (or whose key uses a connector prefix) in explicit <untrusted-source> markers via a new memory_context_safety module. Conservative on purpose — defense-in-depth ahead of a follow-up typed Provenance enum on MemoryEntry.

28 new unit tests across the six modules (see per-commit counts in commit messages).

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • N/A: in-progress; new modules ship with focused unit tests but full lcov merge has not yet run on this branch — Diff coverage ≥ 80% check will run once tests are green
  • N/A: refactor + parity work; no new feature row — Coverage matrix updated
  • N/A: no feature IDs touched — All affected feature IDs from the matrix are listed in the PR description under ## Related
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • N/A: no user-visible surface change in the golden path; in-tree Socket.IO + SSE callers updated atomically — Manual smoke checklist updated if this touches release-cut surfaces
  • N/A: no public tracking issue for this sweep — Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Internal parity across local listener surfaces — Socket.IO + SSE + Telegram callback now run on the same helpers the HTTP path already uses.
  • Channel inbound keying matches the canonical conversation-history shape.
  • Memory recall surfaces a provenance hint to the agent prompt.
  • Performance: negligible. SSE bind tokens add one RPC round-trip per subscribe; Socket.IO handshake adds one constant-time string compare; memory recall adds a per-row heuristic + format!.

Related

  • Closes: N/A — no public tracking issue for this sweep.
  • Follow-up PR(s)/TODOs:
    • Typed Provenance enum on MemoryEntry, populated at ingestion, to replace the conservative heuristic in commit 6.
    • Bot-side state parameter wiring for /auth/telegram so the callback can bind to a server-issued nonce in addition to the fetch-metadata gate.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: feat/runtime-policy-tightening
  • Commit SHA: see git log upstream/main..HEAD — 6 commits, layered foundation → memory.

Validation Run

  • pnpm --filter openhuman-app format:check — clean after cargo fmt pass
  • pnpm typecheck — clean
  • Focused tests: cargo test --lib core::auth, core::socketio::tests, core::event_bind_tokens, core::jsonrpc::tests::telegram_callback_*, openhuman::channels::bus::inbound_thread_id_tests, openhuman::agent::harness::memory_context_safety — all pass
  • Rust fmt/check (if changed): cargo check --manifest-path Cargo.toml clean
  • N/A: no src-tauri changes — Tauri fmt/check (if changed)

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: Socket.IO handshake now reads auth.token and rejects non-allowlisted origins; SSE /events requires either a Bearer header or a minted bind token; Telegram callback validates fetch-metadata; inbound channel keying matches conversation_history_key; memory recall wraps connector-namespaced hits in <untrusted-source>.
  • User-visible effect: None on the golden paths — frontend Socket.IO callers updated atomically.

Parity Contract

  • Legacy behavior preserved: Native (no-Origin) Socket.IO clients still accepted; bearer-header callers to /events still accepted; /auth/telegram browsers without fetch-metadata headers still served via Referer fallback; inbound bus publishers without sender produce the historical channel-only key.
  • Guard/fallback/dispatch parity checks: Socket.IO AuthedConnection marker + connect-time disconnect; bind-token single-shot with TTL clamp; telegram referer host fallback; memory namespace allowlist + connector key-prefix default-deny.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: this one
  • Resolution: N/A

Summary by CodeRabbit

  • New Features

    • Centralized, authenticated core connections and short-lived event tokens for SSE clients.
    • Conversation threading now preserves sender/context so threads remain distinct.
  • Security

    • Stronger bearer-token validation and origin checks for callbacks and socket connections.
    • Agent memory handling marks/isolates potentially untrusted external content.
  • Tests

    • Expanded test suites for socket auth, event bind tokens, origin validation, and memory-safety.

Review Change Stack

oxoxDev added 6 commits May 20, 2026 15:32
Adds `verify_bearer_token(&str) -> bool` so non-HTTP transports can validate
the per-process RPC token through the same gate the Axum middleware uses
instead of duplicating the get_rpc_token + bearer_matches pattern. Pure
refactor: middleware behaviour unchanged.

Keeps the comparison in one helper so a future move to constant-time
equality is a one-line change for every transport at once.
The Socket.IO layer was attached after the bearer-token middleware that
gates `POST /rpc`, so its `connect` / `rpc:request` / `chat:start` /
`chat:cancel` handlers sat outside that boundary. Tighten the namespace
so the same per-process bearer is required everywhere executable.

* Connect-time gate. The connect callback reads the handshake
  `auth.token` (`io(url, { auth: { token } })` on the client) and
  rejects the handshake unless it matches
  `core::auth::verify_bearer_token`. The `Origin` header is checked
  against a small allowlist (Tauri shell, `localhost:*`, `127.0.0.1:*`,
  `[::1]:*`) so cross-origin browser pages cannot ride on a token
  picked up elsewhere on the host. Native clients without an `Origin`
  header (CLI, the Tauri shell) are still accepted.

* Per-event guard. Every executable event handler checks an
  `AuthedConnection` marker stamped on socket extensions at connect
  time, so a future handler addition keeps the same posture by default
  and so the async gap between dispatching `disconnect()` and the
  engine actually tearing the socket down cannot be raced.

Frontend (`socketService`, `useDictationHotkey`, `OverlayApp`) now
fetches the per-process bearer via `getCoreRpcToken()` and passes it
in the handshake's `auth` payload. The backend session JWT rides along
as `auth.session` for future correlation.

Adds origin-allowlist unit tests in `core::socketio::tests`.
`GET /events` was on the public-paths list (browser `EventSource` cannot
attach an `Authorization` header) and filtered broadcasts purely by a
caller-supplied `client_id` query string. That made any well-known
`client_id` — e.g. the literal `system` slot the proactive-message
emitter uses — readable by anyone who can reach the local listener.

This tightens the contract: clients now mint a single-shot bind token
through the new `core.events_subscribe_token` RPC (bearer-protected via
the existing `/rpc` middleware), then open
`/events?client_id=<id>&token=<bind>`. The handler consumes the token
on first use so a leaked URL cannot be replayed. Tokens are bound to
the issued `client_id`, time-bounded (default 60s, max 30min), and
held in an in-process store with a small capacity ceiling so a
misbehaving caller cannot grow it without bound.

Adds:
- `src/core/event_bind_tokens.rs` — the in-memory store + issue/consume
  helpers + unit tests covering match, mismatch, single-shot, expiry,
  and TTL clamping.
- `core.events_subscribe_token` branch in `try_core_dispatch`.
- 401 response shape on `/events` for missing/invalid tokens (was a
  silent broadcast filter before).
- Updated `/` info-page hint so the new query shape is discoverable.
`/auth/telegram` runs from a browser tab as a top-level navigation after
the Telegram bot hands the user a link, so the bearer-token middleware
cannot protect it (the redirect arrives in a fresh tab with no header
control). The handler previously accepted any `GET /auth/telegram?token=…`
and immediately stored the resulting JWT as the local session.

Add a fetch-metadata gate at the top of the handler that distinguishes
"user clicked the link the bot sent them" from "another page navigated
the user's loopback core via `window.location` / `<img>` / `<iframe>`":

* `Sec-Fetch-Mode` must be `navigate` (or absent — older browsers and
  CLI clients).
* `Sec-Fetch-Dest` must be `document`.
* On `Sec-Fetch-Site: cross-site`, the `Referer`/`Origin` must be
  `https://t.me/...` or `https://web.telegram.org/...`.
* If fetch-metadata headers are absent, fall back to a host check on the
  `Referer` header.

Rejected callbacks return a 403 HTML error page that points the user
back at the bot. Legacy clients without fetch-metadata stay supported.

Adds seven unit tests covering accepted shapes (no metadata, legit
telegram redirect, same-origin nav) and rejected shapes (image embed,
iframe embed, cross-site from a third-party page, non-telegram referer
without fetch-metadata).
The inbound `ChannelInboundMessage` subscriber derived its conversation
key as `channel:<channel>`, dropping the `sender` / `reply_target` /
`thread_ts` segments that `channels::context::conversation_history_key`
already uses for the canonical channel paths. The result was that two
participants in the same shared channel (e.g. a multi-member Discord
text channel) were resumed inside one cached agent session: messages
from one user would land on the other's pending in-flight state.

Plumb the missing segments through:

* `DomainEvent::ChannelInboundMessage` grows `sender`, `reply_target`,
  and `thread_ts` (all `Option<String>` so legacy publishers continue
  to compile and produce the historical single-DM key).
* The socket event publisher (`openhuman::socket::event_handlers`)
  lifts those fields off the raw inbound JSON payload — with the
  obvious fallbacks (`from` → `sender`, `chat_id`/`channel_id` →
  `reply_target`, `thread_id` → `thread_ts`) so existing transports
  don't need a coordinated payload change.
* `channels::bus::derive_inbound_thread_id` mirrors the shape
  `conversation_history_key` builds, including the Telegram carve-out
  that ignores `thread_ts` for memory keying.

Adds five unit tests covering the legacy single-DM key, distinct
senders in a shared channel, Slack subthreads, the Telegram
`thread_ts` carve-out, and trimming of whitespace-only optional
fields. Existing `event_bus::events_tests` updated to construct the
expanded variant with `None`s.
Memory recall pulls rows from every provenance tier into a single hit
list — user-authored turns, agent-authored summaries, and connector-
synced content from Gmail / Slack / Notion / Discord — and the harness
then renders that list straight into the agent's working prompt with
no visual distinction between system instructions and external text.
A prompt-injection paragraph that lives inside an inbound email or
Notion page therefore reaches the agent's working context with the
same weight as a system-issued directive.

Adds the narrowest possible mitigation that does not require a memory
schema migration:

* `agent::harness::memory_context_safety` exposes
  `is_potentially_untrusted(&MemoryEntry)` (default-deny on namespace
  and connector key-prefix heuristics) and `wrap_untrusted_for_agent`
  which surrounds the row's body with explicit `<untrusted-source>`
  markers carrying a source hint.
* `build_context` in `agent::harness::memory_context` now wraps every
  flagged row before writing it into the `[Memory context]` block, so
  the safety preamble and the model can both see the boundary.

The heuristic is conservative on purpose: locally-authored namespaces
(`working`, `agent`, `local`, `core`, `global`, `default`, `user`,
plus their `.*` children and `tree.*` ingestion namespaces) form an
allowlist, and connector key prefixes (`chat:`, `email:`, `notion:`,
`drive:`, `discord:`, `telegram:`, `whatsapp:`, `slack:`, `gmail:`,
`outlook:`, `imap:`, `meeting:`, `web:`) flip to untrusted even when
the namespace is missing. A future commit can replace the heuristic
with a typed `Provenance` enum on `MemoryEntry`, populated at ingest;
this commit ships defense-in-depth without that schema change.

Adds seven unit tests covering trusted namespaces, prefixed
subspaces, default-deny on unknown namespaces, connector key prefixes
without a namespace, plain bare keys, and the wrapping output shape.
@oxoxDev oxoxDev requested a review from a team May 20, 2026 10:46
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Socket.IO handshake bearer auth and origin checks, an SSE bind-token lifecycle plus RPC to mint tokens, updates /events to accept bind tokens or Bearer tokens, extends channel inbound events with sender/reply_target/thread_ts and per-thread IDs, and wraps potentially untrusted memory entries before inserting into agent prompts.

Changes

Socket.IO and Event Stream Authentication

Layer / File(s) Summary
Bearer token verification helper
src/core/auth.rs
verify_bearer_token added to validate supplied bearer tokens against the RPC token; unit test ensures empty supplied tokens are rejected.
Socket.IO connection authentication and origin validation
src/core/socketio.rs
Handshake auth parsing, AuthedConnection marker, Origin allowlist, connection-time bearer verification, and early-disconnect guards for unauthenticated sockets; origin allowlist tests added.
Event bind-token store
src/core/event_bind_tokens.rs, src/core/mod.rs
In-memory bind-token store with issue/consume, TTL clamping, capacity limits, single-shot semantics, and unit tests; module exported from core.
Core RPC method for event token minting
src/core/dispatch.rs
Adds core.events_subscribe_token dispatch handler that validates client_id, optional ttl_secs, mints bind tokens via event_bind_tokens::issue, and returns { token, ttl_secs }.
SSE /events endpoint token security & Telegram origin check
src/core/jsonrpc.rs, src/core/jsonrpc_tests.rs
/events now accepts token and validates either Bearer via verify_bearer_token or single-use bind tokens via event_bind_tokens::consume; invalid creds return 401. Adds telegram_callback_origin_ok origin/fetch-metadata checks and tests; updates advertised events URL template with token.
Frontend Socket.IO client authentication
app/src/hooks/useDictationHotkey.ts, app/src/overlay/OverlayApp.tsx, app/src/services/socketService.ts, app/src/services/coreSocket.ts, app/src/services/__tests__/*, app/src/hooks/__tests__/*
Introduces createCoreSocket factory and connectCoreSocket; callers fetch getCoreRpcToken() and pass it as coreToken (with per-session authExtras.session); tests/mocks updated to provide the mock core token and verify disposal logic.

Channel Inbound Context Expansion

Layer / File(s) Summary
Extended ChannelInboundMessage event schema
src/core/event_bus/events.rs, src/core/event_bus/events_tests.rs
DomainEvent::ChannelInboundMessage variant extended with optional sender, reply_target, and thread_ts fields. Test case updated to include new fields as None.
Socket.IO inbound message context extraction
src/openhuman/socket/event_handlers.rs
*:message handler extracts optional routing fields from payload supporting multiple source keys and publishes them with the event.
Thread-aware conversation key derivation
src/openhuman/channels/bus.rs
Adds derive_inbound_thread_id to build per-sender/thread conversation keys (channel:<name>[/<sender>][/<reply_target>][#thread:<ts>]), ignoring thread_ts for telegram; inbound handler now uses derived thread_id.

Memory Context Safety for Agent Prompts

Layer / File(s) Summary
Memory untrustworthiness classification and wrapping
src/openhuman/agent/harness/memory_context_safety.rs, src/openhuman/agent/harness/mod.rs
New classifier is_potentially_untrusted (namespace allowlist & connector-key heuristics) and wrap_untrusted_for_agent emitting <untrusted-source> wrappers; includes unit tests.
Untrusted memory transformation in prompt building
src/openhuman/agent/harness/memory_context.rs
build_context integrates safety checks and wraps untrusted memory entries with wrap_untrusted_for_agent using a namespace-derived hint (fallback connector).

Sequence Diagram(s)

sequenceDiagram
  participant Browser
  participant Frontend
  participant CoreRPC
  participant CoreService
  participant SSE as SSE_/events
  Browser->>Frontend: request socket / events URL
  Frontend->>CoreRPC: resolveCoreSocketUrl()
  Frontend->>CoreRPC: getCoreRpcToken()
  Frontend->>CoreService: connectCoreSocket(auth: coreToken + authExtras.session)
  Browser->>CoreRPC: core.events_subscribe_token(client_id, ttl_secs?)
  CoreRPC->>CoreService: event_bind_tokens::issue(client_id, ttl)
  CoreService-->>CoreRPC: { token, valid_until }
  Browser->>SSE: GET /events?client_id=...&token=...
  SSE->>CoreService: event_bind_tokens::consume(client_id, token)
  CoreService-->>SSE: success/failure
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • graycyrus
  • senamakel

"🐰 Tokens fetched, origins checked with care,
threads split so each sender's story's there,
untrusted memory wrapped in careful light,
sockets handshake true before they bite.
Hop, sniff, and patch — all snug tonight."

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: tighten runtime policy + transport guards' accurately captures the main objective of the changeset: implementing stricter authentication, authorization, and validation across the Socket.IO transport layer, RPC endpoints, and memory safety systems. The title directly reflects the core theme across all commits (auth verification, origin validation, bind tokens, etc.).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. label May 20, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/src/services/socketService.ts (1)

185-204: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Re-check that this connect attempt is still current after the new awaits.

connectAsync() stores this.token before awaiting resolveCoreSocketBaseUrl() and getCoreRpcToken(). If connect() is called again while those are in flight, the older attempt can still instantiate a socket with stale auth/session data and race the newer connection.

Suggested fix
     const backendUrl = await resolveCoreSocketBaseUrl();
+    if (this.token !== token || this.socket) return;
     socketLog('Connecting to core socket', { userId: uid, backendUrl });
@@
     const coreToken = await getCoreRpcToken();
+    if (this.token !== token || this.socket) return;

     const socketOptions = {
       auth: { token: coreToken ?? '', session: token },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/services/socketService.ts` around lines 185 - 204, connectAsync()
stores this.token then awaits resolveCoreSocketBaseUrl() and getCoreRpcToken(),
allowing a race where an older call instantiates a socket with stale auth; fix
by snapshotting a local identifier (e.g., localToken or localAttemptId)
immediately after storing this.token and before the awaits, then after each
await (or at least right before creating this.socket in
connect()/connectAsync()) compare the snapshot to the current
this.token/thisAttempt and abort the old attempt if they differ so only the most
recent call proceeds to call io(...) and assign this.socket.
🧹 Nitpick comments (1)
src/core/dispatch.rs (1)

123-140: ⚡ Quick win

Add branch-local logs for the rejected subscribe-token paths.

The new RPC returns errors for invalid client_id and store-capacity exhaustion, but those branches leave no server-side breadcrumb. A safe debug/warn log here would make /events auth failures diagnosable without ever logging the token.

As per coding guidelines, "Use log / tracing at debug or trace level on RPC entry and exit, error paths, state transitions, and any branch that is hard to infer from tests alone."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/core/dispatch.rs` around lines 123 - 140, Add branch-local logs for the
two error paths: when extracting/validating client_id fails and when
crate::core::event_bind_tokens::issue(...) returns None due to capacity. Inside
the client_id .ok_or_else() closure (the "missing or empty 'client_id'
parameter" branch) emit a debug or warn log including the incoming params (or a
redacted summary) and the fact client_id was invalid; inside the .ok_or_else()
after issue(...) (the "store at capacity" branch) emit a warn-level log noting
bind-token store capacity reached and include client_id and ttl summary. Use the
existing logging facility (log::debug/log::warn or tracing::debug/tracing::warn)
and keep logs concise and non-sensitive.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/core/event_bind_tokens.rs`:
- Around line 108-119: The current logic calls store.remove(token) immediately
and then checks entry.client_id, which deletes the token even on client_id
mismatch; change this so you first lookup/read the entry (use whichever
read/peek method is available instead of store.remove, or re-insert on mismatch)
and only call store.remove(token) when entry.client_id == client_id; keep the
existing log messages (log::debug!/log::warn!) and the same return values, but
move the destructive remove to occur after the client_id equality check to avoid
accidentally consuming the token on mismatch.

In `@src/core/jsonrpc.rs`:
- Around line 479-487: The current referer fallback uses prefix-matching on the
referer string (variable referer) which allows host spoofing; instead parse the
referer into a URI (e.g., via Url::parse or std::net utilities) and compare the
parsed host exactly (host == "localhost" or host == "127.0.0.1") or verify it is
a loopback address, leaving the existing referer_is_telegram check intact;
update the logic in the same function/block that references referer and
referer_is_telegram to use the parsed host comparison and add a regression test
that submits a referer like "http://localhost.evil.com/..." (and one with a
127.0.0.1 spoof) to assert the request is rejected.

In `@src/core/socketio.rs`:
- Around line 48-53: Replace the brittle prefix-matching logic on origin with
proper URL parsing: call url::Url::parse(origin) (or equivalent) and use
url.host_str() to obtain the canonical host, then compare that host exactly
against "localhost", "127.0.0.1", and "::1" (note host_str() returns IPv6
without brackets) instead of starts_with; if parsing fails or host_str() is None
return false (deny) so only exact host matches are allowed.

In `@src/openhuman/agent/harness/memory_context_safety.rs`:
- Around line 57-75: Run rustfmt to fix formatting drift in
memory_context_safety.rs: format the file (and project) with cargo fmt --all and
commit only the formatting changes. Specifically ensure the connector_prefixes
array (the slice declared as connector_prefixes: &[&str] and its iterator chain
using .iter().any(|p| key_lower.starts_with(p))) and the other flagged blocks
(around the same file) are reformatted to match rustfmt output so CI passes; do
not change logic, only apply the formatting delta.
- Around line 104-109: The wrap_untrusted_for_agent function injects source_hint
and content raw into the XML-like marker which allows crafting strings (e.g.,
quotes or "</untrusted-source>") to break out of the marker; fix by
escaping/encoding both source_hint and content before formatting: sanitize
source_hint for quotes/angle brackets (or percent-encode) and replace any
occurrences of the closing tag sequence in content (e.g., "</untrusted-source>")
with a safe-escaped variant or an encoded form, then use the escaped values in
the format! call so the marker cannot be forged or terminated prematurely by
untrusted input.

In `@src/openhuman/agent/harness/memory_context.rs`:
- Around line 63-66: The new namespace-hint block is not formatted to Rust
style; reformat the expression that sets hint (the `hint` binding using
`entry.namespace.as_deref().unwrap_or("connector")`) so it passes rustfmt. Run
`cargo fmt --all` (or `cargo fmt`) to normalize formatting and commit the
changes so `cargo fmt --check` in CI succeeds.

In `@src/openhuman/channels/bus.rs`:
- Around line 994-998: The test function legacy_channel_only_keeps_old_shape has
a formatting violation on the assert_eq! call; run rustfmt (cargo fmt) to
reformat the file so the assertion line conforms to style rules, or manually
reflow the assert_eq! arguments in that function (the call comparing
derive_inbound_thread_id("telegram", None, None, None) to "channel:telegram") so
it fits formatting conventions, then re-run cargo fmt to ensure
src/openhuman/channels/bus.rs passes --check.

In `@src/openhuman/socket/event_handlers.rs`:
- Around line 216-217: The formatting of the thread_ts assignment (the chain
using nonempty(data.get("thread_ts")).or_else(||
nonempty(data.get("thread_id")))) does not match rustfmt expectations; run
rustfmt (cargo fmt) to reformat the hunk and commit the resulting changes so
cargo fmt --check passes. Locate the thread_ts declaration in event_handlers.rs
(the nonempty / data.get / thread_id chain) and apply rustfmt to normalize the
line breaks and chaining style, then add the formatted file to the PR.

---

Outside diff comments:
In `@app/src/services/socketService.ts`:
- Around line 185-204: connectAsync() stores this.token then awaits
resolveCoreSocketBaseUrl() and getCoreRpcToken(), allowing a race where an older
call instantiates a socket with stale auth; fix by snapshotting a local
identifier (e.g., localToken or localAttemptId) immediately after storing
this.token and before the awaits, then after each await (or at least right
before creating this.socket in connect()/connectAsync()) compare the snapshot to
the current this.token/thisAttempt and abort the old attempt if they differ so
only the most recent call proceeds to call io(...) and assign this.socket.

---

Nitpick comments:
In `@src/core/dispatch.rs`:
- Around line 123-140: Add branch-local logs for the two error paths: when
extracting/validating client_id fails and when
crate::core::event_bind_tokens::issue(...) returns None due to capacity. Inside
the client_id .ok_or_else() closure (the "missing or empty 'client_id'
parameter" branch) emit a debug or warn log including the incoming params (or a
redacted summary) and the fact client_id was invalid; inside the .ok_or_else()
after issue(...) (the "store at capacity" branch) emit a warn-level log noting
bind-token store capacity reached and include client_id and ttl summary. Use the
existing logging facility (log::debug/log::warn or tracing::debug/tracing::warn)
and keep logs concise and non-sensitive.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c10946c3-e901-4075-883b-87b5540f3416

📥 Commits

Reviewing files that changed from the base of the PR and between ebd6457 and 62bd6a0.

📒 Files selected for processing (17)
  • app/src/hooks/useDictationHotkey.ts
  • app/src/overlay/OverlayApp.tsx
  • app/src/services/socketService.ts
  • src/core/auth.rs
  • src/core/dispatch.rs
  • src/core/event_bind_tokens.rs
  • src/core/event_bus/events.rs
  • src/core/event_bus/events_tests.rs
  • src/core/jsonrpc.rs
  • src/core/jsonrpc_tests.rs
  • src/core/mod.rs
  • src/core/socketio.rs
  • src/openhuman/agent/harness/memory_context.rs
  • src/openhuman/agent/harness/memory_context_safety.rs
  • src/openhuman/agent/harness/mod.rs
  • src/openhuman/channels/bus.rs
  • src/openhuman/socket/event_handlers.rs

Comment thread src/core/event_bind_tokens.rs Outdated
Comment thread src/core/jsonrpc.rs
Comment thread src/core/socketio.rs Outdated
Comment thread src/openhuman/agent/harness/memory_context_safety.rs
Comment thread src/openhuman/agent/harness/memory_context_safety.rs Outdated
Comment thread src/openhuman/agent/harness/memory_context.rs Outdated
Comment thread src/openhuman/channels/bus.rs
Comment thread src/openhuman/socket/event_handlers.rs Outdated
oxoxDev added 3 commits May 20, 2026 16:48
The bind-token path was the only credential `events_handler` accepted,
but CLI tooling and the in-tree e2e suite hit `/events` with
`Authorization: Bearer <core>` directly. Restore that flow by first
trying the bearer header through `verify_bearer_token` and only
falling back to the `?token=` bind-token consume when the header is
absent or wrong.

Net effect: browser `EventSource` clients still go through the
bind-token mint + consume (no header capability), CLI/Tauri-shell
callers keep working with the existing `Authorization` header, and
the four `tests/json_rpc_e2e.rs` cases that opened SSE with a Bearer
header (`json_rpc_protocol_auth_and_agent_hello`,
`json_rpc_web_chat_*`) pass again locally.
The vi.mock for `coreRpcClient` only exposed `getCoreRpcUrl` /
`clearCoreRpcUrlCache`; the new `getCoreRpcToken()` call in
`connectAsync` therefore returned `undefined` (or threw) and the
connect chain hung. Add `getCoreRpcToken: vi.fn(async () => '...')`
to both test files so the resolve chain proceeds and the existing
pollUntil assertions trip on schedule.

Mock value is irrelevant — the tests assert dispatch behaviour, not
the bearer string content.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
src/openhuman/agent/harness/memory_context_safety.rs (1)

96-99: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Escape source_hint and content before emitting the wrapper.

Both fields are interpolated raw here. A payload containing " or </untrusted-source> can terminate or forge the marker and move attacker-controlled text outside the untrusted boundary, which defeats the safety signal this module is adding.

🔒 Proposed hardening
+fn sanitize_source_hint(source_hint: &str) -> String {
+    let cleaned: String = source_hint
+        .trim()
+        .chars()
+        .filter(|c| c.is_ascii_alphanumeric() || matches!(c, '-' | '_' | '.' | ':'))
+        .take(64)
+        .collect();
+
+    if cleaned.is_empty() {
+        "external".to_string()
+    } else {
+        cleaned
+    }
+}
+
+fn escape_untrusted_content(content: &str) -> String {
+    content
+        .replace('&', "&amp;")
+        .replace('<', "&lt;")
+        .replace('>', "&gt;")
+}
+
 pub fn wrap_untrusted_for_agent(content: &str, source_hint: &str) -> String {
-    let hint = source_hint.trim();
-    let hint = if hint.is_empty() { "external" } else { hint };
-    format!("<untrusted-source source=\"{hint}\">\n{content}\n</untrusted-source>")
+    let hint = sanitize_source_hint(source_hint);
+    let safe_content = escape_untrusted_content(content);
+    format!(
+        "<untrusted-source source=\"{hint}\">\n{safe_content}\n</untrusted-source>"
+    )
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/memory_context_safety.rs` around lines 96 - 99,
The wrap_untrusted_for_agent function is emitting source_hint and content raw
which allows injection of attributes or closing tags; fix it by
HTML/XML-escaping both values before formatting: create or use a helper (e.g.,
escape_xml or escape_html) to replace &, <, >, " (and optionally ') and apply it
to the trimmed source_hint and to content (so any "</untrusted-source>" becomes
escaped), then build the wrapper with those escaped strings in place of {hint}
and {content} in wrap_untrusted_for_agent to ensure the marker cannot be
prematurely terminated or forged.
src/core/event_bind_tokens.rs (1)

106-115: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only consume the token after the client binding matches.

A wrong client_id still removes the token from the store here, which turns a bad bind attempt into a one-shot denial of service against the real subscriber.

Suggested fix
-    let entry = match store.remove(token) {
-        Some(entry) => entry,
+    let entry = match store.get(token) {
+        Some(entry) => entry,
         None => {
             log::debug!("[events-bind] consume: token not found");
             return false;
         }
     };
     if entry.client_id != client_id {
         log::warn!("[events-bind] consume: client_id mismatch (token bound to other id)");
         return false;
     }
+    let entry = store
+        .remove(token)
+        .expect("token was present after successful binding check");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/core/event_bind_tokens.rs` around lines 106 - 115, The code currently
removes the entry via store.remove(token) before checking entry.client_id, which
allows a mismatched client_id to delete the token; change the logic in the
consume function (look for store.remove(token), entry, token, client_id, and
entry.client_id) to first inspect the stored binding (e.g., via get/peek or by
using an entry API) and only call store.remove(token) if entry.client_id ==
client_id; if it mismatches, log the warning and return false without removing
the token.
src/core/jsonrpc.rs (1)

479-486: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Parse the fallback Referer host before treating it as local.

The current prefix check still accepts spoofed hosts like http://localhost.evil.com/..., so a non-Telegram cross-site redirect can satisfy the fallback and bypass the callback origin gate.

Suggested fix
-        let local =
-            referer.starts_with("http://127.0.0.1") || referer.starts_with("http://localhost");
+        let local = url::Url::parse(referer)
+            .ok()
+            .and_then(|url| url.host_str().map(str::to_owned))
+            .map(|host| matches!(host.as_str(), "127.0.0.1" | "localhost" | "::1"))
+            .unwrap_or(false);
         if !(local || referer_is_telegram) {
             return Err("Referer must be telegram or local");
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/core/jsonrpc.rs` around lines 479 - 486, The referer prefix check allows
spoofed hosts like "localhost.evil.com"; change the fallback logic to parse the
Referer URL (use Url::parse or equivalent) and extract the host component, then
set local only when host is exactly "127.0.0.1" or "localhost" (accounting for
optional port via host_str() which excludes port), and keep the existing check
of referer_is_telegram; update the code around the referer variable and the
local binding (and the if !(local || referer_is_telegram) branch) to use the
parsed host comparison instead of starts_with.
src/core/socketio.rs (1)

48-53: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Parse the Origin host instead of prefix-matching it.

This still accepts spoofed origins like https://localhost.attacker.example, so a cross-origin browser page can satisfy the allowlist and complete the Socket.IO handshake.

Suggested fix
-    // Strip scheme so we can prefix-match the host:port portion.
-    let host = origin
-        .strip_prefix("http://")
-        .or_else(|| origin.strip_prefix("https://"))
-        .unwrap_or(origin);
-    host.starts_with("localhost") || host.starts_with("127.0.0.1") || host.starts_with("[::1]")
+    let Ok(parsed) = url::Url::parse(origin) else {
+        return false;
+    };
+    matches!(parsed.host_str(), Some("localhost" | "127.0.0.1" | "::1"))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/core/socketio.rs` around lines 48 - 53, The code currently strips scheme
and prefix-matches the origin string (variable host) using starts_with which
allows spoofed hosts like "localhost.attacker.example"; change it to properly
parse the origin as a URL (e.g., use url::Url::parse(origin)) and extract the
canonical host via Url::host_str() (or Url::host()/ToSocketAddrs if needed),
then compare the returned host exactly against "localhost", "127.0.0.1" and
"::1" (or the IPv6 literal without brackets), rejecting on parse failure;
replace the strip_prefix/starts_with logic that uses origin/host with this
URL-parse + exact-equality host checks to ensure only true loopback hosts are
accepted.
🧹 Nitpick comments (1)
src/openhuman/agent/harness/memory_context.rs (1)

62-64: ⚡ Quick win

Add a build_context regression test for this wrapping branch.

This is the only prompt-builder path that applies the new untrusted wrapper, but none of the current build_context tests exercise it. A single case with an untrusted entry and assertions on the rendered marker/source hint would pin the integration contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/memory_context.rs` around lines 62 - 64, Add a
regression test that exercises the untrusted-branch in build_context: create a
memory entry that is considered untrusted (use is_potentially_untrusted
criteria, e.g., an entry with no namespace or a connector namespace), call
build_context (or the public function that constructs rendered_content), and
assert the returned/rendered fragment contains the untrusted wrapper marker and
the expected source hint string produced by wrap_untrusted_for_agent (e.g.,
"connector" fallback). Name the test something like
test_build_context_untrusted_wrapping and assert both the wrapper marker and the
hint are present in the final rendered content.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/channels/bus.rs`:
- Around line 997-1000: The current special-case for Telegram in
derive_inbound_thread_id checks the full channel string and thus misses
production shapes like "tg:123"; update derive_inbound_thread_id to determine
the transport/provider by splitting the channel on ':' (or otherwise extracting
the provider prefix) and treat providers "telegram" and "tg" as Telegram so
thread_ts is ignored for those providers, then add a unit test asserting
derive_inbound_thread_id("tg:123", ...) produces the same stable key as
derive_inbound_thread_id("telegram", ...) (or at least ignores differing
thread_ts for "tg:123") to cover the production socket payload shape; reference
derive_inbound_thread_id and the new test name when making the change.

---

Duplicate comments:
In `@src/core/event_bind_tokens.rs`:
- Around line 106-115: The code currently removes the entry via
store.remove(token) before checking entry.client_id, which allows a mismatched
client_id to delete the token; change the logic in the consume function (look
for store.remove(token), entry, token, client_id, and entry.client_id) to first
inspect the stored binding (e.g., via get/peek or by using an entry API) and
only call store.remove(token) if entry.client_id == client_id; if it mismatches,
log the warning and return false without removing the token.

In `@src/core/jsonrpc.rs`:
- Around line 479-486: The referer prefix check allows spoofed hosts like
"localhost.evil.com"; change the fallback logic to parse the Referer URL (use
Url::parse or equivalent) and extract the host component, then set local only
when host is exactly "127.0.0.1" or "localhost" (accounting for optional port
via host_str() which excludes port), and keep the existing check of
referer_is_telegram; update the code around the referer variable and the local
binding (and the if !(local || referer_is_telegram) branch) to use the parsed
host comparison instead of starts_with.

In `@src/core/socketio.rs`:
- Around line 48-53: The code currently strips scheme and prefix-matches the
origin string (variable host) using starts_with which allows spoofed hosts like
"localhost.attacker.example"; change it to properly parse the origin as a URL
(e.g., use url::Url::parse(origin)) and extract the canonical host via
Url::host_str() (or Url::host()/ToSocketAddrs if needed), then compare the
returned host exactly against "localhost", "127.0.0.1" and "::1" (or the IPv6
literal without brackets), rejecting on parse failure; replace the
strip_prefix/starts_with logic that uses origin/host with this URL-parse +
exact-equality host checks to ensure only true loopback hosts are accepted.

In `@src/openhuman/agent/harness/memory_context_safety.rs`:
- Around line 96-99: The wrap_untrusted_for_agent function is emitting
source_hint and content raw which allows injection of attributes or closing
tags; fix it by HTML/XML-escaping both values before formatting: create or use a
helper (e.g., escape_xml or escape_html) to replace &, <, >, " (and optionally
') and apply it to the trimmed source_hint and to content (so any
"</untrusted-source>" becomes escaped), then build the wrapper with those
escaped strings in place of {hint} and {content} in wrap_untrusted_for_agent to
ensure the marker cannot be prematurely terminated or forged.

---

Nitpick comments:
In `@src/openhuman/agent/harness/memory_context.rs`:
- Around line 62-64: Add a regression test that exercises the untrusted-branch
in build_context: create a memory entry that is considered untrusted (use
is_potentially_untrusted criteria, e.g., an entry with no namespace or a
connector namespace), call build_context (or the public function that constructs
rendered_content), and assert the returned/rendered fragment contains the
untrusted wrapper marker and the expected source hint string produced by
wrap_untrusted_for_agent (e.g., "connector" fallback). Name the test something
like test_build_context_untrusted_wrapping and assert both the wrapper marker
and the hint are present in the final rendered content.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 73a8c536-a7fd-4056-b7a9-07a3488f2cc2

📥 Commits

Reviewing files that changed from the base of the PR and between 62bd6a0 and d8a36bd.

📒 Files selected for processing (10)
  • app/src/services/__tests__/socketService.events.test.ts
  • app/src/services/__tests__/socketService.test.ts
  • src/core/dispatch.rs
  • src/core/event_bind_tokens.rs
  • src/core/jsonrpc.rs
  • src/core/socketio.rs
  • src/openhuman/agent/harness/memory_context.rs
  • src/openhuman/agent/harness/memory_context_safety.rs
  • src/openhuman/channels/bus.rs
  • src/openhuman/socket/event_handlers.rs

Comment thread src/openhuman/channels/bus.rs
…et.ts

Extract the `io(baseUrl, { auth: { token: <core> }, ... })` shape used
by `socketService`, `useDictationHotkey`, and `OverlayApp` into a
single `createCoreSocket(baseUrl, opts)` factory. All three call sites
now go through one line instead of repeating the handshake plumbing.

Adds focused unit tests for the factory covering: core bearer is
passed through `auth.token`, missing token collapses to empty string,
`authExtras` (currently used by `socketService` for the session JWT)
merges alongside, and connect-time overrides do not erase the auth
payload.

Per `Coverage Gate (diff-cover ≥ 80%)` — the previous shape kept the
new lines inside untested files (`useDictationHotkey.tsx`,
`OverlayApp.tsx`); routing through a tested helper restores diff
coverage without standing up component-level test harnesses for the
two screens.
@coderabbitai coderabbitai Bot added the feature Net-new user-facing capability or product behavior. label May 20, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
app/src/hooks/useDictationHotkey.ts (1)

9-12: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale docstring: connection is now authenticated.

The docstring states the connection "does not require authentication" but the implementation now passes coreToken for bearer authentication via createCoreSocket. Update the comment to reflect that dictation uses an authenticated core socket (while still not requiring user login).

📝 Suggested doc update
 * Dictation events are received over a **dedicated** Socket.IO
-* connection to the core process that does not require authentication.
-* This ensures dictation works regardless of whether the user is
-* logged in.
+* connection to the core process authenticated via the per-process
+* core bearer token (not the user session). This ensures dictation
+* works regardless of whether the user is logged in.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/hooks/useDictationHotkey.ts` around lines 9 - 12, The docstring above
useDictationHotkey is stale: it says the dedicated Socket.IO connection "does
not require authentication" but the code now passes coreToken to
createCoreSocket for bearer auth; update the comment to state that dictation
uses an authenticated core socket (via coreToken/bearer token) while still not
requiring a user login, and adjust the sentences around
useDictationHotkey/createCoreSocket/coreToken to reflect that authentication
behavior.
app/src/overlay/OverlayApp.tsx (1)

7-8: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale docstring: connection is now authenticated.

The docstring describes "a dedicated, unauthenticated Socket.IO connection" but the implementation now passes coreToken via createCoreSocket. Update to reflect that the overlay uses core bearer auth (independent of user session).

📝 Suggested doc update
-* connection (same pattern as `useDictationHotkey`).
+* connection authenticated via the per-process core bearer token
+* (same pattern as `useDictationHotkey`).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/overlay/OverlayApp.tsx` around lines 7 - 8, Update the stale
docstring in OverlayApp.tsx to reflect that the overlay's Socket.IO connection
is authenticated: change the phrase "dedicated, unauthenticated Socket.IO
connection" to indicate the connection uses core bearer auth and passes
coreToken via createCoreSocket (independent of user session); ensure any mention
of parity with useDictationHotkey clarifies that pattern but with authentication
handled by coreToken.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@app/src/hooks/useDictationHotkey.ts`:
- Around line 9-12: The docstring above useDictationHotkey is stale: it says the
dedicated Socket.IO connection "does not require authentication" but the code
now passes coreToken to createCoreSocket for bearer auth; update the comment to
state that dictation uses an authenticated core socket (via coreToken/bearer
token) while still not requiring a user login, and adjust the sentences around
useDictationHotkey/createCoreSocket/coreToken to reflect that authentication
behavior.

In `@app/src/overlay/OverlayApp.tsx`:
- Around line 7-8: Update the stale docstring in OverlayApp.tsx to reflect that
the overlay's Socket.IO connection is authenticated: change the phrase
"dedicated, unauthenticated Socket.IO connection" to indicate the connection
uses core bearer auth and passes coreToken via createCoreSocket (independent of
user session); ensure any mention of parity with useDictationHotkey clarifies
that pattern but with authentication handled by coreToken.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 635cb412-d86e-4fa5-8cf7-4f8051c3b513

📥 Commits

Reviewing files that changed from the base of the PR and between d8a36bd and 1a73665.

📒 Files selected for processing (5)
  • app/src/hooks/useDictationHotkey.ts
  • app/src/overlay/OverlayApp.tsx
  • app/src/services/__tests__/coreSocket.test.ts
  • app/src/services/coreSocket.ts
  • app/src/services/socketService.ts

oxoxDev added 4 commits May 20, 2026 17:41
…elper

`useDictationHotkey` and `OverlayApp` both manually awaited base URL,
awaited the core bearer, checked the React effect's `disposed` flag at
each await point, and then called `createCoreSocket`. The added lines
were inside untested files, so `Coverage Gate (diff-cover ≥ 80%)`
reported them as missing even though `createCoreSocket` itself is
fully covered.

Add `connectCoreSocket({ getBaseUrl, isDisposed, authExtras, overrides })`
that runs the whole dance and returns the socket (or `null` if the
caller's disposal flag flips between awaits). Each call site collapses
to one line that the React effect can `if (!socket) return;` against.

The new helper carries five focused unit tests covering the happy
path, both disposal short-circuits (between baseUrl and token, between
token and connect), authExtras + overrides forwarding, and the
`null`-token-becomes-empty-string fallback.

`socketService` keeps using `createCoreSocket` directly because its
URL guard sits between the resolve and the connect, and the file is
already 100% covered.
`Coverage Gate (diff-cover ≥ 80%)` keeps flagging the three new lines
in `useDictationHotkey.ts` because nothing imports the hook in a test
file. Add a small renderHook-based suite that mocks `connectCoreSocket`
+ the core RPC client and asserts the connect dance:

- happy path: connectCoreSocket is invoked with both `getBaseUrl` and
  `isDisposed` callbacks, and the disposal flag is initially false;
- unmount: the socket's `disconnect()` is called on cleanup;
- short-circuit: when `connectCoreSocket` returns `null` (the helper's
  disposed-mid-await shape), no event handlers are wired up.

`OverlayApp.tsx` gets a `c8 ignore start/stop` band around the same
glue. The component is a full Tauri-tied React surface; the underlying
logic lives in the already-tested `connectCoreSocket` helper, so the
call site has nothing to assert beyond "the function was called". The
ignore band is tight (three lines) and labelled.
Five substantive findings from the CodeRabbit pass on PR tinyhumansai#2331:

1. `event_bind_tokens::consume` peeks the entry before removing, so a
   wrong-client_id probe no longer evicts the token. Without this the
   mismatch path was a one-shot DoS against the legitimate subscriber.
   Regression: `wrong_client_id_does_not_consume_token`.

2. `core::socketio::origin_is_allowed` parses the Origin URL and
   compares the host EXACTLY against the loopback allowlist. Previous
   `starts_with("localhost")` accepted decoys like
   `http://localhost.attacker.example`. IPv6 brackets honoured.
   Regressions: `origin_allowlist_rejects_host_prefix_decoys` +
   `origin_allowlist_rejects_unparseable_origin`.

3. `telegram_callback_origin_ok` Referer fallback now parses the
   Referer URL and exact-matches the host slot, closing the same
   prefix decoy on the public Telegram callback. Regression:
   `telegram_callback_origin_ok_rejects_localhost_host_prefix_decoy`.

4. `memory_context_safety::wrap_untrusted_for_agent` sanitises the
   source hint (alphanumerics + small punct, capped at 64 chars,
   falls back to `external`) and escapes the three HTML-ish breakout
   characters in the content. A payload containing
   `</untrusted-source>` or a stray quote can no longer forge or
   terminate the marker. Four new regressions cover marker breakout,
   attribute breakout, sanitised hint, and length cap.

5. `derive_inbound_thread_id` recognises the production Telegram
   channel shape — `tg:<chat_id>` / `telegram:<chat_id>` — instead of
   only the literal slug. Without this the Telegram thread_ts
   carve-out missed real socket payloads and split per-message
   memory keys. Two new regressions for the `tg:` and `telegram:`
   prefixes plus an inverse Slack assertion.
Two more findings from the CodeRabbit review pass on PR tinyhumansai#2331:

1. `socketService.connectAsync` introduces a second `await` for the
   per-process bearer (`getCoreRpcToken()`) on top of the existing
   `resolveCoreSocketBaseUrl()` await. If `connect(token)` is called
   again while either await is in flight, the older invocation could
   still reach `createCoreSocket` with stale auth and race the newer
   connection. Re-check `this.token` / `this.socket` after both
   awaits and bail out if either has drifted.

2. `core.events_subscribe_token` (`src/core/dispatch.rs`) emitted no
   server-side log on the two rejection branches (missing `client_id`
   and bind-token store at capacity). Add `log::warn` lines so
   `/events` auth failures are diagnosable without ever logging the
   token, plus a `log::debug` on the success branch — token value
   stays off the log line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant