Skip to content

fix(openresponses): bound resume-stream buffer and enforce response ownership#10569

Merged
mudler merged 1 commit into
masterfrom
fix/openresponses-stream-buffer-and-ownership
Jun 28, 2026
Merged

fix(openresponses): bound resume-stream buffer and enforce response ownership#10569
mudler merged 1 commit into
masterfrom
fix/openresponses-stream-buffer-and-ownership

Conversation

@localai-bot

Copy link
Copy Markdown
Collaborator

Description

Two latent issues in the OpenResponses background=true resumable-stream path, found during a design review.

1. Unbounded resume-stream buffer (memory risk)

AppendEvent appended to StoredResponse.StreamEvents without any limit. A long-running or abandoned background generation (especially one no client ever resumes) grew this buffer indefinitely and could exhaust process memory.

The store now caps the per-response resume buffer by event count (default 8192) and total serialized bytes (default 64 MiB), mirroring llama.cpp's byte-capped slot ring. When a cap is exceeded the oldest events are evicted from the front (their payloads nil'd so they can be GC'd) and a droppedThrough watermark advances.

Offset-lost handling: GetEventsAfter now returns the sentinel ErrOffsetLost when the requested starting_after is below the dropped watermark, i.e. the events the client expects next were evicted. handleStreamResume checks this before writing SSE headers and returns HTTP 409 with a clear message, so a resuming client gets an explicit error instead of a stream that silently skips the gap.

Caps live on the store (defaults from package constants, lowerable in tests). A package constant is used for v1 rather than new config plumbing, per the small-PR scope.

2. Response-ownership check (latent IDOR)

GET /responses/:id, its ?stream=true resume, and /cancel looked up responses purely by ID with no ownership check, so any caller who knew or guessed an ID could read or cancel another caller's response/stream.

Responses now carry the creating caller's identity (auth.GetUser(c).ID), stamped at creation via store.SetOwner before the ID is handed back to the client, and compared on read / cancel / resume. On mismatch the handlers return 404 (not 403) so the existence of another caller's response is not leaked.

Backward compatible: the check is gated on a non-empty owner. In single-key / no-auth deployments auth.GetUser returns nil, the owner is empty, and existing behavior is preserved (the response stays accessible).

Notes for Reviewers

Both fixes are included. The ownership fix was assessed as small and clean before implementing:

  • Per-request identity is reliably available via auth.GetUser(c) (the same source the usage/billing middleware uses).
  • Every store call site (StoreBackground + the three synchronous Store calls) already has the echo context, so owner stamping needed no new plumbing.
  • The store stays auth-agnostic: it only holds an opaque Owner string and a SetOwner setter; the identity comparison (accessAllowed) and the auth dependency live in the handler layer.

TDD: store_test.go gains two Ginkgo specs - one that appends past a lowered cap and asserts the buffer stays bounded, oldest events are evicted, and GetEventsAfter below the watermark returns ErrOffsetLost; and one that asserts owner stamping + the allow/deny decision (including the empty-owner backward-compat path). Both fail before the change.

gofmt, go vet, golangci-lint (new-from-merge-base), and go test ./core/http/endpoints/openresponses/... all pass.

Signed commits

  • Yes, I signed my commits.

…wnership

The background=true resumable-stream path had two latent issues.

1. Unbounded resume buffer. AppendEvent grew StreamEvents without limit, so
   a long-running or abandoned background generation could consume process
   memory without bound. The store now caps the buffer (event count and total
   bytes, mirroring llama.cpp's byte-capped slot ring), evicting oldest events
   from the front and advancing a droppedThrough watermark. GetEventsAfter
   returns ErrOffsetLost when the requested starting_after is below the
   watermark, and handleStreamResume surfaces that as HTTP 409 before
   committing to the SSE response, so a resuming client gets a clear error
   instead of a silently truncated stream.

2. Missing ownership check (IDOR). GET /responses/:id, its stream resume, and
   /cancel looked up responses purely by ID, letting any caller who knows or
   guesses an ID read or cancel another caller's response. Responses now carry
   the creating caller's identity (auth.GetUser), stamped at creation and
   compared on read/cancel/resume; a mismatch returns 404 (not 403) so
   existence is not leaked. Backward compatible: responses with no owner
   (single-key / no-auth deployments) remain accessible.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
@mudler mudler merged commit ade9cc9 into master Jun 28, 2026
61 checks passed
@mudler mudler deleted the fix/openresponses-stream-buffer-and-ownership branch June 28, 2026 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants