Skip to content

fix(mls/api): XIP-83 Subscribe — flush send-error, history_only overlap, reusable send timer#563

Open
tylerhawkes wants to merge 1 commit into
mainfrom
tyler/xip-83-subscribe-fixes
Open

fix(mls/api): XIP-83 Subscribe — flush send-error, history_only overlap, reusable send timer#563
tylerhawkes wants to merge 1 commit into
mainfrom
tyler/xip-83-subscribe-fixes

Conversation

@tylerhawkes

@tylerhawkes tylerhawkes commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

XIP-83 Subscribe — three correctness/resource fixes

Follow-up to #562. While building the decentralized (d14n) binding of this same handler on xmtpd (xmtp/xmtpd#2020) and putting it through an independent multi-agent review, three issues surfaced that also exist in this v3 handler. (The d14n review caught them precisely because it was a fresh look at a sibling of code #562's own review had already passed — e.g. #562 fixed the flush timeout false-OK but not the send-error one below.)

1. flush() reported a false OK after a sender send-error

flush() returned nil on senderDone without checking sendErrCh. If the sender goroutine had already stopped on a stream.Send error, the graceful-completion path returned a clean OK while the wave's terminal frames (history tail + CatchupComplete) were never written — silent truncation with no reconnect signal. Now drains sendErrCh after senderDone and propagates the error (mirrors the existing timeout case, which already fails rather than false-OKs).

2. A second add overlapping an in-flight history_only wave

A history_only add never registers in subscribed/catchingUp, and there was no in-flight guard, so a second add for the same topic while its history_only catch-up was still running started a competing wave and reset the high-water floor — re-delivering history the first wave already sent (and orphaning it). Now rejected with InvalidArgument. (Covers the common no-remove overlap; a remove+re-add of such a topic is a narrower sequence dropTopic partially settles — noted in a comment.)

3. Per-send time.After allocation

send() allocated a time.After timer on every call (live for the whole pong deadline), accumulating under high throughput. Replaced with a reused per-session timer (send runs only on the writer goroutine).

Not changed (deliberately)

A cursor above MaxInt64 is left as-is#562 intentionally clamps it to "no history" with a clean close (TestSubscribe_CursorAboveInt64ReturnsNoHistory), and that decision is respected here. (The d14n binding rejects it instead, because its per-originator cursor path silently drops the entry and replays from 0 rather than clamping — a different, genuinely-buggy code path. Flagging the cross-binding inconsistency in case you want to align them.)

Tests

  • TestSubscribe_AddOverInflightHistoryOnlyRejected (new; gates the wave mid-fetch to hold it in flight).
  • Full TestSubscribe* suite green under -race; go vet + gofmt + golangci-lint clean.

The flush send-error and the timer reuse are exercised functionally by the existing suite (closures, not unit-testable in isolation); both are structurally identical to the equivalent fixes already tested on the d14n side.

🤖 Generated with Claude Code

Note

Fix flush send-error propagation, history_only overlap rejection, and reusable send timer in Subscribe

  • In flush() in subscribe.go, reads from sendErrCh after senderDone signals so the actual stream.Send error is returned instead of a false success.
  • Replaces per-call time.After in send() with a reusable sendTimer (reset each call, drained via a new stopTimer helper) to avoid stale timer events on subsequent resets.
  • Rejects an add request with InvalidArgument when the topic already has an in-flight history_only catch-up wave and is neither subscribed nor catching up.
  • Adds TestSubscribe_AddOverInflightHistoryOnlyRejected to verify the overlap rejection under a gated read store.

Macroscope summarized 4c511c7.

@tylerhawkes tylerhawkes requested a review from a team as a code owner June 22, 2026 21:02
@macroscopeapp

macroscopeapp Bot commented Jun 22, 2026

Copy link
Copy Markdown

Approvability

Verdict: Needs human review

This PR introduces a new validation rule that rejects requests targeting topics with in-flight history_only catch-ups - changing API behavior by failing requests that previously succeeded. Combined with the author not owning these files (owned by @xmtp/backend), human review is warranted.

You can customize Macroscope's approvability policy. Learn more.

@tylerhawkes tylerhawkes force-pushed the tyler/xip-83-subscribe-fixes branch from ced549f to b6d8022 Compare June 23, 2026 18:19
…ap guard, reusable send timer

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@tylerhawkes tylerhawkes force-pushed the tyler/xip-83-subscribe-fixes branch from b6d8022 to 4c511c7 Compare June 23, 2026 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants