Skip to content

fix(playlist-proxy): clear pending reconnect timer on stop (#1132)#1407

Open
jakebromberg wants to merge 1 commit into
mainfrom
fix/1132-playlist-proxy-stop
Open

fix(playlist-proxy): clear pending reconnect timer on stop (#1132)#1407
jakebromberg wants to merge 1 commit into
mainfrom
fix/1132-playlist-proxy-stop

Conversation

@jakebromberg

Copy link
Copy Markdown
Member

Closes #1132.

Problem

apps/backend/services/playlist-proxy.service.ts's EventSource 'error' handler unconditionally schedules a reconnect timer with no guard against stopPlaylistProxy having been called. Two failure modes:

  1. Cancellation race — an 'error' event queued in the EventLoop (or emitted by close() itself in some EventSource implementations) fires after stopPlaylistProxy(). The handler schedules a fresh reconnect that wakes up and reopens the upstream connection the operator just asked to tear down.

  2. Stacked timers — if multiple 'error' events fire in rapid succession before any reconnect runs (cascading TCP failure during a deploy disconnect), each handler reassigns reconnectTimer to a fresh setTimeout without clearing the prior handle. stopPlaylistProxy's clearTimeout(reconnectTimer) only cancels the most recent handle, so the earlier ones still fire and stack parallel SSE connections.

The second pattern also unnecessarily escalates reconnectDelay past MAX_RECONNECT_DELAY (the * 2 ran on every error, even ones whose timers were immediately cancelled).

Fix

  • Module-level stopped flag, set by stopPlaylistProxy and cleared by startPlaylistProxy / connectSSE. The 'error' handler returns early when stopped is true.
  • Before assigning a new reconnectTimer, clearTimeout the prior value — prevents the stacked-timer case directly even if the flag check is bypassed.
  • reconnectDelay = Math.min(reconnectDelay * 2, MAX_RECONNECT_DELAY) moved inside the guard so backoff only escalates when a reconnect is actually scheduled.
  • stopPlaylistProxy is now explicitly idempotent (documented; behavior unchanged for the single-call case).

Test plan

New unit suite tests/unit/services/playlist-proxy.stop.test.ts covers:

  • Error fired after stopPlaylistProxy does not schedule a reconnect (failure mode Flowsheet #1)
  • Queued reconnect timer is cleared by stopPlaylistProxy
  • Cascading errors followed by stop produce no parallel reconnects (failure mode Fixed error handling on /flowsheet/end and changed http method to POST #2)
  • Multiple stopPlaylistProxy calls in a row are idempotent
  • startPlaylistProxy after stop clears the stopped flag and re-arms reconnect behavior

Local CI:

  • npm run format:check — clean
  • npm run lint — 0 errors
  • npm run typecheck — clean
  • npm run test:unit — 3156/3156 passing

The EventSource 'error' handler unconditionally schedules a reconnect timer with no guard against `stopPlaylistProxy` having been called. Two failure modes:

1. If an 'error' event fires *after* `stopPlaylistProxy()` (queued in the EventLoop, or emitted by `close()` itself in some EventSource implementations), the handler schedules a fresh reconnect that wakes up and reopens the upstream connection the operator just asked to tear down.

2. If multiple 'error' events fire in rapid succession before any reconnect runs (cascading TCP failure during a deploy), each handler reassigns `reconnectTimer` to a fresh `setTimeout` — but the prior handles are still scheduled. `stopPlaylistProxy`'s `clearTimeout(reconnectTimer)` only clears the most recent one, so the earlier timers still fire and stack parallel SSE connections.

Fix:
- Module-level `stopped` flag, set by `stopPlaylistProxy` and cleared by `startPlaylistProxy`/`connectSSE`. The 'error' handler returns early when `stopped` is true.
- Before assigning a new `reconnectTimer`, `clearTimeout` any prior value (prevents stacking).
- Only escalate `reconnectDelay` when a reconnect is actually scheduled (moved inside the guard).
- `stopPlaylistProxy` is now explicitly idempotent.

Unit tests cover: error fired after stop (cancellation race), cascading errors then stop (stacked timers), repeated stop calls (idempotence), restart-after-stop (flag is cleared).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

playlist-proxy stopPlaylistProxy doesn't prevent queued reconnects

1 participant