fix(playlist-proxy): clear pending reconnect timer on stop (#1132)#1407
Open
jakebromberg wants to merge 1 commit into
Open
fix(playlist-proxy): clear pending reconnect timer on stop (#1132)#1407jakebromberg wants to merge 1 commit into
jakebromberg wants to merge 1 commit into
Conversation
The EventSource 'error' handler unconditionally schedules a reconnect timer with no guard against `stopPlaylistProxy` having been called. Two failure modes: 1. If an 'error' event fires *after* `stopPlaylistProxy()` (queued in the EventLoop, or emitted by `close()` itself in some EventSource implementations), the handler schedules a fresh reconnect that wakes up and reopens the upstream connection the operator just asked to tear down. 2. If multiple 'error' events fire in rapid succession before any reconnect runs (cascading TCP failure during a deploy), each handler reassigns `reconnectTimer` to a fresh `setTimeout` — but the prior handles are still scheduled. `stopPlaylistProxy`'s `clearTimeout(reconnectTimer)` only clears the most recent one, so the earlier timers still fire and stack parallel SSE connections. Fix: - Module-level `stopped` flag, set by `stopPlaylistProxy` and cleared by `startPlaylistProxy`/`connectSSE`. The 'error' handler returns early when `stopped` is true. - Before assigning a new `reconnectTimer`, `clearTimeout` any prior value (prevents stacking). - Only escalate `reconnectDelay` when a reconnect is actually scheduled (moved inside the guard). - `stopPlaylistProxy` is now explicitly idempotent. Unit tests cover: error fired after stop (cancellation race), cascading errors then stop (stacked timers), repeated stop calls (idempotence), restart-after-stop (flag is cleared).
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1132.
Problem
apps/backend/services/playlist-proxy.service.ts's EventSource'error'handler unconditionally schedules a reconnect timer with no guard againststopPlaylistProxyhaving been called. Two failure modes:Cancellation race — an
'error'event queued in the EventLoop (or emitted byclose()itself in some EventSource implementations) fires afterstopPlaylistProxy(). The handler schedules a fresh reconnect that wakes up and reopens the upstream connection the operator just asked to tear down.Stacked timers — if multiple
'error'events fire in rapid succession before any reconnect runs (cascading TCP failure during a deploy disconnect), each handler reassignsreconnectTimerto a freshsetTimeoutwithout clearing the prior handle.stopPlaylistProxy'sclearTimeout(reconnectTimer)only cancels the most recent handle, so the earlier ones still fire and stack parallel SSE connections.The second pattern also unnecessarily escalates
reconnectDelaypastMAX_RECONNECT_DELAY(the* 2ran on every error, even ones whose timers were immediately cancelled).Fix
stoppedflag, set bystopPlaylistProxyand cleared bystartPlaylistProxy/connectSSE. The'error'handler returns early whenstoppedis true.reconnectTimer,clearTimeoutthe prior value — prevents the stacked-timer case directly even if the flag check is bypassed.reconnectDelay = Math.min(reconnectDelay * 2, MAX_RECONNECT_DELAY)moved inside the guard so backoff only escalates when a reconnect is actually scheduled.stopPlaylistProxyis now explicitly idempotent (documented; behavior unchanged for the single-call case).Test plan
New unit suite
tests/unit/services/playlist-proxy.stop.test.tscovers:stopPlaylistProxydoes not schedule a reconnect (failure mode Flowsheet #1)stopPlaylistProxystopPlaylistProxycalls in a row are idempotentstartPlaylistProxyafter stop clears thestoppedflag and re-arms reconnect behaviorLocal CI:
npm run format:check— cleannpm run lint— 0 errorsnpm run typecheck— cleannpm run test:unit— 3156/3156 passing