Problem
The EventSource 'error' handler in playlist-proxy.service.ts unconditionally schedules a reconnect timer with no guard against stopPlaylistProxy having been called. Two failure modes:
-
If stopPlaylistProxy() runs before an error fires, the error handler still schedules reconnectTimer = setTimeout(() => connectSSE(), reconnectDelay). The timer fires after stopPlaylistProxy clears its own copy of reconnectTimer, and a fresh SSE connection opens with no operator instruction to do so.
-
If multiple 'error' events fire before any reconnect runs (rare but possible during cascading TCP failure), each handler reassigns reconnectTimer to a fresh setTimeout — but the prior setTimeout handle is still scheduled. Each prior setTimeout invokes connectSSE() independently. The stopPlaylistProxy clearTimeout(reconnectTimer) only clears the most recent handle. Stacked parallel SSE connections result.
The second pattern also tends to escalate reconnectDelay past MAX_RECONNECT_DELAY (Math.min(reconnectDelay * 2, MAX_RECONNECT_DELAY) runs on every error, even ones that lead to immediately-cancelled timers).
Evidence
apps/backend/services/playlist-proxy.service.ts:240-246:
es.addEventListener('error', () => {
console.error(`[playlist-proxy] SSE error, reconnecting in ${reconnectDelay}ms`);
es.close();
if (currentEventSource === es) currentEventSource = null;
reconnectTimer = setTimeout(() => connectSSE(), reconnectDelay);
reconnectDelay = Math.min(reconnectDelay * 2, MAX_RECONNECT_DELAY);
});
apps/backend/services/playlist-proxy.service.ts:168-178 (stop path only clears the latest handle):
export function stopPlaylistProxy(): void {
if (currentEventSource) { currentEventSource.close(); currentEventSource = null; }
if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; }
connected = false;
}
Reproduction
Pattern 1: call stopPlaylistProxy() while an error is queued in the EventLoop. Pattern 2: pre-deploy disconnect cascade where multiple 'error' events fire in rapid succession before any reconnect runs.
Acceptance criteria
Related
- The file's existing self-documenting comment block (lines 189-200) about why there is no app-level heartbeat — relevant context for whoever fixes this.
Problem
The
EventSource'error'handler inplaylist-proxy.service.tsunconditionally schedules a reconnect timer with no guard againststopPlaylistProxyhaving been called. Two failure modes:If
stopPlaylistProxy()runs before an error fires, the error handler still schedulesreconnectTimer = setTimeout(() => connectSSE(), reconnectDelay). The timer fires afterstopPlaylistProxyclears its own copy ofreconnectTimer, and a fresh SSE connection opens with no operator instruction to do so.If multiple
'error'events fire before any reconnect runs (rare but possible during cascading TCP failure), each handler reassignsreconnectTimerto a freshsetTimeout— but the priorsetTimeouthandle is still scheduled. Each priorsetTimeoutinvokesconnectSSE()independently. The stopPlaylistProxyclearTimeout(reconnectTimer)only clears the most recent handle. Stacked parallel SSE connections result.The second pattern also tends to escalate
reconnectDelaypastMAX_RECONNECT_DELAY(Math.min(reconnectDelay * 2, MAX_RECONNECT_DELAY)runs on every error, even ones that lead to immediately-cancelled timers).Evidence
apps/backend/services/playlist-proxy.service.ts:240-246:apps/backend/services/playlist-proxy.service.ts:168-178(stop path only clears the latest handle):Reproduction
Pattern 1: call
stopPlaylistProxy()while an error is queued in the EventLoop. Pattern 2: pre-deploy disconnect cascade where multiple'error'events fire in rapid succession before any reconnect runs.Acceptance criteria
stoppedflag; set instopPlaylistProxy, cleared instartPlaylistProxy/connectSSE. The'error'handler must check it before scheduling.reconnectTimer,clearTimeoutany prior value (prevents stacking).reconnectDelaywhen a reconnect is actually scheduled (move the line inside the guard).Related