listener: only ack data events, never lifecycle frames — fixes the 1006 disconnect loop#7
Merged
Merged
Conversation
…isconnect loop
THE BUG (not environmental — deterministic, code-side):
readLoop acked every event with a non-nil Request:
if evt.Request != nil { l.sm.Ack(*evt.Request) }
But slack-go creates the 'hello' lifecycle frame WITH a Request
(socket_mode_managed_conn.go: newEvent(EventTypeHello, nil, req)). So on
every connect, loom acked the hello — sending a Socket Mode response with
an empty envelope ID ({"envelope_id":""}, visible in debug logs).
Slack rejects that and closes the socket: websocket close 1006 (abnormal
closure). Reconnect → hello → ack → 1006, forever. No app_mention is ever
delivered because the socket never survives past the hello.
This reproduced identically on two different networks and two different
Slack apps precisely because it is deterministic protocol misuse, not a
middlebox. (Earlier diagnosis as 'environmental WebSocket flap' was
wrong; the cross-network identical failure was the tell.)
THE FIX:
Gate the ack on the event TYPE, not merely Request != nil. Only data
events are acknowledgeable — EventsAPI, SlashCommand, Interactive.
Lifecycle frames (hello, connecting, connected, disconnect, errors) are
never acked and never dispatched as user events. Extracted the decision
into a pure ackable(EventType) bool so the rule is explicit and
regression-tested. This matches slack-go's own handler, which only
dispatches those three types.
Tests (listener_ack_test.go):
- ackable() true ONLY for EventsAPI/SlashCommand/Interactive; false for
all lifecycle frames.
- explicit guard: ackable(Hello) must be false (acking hello is what
killed the socket).
CI: go vet, gofmt, staticcheck, go test -race ./..., go build ./... pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug (deterministic, code-side — NOT environmental)
readLoopacked every event with a non-nil Request:But slack-go creates the
hellolifecycle frame WITH a Request (newEvent(EventTypeHello, nil, req)). So on every connect, loom acked the hello — sending a Socket Mode response with an empty envelope ID ({"envelope_id":""}, visible in debug logs). Slack rejects that and closes the socket:websocket close 1006 (abnormal closure). Then: reconnect → hello → ack → 1006 → forever. Noapp_mentionis ever delivered because the socket never survives past the hello.This reproduced identically on two different networks and two different Slack apps precisely because it's deterministic protocol misuse, not a middlebox. The earlier 'environmental WebSocket flap' diagnosis was wrong — the cross-network identical failure was the tell.
The fix
Gate the ack on the event type, not merely
Request != nil. Only data events are acknowledgeable —EventsAPI,SlashCommand,Interactive. Lifecycle frames (hello, connecting, connected, disconnect, errors) are never acked and never dispatched. Extracted into a pureackable(EventType) boolso the rule is explicit and regression-tested. Matches slack-go's own handler, which dispatches only those three types.Tests (
listener_ack_test.go)ackable()true only for EventsAPI/SlashCommand/Interactive; false for all lifecycle framesackable(Hello)must be false — acking hello is what killed the socketgo vet/gofmt/staticcheckgo test -race ./...go build ./...Note on branch contents
This branch is stacked on
claude/loom-claude-code-llm(PR #6, Claude Code default LLM), so it also contains that change. Both are wanted loom changes; merge #6 first if you want this to reduce to just the ack fix, or merge this and it carries both.This unblocks the live Slack demo
With the socket no longer self-terminating on every hello,
app_mentionevents are finally delivered.@loom say hello from slackround-trips: 👀 + 'Working on it…' → echoed result in the thread.