Skip to content

listener: only ack data events, never lifecycle frames — fixes the 1006 disconnect loop#7

Merged
vinodhalaharvi merged 1 commit into
mainfrom
claude/fix-ack-disconnect
May 25, 2026
Merged

listener: only ack data events, never lifecycle frames — fixes the 1006 disconnect loop#7
vinodhalaharvi merged 1 commit into
mainfrom
claude/fix-ack-disconnect

Conversation

@vinodhalaharvi-claude

Copy link
Copy Markdown
Collaborator

The bug (deterministic, code-side — NOT environmental)

readLoop acked every event with a non-nil Request:

if evt.Request != nil { l.sm.Ack(*evt.Request) }

But slack-go creates the hello lifecycle frame WITH a Request (newEvent(EventTypeHello, nil, req)). So on every connect, loom acked the hello — sending a Socket Mode response with an empty envelope ID ({"envelope_id":""}, visible in debug logs). Slack rejects that and closes the socket: websocket close 1006 (abnormal closure). Then: reconnect → hello → ack → 1006 → forever. No app_mention is ever delivered because the socket never survives past the hello.

This reproduced identically on two different networks and two different Slack apps precisely because it's deterministic protocol misuse, not a middlebox. The earlier 'environmental WebSocket flap' diagnosis was wrong — the cross-network identical failure was the tell.

The fix

Gate the ack on the event type, not merely Request != nil. Only data events are acknowledgeable — EventsAPI, SlashCommand, Interactive. Lifecycle frames (hello, connecting, connected, disconnect, errors) are never acked and never dispatched. Extracted into a pure ackable(EventType) bool so the rule is explicit and regression-tested. Matches slack-go's own handler, which dispatches only those three types.

Tests (listener_ack_test.go)

  • ackable() true only for EventsAPI/SlashCommand/Interactive; false for all lifecycle frames
  • explicit guard: ackable(Hello) must be false — acking hello is what killed the socket
Step Result
go vet / gofmt / staticcheck
go test -race ./...
go build ./...

Note on branch contents

This branch is stacked on claude/loom-claude-code-llm (PR #6, Claude Code default LLM), so it also contains that change. Both are wanted loom changes; merge #6 first if you want this to reduce to just the ack fix, or merge this and it carries both.

This unblocks the live Slack demo

With the socket no longer self-terminating on every hello, app_mention events are finally delivered. @loom say hello from slack round-trips: 👀 + 'Working on it…' → echoed result in the thread.

…isconnect loop

THE BUG (not environmental — deterministic, code-side):
readLoop acked every event with a non-nil Request:

    if evt.Request != nil { l.sm.Ack(*evt.Request) }

But slack-go creates the 'hello' lifecycle frame WITH a Request
(socket_mode_managed_conn.go: newEvent(EventTypeHello, nil, req)). So on
every connect, loom acked the hello — sending a Socket Mode response with
an empty envelope ID ({"envelope_id":""}, visible in debug logs).
Slack rejects that and closes the socket: websocket close 1006 (abnormal
closure). Reconnect → hello → ack → 1006, forever. No app_mention is ever
delivered because the socket never survives past the hello.

This reproduced identically on two different networks and two different
Slack apps precisely because it is deterministic protocol misuse, not a
middlebox. (Earlier diagnosis as 'environmental WebSocket flap' was
wrong; the cross-network identical failure was the tell.)

THE FIX:
Gate the ack on the event TYPE, not merely Request != nil. Only data
events are acknowledgeable — EventsAPI, SlashCommand, Interactive.
Lifecycle frames (hello, connecting, connected, disconnect, errors) are
never acked and never dispatched as user events. Extracted the decision
into a pure ackable(EventType) bool so the rule is explicit and
regression-tested. This matches slack-go's own handler, which only
dispatches those three types.

Tests (listener_ack_test.go):
  - ackable() true ONLY for EventsAPI/SlashCommand/Interactive; false for
    all lifecycle frames.
  - explicit guard: ackable(Hello) must be false (acking hello is what
    killed the socket).

CI: go vet, gofmt, staticcheck, go test -race ./..., go build ./... pass.
@vinodhalaharvi vinodhalaharvi merged commit 012e2a7 into main May 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants