Refactor UDP server batching behind adapters#277
Conversation
📝 WalkthroughWalkthroughThis PR refactors UDP networking abstractions by migrating from a single-packet read model to batch-oriented I/O. It relocates UDP listener interfaces from the Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler as TransportHandler
participant Ingress as Batch Ingress
participant UDPConn as UDP Connection
participant TUN as TUN Device
loop Batch Processing (every event loop iteration)
Handler->>Ingress: ReadBatch(32 packets buffer)
Ingress->>UDPConn: ReadBatch via IPv4/IPv6 PacketConn
UDPConn-->>Ingress: [packet count, addrs, payloads, flags]
Ingress-->>Handler: populated Packet array
loop Per Packet in Batch
Handler->>Handler: handlePacket(addr, data)
Handler->>TUN: write encrypted response
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a420e84da7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| n, err := read(*cache) | ||
| if err != nil { | ||
| return 0, err |
There was a problem hiding this comment.
Handle partial batches when
ReadBatch also returns an error
readBatch drops the entire batch whenever read(*cache) returns a non-nil error, but ipv4/ipv6.PacketConn.ReadBatch on Linux can return n > 0 together with an error from recvmmsg. Since createUDPWorker now routes all server UDP traffic through NewBatchingServerAdapter, any interrupted or partially failing batched read will silently discard up to 32 datagrams that were already dequeued from the socket, which can break a handshake burst or lose in-flight tunnel traffic. This path should preserve the first n messages before surfacing the error.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/infrastructure/network/udp/adapters/client_adapter_test.go (2)
91-106: Minor: Same pattern – server socket leaks.Same issue as
TestWriteAfterClose: the server socket is discarded without closing.Suggested fix
func TestReadTimeout(t *testing.T) { - ad, _ := newPair(t) + ad, srv := newPair(t) + defer func() { _ = srv.Close() }() defer func(ad *ClientAdapter) { _ = ad.Close() }(ad)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/infrastructure/network/udp/adapters/client_adapter_test.go` around lines 91 - 106, TestReadTimeout leaks the server socket by discarding it from newPair; update the test to close the server side just like TestWriteAfterClose: have newPair return both client and server adapters (or expose the server adapter) and add a defer to call server.Close() (or call Close() on the returned server adapter) before exiting the test so the server socket is properly closed; reference newPair, TestReadTimeout and ClientAdapter to locate the code to change.
82-89: Minor: Server socket not closed in test.The server socket returned by
newPairis discarded without closing. While Go's test runner will clean up eventually, explicitly closing resources is good test hygiene and avoids resource exhaustion in larger test suites.Suggested fix
func TestWriteAfterClose(t *testing.T) { - ad, _ := newPair(t) + ad, srv := newPair(t) + defer func() { _ = srv.Close() }() _ = ad.Close() if _, err := ad.Write([]byte("x")); err == nil {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/infrastructure/network/udp/adapters/client_adapter_test.go` around lines 82 - 89, TestWriteAfterClose currently discards the server socket returned by newPair; capture the second return value (e.g., srv) when calling newPair in TestWriteAfterClose and explicitly close it (either with defer srv.Close() immediately after creation or an explicit srv.Close() before test end) to ensure the server socket is released; update the call sites in TestWriteAfterClose to use the two-value return from newPair and close the server resource accordingly.src/infrastructure/network/udp/adapters/server_adapter.go (1)
15-15: Consider removing or reducing theoobbuffer allocation.The 8KB
oobbuffer is allocated in everyServerAdapterinstance but the OOB data is never used - bothReadpaths discard theoobnreturn value. Since ancillary data isn't needed, you could either remove this field entirely and passnilfor the OOB parameter, or significantly reduce its size.♻️ Suggested simplification
type ServerAdapter struct { conn listeners.UdpListener addrPort netip.AddrPort readBuffer [settings.DefaultEthernetMTU + settings.UDPChacha20Overhead]byte - oob [8 * 1024]byte }Then update both
ReadMsgUDPAddrPortcalls to passnilfor the OOB parameter:- n, _, _, _, err := a.conn.ReadMsgUDPAddrPort(buffer[:len(a.readBuffer)], a.oob[:]) + n, _, _, _, err := a.conn.ReadMsgUDPAddrPort(buffer[:len(a.readBuffer)], nil)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/infrastructure/network/udp/adapters/server_adapter.go` at line 15, The oob [8 * 1024]byte field on ServerAdapter is unused (oobn is discarded), so remove the field (or shrink it) and update both calls to conn.ReadMsgUDPAddrPort to pass nil for the OOB parameter instead of the oob buffer; modify the ServerAdapter struct and any constructor/initialization that allocates oob and remove related unused variables (oob/oobn) so the code no longer allocates the 8KB per instance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/infrastructure/network/udp/adapters/batching_server_adapter.go`:
- Around line 76-78: The branch that converts a zero-message ReadBatch result
into io.ErrNoProgress should be changed: when count == 0 return 0, 0, 0,
netip.AddrPort{}, nil (i.e., treat zero reads as non-error) or implement a short
internal retry loop before returning an error; update the code path that
currently returns io.ErrNoProgress (the count==0 branch handling the ReadBatch
result) so callers receive nil for "no data" instead of io.ErrNoProgress, and
add a comment clarifying this non-blocking/no-data behavior in the same
function.
---
Nitpick comments:
In `@src/infrastructure/network/udp/adapters/client_adapter_test.go`:
- Around line 91-106: TestReadTimeout leaks the server socket by discarding it
from newPair; update the test to close the server side just like
TestWriteAfterClose: have newPair return both client and server adapters (or
expose the server adapter) and add a defer to call server.Close() (or call
Close() on the returned server adapter) before exiting the test so the server
socket is properly closed; reference newPair, TestReadTimeout and ClientAdapter
to locate the code to change.
- Around line 82-89: TestWriteAfterClose currently discards the server socket
returned by newPair; capture the second return value (e.g., srv) when calling
newPair in TestWriteAfterClose and explicitly close it (either with defer
srv.Close() immediately after creation or an explicit srv.Close() before test
end) to ensure the server socket is released; update the call sites in
TestWriteAfterClose to use the two-value return from newPair and close the
server resource accordingly.
In `@src/infrastructure/network/udp/adapters/server_adapter.go`:
- Line 15: The oob [8 * 1024]byte field on ServerAdapter is unused (oobn is
discarded), so remove the field (or shrink it) and update both calls to
conn.ReadMsgUDPAddrPort to pass nil for the OOB parameter instead of the oob
buffer; modify the ServerAdapter struct and any constructor/initialization that
allocates oob and remove related unused variables (oob/oobn) so the code no
longer allocates the 8KB per instance.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: afce0675-b1c9-4998-ac1b-cf48053c99e7
📒 Files selected for processing (8)
src/infrastructure/PAL/tunnel/server/server_worker_factory_linux.gosrc/infrastructure/network/udp/adapters/batching_server_adapter.gosrc/infrastructure/network/udp/adapters/client_adapter.gosrc/infrastructure/network/udp/adapters/client_adapter_test.gosrc/infrastructure/network/udp/adapters/server_adapter.gosrc/infrastructure/network/udp/adapters/server_adapter_test.gosrc/infrastructure/network/udp/adapters/server_udp_adapter.gosrc/infrastructure/tunnel/sessionplane/client_factory/worker_factory.go
💤 Files with no reviewable changes (1)
- src/infrastructure/network/udp/adapters/server_udp_adapter.go
| if count == 0 { | ||
| return 0, 0, 0, netip.AddrPort{}, io.ErrNoProgress | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
golang.org/x/net ipv4 ReadBatch return 0 messages when
💡 Result:
golang.org/x/net/ipv4 ReadBatch returns 0 messages (nread == 0, with nil error) in these main cases: 1. No packets available on the socket and the call is non-blocking (O_NONBLOCK set on fd, or deadline exceeded). recvmsg/recvmmsg return 0. 2. Invalid connection (!c.ok), but this returns errInvalidConn error, not 0. 3. Buffers in Message slices too small to hold incoming packets (e.g., smaller than IP header + payload). Underlying recvmsg may partially fill or drop data, potentially leading to 0 valid messages. 4. On non-Linux platforms, it degenerates to single-message read; if that fails with no data or buffer issue, n=0. 5. Socket filters (ICMPFilter) dropping packets before they reach the application. 6. Wrong socket configuration (e.g., wrong protocol like ip4:icmp vs ip4:udp, interface binding, permissions). To debug: check if blocking/non-blocking via deadlines, ensure each Message.Buffers[0] >= 1500+ bytes (Ethernet MTU), verify socket receives traffic (tcpdump), confirm c.LocalAddr.Network == "ip4:". On Linux, uses recvmmsg; elsewhere, single recvmsg.
Citations:
- 1: https://docs-go.hexacode.org/src/golang.org/x/net/ipv4/batch.go?s=241%3A342
- 2: https://pkg.go.dev/golang.org/x/net/ipv4
- 3: https://pkg.go.dev/golang.org/x/net/internal/socket
- 4: https://www.unix.com/man-page/opensolaris/3XNET/recv?os=opensolaris§ion=3XNET&query=recv
- 5: https://man.netbsd.org/NetBSD-11.x-BRANCH/recvmsg.2
- 6: https://illumos.org/man/3XNET/recvmsg
- 7: https://forum.golangbridge.org/t/unable-to-read-from-unix-socket-using-net-conn-read/22120
Reconsider returning io.ErrNoProgress for zero-count reads from non-blocking sockets.
The underlying ReadBatch returns 0 messages with a nil error when no packets are available on non-blocking sockets (normal transient condition), not as an exceptional case. Converting this to io.ErrNoProgress misrepresents the condition—it signals a resource constraint error rather than temporary unavailability. While the transport handler gracefully handles this, other callers may not expect an error for normal non-blocking I/O behavior.
Either:
- Retry internally before returning an error, or
- Return nil when count==0 (allowing callers to distinguish "no data" from true errors)
- Document this behavior clearly if the error return is intentional
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/infrastructure/network/udp/adapters/batching_server_adapter.go` around
lines 76 - 78, The branch that converts a zero-message ReadBatch result into
io.ErrNoProgress should be changed: when count == 0 return 0, 0, 0,
netip.AddrPort{}, nil (i.e., treat zero reads as non-error) or implement a short
internal retry loop before returning an error; update the code path that
currently returns io.ErrNoProgress (the count==0 branch handling the ReadBatch
result) so callers receive nil for "no data" instead of io.ErrNoProgress, and
add a comment clarifying this non-blocking/no-data behavior in the same
function.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/infrastructure/network/udp/adapters/batching_server_ingress.go (2)
58-67: Consider defensive handling for nilDataslices.If a caller passes a
Packetwith a nilDataslice, lines 60-62 will result inbufferremaining nil (sincecap(nil) == 0), which would then be passed to the batch source. While the current usage intransport_handler.goalways pre-allocates buffers, adding a defensive check or documenting this precondition would improve robustness.📝 Optional: Add nil check or document precondition
for i := range packets { buffer := packets[i].Data + if buffer == nil { + // Caller must provide pre-allocated buffers + continue + } if cap(buffer) > 0 { buffer = buffer[:cap(buffer)] } r.batch[i].Buffer = bufferAlternatively, document the precondition in the interface contract.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/infrastructure/network/udp/adapters/batching_server_ingress.go` around lines 58 - 67, The loop assumes Packet.Data is non-nil; defend against nil by checking Packet.Data before slicing: for each packets[i], if packets[i].Data != nil and cap(packets[i].Data) > 0 then set buffer = packets[i].Data[:cap(packets[i].Data)] else set buffer to an explicit empty slice (e.g., make([]byte,0)) and assign that to r.batch[i].Buffer; update usage around packets, Packet.Data and r.batch[i].Buffer so nil Data never propagates into the batch.
36-41: The dual-stack socket behavior is intentional and correct, but consider adding a clarifying comment.The code correctly handles dual-stack sockets: the system binds to
::on dual-stack-capable systems (as determined bylistenFallbackIP()), andipv6.PacketConnon a dual-stack socket properly accepts both IPv4 and IPv6 traffic. No bug exists, but the intentional dual-stack routing toipv6BatchSourcedeserves a brief code comment to prevent future confusion.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/infrastructure/network/udp/adapters/batching_server_ingress.go` around lines 36 - 41, The newBatchSource function's selection of ipv6.PacketConn for dual-stack sockets is intentional; add a concise comment above or inside newBatchSource explaining that on dual-stack systems (see listenFallbackIP behavior) binding to :: yields a socket that accepts both IPv4 and IPv6 and therefore using ipv6BatchSource is correct, and note why ipv4BatchSource is only chosen when LocalAddr().(*net.UDPAddr).IP.To4() != nil; reference newBatchSource, ipv6BatchSource and ipv4BatchSource in the comment to make the rationale explicit for future maintainers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/infrastructure/network/udp/adapters/server_adapter.go`:
- Around line 43-46: The code silently truncates UDP datagrams when len(buffer)
< n by copying only len(buffer) bytes and returning len(buffer); instead, detect
this case in the function containing readBuffer/n (server_adapter.go, e.g., the
adapter's Read/ReadFrom method), and return an explicit error indicating
truncation (for example a package-level ErrDatagramTruncated or ErrShortBuffer)
instead of a successful byte count; alternatively, if truncation is
intentionally supported, add a clear comment at the top of the method explaining
why truncation is safe for this use case and keep the behavior. Ensure the
change references a clear error return path when a.readBuffer contains more
bytes than the destination buffer.
---
Nitpick comments:
In `@src/infrastructure/network/udp/adapters/batching_server_ingress.go`:
- Around line 58-67: The loop assumes Packet.Data is non-nil; defend against nil
by checking Packet.Data before slicing: for each packets[i], if packets[i].Data
!= nil and cap(packets[i].Data) > 0 then set buffer =
packets[i].Data[:cap(packets[i].Data)] else set buffer to an explicit empty
slice (e.g., make([]byte,0)) and assign that to r.batch[i].Buffer; update usage
around packets, Packet.Data and r.batch[i].Buffer so nil Data never propagates
into the batch.
- Around line 36-41: The newBatchSource function's selection of ipv6.PacketConn
for dual-stack sockets is intentional; add a concise comment above or inside
newBatchSource explaining that on dual-stack systems (see listenFallbackIP
behavior) binding to :: yields a socket that accepts both IPv4 and IPv6 and
therefore using ipv6BatchSource is correct, and note why ipv4BatchSource is only
chosen when LocalAddr().(*net.UDPAddr).IP.To4() != nil; reference
newBatchSource, ipv6BatchSource and ipv4BatchSource in the comment to make the
rationale explicit for future maintainers.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 21c3f971-5259-4e7d-9241-28e5e15abe77
📒 Files selected for processing (11)
src/application/listeners/udp_listener.gosrc/application/network/udp/contracts.gosrc/application/network/udp/packet.gosrc/infrastructure/PAL/tunnel/server/server_worker_factory_linux.gosrc/infrastructure/network/udp/adapters/batching_server_ingress.gosrc/infrastructure/network/udp/adapters/batching_server_ingress_test.gosrc/infrastructure/network/udp/adapters/registration_adapter.gosrc/infrastructure/network/udp/adapters/server_adapter.gosrc/infrastructure/tunnel/dataplane/server/udp_chacha20/transport_handler.gosrc/infrastructure/tunnel/dataplane/server/udp_chacha20/transport_handler_test.gosrc/infrastructure/tunnel/sessionplane/server/udp_registration/registration.go
💤 Files with no reviewable changes (1)
- src/application/listeners/udp_listener.go
✅ Files skipped from review due to trivial changes (1)
- src/application/network/udp/packet.go
🚧 Files skipped from review as they are similar to previous changes (1)
- src/infrastructure/PAL/tunnel/server/server_worker_factory_linux.go
| if len(buffer) < n { | ||
| copy(buffer, a.readBuffer[:len(buffer)]) | ||
| return len(buffer), nil | ||
| } |
There was a problem hiding this comment.
Silent data truncation may cause subtle bugs.
When the received datagram is larger than the caller's buffer (len(buffer) < n), data is silently truncated. The caller receives len(buffer) bytes with no indication that data was lost. UDP datagrams are atomic - partial reads typically indicate a programming error or should return an error.
🛠️ Suggested fix: return error on truncation
if len(buffer) < n {
- copy(buffer, a.readBuffer[:len(buffer)])
- return len(buffer), nil
+ return 0, fmt.Errorf("buffer too small: need %d bytes, have %d", n, len(buffer))
}Alternatively, if truncation is intentional for this use case, add a comment documenting why.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if len(buffer) < n { | |
| copy(buffer, a.readBuffer[:len(buffer)]) | |
| return len(buffer), nil | |
| } | |
| if len(buffer) < n { | |
| return 0, fmt.Errorf("buffer too small: need %d bytes, have %d", n, len(buffer)) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/infrastructure/network/udp/adapters/server_adapter.go` around lines 43 -
46, The code silently truncates UDP datagrams when len(buffer) < n by copying
only len(buffer) bytes and returning len(buffer); instead, detect this case in
the function containing readBuffer/n (server_adapter.go, e.g., the adapter's
Read/ReadFrom method), and return an explicit error indicating truncation (for
example a package-level ErrDatagramTruncated or ErrShortBuffer) instead of a
successful byte count; alternatively, if truncation is intentionally supported,
add a clear comment at the top of the method explaining why truncation is safe
for this use case and keep the behavior. Ensure the change references a clear
error return path when a.readBuffer contains more bytes than the destination
buffer.
Summary
listeners.UdpListenerinfrastructure/network/udp/adaptersand remove redundantUDPprefixesVerification
GOCACHE=/tmp/tungo-gocache go test ./infrastructure/network/udp/adapters ./infrastructure/tunnel/sessionplane/client_factory ./infrastructure/tunnel/sessionplane/server/udp_registrationGOCACHE=/tmp/tungo-gocache GOOS=linux GOARCH=amd64 go test -c -o /tmp/tungo_server_linux.test ./infrastructure/PAL/tunnel/serverSummary by CodeRabbit
Release Notes