Skip to content

Add primitive flow control (v0.33)#5

Merged
danielfoehrKn merged 4 commits into
coreweavefrom
dfoehr/coreweave-flow-control
Oct 15, 2025
Merged

Add primitive flow control (v0.33)#5
danielfoehrKn merged 4 commits into
coreweavefrom
dfoehr/coreweave-flow-control

Conversation

@danielfoehrKn

Copy link
Copy Markdown

This adds a primitive application-layer flow control in the server -> agent direction.

The goal of this implementation is to tell the server to stop sending packets for this specific overlay connection when this per-connection buffer runs full.

Fundamentally, this implementation relies on the guarantees of grpc (http/2 and tcp) with TCP guaranteeing in-order delivery of the application-layer data without any loss or corruption.

  • we don't need to implement packet sequencing etc.

The agent already today implements a kind of receive buffer via a buffered channel the size of the xfr channel size.
Because we get guaranteed packet ordering plus delivery, we can implement a simple per-packet ACK protocol to inform the server about the size of the agent's receive buffer. However, note that a sender does not have to wait for an ACK for packet t, before it can send out packet t+1- but can keep sending while the agents receive buffer is not full.

Flow

  1. Agent tells the server about its receive window size (currently set to the xfr channel size)
  2. Server initiates a semaphore with the size of the agents receive window
  3. Server acquires the semaphore with weight 1 and then sends a packet to the agent
  • if the semaphore is at size == 0, then this blocks until the server has received a DATA_ACK for this connection
  1. Agent sends a DATA_ACK packets back to the agent just before it sends the packet to the endpoint in the tenant
  2. Server receives the DATA_ACK for this connection. Releases the semaphore with weight 1
  • this adds + 1 to the semaphore and could unblock a waiting goroutine

NOTE: there are obvious performance improvements such as batch ACK of packets etc ---> but this is a PoC + but the konnectivity proxy is anyways not designed for high throughput scenarios (more for log streaming, execs, webhook deliveries in K8s).

NOTE 2: on the server side serving the API Server, this implements the flow control for both the HTTPConnect and grpc frontends.

NOTE 3: there are a bunch of comments in the code where we could insert flow control for the opposite direction, but that can be ignored for now.

@danielfoehrKn danielfoehrKn force-pushed the dfoehr/coreweave-flow-control branch from e23e91f to 826f734 Compare October 15, 2025 21:08
@danielfoehrKn danielfoehrKn changed the title Add primitive flow control control (v0.33) Add primitive flow control (v0.33) Oct 15, 2025
@danielfoehrKn danielfoehrKn merged commit 1abdcde into coreweave Oct 15, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants