Skip to content

Retire pruned leaked connections so they can't be reused (fixes #9240)#9508

Open
nickjn92 wants to merge 1 commit into
square:masterfrom
nickjn92:leaked-connection-reuse-fix
Open

Retire pruned leaked connections so they can't be reused (fixes #9240)#9508
nickjn92 wants to merge 1 commit into
square:masterfrom
nickjn92:leaked-connection-reuse-fix

Conversation

@nickjn92

@nickjn92 nickjn92 commented Jun 24, 2026

Copy link
Copy Markdown

Fixes #9240.

If you leak a response body (execute a call and never close the body), a later unrelated request can come back corrupted. It gets the previous response's bytes stapled onto the front of its own, which surfaces as ProtocolException: Unexpected status line: .... On 4.x this same mistake was a harmless leak, the connection got retired and was never reused. On 5.x it can silently corrupt a different call.

What's happening

RealConnectionPool.pruneAndGetAllocationCount notices a leaked allocation (its CallReference was garbage collected) and drops the reference. On 4.12.0 it also set connection.noNewExchanges = true so that connection could never be handed out again. #7456 removed that line from the leak path:

       references.removeAt(i)
-      connection.noNewExchanges = true
       if (references.isEmpty()) {

It only bites under concurrency. closeConnections prunes every connection in a single pass but only ever evicts one of them. If there is just one leaked connection it becomes the last allocation, gets marked idle, and the eviction path closes it (and sets noNewExchanges), which is why the existing leakedAllocation test still passes. But as soon as you have more than one leaked connection, the ones that did not get evicted are left with noNewExchanges == false, sitting idle in the pool, still holding the abandoned response's unread bytes on the socket. isEligible() says yes and isHealthy() says yes (leftover readable bytes are not EOF), so the next call grabs that dirty connection and reads the stale bytes as its status line.

Here is the lifecycle of the connection that got stapled, from a quick instrumented run:

ACQ  x9    noNewExchanges=false   x9 reads a little, then abandons without closing
ACQ  c114  noNewExchanges=false   an unrelated call reuses the same connection
NONEW                             it finally gets marked dead, but c114 already has it

The fix

Put the line back, so any leaked connection that gets pruned is retired right away:

       references.removeAt(i)
+      connection.noNewExchanges = true
       if (references.isEmpty()) {

That matches 4.12.0 and the other noNewExchanges = true spots in connectionBecameIdle and evictAll. I left out the connectionListener.noNewExchanges(connection) call on purpose. pruneAndGetAllocationCount runs while the connection lock is held (it is called from inside closeConnections), and everywhere else in this file the listener events are fired outside the lock. If you would rather have the leak path show up on the listener too, I can work the event out under the lock and fire it from closeConnections after the prune. Happy to go either way.

Test

I added ConnectionPoolTest.prunedLeakedConnectionIsRetiredEvenWhenNotEvicted. It follows the existing leakedAllocation test (same allocateAndLeakAllocation helper and awaitGarbageCollection()), but leaks two connections so a single cleanup pass leaves a survivor that was not evicted yet still has to be retired.

It fails reliably without the fix (ran it four times, failed every time) and passes with it. The rest of :okhttp:jvmTest is green. The only failures I saw were the FastFallbackTest cases that need IPv6 dual stack, and those fail the same way on a clean checkout.

When a response body is abandoned (never closed), the GC-based leak detector in
RealConnectionPool.pruneAndGetAllocationCount removes the leaked call's reference
but, since square#7456, no longer marks the connection noNewExchanges. A cleanup pass
prunes every connection yet evicts at most one, so a leaked connection that isn't
the evicted one stays idle and eligible for reuse while its socket still holds the
abandoned response's unread bytes. The next call to reuse it reads those bytes as
its status line, throwing "ProtocolException: Unexpected status line" and stapling
one response's body onto another. 4.x marked the connection noNewExchanges here;
this restores that line so any pruned leaked connection is retired.

Fixes square#9240
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race condition causing reuse of leaked connections

1 participant