test: remove timing race in serialization timeout test#283
Merged
Conversation
Use a test-controlled CountDownLatch instead of Thread.sleep so the 100 ms timeout always fires before the serializer finishes, even on slow or loaded runners where a main-thread pause between trySerialize() and waitForCompletion() could previously let the background path complete first, hiding the expected SERIALIZATION_TIMEOUT outcome.
Use a test-controlled CountDownLatch to block session1's serialization until session2 has completed, replacing the Thread.sleep+delay pattern that assumed a specific timing relationship between two concurrent serializations. On loaded CI runners the timing could drift enough that session1 finished first, flipping the expected ordering. Also wait explicitly for session1 to reach the started state before scheduling session2, so the 'started' order is deterministic.
3 tasks
mcollovati
approved these changes
Apr 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a rare flake in
SerializationDebugRequestHandlerTest.handleRequest_serializationTimeout_timeoutReportedobserved on CI (run 24824699112) where the test reported[SERIALIZATION_FAILED, NOT_SERIALIZABLE_CLASSES]instead of the expected[NOT_SERIALIZABLE_CLASSES, SERIALIZATION_TIMEOUT].Root cause
The test had
SlowSerialization.writeObjectsleep 2000 ms and set the timeout to 100 ms. The expectation is that the 100 ms timeout fires first, producingSERIALIZATION_TIMEOUT.The flow is:
trySerialize()schedules async serialization, returnswaitForCompletion(100ms)— awaitsserializationCompletedLatchThread.sleep(2000)then fails on non-serializable fieldsIf the main thread pauses (GC, JIT, loaded CI) for a couple of seconds between steps 1 and 2, the background task can reach its failure path and count the latch down via
whenSerialized.accept(null)before the main thread enterswaitForCompletion. The 100 msawaitthen returns immediately withcompleted=true,timeout()is never called, and onlySERIALIZATION_FAILEDends up in the outcomes.Fix
Replace
Thread.sleep(2000)with a test-controlledCountDownLatchthatwriteObjectawaits unbounded. The test releases the latch in afinallyblock after asserting. This guarantees the latch stays uncounted until the assertion is done, so the 100 ms timeout always fires regardless of main-thread pauses.Test plan
mvn test -Dtest=SerializationDebugRequestHandlerTest#handleRequest_serializationTimeout_timeoutReportedpasses 5/5 runs locallySerializationDebugRequestHandlerTestclass (13 tests) passes