Fixed #535: ThreadSensitiveContext.__aexit__ blocking the event loop.#563
Fixed #535: ThreadSensitiveContext.__aexit__ blocking the event loop.#563helgefmi wants to merge 1 commit into
Conversation
… loop. executor.shutdown() is a blocking join, and the executor's worker thread may itself be waiting on the event loop, deadlocking it. Join in a separate thread instead, as asyncio's shutdown_default_executor() does.
|
Thanks for this @helgefmi. Just about to disappear on holiday, so will resolve this when I return 🏖️ |
| pass | ||
|
|
||
|
|
||
| def cancel_inside_thread_sensitive_context(): |
There was a problem hiding this comment.
Nice, the new unit test makes the error obvious 👍
|
Still working on the code review. Investigating a cancellation problem with nested async_to_sync |
|
Finished my investigation 😅 First, thanks again for the test case, it really helped. Running We run sync code through Now the parent reaches
So the real issue: we "cancel" the sync function, but it doesn't actually stop, it runs on into async_to_sync and parks. I want to float a different angle: fix the cause instead. We can't truly cancel running sync code anyway, so instead of cancelling mid-flight, we let it run until it finishes or reaches the I don't have a strong opinion on which approach is better. Thoughts? |

Fixes #535.
ThreadSensitiveContext.__aexit__is async, but it callsexecutor.shutdown(), which blocks the thread (it's aThread.join). Running a blocking call straight on the event loop can deadlock the whole loop, in this case when the executor's worker thread is itself waiting on that loop. When it happens the server stops responding to everything.The fix runs the shutdown off the loop with
await loop.run_in_executor(None, executor.shutdown), so the loop stays free. This is what asyncio itself does inloop.shutdown_default_executor(). I also moved the contextvar reset to before the await.My test deadlocks on current main and passes with the change. It runs in a separate process because the hung thread would otherwise hang the test suite.
AI disclosure: I used Claude Opus 4.8 to help track down the deadlock, write the fix and the test.