Skip to content

Fix polling (neon) driver reentrancy panic on in-read filter writes#906

Open
Meemaw wants to merge 1 commit into
ntex-rs:mainfrom
Meemaw:fix/polling-write-reentrancy
Open

Fix polling (neon) driver reentrancy panic on in-read filter writes#906
Meemaw wants to merge 1 commit into
ntex-rs:mainfrom
Meemaw:fix/polling-write-reentrancy

Conversation

@Meemaw

@Meemaw Meemaw commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

The polling driver's StreamOpsInner::with is Cell::take().unwrap() with no reentrancy guard. The driver calls StreamItem::read (-> filter read processing -> update_read_status) inside a with scope. If a filter produces >= write_buf_threshold (8192) bytes of write data during read processing while writes are paused, the io stream initiates a direct write (Handle::write -> WeakStreamCtl::with), which re-enters with on the same already-taken cell -> None.unwrap() panic, killing the arbiter thread and the connection.

This is the exact failure mode of a TLS-style filter that emits a large handshake/control burst in response to incoming records.

Fix: route direct writes through a new write_stream that, when the streams slab is already borrowed (we are inside event handling), defers the write to check_delayed_writes, draining it once the slab is released

  • mirroring the existing delayed_drop mechanism. No more reentrant take.

Regression test test_filter_large_write_during_read_processing exercises the polling backend (the default driver on non-Linux hosts): it panics with "called Option::unwrap() on a None value" at polling/stream.rs without this fix and passes with it.

The polling driver's `StreamOpsInner::with` is `Cell::take().unwrap()` with
no reentrancy guard. The driver calls `StreamItem::read` (-> filter read
processing -> `update_read_status`) *inside* a `with` scope. If a filter
produces >= `write_buf_threshold` (8192) bytes of write data during read
processing while writes are paused, the io stream initiates a direct write
(`Handle::write` -> `WeakStreamCtl::with`), which re-enters `with` on the
same already-taken cell -> `None.unwrap()` panic, killing the arbiter
thread and the connection.

This is the exact failure mode of a TLS-style filter that emits a large
handshake/control burst in response to incoming records.

Fix: route direct writes through a new `write_stream` that, when the
streams slab is already borrowed (we are inside event handling), defers
the write to `check_delayed_writes`, draining it once the slab is released
- mirroring the existing `delayed_drop` mechanism. No more reentrant take.

Regression test `test_filter_large_write_during_read_processing` exercises
the polling backend (the default driver on non-Linux hosts): it panics with
"called `Option::unwrap()` on a `None` value" at polling/stream.rs without
this fix and passes with it.
@Meemaw Meemaw force-pushed the fix/polling-write-reentrancy branch from 2d7da74 to 55afd4a Compare June 13, 2026 10:52
@codecov

codecov Bot commented Jun 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.50000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 90.27%. Comparing base (61295aa) to head (55afd4a).

Files with missing lines Patch % Lines
ntex-net/src/polling/stream.rs 97.43% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #906      +/-   ##
==========================================
+ Coverage   90.25%   90.27%   +0.01%     
==========================================
  Files         240      240              
  Lines       34992    35024      +32     
==========================================
+ Hits        31582    31617      +35     
+ Misses       3410     3407       -3     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant