Skip to content

man: document that send/recv can see -EAGAIN without MSG_DONTWAIT#1601

Open
tchaikov wants to merge 1 commit into
axboe:masterfrom
tchaikov:man-document-eagain-without-dontwait
Open

man: document that send/recv can see -EAGAIN without MSG_DONTWAIT#1601
tchaikov wants to merge 1 commit into
axboe:masterfrom
tchaikov:man-document-eagain-without-dontwait

Conversation

@tchaikov

@tchaikov tchaikov commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

The send/recv prep man pages imply -EAGAIN is only expected when the application sets MSG_DONTWAIT. In practice io_uring can also complete a request that did not set MSG_DONTWAIT with -EAGAIN: when it cannot make progress, it stops retrying rather than waiting indefinitely (see kernel commit c16bda37594f "io_uring/poll: allow some retries for poll triggering spuriously", which bounds the retry attempts).

Applications submitting blocking-style socket sends/recvs (no MSG_DONTWAIT) can therefore observe -EAGAIN under load and must treat it as transient and reissue. Note this in the ERRORS section of the send, sendmsg, recv and recvmsg prep pages.


git request-pull output:

The following changes since commit 2750b94a77ea9e4e156fd0caa5c14552e7d1397f:

  Merge branch 'poll-update-trigger' of https://github.com/schlad/liburing (2026-06-10 06:57:31 -0600)

are available in the Git repository at:

  git@github.com:tchaikov/liburing.git man-document-eagain-without-dontwait

for you to fetch changes up to f4cdd7d3fe004d1315bd367fefbc189aa92511de:

  man: document that send/recv can see -EAGAIN without MSG_DONTWAIT (2026-06-15 20:06:25 +0800)

----------------------------------------------------------------
Kefu Chai (1):
      man: document that send/recv can see -EAGAIN without MSG_DONTWAIT

 man/io_uring_prep_recv.3    | 17 ++++++++++++++++-
 man/io_uring_prep_recvmsg.3 | 19 ++++++++++++++++++-
 man/io_uring_prep_send.3    | 17 ++++++++++++++++-
 man/io_uring_prep_sendmsg.3 | 16 ++++++++++++++++
 4 files changed, 66 insertions(+), 3 deletions(-)

Click to show/hide pull request guidelines

Pull Request Guidelines

  1. To make everyone easily filter pull request from the email
    notification, use [GIT PULL] as a prefix in your PR title.
[GIT PULL] Your Pull Request Title
  1. Follow the commit message format rules below.
  2. Follow the Linux kernel coding style (see: https://github.com/torvalds/linux/blob/master/Documentation/process/coding-style.rst).

Commit message format rules:

  1. The first line is title (don't be more than 72 chars if possible).
  2. Then an empty line.
  3. Then a description (may be omitted for truly trivial changes).
  4. Then an empty line again (if it has a description).
  5. Then a Signed-off-by tag with your real name and email. For example:
Signed-off-by: Foo Bar <foo.bar@gmail.com>

The description should be word-wrapped at 72 chars. Some things should
not be word-wrapped. They may be some kind of quoted text - long
compiler error messages, oops reports, Link, etc. (things that have a
certain specific format).

Note that all of this goes in the commit message, not in the pull
request text. The pull request text should introduce what this pull
request does, and each commit message should explain the rationale for
why that particular change was made. The git tree is canonical source
of truth, not github.

Each patch should do one thing, and one thing only. If you find yourself
writing an explanation for why a patch is fixing multiple issues, that's
a good indication that the change should be split into separate patches.

If the commit is a fix for an issue, add a Fixes tag with the issue
URL.

Don't use GitHub anonymous email like this as the commit author:

123456789+username@users.noreply.github.com

Use a real email address!

Commit message example:

src/queue: don't flush SQ ring for new wait interface

If we have IORING_FEAT_EXT_ARG, then timeouts are done through the
syscall instead of by posting an internal timeout. This was done
to be both more efficient, but also to enable multi-threaded use
the wait side. If we touch the SQ state by flushing it, that isn't
safe without synchronization.

Fixes: https://github.com/axboe/liburing/issues/402
Signed-off-by: Jens Axboe <axboe@kernel.dk>

By submitting this pull request, I acknowledge that:

  1. I have followed the above pull request guidelines.
  2. I have the rights to submit this work under the same license.
  3. I agree to a Developer Certificate of Origin (see https://developercertificate.org for more information).

The send/recv prep man pages imply -EAGAIN is only expected when the
application sets MSG_DONTWAIT. In practice io_uring can also complete a
request that did not set MSG_DONTWAIT with -EAGAIN: when it cannot make
progress, it stops retrying rather than waiting indefinitely (see kernel
commit c16bda37594f "io_uring/poll: allow some retries for poll triggering
spuriously", which bounds the retry attempts).

Applications submitting blocking-style socket sends/recvs (no MSG_DONTWAIT)
can therefore observe -EAGAIN under load and must treat it as transient and
reissue. Note this in the ERRORS section of the send, sendmsg, recv and
recvmsg prep pages.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
@axboe

axboe commented Jun 15, 2026

Copy link
Copy Markdown
Owner

This is essentially a "will never happen in an app" case, but added so that malicious/odd reproducers don't get stuck forever. Curious if this is something you actually ran into, or whether you just skimmed the source and spotted this?

sunyuechi added a commit to sunyuechi/seastar that referenced this pull request Jun 15, 2026
The io_uring backend can complete a socket send with -EAGAIN and leak it
to the caller: when arming a poll fails the request goes to io-wq, which
does not retry EAGAIN for SOCK_NONBLOCK sockets. A send can complete with
-EAGAIN even without MSG_DONTWAIT; this is expected io_uring behaviour and
must be retried (documentation requested in axboe/liburing#1601).

Intercept the -EAGAIN completion in complete_with() and re-issue through
the poll-based do_sendmsg()/do_send(), like the aio/epoll backends. The
handling lives in the shared completion base classes so both io_uring
backends recover from it.

Observed under load on riscv64 (io_uring is the auto-selected backend):
unittest-seastar-socket's test_preemptive_down() aborts with EAGAIN.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
@tchaikov

Copy link
Copy Markdown
Contributor Author

@axboe hi Jens,

it's real, not from skimming the source. actually, it showed up when testing Ceph's crimson test suite, which runs on Seastar's io_uring reactor backend, on a riscv64 box. the socket test test_preemptive_down started aborting intermittently under load. the app only expects broken_pipe/connection_reset when a connection goes away, so EAGAIN tripped a abort().

it only shows up under heavy CPU contention. the test runs Seastar on 2 cores (--smp 2), and we pin everything to those same 2 cores while running busy background process there too, so the reactor is heavily oversubscribed:

for i in $(seq 12); do 
  taskset -c 0-1 yes >/dev/null & 
done
taskset -c 0-1 ./unittest-seastar-socket --reactor-backend io_uring --smp 2

With the cores that contended, 2 of 3 runs abort. ftrace on the io_uring events caught the path:

submit           SENDMSG
poll_arm         SENDMSG                 # buffer full, poll armed
complete         SENDMSG result 34199    # partial send
submit           SENDMSG                 # retry remainder, REQ_F_POLLED
queue_async_work SENDMSG
iou-wrk complete SENDMSG result -11

a partial send that can't make further progress while the reactor is starved, hitting the retry cap from kernel commit c16bda37594f. So yes, pathological scheduling, but a real workload rather than a crafted reproducer.

we fixed it in Seastar by treating EAGAIN as transient and retrying through poll (20 passes out 20 runs after that). the man page note just gives the next person who hits a stray EAGAIN under load somewhere to land, instead of assuming io_uring is broken. if you'd rather not document a case that won't occur under normal scheduling, i am happy to drop or reword it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants