Skip to content

Conversation

@aduermael
Copy link

Motivation

When using vsock (virtual sockets) through Apple's Containerization framework, the app crashes with:

assert(readResult == .some)

in NIOPosix/BaseSocketChannel.swift:1205 (in readable0()).

Root Cause

The assertion assumes kqueue/epoll always have implicit synchronization such that POLLIN notifications are only received when data is actually available. However, vsock (virtio-vsock) used for host-guest communication goes through a hypervisor layer and may not provide the same kqueue synchronization guarantees as regular TCP/Unix sockets.

This creates a race condition similar to io_uring where POLLIN fires but no data is ready.

Solution

Instead of asserting on readResult == .none, handle it gracefully by:

  1. Calling readIfNeeded0() to re-register for the next read notification
  2. Returning .normal(.none) to indicate no data was read

This matches the existing pattern for handling io_uring spurious wakeups.

Related Issues

Testing

  • All existing SwiftNIO tests pass
  • This change has been tested with the Containerization framework using vsock connections

vsock (virtual sockets) used by Apple's Containerization framework may not
provide the same kqueue synchronization guarantees as regular TCP/Unix sockets.
This causes spurious POLLIN notifications where no data is actually available.

Instead of asserting on readResult == .none, handle it gracefully by calling
readIfNeeded0() and returning .normal(.none).

Refs: apple#3500
Refs: apple/containerization#503
EBADF (bad file descriptor) was already tolerated on iOS because file
descriptors can be reaped in the background. The same issue occurs on
macOS when using vsock through Apple's Containerization framework.

The vsock hypervisor layer may invalidate file descriptors during normal
operation, causing EBADF in fcntl and other syscalls. This change extends
the iOS workaround to all Darwin platforms.

Refs: apple#3500
Refs: apple/containerization#503
@aduermael aduermael marked this pull request as draft January 31, 2026 00:12
Copy link
Contributor

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking the time to file the patch. I left some comments on the patch itself, but I think we also need to answer some higher-level questions first:

  • Is this hypervisor behaviour something we should tolerate in this way?
  • Have you tested this patch resolves the issue?
  • Could we expand this PR to include tests?

Comment on lines +1210 to +1216
if readResult == .none {
// Spurious wakeup - no data available despite POLLIN notification.
// This can happen with vsock or other non-standard socket types that
// don't have the same kqueue synchronization guarantees.
self.readIfNeeded0()
return .normal(.none)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch and early-return has actually changed the semantics for io_uring -- when readResult == .none, we'll no longer conditionally fireChannelReadComplete down the pipeline.

Comment on lines +1205 to +1209
// NOTE: The original assert assumed kqueue/epoll always have implicit sync,
// but vsock (virtual sockets) may not provide the same guarantees as regular
// TCP/Unix sockets. Handle .none gracefully instead of crashing.
// See: https://github.com/apple/swift-nio/issues/3500
// https://github.com/apple/containerization/issues/503
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is referring to an assert that will not exist once this patch is merged, so will be confusing to maintainers. I appreciate you added a link to some GitHub issues, but it would be better if the comments in the code explain the code that's actually here.

Relatedly, the lengthy commentary above, about skipping the assertion for io_uring, no longer makes sense.

Comment on lines +202 to +203
// issues. The vsock hypervisor layer may close descriptors in ways that cause EBADF
// during normal operation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we confirm this statement is true? Specifically "during normal operation" -- should we really expect the hypervisor to be closing the FD and have we ruled out this being a bug in Container?

IIUC a file descriptor that is closed out-of-band, under NIO's feat, violates NIOs model, and it would be problematic to just gloss over it for everything.

The commentary above about tolerating for iOS is very specific, but this patch is opening it up to everything on macOS, not just VSOCK, even if that is something we wanted to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw, I (containerization maintainer) cannot repro the issue Adrian is seeing so far. I can dig in on the hypervisor end, but I'm mostly in Simon's camp here in terms of his comment left. I don't think we should ignore EBADF either, or at the very least offer some escape hatch for folks who don't want the library to assert on EBADFs. On the hypervisor/VMM end, Virtualization.framework does seem to close all vsock FDs it has vended when the VM is stopped, but I imagine this is intended and I view this as a good thing. We had quite a few bugs originally for our container stop logic that would trigger the ebadf assert in NIO, but never on startup that I've seen yet apple/containerization#503.

aduermael and others added 2 commits February 2, 2026 13:15
The previous commit tolerated EBADF in isUnacceptableErrno() but the
close() path uses a separate function isUnacceptableErrnoOnClose() which
still treated EBADF as unacceptable.

When vsock connections are closed by the hypervisor layer (e.g., when a
container stops), NIO may attempt to close an already-invalid file
descriptor, resulting in EBADF. This is normal behavior for vsock and
should not crash.

Refs: apple#3500
Refs: apple/containerization#503

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When vsock connections are closed by the hypervisor (e.g., container
shutdown), the event loop may exit abnormally without going through
the proper shutdown state machine. This would cause an assertion
failure because the state is still runningAndAcceptingNewRegistrations
instead of noLongerRunning.

On Darwin, we now gracefully handle this case by forcing the state
transition instead of asserting. This allows the event loop to clean
up properly even after abnormal vsock disconnections.

Refs: apple#3500
Refs: apple/containerization#503

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants