Use `FdLock` on all targets by ysbaddaden · Pull Request #16569 · crystal-lang/crystal

ysbaddaden · 2026-01-15T11:01:45Z

Moves Crystal::FdLock out of the internal Crystal::System types right into the stdlib types: IO::Descriptor, File and Socket.

The change is mostly to bring the single reader/writer behavior to Windows. Still, making sure to clean everything before we close a file isn't a bad idea.

Contrary to what I thought, we don't need the single reader/writer lock to implement Crystal::EventLoop::IOCP#shutdown(IO::FileDescriptor) because Windows provides CancelIoEx(handle, NULL) that allows to cancel all pending IO operations on a file handle. This may be extracted into its own PR.

Also refactors #system_bind and #system_listen that currently take a block to yield errors... while the block usually returns the error. It's simpler to just return a nilable error.

follow-up to #16209

straight-shoota

It makes so much more sense to have the locks at this level 👍

straight-shoota · 2026-01-21T13:26:03Z

I was wondering whether we could safely guarantee that all file operations are protected by a lock and maintain that in the future.

And there could be a simple way to provide a bit more confidence: We could combine access to the file descriptor handle with acquiring the lock. The locking methods yield the file descriptor. It's a bit more inconvenient because we have to pass the fd handle down into the system implementations, but I don't think that should be a stopper.
We could implement that directly in FdLock or add wrappers in the FileDescriptor and Socket.

  private def unbuffered_read(slice : Bytes) : Int32
    # FdLock#read yields the fd handle, and we pass it explicitly to `system_read`
    @fd_lock.read { |fd| system_read(fd, slice) }
  end

The public #fd method still needs to exist and provide lock-free access to the handle for backwards compatibility. So we cannot make this 100% safe. In the end, nothing prevents system_read from still calling #fd instead of using the argument (or perhaps we could turn them into class methods so they have no access to the object?). But it would still be worth it to have bit more safety, even if it's not 100% water proof.

I figure this should probably be addressed in a follow-up though.

ysbaddaden · 2026-01-22T11:04:13Z

@straight-shoota Interesting. Though FdLock doesn't know the actual fd (or the IO object) for now.

ysbaddaden · 2026-01-29T12:45:20Z

Damn. I didn't notice the CI failures on Windows.

straight-shoota · 2026-01-29T12:47:14Z

The icon for failed checks is way too similar to skipped ones.

src/crystal/event_loop/iocp.cr

ysbaddaden · 2026-01-30T13:43:05Z

There's an odd behavior on Windows where a fiber keeps calling Crystal::FdLock.read without actually locking, and hangs the process.

It feels like a recursive call but a stackoverflow is never reached, so it's not.

Actually, the fdlock has been closed and so #lock_slow raises, so there must be a loop that rescues the exception and keeps trying to read.

ysbaddaden · 2026-01-30T14:00:55Z

The issue is that HTTP::Server#listen rescues the exception and retries without checking if the IO is closed, which creates the infinite loop.

It's working on UNIX, so I thought there might be something different in the UNIX vs Win32 system implementations of #accept? but it doesn't look like it.

I think #accept? should return nil early when the socket is closed.

ysbaddaden · 2026-01-30T17:11:21Z

Fixing the #accept? methods fixes the hang 👍

Now, I get a "Process terminated abnormally, the cause is unknown" when running HTTP::Client specs (will retry a broken socket) 😓

If I mark the above spec as pending, I'm down to a couple failures:

UNIXServer accept raises when server is closed
Failure/Error: exception.try(&.message).should eq("Closed stream")

  Expected: "Closed stream"
       got: "AcceptEx timed out"

# spec\std\socket\unix_server_spec.cr:119

UNIXServer accept? returns nil when server is closed
Failure/Error: ch.receive.should eq SpecChannelStatus::End

  Expected: SpecChannelStatus::End
       got: SpecChannelStatus::Timeout

# spec\std\socket\unix_server_spec.cr:157

There might be something about shutdown related to UNIX sockets on Windows?

ysbaddaden · 2026-01-30T17:45:36Z

The two failures are caused by Crystal::System::Socket#overlapped_accept that raises an IO::TimeoutError when the operation has been aborted, which only happened because of a manual cancel after timeout, but now we explicitly cancel with a shutdown before we close.

Let's check if the IO has been closed before raising, and... fixed 🎉

And it seems to have fixed the "Process terminated abnormally" of the HTTP::Client spec... which still fails, but with an explicit exception:

WSARecv (#TCPSocket:0x2b448b7a8c0): An established connection was aborted by the software in your host machine. (IO::Error)

Which sounds just about right. We did cancel it.

ysbaddaden · 2026-01-30T17:58:18Z

Allright, I think I got most of the issues cornered. I'm not so sure about the last commits for IOCP. They look okayish but I'd like another pair of 👀

ysbaddaden · 2026-02-02T10:32:24Z

Sigh, one last failure apparently:

  1) UDPSocket using IPv4 joins and transmits to multicast groups
     Failure/Error: expect_raises(IO::Error) { udp.receive }

       Expected IO::Error but nothing was raised

     # spec\std\socket\udp_socket_spec.cr:201

ysbaddaden · 2026-02-02T11:26:49Z

This is caused by 4b286bb#diff-d5a00d86dac51ee0e517a4d7b20d9f87168c36220dc1fc4cb23360b424108e61R440

ysbaddaden · 2026-02-02T12:42:35Z

The issue is caused by 4b286bb#diff-d5a00d86dac51ee0e517a4d7b20d9f87168c36220dc1fc4cb23360b424108e61R440

It makes sense: both cases shall raise an IO::Error exception, not report an EOF.

But then the HTTP::Client spec failure from a few comments above comes back. We must fix it some other way. The spec shouldn't pass because we returned an EOF when the connection has been explicitly closed, especially when the spec is "HTTP::Client will retry a broken socket".

Instead of only protecting file descriptors on UNIX, we can take advantage of Crystal::FdLock to serialize reads and writes of files, pipes and sockets for every targets.

ysbaddaden · 2026-02-02T12:56:55Z

Rebased to bring changes from master (especially the timer fixes from 1.19.1), and added a commit to properly fix/workaround spec failures related to WSA.

ysbaddaden · 2026-02-02T14:03:14Z

MinGW eventually fails on x86_64 with the unhelpful:

make: *** [Makefile:142: std_spec] Error 67
Error: Process completed with exit code 2.

While the spec output looks fine (no failures, no errors).

ysbaddaden self-assigned this Jan 15, 2026

ysbaddaden added the kind:feature label Jan 15, 2026

straight-shoota approved these changes Jan 21, 2026

View reviewed changes

straight-shoota added the topic:stdlib:system label Jan 21, 2026

straight-shoota changed the title ~~Use FdLock on every targets~~ Use FdLock on all targets Jan 21, 2026

ysbaddaden added this to the 1.20.0 milestone Jan 29, 2026

straight-shoota removed this from the 1.20.0 milestone Jan 29, 2026

ysbaddaden commented Jan 30, 2026

View reviewed changes

src/crystal/event_loop/iocp.cr Outdated Show resolved Hide resolved

ysbaddaden added 10 commits February 2, 2026 13:55

Implement Crystal::EventLoop::IOCP#shutdown

0ca0bfc

Implement Crystal::EventLoop::IOCP#close(Socket)

c810c4d

Move FdLock to File, IO::FileDescriptor and Socket

577c900

Instead of only protecting file descriptors on UNIX, we can take advantage of Crystal::FdLock to serialize reads and writes of files, pipes and sockets for every targets.

Fix: #system_bind and #system_listen

9b20a1c

fixup! Fix: #system_bind and #system_listen

2c0ebb4

fixup! Implement Crystal::EventLoop::IOCP#shutdown

44ae15e

Fix: don't check errors in IOCP#shutdown(socket)

61084c6

Fix: Socket#accept? shall do an early check for #closed?

0d51a1f

Fix: IOCP must distinguish shutdown from timeout

77c9b2c

Fix: WSA issues on win32

e03b7b0

ysbaddaden force-pushed the feature/move-fd-lock-out-of-crystal-system branch from 4b286bb to e03b7b0 Compare February 2, 2026 12:55

Uh oh!

Conversation

ysbaddaden commented Jan 15, 2026

Uh oh!

straight-shoota left a comment

Choose a reason for hiding this comment

Uh oh!

straight-shoota commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ysbaddaden commented Jan 22, 2026

Uh oh!

ysbaddaden commented Jan 29, 2026

Uh oh!

straight-shoota commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ysbaddaden commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ysbaddaden commented Jan 30, 2026

Uh oh!

ysbaddaden commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ysbaddaden commented Jan 30, 2026

Uh oh!

ysbaddaden commented Jan 30, 2026

Uh oh!

ysbaddaden commented Feb 2, 2026

Uh oh!

ysbaddaden commented Feb 2, 2026

Uh oh!

ysbaddaden commented Feb 2, 2026

Uh oh!

ysbaddaden commented Feb 2, 2026

Uh oh!

ysbaddaden commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

straight-shoota commented Jan 21, 2026 •

edited

Loading

straight-shoota commented Jan 29, 2026 •

edited

Loading

ysbaddaden commented Jan 30, 2026 •

edited

Loading

ysbaddaden commented Jan 30, 2026 •

edited

Loading