Skip to content

Conversation

@guyco3
Copy link

@guyco3 guyco3 commented Jan 18, 2026

Fixes #4828

Problem

ZMQ_STREAM server cannot disconnect a client if the SNDHWM has been reached. This happens because the stream_t::xsend logic returns EAGAIN on the first frame (the Routing ID) when check_write() fails, preventing the socket from ever receiving the second frame (the 0-byte disconnect signal).

Solution

defer the check_write() call from the Routing ID frame to the actual payload frame. This allows the socket to identify the peer and process a termination signal even under HWM pressure. Actual data payloads still correctly trigger EAGAIN if the pipe is full, preserving existing backpressure behavior.

  • If the subsequent payload frame is 0 bytes (a disconnect signal), it is processed immediately via terminate(), bypassing the HWM.
  • If the subsequent payload frame contains data (size > 0), the HWM check is enforced at that stage to maintain standard flow control.

Verification

  • Added a new test case: tests/test_stream_hwm_disconnect.cpp.
  • Before fix: Disconnect failed with EAGAIN.
  • After fix: Disconnect succeeds even when the sending queue is full.

Solution: defer HWM check to the second frame in stream_t::xsend
I hereby agree to license my contributions to libzmq under the terms of the MPLv2.
In ZMQ_STREAM, sending the routing ID frame followed by an empty payload
signals a disconnect. If the SNDHWM is reached, the routing ID frame
is blocked by check_write(), returning EAGAIN and preventing the
disconnect signal from being processed.

Solution: defer HWM check to the payload frame in stream_t::xsend.
Also added a regression test in tests/test_stream_hwm_disconnect.cpp.

Fixes zeromq#4828
@guyco3 guyco3 force-pushed the fix/stream-hwm-disconnect-4828 branch 2 times, most recently from b508a91 to 16c7c2a Compare January 18, 2026 05:00
I hereby agree to license my contributions to libzmq under the terms
of the MPLv2.
@guyco3 guyco3 force-pushed the fix/stream-hwm-disconnect-4828 branch from 16c7c2a to 13e5c52 Compare January 18, 2026 06:20
@guyco3 guyco3 marked this pull request as ready for review January 18, 2026 06:25
@pijyoi
Copy link
Contributor

pijyoi commented Jan 18, 2026

In the test, after the sndhwm is reached, you have a comment that at that point, it is unknown whether the socket is in the "more" state. The subsequent code then deals with both cases.

It seems to me that there is a race? Let's assume that the socket is in the "more" state. Before you send the routing_id frame, is it possible for the sndhwm to transition to no longer full? If yes, then that would result in the routing id frame being sent as a payload to the client.

Perhaps the "more" state should be reset if the sndhwm is reached?

I see that #4829 added an API zmq_disconnect_peer but only implemented for SERVER sockets. If support for STREAM sockets was added, that would also solve the issue. i.e. sending empty frame to disconnect would be deprecated in favor of using zmq_disconnect_peer.

@guyco3 guyco3 force-pushed the fix/stream-hwm-disconnect-4828 branch 2 times, most recently from 5ebe188 to d05be44 Compare January 19, 2026 05:12
…wm is at limit

Solution: Implement zmq_disconnect_peer support for ZMQ_STREAM
@guyco3 guyco3 force-pushed the fix/stream-hwm-disconnect-4828 branch from d05be44 to c5f0e8c Compare January 19, 2026 05:15
When a ZMQ_STREAM socket reaches its SNDHWM, it becomes impossible to
disconnect a peer. Disconnecting requires sending the routing ID
followed by a 0-byte frame. Currently, xsend returns EAGAIN on the
first frame (the ID) if the pipe is full, preventing the second
disconnect frame from ever being processed.

Solution:
Modified xsend in stream.cpp to allow the routing ID frame to pass
even when the pipe is full. If the subsequent payload frame is
0-bytes, the connection is terminated immediately via terminate().

To prevent state-machine desync, _more_out is reset to false if a
data-bearing payload frame is sent on a full pipe, forcing an EAGAIN
and requiring a clean retry from the user.
@guyco3 guyco3 force-pushed the fix/stream-hwm-disconnect-4828 branch from 705f411 to 8600568 Compare January 19, 2026 11:35
@guyco3
Copy link
Author

guyco3 commented Jan 19, 2026

Thanks for the feedback, I looked into the zmq_disconnect_peer approach as you suggested. While that API is a great addition, I think fixing the xsend logic is the right path for ZMQ_STREAM for the following reason (open to discussion though):

API Incompatibility: zmq_disconnect_peer currently takes a uint32_t routing ID. ZMQ_STREAM identities are 5-byte frames (0x00 + 4-byte ID). Forcing STREAM users to use this API would require them to manually parse the identity frame to extract the last 4 bytes in network order (e.g., ntohl(*(uint32_t*)(data+1))). Additionally, STREAM supports custom identities that will probably not map cleanly to uint32_t since they might be longer.

Current Progress:
I am still working fixing the race you mentioned and writing a good test file into the libzmq test suite to verify this behavior. I will update the PR once the test file is finalized and verified.

@guyco3 guyco3 marked this pull request as draft January 19, 2026 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ZMQ_STREAM server cannot disconnect client when sndhwm reached

2 participants