-
Notifications
You must be signed in to change notification settings - Fork 50
Description
DLX message loss on restart due to multi-threaded race in definition loading
Problem
When LavinMQ restarts with multi-threading (4 threads), messages in TTL queues might be lost. The server correctly persists and loads messages from disk, but during load_definitions!, queues are created (spawning TTL expire fibers) before exchange bindings are loaded. With Crystal's multi-threaded runtime, another thread picks up a TTL fiber immediately, finds expired messages (downtime > TTL), attempts DLX routing, but the exchange has no bindings yet — find_queues returns empty and messages are silently dropped via delete_message.
Observed this on a QA instance that had a bunch of queues with data, e.g. it got a bit busy when starting up, haven't been able to reproduce via naive spec.
Root Cause
src/lavinmq/vhost.cr load_definitions! applies definitions in order:
- Line 549: exchanges created
- Line 551: queues created → spawns TTL expire fibers
- Line 553: exchange bindings ← not reached yet when TTL fires
- Line 555: queue bindings ← not reached yet when TTL fires
Queue#initialize → start → spawn message_expire_loop. With 4 threads, the spawned fiber runs on another thread immediately, before the main thread finishes loading bindings.
The expire path: message_expire_loop → expire_messages → expire_msg → @dead_letter.route → ex.find_queues(...) → empty (no bindings) → return → delete_message (message gone).
Proposed Fix
Add a @ready gate (BoolChannel) to VHost. Queue expire loops wait on this gate before processing. The gate opens after all definitions (including bindings) are loaded.
BoolChannelis already used extensively in queue code (@consumers_empty,@msg_store.empty)- When
@readyis false (during loading),when_true.receiveblocks the fiber - When
@readyis set to true (after bindings loaded), all waiting fibers proceed - During normal operation (queue created at runtime),
@readyis already true —when_true.receivereturns immediately, zero overhead
Known limitation
If DLX is configured via policy (not queue argument), policies are deferred by 10ms in a separate fiber.
TTL fibers would run after @ready but before policies are applied, haven't looked into this, but we should probably look at this closely as well.
Reproduction
I was able to trigger this with the following setup https://gist.github.com/jage/68883dd9fc737cd113074f81552dcade - It setups two vhosts and the "qa-ordering" vhost lost all its messages on a restart (it's supposed to just loop messages via a lot af DLX/DLQ hops)