Skip to content

DLX message loss on restart due to multi-threaded race in definition loading #1697

@jage

Description

@jage

DLX message loss on restart due to multi-threaded race in definition loading

Problem

When LavinMQ restarts with multi-threading (4 threads), messages in TTL queues might be lost. The server correctly persists and loads messages from disk, but during load_definitions!, queues are created (spawning TTL expire fibers) before exchange bindings are loaded. With Crystal's multi-threaded runtime, another thread picks up a TTL fiber immediately, finds expired messages (downtime > TTL), attempts DLX routing, but the exchange has no bindings yet — find_queues returns empty and messages are silently dropped via delete_message.

Observed this on a QA instance that had a bunch of queues with data, e.g. it got a bit busy when starting up, haven't been able to reproduce via naive spec.

Root Cause

src/lavinmq/vhost.cr load_definitions! applies definitions in order:

  1. Line 549: exchanges created
  2. Line 551: queues created → spawns TTL expire fibers
  3. Line 553: exchange bindings ← not reached yet when TTL fires
  4. Line 555: queue bindings ← not reached yet when TTL fires

Queue#initializestartspawn message_expire_loop. With 4 threads, the spawned fiber runs on another thread immediately, before the main thread finishes loading bindings.

The expire path: message_expire_loopexpire_messagesexpire_msg@dead_letter.routeex.find_queues(...) → empty (no bindings) → returndelete_message (message gone).

Proposed Fix

Add a @ready gate (BoolChannel) to VHost. Queue expire loops wait on this gate before processing. The gate opens after all definitions (including bindings) are loaded.

  • BoolChannel is already used extensively in queue code (@consumers_empty, @msg_store.empty)
  • When @ready is false (during loading), when_true.receive blocks the fiber
  • When @ready is set to true (after bindings loaded), all waiting fibers proceed
  • During normal operation (queue created at runtime), @ready is already true — when_true.receive returns immediately, zero overhead

Known limitation

If DLX is configured via policy (not queue argument), policies are deferred by 10ms in a separate fiber.
TTL fibers would run after @ready but before policies are applied, haven't looked into this, but we should probably look at this closely as well.

Reproduction

I was able to trigger this with the following setup https://gist.github.com/jage/68883dd9fc737cd113074f81552dcade - It setups two vhosts and the "qa-ordering" vhost lost all its messages on a restart (it's supposed to just loop messages via a lot af DLX/DLQ hops)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions