Skip to content

Discord messages not received when multiple gateway instances running #73

@drpedapati

Description

@drpedapati

Problem

When multiple gateway instances are running simultaneously with the same Discord bot token, only ONE instance receives messages from Discord. The other instances appear connected but receive no messages, causing confusing behavior where:

  • The gateway shows "Discord bot connected" successfully
  • No errors are logged
  • Messages sent in Discord never appear in the logs
  • The bot appears completely unresponsive

Root Cause

Discord's websocket gateway only delivers each message to ONE connected client per bot token. When multiple sciclaw gateway processes connect with the same token, Discord arbitrarily picks one to receive messages. The others are effectively "deaf" while appearing healthy.

How This Happens

  1. Service restarts during debugging - Running sciclaw gateway in foreground while launchd service is also running
  2. Failed service stops - sciclaw service stop times out but process keeps running
  3. Multiple machines - Running gateway on both local machine and remote server with same config
  4. Stale PID files - Gateway doesn't fully clean up on crash

Symptoms

  • Gateway logs show successful connection: Discord bot connected {username=sciclaw-app}
  • No Processing message or route_* logs appear when messages are sent
  • Messages with @mentions are ignored
  • Other channels (if configured) may work fine

Diagnosis

# Check for multiple gateway processes
pgrep -fl sciclaw

# If you see multiple PIDs, that's the problem
# Example bad output:
# 12387 /opt/homebrew/bin/sciclaw gateway
# 13526 /tmp/sciclaw-debug gateway

Fix

1. Kill all gateway processes

pkill -9 -f "sciclaw.*gateway"
# or
pkill -9 sciclaw

2. Verify all are stopped

pgrep -fl sciclaw
# Should return nothing

3. Restart single instance

sciclaw service start
# or
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/io.sciclaw.gateway.plist

4. Verify single process

pgrep -fl sciclaw
# Should show exactly ONE process

Prevention Suggestions

  • Add startup check that kills any existing gateway processes before starting
  • Add periodic heartbeat log showing "still receiving messages"
  • Add warning when another gateway instance is detected (if possible to detect)
  • Improve sciclaw service stop to force-kill if graceful stop times out
  • Add sciclaw doctor check for multiple running instances

Environment

  • sciclaw v0.1.64
  • macOS (launchd) / Linux (systemd)
  • Discord channel

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions