Discord Gemini Live (Native Audio) 🎙️✨

The first open-source implementation of a Discord Bot utilizing the Google Gemini Multimodal Live API for native Speech-to-Speech interaction.

🚀 Why is this special?

Most "voice" bots on Discord today utilize a slow, chained pipeline: Speech-to-Text (Whisper) ➔ LLM (GPT) ➔ Text-to-Speech (ElevenLabs)

This bot is different. It establishes a direct, bi-directional WebSocket connection with Google's Gemini 2.0 model.

No Transcriptions: The model "hears" the raw audio bytes (tone, emotion, pace).
No TTS Engine: The model generates raw audio bytes directly.
Sub-Second Latency: Responses feel almost instantaneous.
Barge-In Capable: You can interrupt the bot, and it will stop talking and listen (Echo Cancellation).

🛠️ The Architecture (The "Secret Sauce")

Connecting Discord's UDP audio stream to Gemini's WebSocket required solving several complex synchronization issues. This repo implements three critical fixes:

1. "Silence Injection" (Keep-Alive) 🤫

Gemini's WebSocket will close the connection with a 1011 error if the client stops sending data. However, when the bot is speaking, we must cut the microphone stream to prevent the bot from hearing itself (Echo).

Solution: When the bot speaks, we inject Digital Silence (b'\x00') into the upload stream. This "mutes" the mic but keeps the WebSocket heartbeat alive.

2. Accumulation Buffer (Jitter Fix) 🌊

Discord sends audio in tiny 20ms chunks. Sending these individually to Google causes network congestion and "choppy" audio.

Solution: We implement an Accumulation Buffer that collects ~150ms of audio (4800 bytes) before sending a single, stable chunk to the API.

3. Opus Error Patching 🩹

Discord occasionally sends empty or malformed Opus packets, which causes standard decoders to crash.

Solution: A monkey-patch for discord.opus.Decoder that safely returns silence instead of raising an exception.

⚙️ Prerequisites

Python 3.10+
FFmpeg (Required for Discord audio processing)
- Linux: sudo apt install ffmpeg
- Windows: Download and add to PATH
- Mac: brew install ffmpeg
Google Gemini API Key (Access to gemini-2.0-flash-exp or newer)

📦 Installation

Clone the repository:

git clone [https://github.com/yourusername/discord-gemini-live.git](https://github.com/yourusername/discord-gemini-live.git)
cd discord-gemini-live

Create a Virtual Environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Dependencies:
```
pip install -r requirements.txt
```
Create your .env file: Copy .env.example to .env and fill in your details.
```
cp .env.example .env
```

📝 Configuration (`.env`)

Variable	Description
`DISCORD_TOKEN`	Your Discord Bot Token (Get it from Developer Portal).
`GEMINI_API_KEY`	Your Google AI Studio API Key.
`GEMINI_MODEL_ID`	Default: `gemini-2.5-flash-native-audio-preview-12-2025`
`GEMINI_VOICE_NAME`	Voices: `Aoede`, `Puck`, `Charon`, `Kore`, `Fenrir`.
`BOT_PERSONALITY`	The System Instruction (Prompt) for the bot.

Example Personality:

You are Skippy, a grumpy otter wizard who hates technology but loves fish.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.github		.github
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bot.py		bot.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Discord Gemini Live (Native Audio) 🎙️✨

🚀 Why is this special?

🛠️ The Architecture (The "Secret Sauce")

1. "Silence Injection" (Keep-Alive) 🤫

2. Accumulation Buffer (Jitter Fix) 🌊

3. Opus Error Patching 🩹

⚙️ Prerequisites

📦 Installation

📝 Configuration (`.env`)

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Languages

Uh oh!

License

malvinarum/Discord-Gemini-Live

Folders and files

Latest commit

History

Repository files navigation

Discord Gemini Live (Native Audio) 🎙️✨

🚀 Why is this special?

🛠️ The Architecture (The "Secret Sauce")

1. "Silence Injection" (Keep-Alive) 🤫

2. Accumulation Buffer (Jitter Fix) 🌊

3. Opus Error Patching 🩹

⚙️ Prerequisites

📦 Installation

📝 Configuration (.env)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Languages

📝 Configuration (`.env`)

Packages