Skip to content

mandarinoazul/discord-ai-bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


🎙️ ENE-SYSTEM: Local AI Voice Companion for Discord

ENE-SYSTEM is an open-source "Voice-to-Voice" Discord bot that runs entirely on local hardware (CPU/GPU). It combines advanced Local LLMs, Neural Voice Cloning, and Speech Recognition to create immersive roleplay experiences with persistent memory and reactive sound effects.

🚀 Key Features

  • 🧠 Local Intelligence: Powered by Ollama (Hermes 3 / Llama 3.1 / Qwen 2.5) for uncensored, smart, and context-aware roleplay.
  • 🗣️ Voice Cloning (XTTS): Uses Coqui XTTS v2 to clone specific character voices (Rick Sanchez, Anime characters, etc.) with high fidelity and emotion.
  • 👂 Speech Recognition: Transcribes voice chat in real-time using OpenAI Whisper.
  • 💾 Memento Protocol (Long-Term Memory): The bot remembers user details (names, facts, hobbies) across different sessions using a JSON database.
  • 🔊 Reactive Soundboard: Automatically injects SFX (burps, glitches, slams) based on the context of the conversation using FFmpeg mixing.
  • 🎭 Multi-Personality Engine: Dynamic switching between different character profiles (prompts + voices) on the fly.

📂 Included Personalities

The system comes with pre-configured profiles (JSON/TXT):

  • Rick (C-137): Nihilistic, cynical, speaks English (auto-generates burps).
  • Shiro (NGNL): Logical, gamer, emotionless tone.
  • The Commander: Historical parody, paranoid, screams orders.
  • The Shadow: "Yandere" entity living in the code (creepy/whispery).
  • Anime Girl: Energetic and cheerful assistant.
  • Picara: Flirty/Sarcastic Latina personality.

🛠️ Tech Stack

  • Language: Python 3.10+
  • Discord: discord.py (with experimental voice recv support).
  • LLM: Ollama (Server-Client architecture).
  • TTS: Coqui XTTS v2 (PyTorch).
  • STT: OpenAI Whisper (Tiny/Base).
  • Audio Processing: FFmpeg & NumPy.

📦 Installation Guide

Prerequisites

Before starting, ensure you have the following installed on your system:

  1. Python 3.10: This version is crucial for XTTS compatibility.
  2. FFmpeg: Must be installed and added to your system PATH (or placed in the project root folder).
  3. Ollama: Installed and running locally.
  4. Hardware: A dedicated GPU is recommended (NVIDIA for CUDA, or AMD via Vulkan for Ollama inference).

Step 1: Download and Setup

Clone the repository to your local machine:

git clone https://github.com/mandarinoazul/discord-ai-bot.git
cd discord-ai-bot

Step 2: Virtual Environment (Recommended)

Using a virtual environment is highly recommended to avoid conflicts with system libraries.

Create a virtual environment with Python 3.10:

py -3.10 -m venv venv_tts

Activate the environment:

  • Windows:
.\venv_tts\Scripts\activate
  • Linux/Mac:
source venv_tts/bin/activate

Step 3: Install Dependencies

With the virtual environment active, install the required packages:

pip install -r requirements.txt

If the requirements.txt file is missing or you encounter issues, you can install dependencies manually:

pip install discord.py ollama openai-whisper scipy discord-ext-voice-recv numpy TTS transformers==4.36.2

Step 4: Download LLM Model

Open a new terminal window and pull the brain model via Ollama:

ollama pull hermes3

Alternatively, you can use ollama pull qwen2.5.


⚙️ Configuration

Discord Token

You need a Bot Token from the Discord Developer Portal.

  • Set it as an environment variable named DISCORD_TOKEN.
  • Alternatively, edit ENEScript.py directly (not recommended for public repos):
DISCORD_TOKEN = "YOUR_TOKEN_HERE"

Base Directory

Edit the BASE_DIR variable in ENEScript.py to match your local folder path:

BASE_DIR = r"C:\Path\To\Your\Project\Folder"

Audio Files

Ensure you have your .wav voice samples (mono, 22050Hz) in the root folder. The filenames must match the names defined in the PERFILES dictionary within the script (e.g., voz_rick.wav, voz_shiro.wav).


🎮 Usage Commands

To start the bot, run:

python ENEScript.py

In Discord

  • !voice [name]: Switch active personality.

  • Example: !voice rick, !voice shadow, !voice anime.

  • !listen: Bot joins your Voice Channel and listens to you for 5 seconds.

  • Usage: Say "Hello Rick", wait for processing, and hear the response.

  • !shh: Emergency Silence. Stops the bot from speaking immediately.

  • !forget: Wipes the bot's long-term memory about the current user.

  • !leave: Disconnects the bot from the voice channel.

Natural Chat

You can also chat via text. The bot responds if you:

  • Mention it (@Bot).
  • Say its name ("Ene", "Rick", "Commander").
  • Reply to its messages.

🔧 Troubleshooting & Optimization

For AMD GPU Users (RX 6000/7000 Series)

To force Ollama to use your GPU instead of CPU/RAM:

  1. Open Windows Environment Variables.
  2. Add a new System Variable: OLLAMA_VULKAN with value 1.
  3. Add another System Variable: HSA_OVERRIDE_GFX_VERSION with value 10.3.0.
  4. Restart Ollama completely.

Latency Tips

  • XTTS runs on CPU by default for stability on Windows AMD systems. It takes approximately 2-3 seconds of processing per second of audio.
  • To improve speed, keep the system prompts asking for short responses (max 20 words).

📜 License

This project is open-source. Feel free to fork, modify, and distribute.

Built with ❤️, Python, and a lot of VRAM.

About

This is my project creating ai bots for Discord.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages