A high-quality Text-to-Speech system using the Kokoro-82M model with NVIDIA GPU acceleration.
- ⚡ Smart Chunking - Intelligently segments text for optimal processing
- 📝 Unlimited Length - Generate speech from texts of any length
- 🎙️ High-quality speech synthesis with 82M parameters
- 🚀 NVIDIA GPU acceleration support
- 🌍 Multi-language support
- 👥 Multiple voice options
- 🎨 Gradio web interface
- 📦 Pre-trained models included
Do subscribe to my youtube channel for more such releases! - https://www.youtube.com/@TheOracleGuy_AI
📹 Watch the YouTube Tutorial and download the tool:
- Python 3.11
- NVIDIA GPU (recommended for faster inference)
- CUDA toolkit (if using GPU acceleration)
- Clone the repository:
git clone https://github.com/niteshsharmacodes/kokoro-tts-ultimate.git
cd kokoro-tts-ultimate- Install dependencies:
pip install -r requirements.txtpython scripts/gradio_v5/gradio_inf.pypython scripts/gradio_v4/gradio_inf.pypython scripts/generate_speech.py --text "Your text here" --voice am_adamUse the tinker scripts for batch processing:
python scripts/tinker_v2/b.pyThe model includes voices in multiple languages and genders:
- English: Male (Adam, Echo, Eric, etc.), Female (Alloy, Bella, Jessica, etc.)
- Chinese: Male (Yunjian, Yunxi, etc.), Female (Xiaobei, Xiaoni, etc.)
- Japanese: Male (Kumo), Female (Alpha, Gongitsune, etc.)
- Hindi: Male (Omega, Psi), Female (Alpha, Beta)
- French: Male, Female
- Portuguese: Male, Female
- Spanish: Male, Female
- Italian: Male, Female
- And more...
├── model/ # Pre-trained model files
│ ├── config.json # Model configuration
│ └── audio/ # Voice model weights
├── scripts/ # Python scripts
│ ├── generate_speech.py # Speech generation script
│ ├── gradio_v4/ # Gradio v4 interface
│ ├── gradio_v5/ # Gradio v5 interface
│ └── tinker/ # Experimental scripts
├── python/ # Python environment
└── hub/ # Hugging Face model cache
See requirements.txt for a complete list of dependencies.
- Optimized for NVIDIA GPUs
- Supports batch processing
- Real-time inference capable
See LICENSE.tpp for details.
Contributions are welcome! Please feel free to submit a Pull Request.
For issues and questions, please open an issue on GitHub.
Created with ❤️ by The Oracle Guy for high-quality speech synthesis