Skip to content

Kokoro TTS - Ultimate Edition: A professional-grade Text-to-Speech system featuring the Kokoro-82M model with NVIDIA GPU acceleration. Generate unlimited-length, multi-language speech with 50+ voices. Includes smart text chunking, batch processing, and interactive Gradio web interface.

License

Notifications You must be signed in to change notification settings

niteshsharmacodes/kokoro-tts-ultimate

Repository files navigation

Kokoro TTS - Ultimate Edition

A high-quality Text-to-Speech system using the Kokoro-82M model with NVIDIA GPU acceleration.

Features

  • Smart Chunking - Intelligently segments text for optimal processing
  • 📝 Unlimited Length - Generate speech from texts of any length
  • 🎙️ High-quality speech synthesis with 82M parameters
  • 🚀 NVIDIA GPU acceleration support
  • 🌍 Multi-language support
  • 👥 Multiple voice options
  • 🎨 Gradio web interface
  • 📦 Pre-trained models included

Do subscribe to my youtube channel for more such releases! - https://www.youtube.com/@TheOracleGuy_AI

Tutorial

📹 Watch the YouTube Tutorial and download the tool:

Kokoro TTS Ultimate - AI Voice Generation

Installation

Prerequisites

  • Python 3.11
  • NVIDIA GPU (recommended for faster inference)
  • CUDA toolkit (if using GPU acceleration)

Setup

  1. Clone the repository:
git clone https://github.com/niteshsharmacodes/kokoro-tts-ultimate.git
cd kokoro-tts-ultimate
  1. Install dependencies:
pip install -r requirements.txt

Usage

Web Interface (Gradio v5)

python scripts/gradio_v5/gradio_inf.py

Web Interface (Gradio v4)

python scripts/gradio_v4/gradio_inf.py

Command Line

python scripts/generate_speech.py --text "Your text here" --voice am_adam

Batch Processing

Use the tinker scripts for batch processing:

python scripts/tinker_v2/b.py

Available Voices

The model includes voices in multiple languages and genders:

  • English: Male (Adam, Echo, Eric, etc.), Female (Alloy, Bella, Jessica, etc.)
  • Chinese: Male (Yunjian, Yunxi, etc.), Female (Xiaobei, Xiaoni, etc.)
  • Japanese: Male (Kumo), Female (Alpha, Gongitsune, etc.)
  • Hindi: Male (Omega, Psi), Female (Alpha, Beta)
  • French: Male, Female
  • Portuguese: Male, Female
  • Spanish: Male, Female
  • Italian: Male, Female
  • And more...

Project Structure

├── model/                  # Pre-trained model files
│   ├── config.json        # Model configuration
│   └── audio/             # Voice model weights
├── scripts/               # Python scripts
│   ├── generate_speech.py # Speech generation script
│   ├── gradio_v4/         # Gradio v4 interface
│   ├── gradio_v5/         # Gradio v5 interface
│   └── tinker/            # Experimental scripts
├── python/                # Python environment
└── hub/                   # Hugging Face model cache

Requirements

See requirements.txt for a complete list of dependencies.

Performance

  • Optimized for NVIDIA GPUs
  • Supports batch processing
  • Real-time inference capable

License

See LICENSE.tpp for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues and questions, please open an issue on GitHub.


Created with ❤️ by The Oracle Guy for high-quality speech synthesis

About

Kokoro TTS - Ultimate Edition: A professional-grade Text-to-Speech system featuring the Kokoro-82M model with NVIDIA GPU acceleration. Generate unlimited-length, multi-language speech with 50+ voices. Includes smart text chunking, batch processing, and interactive Gradio web interface.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published