Kokoro TTS - Ultimate Edition

A high-quality Text-to-Speech system using the Kokoro-82M model with NVIDIA GPU acceleration.

Features

⚡ Smart Chunking - Intelligently segments text for optimal processing
📝 Unlimited Length - Generate speech from texts of any length
🎙️ High-quality speech synthesis with 82M parameters
🚀 NVIDIA GPU acceleration support
🌍 Multi-language support
👥 Multiple voice options
🎨 Gradio web interface
📦 Pre-trained models included

Do subscribe to my youtube channel for more such releases! - https://www.youtube.com/@TheOracleGuy_AI

Tutorial

📹 Watch the YouTube Tutorial and download the tool:

Installation

Prerequisites

Python 3.11
NVIDIA GPU (recommended for faster inference)
CUDA toolkit (if using GPU acceleration)

Setup

Clone the repository:

git clone https://github.com/niteshsharmacodes/kokoro-tts-ultimate.git
cd kokoro-tts-ultimate

Install dependencies:

pip install -r requirements.txt

Usage

Web Interface (Gradio v5)

python scripts/gradio_v5/gradio_inf.py

Web Interface (Gradio v4)

python scripts/gradio_v4/gradio_inf.py

Command Line

python scripts/generate_speech.py --text "Your text here" --voice am_adam

Batch Processing

Use the tinker scripts for batch processing:

python scripts/tinker_v2/b.py

Available Voices

The model includes voices in multiple languages and genders:

English: Male (Adam, Echo, Eric, etc.), Female (Alloy, Bella, Jessica, etc.)
Chinese: Male (Yunjian, Yunxi, etc.), Female (Xiaobei, Xiaoni, etc.)
Japanese: Male (Kumo), Female (Alpha, Gongitsune, etc.)
Hindi: Male (Omega, Psi), Female (Alpha, Beta)
French: Male, Female
Portuguese: Male, Female
Spanish: Male, Female
Italian: Male, Female
And more...

Project Structure

├── model/                  # Pre-trained model files
│   ├── config.json        # Model configuration
│   └── audio/             # Voice model weights
├── scripts/               # Python scripts
│   ├── generate_speech.py # Speech generation script
│   ├── gradio_v4/         # Gradio v4 interface
│   ├── gradio_v5/         # Gradio v5 interface
│   └── tinker/            # Experimental scripts
├── python/                # Python environment
└── hub/                   # Hugging Face model cache

Requirements

See requirements.txt for a complete list of dependencies.

Performance

Optimized for NVIDIA GPUs
Supports batch processing
Real-time inference capable

License

See LICENSE.tpp for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues and questions, please open an issue on GitHub.

Created with ❤️ by The Oracle Guy for high-quality speech synthesis

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
model		model
scripts		scripts
.gitignore		.gitignore
LICENSE.tpp		LICENSE.tpp
README.md		README.md
Run KOKORO.bat		Run KOKORO.bat
espeak-ng.msi		espeak-ng.msi
ffmpeg.exe		ffmpeg.exe
output.wav		output.wav
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kokoro TTS - Ultimate Edition

Features

Tutorial

Installation

Prerequisites

Setup

Usage

Web Interface (Gradio v5)

Web Interface (Gradio v4)

Command Line

Batch Processing

Available Voices

Project Structure

Requirements

Performance

License

Contributing

Support

About

Uh oh!

Releases

Packages

Languages

License

niteshsharmacodes/kokoro-tts-ultimate

Folders and files

Latest commit

History

Repository files navigation

Kokoro TTS - Ultimate Edition

Features

Tutorial

Installation

Prerequisites

Setup

Usage

Web Interface (Gradio v5)

Web Interface (Gradio v4)

Command Line

Batch Processing

Available Voices

Project Structure

Requirements

Performance

License

Contributing

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages