Custom nodes for ComfyUI that integrate the Resemble AI Chatterbox library for Text-to-Speech (TTS) and Voice Conversion (VC).
- Chatterbox TTS Node:
- Synthesize speech from text.
- Optional voice cloning using an audio prompt.
- Adjustable parameters: exaggeration, temperature, CFG weight, seed.
- Chatterbox Voice Conversion Node:
- Convert the voice in a source audio file to sound like a target voice.
- Uses a target audio file for voice characteristics or defaults to a built-in voice if no target is provided.
- Automatic Model Downloading: Necessary model files are automatically downloaded from Hugging Face (
ResembleAI/chatterbox) on first use if not found locally.
Check the official demo
-
Install Dependencies: Navigate to the custom node's directory and install the required packages:
cd ComfyUI/custom_nodes/ComfyUI-Chatterbox pip install -r requirements.txt -
Model Pack Directory (Automatic Setup): The node will automatically attempt to download the default model pack (
resembleai_default_voice) intoComfyUI/models/chatterbox_tts/when you first use a node that requires it. You can also manually create subdirectories inComfyUI/models/chatterbox_tts/and place other Chatterbox model packs there. Each pack should contain:ve.ptt3_cfg.pts3gen.pttokenizer.jsonconds.pt(for default voice capabilities)
-
Restart ComfyUI.
After installation and restarting ComfyUI:
- The "Chatterbox TTS 📢" node will be available under the
audio/generationcategory. - The "Chatterbox Voice Conversion 🗣️" node will be available under the
audio/conversioncategory.
Load example workflows from the workflow-examples/ directory in this repository to get started.
- The Chatterbox library is included within this custom node's
src/directory. - Tested with Pytorch 2.7 + CUDA 12.6
- This node relies on the Chatterbox library by Resemble AI.
