diff --git a/docs.json b/docs.json index d58e04c7..f3fe2ecf 100644 --- a/docs.json +++ b/docs.json @@ -196,6 +196,7 @@ "server/services/tts/openai", "server/services/tts/piper", "server/services/tts/playht", + "server/services/tts/resemble", "server/services/tts/rime", "server/services/tts/sarvam", "server/services/tts/speechmatics", diff --git a/guides/features/gemini-live.mdx b/guides/features/gemini-live.mdx index 8385f7a3..3f3c7919 100644 --- a/guides/features/gemini-live.mdx +++ b/guides/features/gemini-live.mdx @@ -3,12 +3,6 @@ title: "Building with Gemini Live" description: "Create real-time voice AI agents using Google's Gemini Live API and Pipecat" --- - - Gemini 3 SuperHack 🏈 - - Check out our [Gemini Live demo](https://github.com/pipecat-ai/gemini-live-web-starter) showing how to build with a web client using vision capabilities. - - Gemini Live is Google's speech-to-speech API that enables natural, real-time voice conversations with AI. With Pipecat, you can build production-ready voice agents that leverage Gemini Live for telephony, web, and mobile applications. diff --git a/server/services/supported-services.mdx b/server/services/supported-services.mdx index 3e391d33..3dd83033 100644 --- a/server/services/supported-services.mdx +++ b/server/services/supported-services.mdx @@ -109,6 +109,7 @@ Text-to-Speech services receive text input and output audio streams or chunks. | [OpenAI](/server/services/tts/openai) | `pip install "pipecat-ai[openai]"` | | [Piper](/server/services/tts/piper) | No dependencies required | | [PlayHT](/server/services/tts/playht) | `pip install "pipecat-ai[playht]"` | +| [Resemble](/server/services/tts/resemble) | `pip install "pipecat-ai[resemble]"` | | [Rime](/server/services/tts/rime) | `pip install "pipecat-ai[rime]"` | | [Sarvam](/server/services/tts/sarvam) | No dependencies required | | [Speechmatics](/server/services/tts/speechmatics) | `pip install "pipecat-ai[speechmatics]"` | diff --git a/server/services/tts/resemble.mdx b/server/services/tts/resemble.mdx new file mode 100644 index 00000000..58c6f9f4 --- /dev/null +++ b/server/services/tts/resemble.mdx @@ -0,0 +1,61 @@ +--- +title: "Resemble AI" +description: "Text-to-speech service using Resemble AI's WebSocket streaming API with word-level timing" +--- + +## Overview + +`ResembleAITTSService` provides high-quality text-to-speech synthesis using Resemble AI's streaming WebSocket API with word-level timestamps and audio context management for handling multiple simultaneous synthesis requests with proper interruption support. + + + + Pipecat's API methods for Resemble AI TTS integration + + + Complete example with interruption handling + + + Official Resemble AI API documentation + + + Sign up for a Resemble AI account + + + +## Installation + +To use Resemble AI services, install the required dependencies: + +```bash +pip install "pipecat-ai[resemble]" +``` + +## Prerequisites + +### Resemble AI Account Setup + +Before using Resemble AI TTS services, you need: + +1. **Resemble AI Account**: Sign up at [Resemble AI](https://app.resemble.ai) +2. **API Key**: Generate an API key from your [account settings](https://app.resemble.ai/account/api) +3. **Voice Selection**: Choose or create voice UUIDs from your [voice library](https://app.resemble.ai/hub/voices) + +### Required Environment Variables + +- `RESEMBLE_API_KEY`: Your Resemble AI API key for authentication