Agent Voice Response with ElevenLabs TTS Integration

This repository demonstrates the integration between Agent Voice Response (AVR) and ElevenLabs Text-to-Speech (TTS) API, allowing for real-time speech synthesis in an audio format suitable for telephony applications. The project is built with Node.js and leverages ElevenLabs for high-quality voice generation.

Features

Real-time Text-to-Speech (TTS): Convert text to natural-sounding speech using ElevenLabs API.
Streaming Audio: The audio response is streamed back to the client in real-time using Node.js' stream capabilities, allowing for low-latency voice responses.

Prerequisites

Before you begin, ensure you have the following:

Node.js and npm installed.
An ElevenLabs API key and a voice ID.

Installation

Clone the repository:

git clone https://github.com/agentvoiceresponse/avr-tts-elevenlabs.git
cd agent-voice-response-elevenlabs

Install dependencies:
```
npm install
```

Create a .env file in the root directory and add your ElevenLabs API key and voice ID:

ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_VOICE_ID=your_elevenlabs_voice_id
PORT=6007

Usage

To start the application:

npm start

The application will listen on the port specified in the .env file (default is 6007).

API Endpoint

`POST /text-to-speech-stream`

This endpoint accepts a JSON payload containing the text to be converted into speech. The audio is streamed back in WAV format.

Request Body:

{
  "text": "Hello, how can I assist you today?"
}

Response: The server streams the audio as audio/wav with the following characteristics:
- Mono channel
- 8kHz sample rate
- 16-bit linear PCM

Example Request

curl -X POST http://localhost:6003/text-to-speech-stream \
     -H "Content-Type: application/json" \
     -d '{"text":"Hello, this is a real-time voice response!"}' \
     --output response.wav

How It Works

The application receives a text string through an HTTP POST request.
It sends this text to ElevenLabs' API to synthesize the voice.
The audio response is streamed back to the client.

Code Breakdown

ElevenLabs API Call: The text is sent to the ElevenLabs API to generate speech using the provided voice ID. The request includes parameters like voice settings (stability, similarity boost, etc.).
Real-time Streaming: The audio is streamed back to the client in real-time.

Error Handling

The application includes basic error handling:

Missing text in the request body results in a 400 Bad Request response.
Issues with the ElevenLabs API result in a 500 Internal Server Error response.

Support & Community

GitHub: https://github.com/agentvoiceresponse - Report issues, contribute code.
Discord: https://discord.gg/DFTU69Hg74 - Join the community discussion.
Docker Hub: https://hub.docker.com/u/agentvoiceresponse - Find Docker images.
NPM: https://www.npmjs.com/~agentvoiceresponse - Browse our packages.
Wiki: https://wiki.agentvoiceresponse.com/en/home - Project documentation and guides.

Support AVR

AVR is free and open-source. Any support is entirely voluntary and intended as a personal gesture of appreciation. Donations do not provide access to features, services, or special benefits, and the project remains fully available regardless of donations.

License

MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Voice Response with ElevenLabs TTS Integration

Features

Prerequisites

Installation

Usage

API Endpoint

`POST /text-to-speech-stream`

Example Request

How It Works

Code Breakdown

Error Handling

Support & Community

Support AVR

License

About

Uh oh!

Uh oh!

Languages

License

agentvoiceresponse/avr-tts-elevenlabs

Folders and files

Latest commit

History

Repository files navigation

Agent Voice Response with ElevenLabs TTS Integration

Features

Prerequisites

Installation

Usage

API Endpoint

POST /text-to-speech-stream

Example Request

How It Works

Code Breakdown

Error Handling

Support & Community

Support AVR

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages

`POST /text-to-speech-stream`