AI Voice Agent - proPAL AI - Assignment

A real-time voice interaction system built with LiveKit that combines Speech-to-Text, Large Language Model, and Text-to-Speech capabilities to create an interactive voice agent.

Features

Speech-to-Text (STT) using OpenAI's Whisper
Large Language Model (LLM) integration with Groq (Llama3-70B model)
Text-to-Speech (TTS) using ElevenLabs
Real-time streaming support via LiveKit
Comprehensive metrics tracking and logging to Excel
Multi-language support

Quick Start

1. Prerequisites

Python 3.8 or higher
Virtual environment (recommended)

2. Installation

# Clone or download the project files
git clone https://github.com/allwin107/AI-Voice-Agent.git
cd ai-voice-agent

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Requirements

Install the required dependencies:

pip install -r requirements.txt

Configuration

Create a .env file with your API keys:

GROQ_API_KEY=your_groq_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key

# LiveKit Configuration
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret

Getting API Keys

Gorq (free LLM) console.groq.com - Fast LLM inference
ElevenLabs: elevenlabs.io - Text-to-Speech
Livekit: LiveKit Cloud - Real-time communication

Project Structure

app/pipeline/ - Core pipeline components
- stt.py - Speech-to-Text using Whisper
- llm.py - Language model integration using Groq
- tts.py - Text-to-Speech using ElevenLabs
- voice_agent.py - Main voice agent pipeline
- livekit_backend.py - Livekit Integration
app/test/ - Testing Scripts
- test_stt.py - Tests the transcription functionality of the STT (Speech-to-Text) module.
- test_llm.py - This script is used to test the LLM response generation functionality.
- test_tts.py - Test the text-to-speech functionality of the application.
- test_agent.py - Test script for the voice agent pipeline
- test_audio - Test .wav audio file
app/config.py - Configuration settings for the application
.env - Environment Variables
README.md
requirements.txt

Usage

Running Tests

Test individual components:

python app/pipeline/test_stt.py
python app/pipeline/test_llm.py
python app/pipeline/test_tts.py
python app/pipeline/test_agent.py

Running the Voice Agent

python app/pipeline/voice_agent.py

Test Your Agent with LiveKit

1. Prerequisites Check

Make sure you’ve done this:

Activated a LiveKit Cloud instance
Have the following values into .env :

LIVEKIT_WS_URL=wss://.livekit.cloud
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...

2. Start Your Voice Agent Locally

Run your livekit_backend.py script from terminal:

python app/pipeline/livekit_backend.py

If working correctly, logs will say:

Connected to room your-livekit-room as your-participant-name

This means the agent is live and ready to receive audio.

3. Join the Same Room as a Human User

Use the LiveKit Agent Playground: https://agent.livekit.io

This is essential for testing as the "other participant"

Steps:

Go to the Playground URL
Input the same Room Name (your-livekit-room)
Use your LiveKit credentials:

API Key, API Secret
Click Join Room

🎙️ Now when you speak, your local VoiceAgentBot should:

Detect your voice
Transcribe it
Send it to the LLM
Reply back via audio in real-time
Log metrics

Performance Metrics

The system tracks several key metrics:

EOU (End of Utterance) Delay
TTFT (Time to First Token)
TTFB (Time to First Byte)
Total Latency

Future Improvements

Smarter language detection
Improved end-of-utterance (EOU) timing
Web or mobile interface integration

📜 License

This project is created for the proPAL AI Backend Engineering Internship assignment.

Built with ❤️ for proPAL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Voice Agent - proPAL AI - Assignment

Features

Quick Start

1. Prerequisites

2. Installation

Requirements

Configuration

Getting API Keys

Project Structure

Usage

Running Tests

Running the Voice Agent

Test Your Agent with LiveKit

1. Prerequisites Check

2. Start Your Voice Agent Locally

3. Join the Same Room as a Human User

Performance Metrics

Future Improvements

📜 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
app		app
README.md		README.md
requirements.txt		requirements.txt

allwin107/AI-Voice-Agent

Folders and files

Latest commit

History

Repository files navigation

AI Voice Agent - proPAL AI - Assignment

Features

Quick Start

1. Prerequisites

2. Installation

Requirements

Configuration

Getting API Keys

Project Structure

Usage

Running Tests

Running the Voice Agent

Test Your Agent with LiveKit

1. Prerequisites Check

2. Start Your Voice Agent Locally

3. Join the Same Room as a Human User

Performance Metrics

Future Improvements

📜 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages