A sophisticated AI-powered voice assistant that handles incoming phone calls and provides a web-based chat interface. Built with TypeScript, Fastify, and OpenAI's Realtime API, this system integrates with Twilio for telephony services and includes advanced features like memory management, calendar integration, and webhook-based data processing.
- Real-time Voice Processing: Handles incoming Twilio calls with OpenAI's Realtime API
- Intelligent Chat Interface: Web-based chat system for testing and demonstration
- Multi-modal Communication: Supports both voice and text interactions
- Memory Management: Persistent memory system for customer context and preferences
- Calendar Integration: Automated appointment scheduling and availability checking
- Webhook Integration: Extensible data processing through external webhook services
- Speech-to-Text: Real-time audio transcription using Whisper
- Natural Language Processing: GPT-4 powered conversation handling
- Tool-based Architecture: Modular function calling system for extensibility
- Session Management: Comprehensive call session tracking and logging
- Error Handling: Robust error management with graceful fallbacks
- Rate Limiting: Built-in protection against abuse
src/
├── agent/ # AI agent configuration and tools
├── call/ # Twilio call handling and OpenAI integration
├── config/ # Application configuration
├── data-sources/ # Database and data layer
├── providers/ # External service providers (Twilio, OpenAI)
├── services/ # Business logic services
├── utils/ # Utility functions and helpers
└── server.ts # Main application entry point
- Agent System: Configurable AI agent with tool-based architecture
- Call Management: Twilio WebSocket integration for real-time audio
- Memory System: Persistent storage for customer data and preferences
- Webhook Service: External API integration for data processing
- Chat Service: Web-based interface for testing and demonstration
- Runtime: Node.js 20+ with TypeScript
- Framework: Fastify with WebSocket support
- AI/ML: OpenAI GPT-4 and Realtime API
- Telephony: Twilio Voice API
- Database: TypeORM with PostgreSQL support
- Validation: Zod for schema validation
- Styling: Tailwind CSS for web interface
- Development: tsx, Prettier, ESLint
- Node.js 20 or higher
- PostgreSQL database (optional, for data persistence)
- Twilio account with Voice API access
- OpenAI API key with Realtime API access
-
Clone the repository
git clone <repository-url> cd AI-voice-assistant
-
Install dependencies
npm install
-
Environment Configuration Create a
.envfile in the root directory:# OpenAI Configuration OPENAI_API_KEY=your_openai_api_key # Twilio Configuration TWILIO_ACCOUNT_SID=your_twilio_account_sid TWILIO_AUTH_TOKEN=your_twilio_auth_token # Webhook Configuration WEBHOOK_URL=your_webhook_endpoint_url WEBHOOK_TOKEN=your_webhook_authentication_token # Database Configuration (optional) DATABASE_URL=postgresql://username:password@localhost:5432/database_name # Application Configuration PORT=3000 NODE_ENV=development
-
Start the development server
npm run dev
- Configure your Twilio phone number to point to your server's
/incoming-callendpoint - The AI agent will automatically handle incoming calls
- Customers can speak naturally, and the system will process their requests
- The agent can schedule appointments, check availability, and manage customer data
- Navigate to
http://localhost:3000/chatin your web browser - Use the web interface to test the AI agent's capabilities
- Try suggested prompts or type your own messages
- The interface shows tool calls, system messages, and conversation history
- Memory Management: Store and retrieve customer information
- Calendar Operations: Check availability and schedule appointments
- Web Scraping: Extract information from websites
- Call Management: End calls and manage call sessions
npm run dev # Start development server with hot reload
npm run start # Start production server
npm run build # Compile TypeScript (if needed)
npm run format # Format code with Prettier
npm run reset # Clean install dependencies- Agent Configuration: Define AI behavior and available tools in
src/agent/ - Call Handling: Implement call logic in
src/call/ - Service Layer: Add business logic in
src/services/ - Utilities: Common functions in
src/utils/
- Define tool schema in
src/agent/tools.ts - Implement tool logic in appropriate service
- Register tool in the agent configuration
- Test using the chat interface
GET /- Health check endpointPOST /incoming-call- Twilio webhook for incoming callsGET /media-stream- WebSocket endpoint for real-time audioGET /chat- Serve chat interfacePOST /chat- Handle chat messages
- Environment variables for sensitive configuration
- Rate limiting on API endpoints
- Input validation using Zod schemas
- Secure webhook authentication
- Error handling without information leakage
This project is licensed under the ISC License.
Contributions are welcome! Please feel free to submit a Pull Request.
This project is designed for demonstration and development purposes. For production deployment, consider:
- Implementing proper authentication and authorization
- Adding comprehensive logging and monitoring
- Setting up proper error tracking and alerting
- Ensuring compliance with telephony regulations
- Implementing proper data privacy measures
- Adding comprehensive testing coverage