Skip to content

VISU-Reborn/VISU-Reborn-backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– VISU-Reborn

Python 3.8+ LiveKit License: MIT PRs Welcome

An emotionally intelligent voice assistant with real-time emotion detection and visual feedback

VISU is an advanced voice assistant that combines conversational AI with emotional intelligence. It detects user emotions, responds empathetically, and provides real-time visual feedback through a modern web interface.

✨ Features

🎭 Emotional Intelligence

  • Real-time emotion detection - Analyzes user sentiment and responds appropriately
  • Empathetic responses - Matches user's emotional state (sad β†’ empathetic, happy β†’ excited)
  • 8 emotion types - Happy, curious, empathetic, neutral, excited, concerned, supportive, playful
  • Visual feedback - Live emotion display with colors and animations

πŸ—£οΈ Voice Capabilities

  • Natural conversation - Powered by advanced language models (GPT/Cerebras)
  • High-quality TTS - Cartesia voice synthesis
  • Accurate STT - Deepgram speech recognition
  • Voice activity detection - Silero VAD for seamless interaction

🌐 Real-time Frontend

  • Live emotion display - WebSocket-powered real-time updates
  • Modern UI - Glassmorphism design with smooth animations
  • Responsive design - Works on desktop and mobile
  • Connection resilience - Auto-reconnection and error handling

πŸ”§ Architecture

  • Modular design - Separate agent, context, and frontend components
  • Configuration management - Environment-based settings
  • Error handling - Robust fallback mechanisms
  • Extensible - Easy to add new emotions, tools, and features

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Node.js (optional, for advanced frontend development)
  • API keys for:
    • OpenAI/Cerebras (LLM)
    • Deepgram (Speech-to-Text)
    • Cartesia (Text-to-Speech)
    • LiveKit (Real-time communication)

Installation

  1. Clone the repository

    git clone https://github.com/AbhiramVSA/VISU-Reborn.git
    cd VISU-Reborn
  2. Set up Python environment

    # Using uv (recommended)
    uv sync
    
    # Or using pip
    pip install -r requirements.txt
  3. Configure environment variables

    cp .env.example .env
    # Edit .env with your API keys

    Required environment variables:

    OPENAI_API_KEY=your_openai_key
    DEEPGRAM_API_KEY=your_deepgram_key
    CARTESIA_API_KEY=your_cartesia_key
    LIVEKIT_API_KEY=your_livekit_key
    LIVEKIT_API_SECRET=your_livekit_secret
    LIVEKIT_URL=your_livekit_url
  4. Start the emotion frontend

    cd frontend
    python server.py
  5. Run the voice assistant

    # In a new terminal
    uv run main.py
  6. Open the emotion display

    • Visit http://localhost:8000 in your browser
    • You'll see real-time emotion updates as you interact with VISU

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Voice Input   │───▢│   VISU Agent     │───▢│  Emotion API    β”‚
β”‚   (Microphone)  β”‚    β”‚  - LLM Processingβ”‚    β”‚  (HTTP POST)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚  - Emotion Det.  β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚  - Response Gen. β”‚              β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚  - TTS Output    β”‚              β–Ό
β”‚  Voice Output   │◀───│                  β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   (Speakers)    β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚  Web Frontend   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                            β”‚  - Live Display β”‚
                                               β”‚  - WebSocket    β”‚
                                               β”‚  - Animations   β”‚
                                               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

  • agent/visu.py - Main voice assistant with emotion detection
  • frontend/server.py - FastAPI server for emotion visualization
  • context/ - Context loading and management
  • config/ - Configuration and environment management
  • prompts/ - AI personality and behavior rules

🎭 Emotion System

VISU recognizes and responds to user emotions with appropriate emotional states:

User Emotion VISU Response Visual Color Emoji
Sad Empathetic, Concerned Purple, Orange πŸ€— 😟
Happy Excited, Joyful Gold, Red 😊 🀩
Angry Calming, Supportive Green πŸ«‚
Confused Patient, Helpful Blue πŸ€”
Neutral Friendly Light Gray 😐

πŸ”§ Configuration

Environment Setup

Create a .env file in the root directory:

# Required API Keys
OPENAI_API_KEY=sk-...
DEEPGRAM_API_KEY=...
CARTESIA_API_KEY=sk_car_...
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...
LIVEKIT_URL=wss://...

Customizing VISU's Personality

Edit the prompt files to customize VISU's behavior:

  • prompts/prompt.txt - Core personality and speaking style
  • prompts/rules.txt - Operational rules and constraints

Adding Context

Place text files in the context/ directory to give VISU additional knowledge.

πŸ§ͺ Development

Project Structure

VISU-Reborn/
β”œβ”€β”€ agent/              # Voice assistant core
β”‚   β”œβ”€β”€ visu.py        # Main agent implementation
β”‚   └── __init__.py
β”œβ”€β”€ frontend/          # Emotion visualization
β”‚   β”œβ”€β”€ server.py      # FastAPI backend
β”‚   β”œβ”€β”€ requirements.txt
β”‚   └── README.md
β”œβ”€β”€ context/           # Knowledge base
β”‚   β”œβ”€β”€ context.py     # Context loader
β”‚   └── *.txt         # Context files
β”œβ”€β”€ config/            # Configuration
β”‚   └── settings.py    # Environment management
β”œβ”€β”€ prompts/           # AI personality
β”‚   β”œβ”€β”€ prompt.txt     # Core personality
β”‚   └── rules.txt      # Behavioral rules
β”œβ”€β”€ main.py           # Entry point
β”œβ”€β”€ pyproject.toml    # Python dependencies
└── .env             # Environment variables

Running Tests

# Run frontend tests
cd frontend
python -m pytest

# Run agent tests
python -m pytest tests/

Code Quality

# Format code
black .
isort .

# Lint code
flake8 .
pylint agent/ frontend/

🀝 Contributing

We welcome contributions! Here's how to get started:

πŸš€ Quick Contribution Guide

  1. Fork the repository

    git fork https://github.com/AbhiramVSA/VISU-Reborn.git
  2. Create a feature branch

    git checkout -b feature/amazing-feature
  3. Make your changes

    • Follow the existing code style
    • Add tests for new features
    • Update documentation as needed
  4. Test your changes

    # Test the agent
    uv run main.py
    
    # Test the frontend
    cd frontend && python server.py
  5. Commit and push

    git commit -m "feat: add amazing feature"
    git push origin feature/amazing-feature
  6. Create a Pull Request

    • Use the PR template
    • Describe your changes clearly
    • Link related issues

🎯 Contribution Areas

🎭 Emotion System

  • Add new emotion types
  • Improve emotion detection accuracy
  • Create emotion transition animations

πŸ—£οΈ Voice Capabilities

  • Add support for different languages
  • Improve voice recognition accuracy
  • Add voice cloning features

🌐 Frontend Enhancements

  • Create mobile app version
  • Add emotion history graphs
  • Implement theme customization

πŸ”§ Technical Improvements

  • Add comprehensive testing
  • Improve error handling
  • Optimize performance
  • Add Docker support

πŸ“š Documentation

  • Write tutorials
  • Create API documentation
  • Add video demos

πŸ“‹ Development Guidelines

Code Style

  • Use Black for Python formatting
  • Follow PEP 8 conventions
  • Write descriptive commit messages
  • Add docstrings to all functions

Testing

  • Write unit tests for new features
  • Test voice interaction manually
  • Verify frontend functionality
  • Check emotion detection accuracy

Documentation

  • Update README for new features
  • Add inline code comments
  • Write clear function docstrings
  • Update API documentation

πŸ› Bug Reports

Found a bug? Please create an issue with:

  1. Clear title describing the problem
  2. Steps to reproduce the issue
  3. Expected vs actual behavior
  4. System information (OS, Python version, etc.)
  5. Logs or error messages

πŸ’‘ Feature Requests

Have an idea? Create an issue with:

  1. Clear description of the feature
  2. Use case explaining why it's needed
  3. Proposed implementation (if you have ideas)
  4. Examples of similar features

πŸ† Recognition

Contributors will be:

  • Listed in the contributors section
  • Mentioned in release notes
  • Given credit in documentation
  • Invited to the contributors team

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • LiveKit for real-time communication infrastructure
  • OpenAI for advanced language model capabilities
  • Deepgram for accurate speech recognition
  • Cartesia for high-quality text-to-speech
  • FastAPI for the modern web framework
  • Contributors who make this project better

πŸ“ž Support

⭐ Show Your Support

If you find VISU helpful, please consider:

  • ⭐ Starring the repository
  • 🍴 Forking to contribute
  • πŸ“’ Sharing with others
  • πŸ’ Sponsoring the project

Built with ❀️ by the VISU Community
Making AI more emotionally intelligent, one conversation at a time

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages