Skip to content

Minimal real-time transcription demo built with Flask and JavaScript, showcasing integration with Gladia’s Real-Time API V2. Includes a simple web interface and CLI for streaming audio, managing sessions, and displaying live transcripts in real time.

Notifications You must be signed in to change notification settings

Nelson-PROIA/gladia-realtime-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gladia Realtime Demo

Live speech-to-text demo using Gladia's Realtime API V2 with speaker diarization.

🌐 Live Demo: https://gladia-realtime-demo.onrender.com/
Note: Access key required - contact the owner for access

Quick Start

  1. Get API key: app.gladia.io
  2. Clone & setup:
git clone https://github.com/Nelson-PROIA/gladia-realtime-demo
cd gladia-realtime-demo
pip install -r requirements.txt
  1. Configure:
cp .env.example .env
# Edit .env with your settings:
# GLADIA_API_KEY=your_gladia_api_key
# SECRET_KEY=your_flask_secret_key
# ACCESS_KEY=your_demo_access_key (optional - for public deployment security)
  1. Run:
python app.py
# Open http://localhost:8000

Code Structure

├── app.py                    # Flask web app (WebSocket + Gladia API)
├── cli.py                    # Minimal CLI demo (~200 lines)
├── static/
│   ├── index.html           # Main UI
│   ├── login.html           # Access control page
│   ├── app.js               # Frontend (audio capture + Socket.IO)
│   ├── style.css            # Main styling
│   └── login.css            # Login page styling
├── assets/
│   ├── gladia-logo.png      # Logo image
│   └── gladia-logo.ico      # Favicon
├── requirements.txt          # Web app dependencies
├── requirements-cli.txt      # CLI dependencies (includes PyAudio)
├── .env.example             # Environment variables template
└── README.md                # This file

How It Works

  1. Browser captures microphone → Web Audio API (16kHz)
  2. Frontend converts audio → WAV → base64 → Socket.IO
  3. Flask receives audio → forwards to Gladia WebSocket
  4. Gladia transcribes → returns JSON → displayed in UI

CLI Demo

Quick test without web interface:

pip install -r requirements-cli.txt  # Adds PyAudio
python cli.py

Configuration

Web App: Use the UI settings panel

  • Languages: Auto-detect or select specific languages
  • Code switching: Only available in auto-detect mode
  • Custom vocabulary: Boost recognition for specific terms

CLI: Edit create_session() in cli.py:

payload = {
    "language_config": {
        "languages": [],  # Auto-detect
        "code_switching": False
    },
    "realtime_processing": {
        "custom_vocabulary": True,
        "custom_vocabulary_config": {
            "vocabulary": ["Gladia", "API"]
        }
    }
}

Deployment

Render (recommended):

  1. Push to GitHub
  2. Connect repo to Render
  3. Set environment variables: GLADIA_API_KEY
  4. Deploy

API Reference

  • Gladia Docs: docs.gladia.io
  • Endpoint: POST https://api.gladia.io/v2/live
  • Audio format: WAV/PCM, 16kHz, mono
  • Languages: Multiple languages supported (auto-detect or specific)

Troubleshooting

  • No audio: Check microphone permissions
  • No transcription: Verify GLADIA_API_KEY in .env
  • Poor quality: Add custom vocabulary, reduce noise
  • CLI issues: Install PyAudio system dependencies

Built by: Nelson PROIA Powered by: Gladia

About

Minimal real-time transcription demo built with Flask and JavaScript, showcasing integration with Gladia’s Real-Time API V2. Includes a simple web interface and CLI for streaming audio, managing sessions, and displaying live transcripts in real time.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published