|
| 1 | + |
| 2 | +# 🎬 Real-Time Audio + Subtitle Translation Engine for Global Content |
| 3 | + |
| 4 | +## Problem Statement |
| 5 | + |
| 6 | +Global companies rely heavily on video content for marketing, product education, training, and international outreach. However, most video assets are created in a single language typically in English. Translating these videos manually into multiple languages requires: human translators, voiceover artists, subtitle file creation and formatting, repeated engineering work for UI and SEO localization. |
| 7 | + |
| 8 | +The challenge becomes far more complex when translation needs to be real-time, such as: live product demos, training sessions, user education websites, continuous content creation. Traditional translation workflows cannot meet real-time requirements. |
| 9 | +This project solves that problem by creating a Real-Time Multilingual Video & Audio Translation System that automatically translates: Speech → Text → Translated Text → Translated Audio → Subtitles in real time |
| 10 | + |
| 11 | + |
| 12 | + |
| 13 | +## Table of Contents |
| 14 | +- [lingo.video website](https://lingo-video.vercel.app/) |
| 15 | +- [YouTube video](https://youtu.be/AUdZw9KzZzw) |
| 16 | +- [Installing](#getting-started) |
| 17 | +- [Real-Time Video Subtitles Translation architecture and tech stack](./docs/live-translation-architecture.md) |
| 18 | +- [Impact & Benefits for Global Companies](#impact--benefits-for-global-companies) |
| 19 | +- [Features](#features) |
| 20 | +- [Challenges with Real-Time Translation & How We Solve Them](#challenges-with-real-time-translation--how-we-solve-them) |
| 21 | +- [What is next?](./docs/what-is-next.md) |
| 22 | +- [Author](#author) |
| 23 | +- [License](#license) |
| 24 | + |
| 25 | +## Getting Started |
| 26 | +1. Clone repository |
| 27 | +``` |
| 28 | +git clone https://github.com/ShubhamOulkar/lingo.video.git |
| 29 | +cd lingo.video |
| 30 | +``` |
| 31 | +2. Install dependencies |
| 32 | +``` |
| 33 | +pnpm install |
| 34 | +``` |
| 35 | +3. Get lingo.dev api key from [`lingo.dev`](https://lingo.dev/) |
| 36 | +4. Create `.env` file and store `LINGODOTDEV_API_KEY` |
| 37 | +5. Run frontend and websocket server concurrently |
| 38 | +``` |
| 39 | +pnpm dev |
| 40 | +``` |
| 41 | + |
| 42 | + |
| 43 | +## Impact & Benefits for Global Companies |
| 44 | +This system offers tangible benefits for organizations, especially global food and delivery companies: |
| 45 | + |
| 46 | +- `Eliminates VTT and audio file maintenance`: No need to manually create or store .vtt subtitle files for each language. |
| 47 | + |
| 48 | +- `Reduces database and storage costs`: Subtitles are generated and translated on the fly, so companies don’t pay for storing multiple language files. |
| 49 | + |
| 50 | +- `Minimizes developer workload`: No extra development effort is required to maintain multilingual video content. |
| 51 | + |
| 52 | +- `Reach markets early`: Videos can be shipped in days instead of months, accelerating global reach. |
| 53 | + |
| 54 | +- `Unlimited language support`: AI driven translation opens the door to reaching any country in the world. |
| 55 | + |
| 56 | +- `Focus on product, not translation`: Teams can concentrate on improving the core product while the system handles multilingual content automatically. |
| 57 | + |
| 58 | +## Features |
| 59 | + |
| 60 | +- **Real-Time Subtitle Translation** |
| 61 | + - Translates video subtitles on the fly using [`lingo.dev`](https://lingo.dev/en/sdk) SDK and a WebSocket server. |
| 62 | + - No need to maintain `.vtt` files for multiple languages. |
| 63 | + > Note: This repository includes [.vtt files](./apps/next-app/public/subtitles/emotions.hi.vtt) for manual accuracy testing. You can test it by clicking on `CC` and comparing with live translation. |
| 64 | +
|
| 65 | +- **UI Translation in React** |
| 66 | + - React UI automatically updates using [`Lingo Compiler`](https://lingo.dev/en/compiler) ⚡🤖. |
| 67 | + - Dynamic language compilation without hardcoding translations. |
| 68 | + |
| 69 | +- **SEO-Friendly Multilingual Content** |
| 70 | + - Automatically generates meta tags and Open Graph (OG) tags using [`Lingo CLI`](https://lingo.dev/en/cli). |
| 71 | + - Fully automatable via CI/CD pipelines. |
| 72 | + > note: Verify og cards for hindi [here](https://opengraph.dev/panel?url=https%3A%2F%2Flingo-video.vercel.app%2Fhi) |
| 73 | +
|
| 74 | +- **Time and Cost Efficiency** |
| 75 | + - Reduces developer effort and eliminates third party translators. |
| 76 | + - Ship multilingual content in **days instead of months**. |
| 77 | + |
| 78 | +- **Unlimited Language Support** |
| 79 | + - AI driven translation allows reaching any country worldwide. |
| 80 | + - Easily add new languages without manual work. |
| 81 | + |
| 82 | +- **Focus on Product, Not Translation** |
| 83 | + - Teams can concentrate on improving the core product while translations happen automatically. |
| 84 | + |
| 85 | +- **Scales with Video Volume** |
| 86 | + - Can handle large numbers of videos without extra infrastructure or maintenance. |
| 87 | + |
| 88 | +- **Adopt to user prefered system theme** |
| 89 | + - Website can adopt automatically to user prefered light or dark theme. |
| 90 | + |
| 91 | +## Challenges with Real Time Translation & How We Solve Them |
| 92 | +Real-time translation systems face several technical and operational challenges. This project is designed with production grade solutions to minimize latency, reduce translation costs, and ensure consistent accuracy across high-volume video content. |
| 93 | + |
| 94 | +### ⚠️ Core Challenges |
| 95 | + |
| 96 | +1. **Network Latency** : Real time translation requires fast WebSocket communication. Any network instability can delay subtitle updates. |
| 97 | + |
| 98 | +2. **LLM Token Generation Delay** : Translation quality depends on the speed of token generation from the LLM. High load or large subtitles can increase response time. Lingo SDK do not support streaming. |
| 99 | + |
| 100 | +3. **Redundant Translation Costs** : Many subtitles repeat the same text across videos. Without optimization, the same token generation is billed multiple times. |
| 101 | + |
| 102 | +4. **Cold Start Issues** : Serverless deployments can experience slow startup times, affecting real-time subtitle delivery. |
| 103 | + |
| 104 | +5. **Scaling with High Traffic** : Multiple users watching videos simultaneously can overload translation or socket servers if not optimized. |
| 105 | + |
| 106 | +## Author |
| 107 | +- [LinkedIn](www.linkedin.com/in/shubham-oulkar) |
| 108 | +- [Frontend Mentor](https://www.frontendmentor.io/profile/ShubhamOulkar) |
| 109 | +- [X](https://x.com/shubhuoulkar) |
| 110 | + |
| 111 | +## License |
| 112 | +Content submitted by [shubham oulkar](https://github.com/ShubhamOulkar) is Creative Commons Attribution 4.0 International licensed, as found in the [LICENSE](/LICENSE) file. |
| 113 | + |
| 114 | +## 🌐 Readme in other languages |
| 115 | +[हिंदी](./docs/readmes/hi.md) • [日本語](./docs/readmes/ja.md) • [Français](./docs/readmes/fr.md) • [Deutsch](./docs/readmes/de.md) • [Español](./docs/readmes/es.md) |
0 commit comments