Skip to content

A lightweight JavaScript API for retrieving transcripts and subtitles from YouTube videos. Works with both manual and auto-generated captions without requiring API keys or browser automation tools like Selenium.

License

Notifications You must be signed in to change notification settings

rajat-mehra05/youtube-transcript-api-js

Repository files navigation

YouTube Transcript API

A lightweight JavaScript/TypeScript library for retrieving transcripts and subtitles from YouTube videos. Supports manual and auto-generated captions, multiple languages, translation, and various output formats — no API keys or browser automation required.

Common use cases: content analysis, accessibility tools, search indexing, language learning apps, video summarization, subtitle generation, and AI/NLP training data collection.

Installation

npm install youtube-transcript-api-js

Or with yarn:

yarn add youtube-transcript-api-js

Quick Start

import { YouTubeTranscriptApi } from 'youtube-transcript-api-js';

const api = new YouTubeTranscriptApi();

// Fetch transcript for a video
const transcript = await api.fetch('dQw4w9WgXcQ');

console.log(transcript.snippets);
// Output: [{ text: '...', start: 0.0, duration: 1.5 }, ...]

API Reference

YouTubeTranscriptApi

The main class for fetching transcripts.

import { YouTubeTranscriptApi } from 'youtube-transcript-api-js';

const api = new YouTubeTranscriptApi();

fetch(videoId, languages?, preserveFormatting?)

Fetches the transcript for a video.

// Fetch with default language (English)
const transcript = await api.fetch('VIDEO_ID');

// Fetch with specific languages (priority order)
const transcript = await api.fetch('VIDEO_ID', ['de', 'en']);

// Preserve formatting (keeps HTML tags like <i>, <b>)
const transcript = await api.fetch('VIDEO_ID', ['en'], true);

Parameters:

Parameter Type Default Description
videoId string — The YouTube video ID
languages string[] ['en'] Language codes in priority order
preserveFormatting boolean false Keep HTML formatting tags

Returns: Promise<FetchedTranscript>

list(videoId)

Lists all available transcripts for a video.

const transcriptList = await api.list('VIDEO_ID');

// Find a specific transcript
const transcript = transcriptList.findTranscript(['en', 'de']);

// Get only auto-generated transcripts
const generated = transcriptList.findGeneratedTranscript(['en']);

// Get only manually created transcripts
const manual = transcriptList.findManuallyCreatedTranscript(['en']);

// Iterate over all transcripts
for (const transcript of transcriptList) {
  console.log(`${transcript.languageCode}: ${transcript.language}`);
}

Translating Transcripts

const transcriptList = await api.list('VIDEO_ID');
const transcript = transcriptList.findTranscript(['en']);

if (transcript.isTranslatable) {
  const translated = transcript.translate('de');
  const fetched = await translated.fetch();
  console.log(fetched.snippets);
}

Data Types

interface FetchedTranscript {
  snippets: FetchedTranscriptSnippet[];
  videoId: string;
  language: string;
  languageCode: string;
  isGenerated: boolean;
  toRawData(): Array<{ text: string; start: number; duration: number }>;
}

interface FetchedTranscriptSnippet {
  text: string;
  start: number;
  duration: number;
}

CLI Usage

Basic Commands

# Fetch transcript for a video
youtube-transcript-api VIDEO_ID

# Fetch transcripts for multiple videos
youtube-transcript-api VIDEO_ID1 VIDEO_ID2 VIDEO_ID3

# List available transcripts
youtube-transcript-api VIDEO_ID --list-transcripts

Options

Option Description Example
--list-transcripts List available transcript languages --list-transcripts
--languages <codes...> Language codes in priority order --languages de en
--translate <code> Translate to specified language --translate es
--format <format> Output format: json, pretty, text, srt, webvtt --format srt
--exclude-generated Exclude auto-generated transcripts --exclude-generated
--exclude-manually-created Exclude manually created transcripts --exclude-manually-created
--http-proxy <url> Use HTTP proxy --http-proxy http://proxy:8080
--https-proxy <url> Use HTTPS proxy --https-proxy http://proxy:8080
--webshare-proxy-username <user> Webshare proxy username --webshare-proxy-username myuser
--webshare-proxy-password <pass> Webshare proxy password --webshare-proxy-password mypass

Examples

# Get German transcript, fallback to English
youtube-transcript-api dQw4w9WgXcQ --languages de en

# Export as SRT subtitle file
youtube-transcript-api dQw4w9WgXcQ --format srt > subtitles.srt

# Translate English transcript to Spanish
youtube-transcript-api dQw4w9WgXcQ --languages en --translate es

# Use proxy for requests
youtube-transcript-api dQw4w9WgXcQ --http-proxy http://user:pass@proxy.com:8080

# Get only manually created transcripts in JSON
youtube-transcript-api dQw4w9WgXcQ --exclude-generated --format json

Output Formatters

Format transcripts in different output formats.

import { FormatterLoader, SRTFormatter } from 'youtube-transcript-api-js';

const transcript = await api.fetch('VIDEO_ID');

// Using FormatterLoader
const loader = new FormatterLoader();
const formatter = loader.load('srt');
console.log(formatter.formatTranscript(transcript));

// Or use formatters directly
const srtFormatter = new SRTFormatter();
console.log(srtFormatter.formatTranscript(transcript));
Format Class Description
json JSONFormatter Compact JSON output
pretty PrettyPrintFormatter Pretty-printed JSON
text TextFormatter Plain text (transcript only)
srt SRTFormatter SubRip subtitle format
webvtt WebVTTFormatter WebVTT subtitle format

Proxy Support

Generic Proxy

import { YouTubeTranscriptApi, GenericProxyConfig } from 'youtube-transcript-api-js';

const proxyConfig = new GenericProxyConfig(
  'http://user:pass@proxy.example.com:8080',  // HTTP
  'http://user:pass@proxy.example.com:8080'   // HTTPS
);

const api = new YouTubeTranscriptApi(proxyConfig);

Webshare Rotating Proxy

For rotating residential proxies via Webshare.

import { YouTubeTranscriptApi, WebshareProxyConfig } from 'youtube-transcript-api-js';

const proxyConfig = new WebshareProxyConfig(
  'your-username',
  'your-password',
  ['US', 'GB'],  // Filter by IP locations (optional)
  10             // Retries when blocked (default: 10)
);

const api = new YouTubeTranscriptApi(proxyConfig);

Enhanced API with Proxy

For advanced use cases with Invidious fallback support.

import { EnhancedYouTubeTranscriptApi } from 'youtube-transcript-api-js';

const api = new EnhancedYouTubeTranscriptApi(
  {
    enabled: true,
    http: 'http://user:pass@proxy.example.com:8080',
    https: 'http://user:pass@proxy.example.com:8080'
  },
  {
    enabled: true,
    instanceUrls: ['https://invidious.example.com'],
    timeout: 10000
  }
);

const transcript = await api.fetch('VIDEO_ID');

Error Handling

import {
  VideoUnavailable,
  TranscriptsDisabled,
  NoTranscriptFound,
  RateLimitExceeded,
  TimeoutError,
  ConnectionError
} from 'youtube-transcript-api-js';

try {
  const transcript = await api.fetch('VIDEO_ID');
} catch (error) {
  if (error instanceof RateLimitExceeded) {
    console.error(`Rate limited. Retry after ${error.retryAfter} seconds`);
  } else if (error instanceof TimeoutError) {
    console.error(`Request timed out after ${error.timeoutMs}ms`);
  }
}
Error Description
VideoUnavailable Video does not exist or is private
TranscriptsDisabled Subtitles are disabled for this video
NoTranscriptFound No transcript for requested languages
InvalidVideoId Invalid video ID (URL passed instead of ID)
AgeRestricted Video is age-restricted
VideoUnplayable Video cannot be played
RequestBlocked YouTube is blocking requests from your IP
IpBlocked IP address has been blocked
RateLimitExceeded Too many requests (HTTP 429)
NetworkError General network error
TimeoutError Request timed out
ConnectionError Failed to connect to server

Tech Stack

Technology Purpose
TypeScript Type-safe JavaScript
Axios HTTP client
Commander.js CLI framework
http-proxy-agent Proxy support
Jest Testing

License

MIT

About

A lightweight JavaScript API for retrieving transcripts and subtitles from YouTube videos. Works with both manual and auto-generated captions without requiring API keys or browser automation tools like Selenium.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •