Skip to content

smile7up/xiaohongshu-downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Xiaohongshu Video Downloader

English | 中文

A Claude Code skill for downloading and summarizing videos from Xiaohongshu (小红书/RedNote) using yt-dlp.

Features

  • Download Xiaohongshu videos in best available quality (up to 1080p)
  • Support multiple URL formats (explore links, discovery links, short links)
  • Automatic browser cookie extraction for authentication
  • Configurable video quality (best / 1080p / 720p / 480p)
  • Audio-only download mode (MP3)
  • Format listing without downloading
  • Full resource pack mode — video + audio + subtitles + transcript in one folder
  • 3-tier subtitle acquisition — manual subs → auto subs → Whisper transcription
  • Parallel Whisper transcription — silence-based splitting + multi-core processing
  • AI-powered summary — structured summary generated by Claude

Prerequisites

  • yt-dlp installed (brew install yt-dlp on macOS or pip install yt-dlp)
  • ffmpeg installed (brew install ffmpeg on macOS)
  • Python 3.8+
  • A browser logged into xiaohongshu.com
  • (Optional) uv for automatic Whisper dependency management (brew install uv)

Installation

As a Claude Code Skill (Recommended)

Copy the skill to your Claude Code skills directory:

cp -r xiaohongshu-downloader ~/.claude/skills/

Then simply ask Claude: "download this xiaohongshu video: "

Standalone Usage

python scripts/download_xiaohongshu.py "https://www.xiaohongshu.com/explore/VIDEO_ID"

Supported URL Formats

Format Example
Explore link https://www.xiaohongshu.com/explore/<id>
Discovery link https://www.xiaohongshu.com/discovery/item/<id>?xsec_token=...
Short link http://xhslink.com/a/<id>

Tip: Always copy the full share URL (including xsec_token parameters) from Xiaohongshu's share button for best results.

Usage

Basic Download (v1.0 compatible)

python scripts/download_xiaohongshu.py "URL"

Output: ~/Downloads/<title> [<id>].mp4

Full Resource Pack

python scripts/download_xiaohongshu.py "URL" --full

Output:

~/Downloads/<video title>/
├── video.mp4          # Original video
├── audio.mp3          # Extracted audio (ffmpeg)
├── subtitle.vtt       # WebVTT subtitles
└── transcript.txt     # Plain text transcript

AI Summary (via Claude Skill)

Ask Claude: "帮我下载并总结这个小红书视频: "

Output adds:

~/Downloads/<video title>/
├── ...
├── .meta.json         # Video metadata
└── summary.md         # AI-generated structured summary

Options

Option Description Default
-o, --output Output directory ~/Downloads
-q, --quality Video quality (best, 1080p, 720p, 480p) best
--browser Browser for cookie extraction (chrome, firefox, safari, none) chrome
-a, --audio-only Download audio only as MP3 false
--list-formats List available formats without downloading false
--full Full resource pack mode false
--summary AI summary mode (implies --full) false

Examples

# Download with default settings (best quality, Chrome cookies)
python scripts/download_xiaohongshu.py "https://www.xiaohongshu.com/explore/69821980000000000e03c95f"

# Download in 720p
python scripts/download_xiaohongshu.py "URL" -q 720p

# Download to a specific directory
python scripts/download_xiaohongshu.py "URL" -o ~/Videos/

# Download audio only
python scripts/download_xiaohongshu.py "URL" -a

# Full resource pack
python scripts/download_xiaohongshu.py "URL" --full

# Full resource pack with AI summary metadata
python scripts/download_xiaohongshu.py "URL" --summary

# List available formats
python scripts/download_xiaohongshu.py "URL" --list-formats

# Use Firefox cookies instead of Chrome
python scripts/download_xiaohongshu.py "URL" --browser firefox

Subtitle Acquisition Strategy

The full resource pack mode uses a 3-tier strategy to obtain subtitles:

  1. Manual subtitles — Tries to download creator-uploaded subtitles via yt-dlp --write-subs
  2. Auto-generated subtitles — Tries platform auto-generated subtitles via yt-dlp --write-auto-subs
  3. Whisper transcription — Falls back to local speech-to-text using faster-whisper

The Whisper fallback uses intelligent silence-based audio splitting and parallel multi-core transcription for faster processing.

Troubleshooting

Problem Solution
No video formats found Log into xiaohongshu.com in your browser first, then retry with --browser chrome
Unable to extract initial state CAPTCHA triggered — open the URL in your browser, solve it, then retry
Link expired Copy a fresh share link from Xiaohongshu (tokens expire)
Low quality only Maximum is 1080p (platform limitation). Use -q best
No subtitles found The script automatically falls back to Whisper transcription
Whisper fails Install uv (brew install uv) or manually install faster-whisper

How It Works

This tool leverages yt-dlp's built-in XiaoHongShu extractor which:

  1. Downloads the Xiaohongshu webpage and extracts window.__INITIAL_STATE__ JSON
  2. Parses video metadata including multiple codec formats (H.264, H.265/HEVC, AV1)
  3. Uses browser cookies (web_session) to authenticate with the platform
  4. Downloads the video stream and merges audio/video if needed via ffmpeg

In full resource pack mode, the tool additionally:

  1. Extracts audio to MP3 using ffmpeg
  2. Acquires subtitles via the 3-tier strategy
  3. Generates a plain-text transcript from the subtitles
  4. (Optional) Prepares metadata for AI summary generation

Project Structure

xiaohongshu-downloader/
├── SKILL.md                              # Skill definition & workflow
├── README.md                             # This file
├── LICENSE                               # MIT License
├── .gitignore
├── scripts/
│   ├── download_xiaohongshu.py           # Main downloader script
│   └── parallel_transcribe.py            # Parallel Whisper transcription
└── reference/
    └── summary-prompt.md                 # AI summary prompt template

License

MIT

Disclaimer

This tool is for personal and educational use only. Please respect copyright laws and Xiaohongshu's terms of service. Always ensure you have the right to download content before using this tool.


小红书视频下载器

一个基于 yt-dlp 的 Claude Code 技能,用于下载和总结小红书 (RedNote) 视频。

功能特点

  • 下载小红书视频,最高画质可达 1080p
  • 支持多种链接格式(探索链接、发现链接、短链接)
  • 自动提取浏览器 Cookie 进行身份验证
  • 可配置视频画质(best / 1080p / 720p / 480p)
  • 仅下载音频模式(MP3)
  • 列出可用格式(不下载)
  • 完整资源包模式 — 视频 + 音频 + 字幕 + 文字稿,统一输出到一个文件夹
  • 三级字幕获取策略 — 手动字幕 → 自动字幕 → Whisper 转录
  • 并行 Whisper 转录 — 基于静音点切割 + 多核并行处理
  • AI 智能总结 — 由 Claude 生成结构化摘要

前置要求

  • 安装 yt-dlp:macOS 使用 brew install yt-dlp,或 pip install yt-dlp
  • 安装 ffmpeg:macOS 使用 brew install ffmpeg
  • Python 3.8+
  • 浏览器已登录 xiaohongshu.com
  • (可选)安装 uv 以自动管理 Whisper 依赖:brew install uv

安装

作为 Claude Code Skill 使用(推荐)

将 skill 复制到 Claude Code 技能目录:

cp -r xiaohongshu-downloader ~/.claude/skills/

然后直接对 Claude 说:"帮我下载这个小红书视频:<链接>"

独立使用

python scripts/download_xiaohongshu.py "https://www.xiaohongshu.com/explore/视频ID"

支持的链接格式

格式 示例
探索链接 https://www.xiaohongshu.com/explore/<id>
发现链接 https://www.xiaohongshu.com/discovery/item/<id>?xsec_token=...
短链接 http://xhslink.com/a/<id>

提示: 建议从小红书的分享按钮复制完整链接(包含 xsec_token 参数),效果最佳。

使用方法

基本下载(兼容 v1.0)

python scripts/download_xiaohongshu.py "链接"

输出:~/Downloads/<标题> [<id>].mp4

完整资源包

python scripts/download_xiaohongshu.py "链接" --full

输出:

~/Downloads/<视频标题>/
├── video.mp4          # 原始视频
├── audio.mp3          # 提取的音频(ffmpeg)
├── subtitle.vtt       # WebVTT 字幕
└── transcript.txt     # 纯文本转录

AI 总结(通过 Claude Skill 触发)

对 Claude 说:"帮我下载并总结这个小红书视频:<链接>"

额外输出:

~/Downloads/<视频标题>/
├── ...
├── .meta.json         # 视频元数据
└── summary.md         # AI 生成的结构化摘要

参数说明

参数 说明 默认值
-o, --output 输出目录 ~/Downloads
-q, --quality 视频画质(best, 1080p, 720p, 480p best
--browser 提取 Cookie 的浏览器(chrome, firefox, safari, none chrome
-a, --audio-only 仅下载音频(MP3) false
--list-formats 列出可用格式(不下载) false
--full 完整资源包模式 false
--summary AI 总结模式(隐含 --full false

使用示例

# 默认设置下载(最佳画质,Chrome Cookie)
python scripts/download_xiaohongshu.py "https://www.xiaohongshu.com/explore/69821980000000000e03c95f"

# 下载 720p 画质
python scripts/download_xiaohongshu.py "链接" -q 720p

# 下载到指定目录
python scripts/download_xiaohongshu.py "链接" -o ~/Videos/

# 仅下载音频
python scripts/download_xiaohongshu.py "链接" -a

# 完整资源包
python scripts/download_xiaohongshu.py "链接" --full

# 完整资源包 + AI 总结元数据
python scripts/download_xiaohongshu.py "链接" --summary

# 使用 Firefox Cookie
python scripts/download_xiaohongshu.py "链接" --browser firefox

字幕获取策略

完整资源包模式采用三级策略获取字幕:

  1. 手动字幕 — 尝试通过 yt-dlp --write-subs 下载创作者上传的字幕
  2. 自动生成字幕 — 尝试通过 yt-dlp --write-auto-subs 获取平台自动生成的字幕
  3. Whisper 转录 — 使用 faster-whisper 进行本地语音转文字

Whisper 回退方案采用智能的基于静音点的音频切割和多核并行转录,以加快处理速度。

常见问题

问题 解决方案
No video formats found 先在浏览器中登录小红书,然后使用 --browser chrome 重试
Unable to extract initial state 触发了验证码 — 在浏览器中打开链接并完成验证,再重试
链接失效 从小红书重新复制分享链接(token 会过期)
画质低 最高支持 1080p(平台限制),使用 -q best
没有找到字幕 脚本会自动回退到 Whisper 转录
Whisper 失败 安装 uvbrew install uv)或手动安装 faster-whisper

工作原理

本工具利用 yt-dlp 内置的小红书提取器:

  1. 下载小红书网页,提取 window.__INITIAL_STATE__ JSON 数据
  2. 解析视频元数据,包括多种编码格式(H.264、H.265/HEVC、AV1)
  3. 使用浏览器 Cookie(web_session)进行平台身份验证
  4. 下载视频流,必要时通过 ffmpeg 合并音视频

在完整资源包模式下,还会:

  1. 使用 ffmpeg 提取 MP3 音频
  2. 通过三级策略获取字幕
  3. 从字幕生成纯文本转录
  4. (可选)准备元数据供 AI 生成总结

项目结构

xiaohongshu-downloader/
├── SKILL.md                              # 技能定义和工作流
├── README.md                             # 本文件
├── LICENSE                               # MIT 许可证
├── .gitignore
├── scripts/
│   ├── download_xiaohongshu.py           # 主下载脚本
│   └── parallel_transcribe.py            # 并行 Whisper 转录
└── reference/
    └── summary-prompt.md                 # AI 总结提示词模板

许可证

MIT

免责声明

本工具仅供个人学习和教育用途。请遵守版权法和小红书的服务条款。使用前请确保您有权下载相关内容。


Star History

Star History Chart

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages