Wenbi converts media and text into structured Markdown, then rewrites or translates it.
It supports:
- Video/audio/URL transcription to VTT/Markdown
- Text rewriting (
rewrite,academicstyle) - Translation (
translate) with DeepL first, then LLM fallback - PPT-style slide + speech combination (
ppt) - Batch directory processing (
wenbi-batch)
Prerequisites:
- Python 3.10+
ffmpegin PATH
Install:
pip install wenbiOr from a local checkout:
# from the project directory
pip install -e .Rewrite:
wenbi rewrite input.mp4 --lang Chinese --llm ollama/qwen3Translate (DeepL first):
wenbi translate input.md --lang Chinese --deepl-key "$DEEPL_API_KEY"PPT workflow:
wenbi ppt lecture.mp4 --lang EnglishRewrite spoken/transcribed text into written style.
wenbi rewrite <input> [options]Key options:
--style rewrite|academic--lang--llm--cite-timestamps--start-time,--end-time(media/URL)
Translate content to a target language.
wenbi translate <input> --lang <target> [options]Key options:
--deepl-key(orDEEPL_API_KEYenv var)--llm(fallback model)--keep-original-lang--cite-timestamps
translate uses this order:
- Try DeepL API first (when key is available).
- If DeepL is unavailable or chunk translation fails, fallback to LLM.
If both DeepL and LLM are unavailable, translation cannot complete successfully.
Extract slides from video, align with speech, and export combined markdown.
wenbi ppt <video_or_url> [options]Key options:
--frame-interval--cropped-slide [auto|x0,y0,x1,y1]--ppt <ppt/pdf/image/odp>--no-ocr--no-clean--ssim-threshold,--hist-threshold,--dedup-method
- Media:
.mp4 .avi .mov .mkv .flv .wmv .m4v .webm .mp3 .flac .aac .ogg .m4a .opus - Text/subtitles:
.vtt .srt .ass .ssa .sub .smi .txt .md .markdown .docx - URL inputs are supported for media flows.
Used by subcommands:
--output-dir--lang--llm--chunk-length--max-tokens--timeout--temperature--transcribe-model--transcribe-lang--multi-language--verbose
Typical outputs:
*_rewritten.md*_translated.md*_academic.md*_combine.md/*_combine_clean.md(PPT workflows)*.vtt,*.csv(depending on flow)
Process a directory of media files:
wenbi-batch <input_dir> --output-dir <dir> --mdOptional config:
wenbi-batch <input_dir> --config config.yamlwenbi supports YAML via --config.
Example:
input: lecture.mp4
output_dir: ./out
llm: ollama/qwen3
lang: Chinese
chunk_length: 20Multi-input format is also supported using inputs:.
from wenbi.main import process_input
text, md_file, csv_file, base_name = process_input(
file_path="input.mp4",
subcommand="translate",
lang="Chinese",
use_deepl=True,
deepl_key="<DEEPL_KEY>",
llm="ollama/qwen3",
)- No DeepL translation output:
- Set
DEEPL_API_KEYor--deepl-key - Run with
--verboseto confirm DeepL connectivity
- Set
- Fallback LLM not working:
- Ensure your provider is reachable (for example, Ollama running locally for
ollama/...)
- Ensure your provider is reachable (for example, Ollama running locally for
- PPT OCR issues:
- Ensure
marker_singleand OCR dependencies are installed correctly
- Ensure
Apache-2.0