Wenbi

Wenbi converts media and text into structured Markdown, then rewrites or translates it.

It supports:

Video/audio/URL transcription to VTT/Markdown
Text rewriting (rewrite, academic style)
Translation (translate) with DeepL first, then LLM fallback
PPT-style slide + speech combination (ppt)
Batch directory processing (wenbi-batch)

Install

Prerequisites:

Python 3.10+
ffmpeg in PATH

Install:

pip install wenbi

Or from a local checkout:

# from the project directory
pip install -e .

Quick Start

Rewrite:

wenbi rewrite input.mp4 --lang Chinese --llm ollama/qwen3

Translate (DeepL first):

wenbi translate input.md --lang Chinese --deepl-key "$DEEPL_API_KEY"

PPT workflow:

wenbi ppt lecture.mp4 --lang English

Commands

`rewrite` (`rw`)

Rewrite spoken/transcribed text into written style.

wenbi rewrite <input> [options]

Key options:

--style rewrite|academic
--lang
--llm
--cite-timestamps
--start-time, --end-time (media/URL)

`translate` (`tr`)

Translate content to a target language.

wenbi translate <input> --lang <target> [options]

Key options:

--deepl-key (or DEEPL_API_KEY env var)
--llm (fallback model)
--keep-original-lang
--cite-timestamps

Translation behavior

translate uses this order:

Try DeepL API first (when key is available).
If DeepL is unavailable or chunk translation fails, fallback to LLM.

If both DeepL and LLM are unavailable, translation cannot complete successfully.

`ppt` (`p`)

Extract slides from video, align with speech, and export combined markdown.

wenbi ppt <video_or_url> [options]

Key options:

--frame-interval
--cropped-slide [auto|x0,y0,x1,y1]
--ppt <ppt/pdf/image/odp>
--no-ocr
--no-clean
--ssim-threshold, --hist-threshold, --dedup-method

Supported Inputs

Media: .mp4 .avi .mov .mkv .flv .wmv .m4v .webm .mp3 .flac .aac .ogg .m4a .opus
Text/subtitles: .vtt .srt .ass .ssa .sub .smi .txt .md .markdown .docx
URL inputs are supported for media flows.

Common Global Options

Used by subcommands:

--output-dir
--lang
--llm
--chunk-length
--max-tokens
--timeout
--temperature
--transcribe-model
--transcribe-lang
--multi-language
--verbose

Output Files

Typical outputs:

*_rewritten.md
*_translated.md
*_academic.md
*_combine.md / *_combine_clean.md (PPT workflows)
*.vtt, *.csv (depending on flow)

Batch Processing

Process a directory of media files:

wenbi-batch <input_dir> --output-dir <dir> --md

Optional config:

wenbi-batch <input_dir> --config config.yaml

YAML Config (CLI)

wenbi supports YAML via --config.

Example:

input: lecture.mp4
output_dir: ./out
llm: ollama/qwen3
lang: Chinese
chunk_length: 20

Multi-input format is also supported using inputs:.

Python API

from wenbi.main import process_input

text, md_file, csv_file, base_name = process_input(
    file_path="input.mp4",
    subcommand="translate",
    lang="Chinese",
    use_deepl=True,
    deepl_key="<DEEPL_KEY>",
    llm="ollama/qwen3",
)

Troubleshooting

No DeepL translation output:
- Set DEEPL_API_KEY or --deepl-key
- Run with --verbose to confirm DeepL connectivity
Fallback LLM not working:
- Ensure your provider is reachable (for example, Ollama running locally for ollama/...)
PPT OCR issues:
- Ensure marker_single and OCR dependencies are installed correctly

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
config		config
test		test
wenbi		wenbi
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
package.json		package.json
pyproject.toml		pyproject.toml
requirements-dev.lock		requirements-dev.lock
requirements.lock		requirements.lock
requirements.txt		requirements.txt
test_deepl_integration.py		test_deepl_integration.py
test_deepl_translation.py		test_deepl_translation.py
test_integration_summary.py		test_integration_summary.py
test_keep_original.py		test_keep_original.py
uv.lock		uv.lock
wenbi_logo.PNG		wenbi_logo.PNG

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wenbi

Install

Quick Start

Commands

`rewrite` (`rw`)

`translate` (`tr`)

Translation behavior

`ppt` (`p`)

Supported Inputs

Common Global Options

Output Files

Batch Processing

YAML Config (CLI)

Python API

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Areopaguaworkshop/wenbi

Folders and files

Latest commit

History

Repository files navigation

Wenbi

Install

Quick Start

Commands

rewrite (rw)

translate (tr)

Translation behavior

ppt (p)

Supported Inputs

Common Global Options

Output Files

Batch Processing

YAML Config (CLI)

Python API

Troubleshooting

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`rewrite` (`rw`)

`translate` (`tr`)

`ppt` (`p`)

Packages