A small suite of Python scripts to convert Markdown -> DOCX -> PDF.
This repository contains three tools:
md2docx.py— Convert Markdown (.md) to Microsoft Word (.docx). Supports math (LaTeX) rendering, images, code blocks, tables, and batch conversion.docx2pdf.py— Convert Word (.docx/.doc) to PDF. Uses multiple backends:docx2pdf(Windows), Word COM (comtypes) on Windows, or LibreOffice on other platforms.md2pdf_pipeline.py— A pipeline that runsmd2docx.pythendocx2pdf.pyto produce PDF output from Markdown, with an option to keep intermediate DOCX files.
- Markdown to DOCX with math rendering (matplotlib-generated images).
- DOCX to PDF conversion with fallbacks across multiple backends.
- Batch processing and recursive directory scanning.
- Verbose logging mode for debugging.
The scripts rely on a few Python packages. Install them with pip:
pip install -r requirements.txtIf you prefer to install individual packages (used across the scripts):
pip install python-docx markdown2 pillow requests beautifulsoup4 matplotlib
pip install docx2pdf comtypes # optional; Windows-only backendsNotes:
- On Linux/macOS, LibreOffice is used for DOCX → PDF conversion if other backends are unavailable. Install LibreOffice from your platform package manager.
- On Windows, installing
docx2pdforcomtypes(and having MS Word installed) enables higher-fidelity conversions.
All scripts are CLI tools. Run them with Python 3.
Markdown -> DOCX (single file):
python md2docx.py input.md
# specify output
python md2docx.py input.md output.docx
# batch
python md2docx.py --batch path\to\dirDOCX -> PDF (single file):
python docx2pdf.py input.docx
# specify output
python docx2pdf.py input.docx output.pdf
# force backend
python docx2pdf.py input.docx --backend libreoffice
# batch
python docx2pdf.py --batch path\to\dirMarkdown -> PDF pipeline:
python md2pdf_pipeline.py input.md
# specify output
python md2pdf_pipeline.py input.md output.pdf
# keep intermediate DOCX
python md2pdf_pipeline.py input.md --keep-docx
# batch
python md2pdf_pipeline.py --batch path\to\dirAdd --verbose or -v to any command to enable more detailed logging.
- docx2pdf: Python package that wraps Microsoft Word on Windows. Requires MS Office. If installation or Word is missing, conversion will fall back.
- comtypes: Uses the Word COM interface directly on Windows. Requires MS Word.
- LibreOffice: Cross-platform
soffice/libreofficecommand-line conversion. Must be installed and available on PATH. On Windows, the scripts try common installation paths.
Common issues:
- Missing Python packages: Run
pip install -r requirements.txt. - LibreOffice not found: Install LibreOffice or choose a different backend on Windows.
- Permission errors saving output: Ensure the output directory exists and you have write permissions.
Convert a single markdown file to PDF in one step:
python md2pdf_pipeline.py report.md report.pdfBatch convert all markdown files under a directory (recursively):
python md2pdf_pipeline.py --batch C:\projects\notes --recursiveConvert a directory of DOCX files to PDF using LibreOffice explicitly:
python docx2pdf.py --batch C:\docs --backend libreofficeThis is a small utility repository. To run basic smoke tests, try converting a minimal markdown file and inspect the output.
This project includes a LICENSE file in the repository root. Review it for terms.
Contributions welcome — please open issues or pull requests. When adding features, include tests and update this README with new usage notes.