Skip to content

0xSA7/md2docx2pdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

md2docx2pdf

A small suite of Python scripts to convert Markdown -> DOCX -> PDF.

This repository contains three tools:

  • md2docx.py — Convert Markdown (.md) to Microsoft Word (.docx). Supports math (LaTeX) rendering, images, code blocks, tables, and batch conversion.
  • docx2pdf.py — Convert Word (.docx/.doc) to PDF. Uses multiple backends: docx2pdf (Windows), Word COM (comtypes) on Windows, or LibreOffice on other platforms.
  • md2pdf_pipeline.py — A pipeline that runs md2docx.py then docx2pdf.py to produce PDF output from Markdown, with an option to keep intermediate DOCX files.

Features

  • Markdown to DOCX with math rendering (matplotlib-generated images).
  • DOCX to PDF conversion with fallbacks across multiple backends.
  • Batch processing and recursive directory scanning.
  • Verbose logging mode for debugging.

Requirements

The scripts rely on a few Python packages. Install them with pip:

pip install -r requirements.txt

If you prefer to install individual packages (used across the scripts):

pip install python-docx markdown2 pillow requests beautifulsoup4 matplotlib
pip install docx2pdf comtypes    # optional; Windows-only backends

Notes:

  • On Linux/macOS, LibreOffice is used for DOCX → PDF conversion if other backends are unavailable. Install LibreOffice from your platform package manager.
  • On Windows, installing docx2pdf or comtypes (and having MS Word installed) enables higher-fidelity conversions.

Usage

All scripts are CLI tools. Run them with Python 3.

Markdown -> DOCX (single file):

python md2docx.py input.md
# specify output
python md2docx.py input.md output.docx
# batch
python md2docx.py --batch path\to\dir

DOCX -> PDF (single file):

python docx2pdf.py input.docx
# specify output
python docx2pdf.py input.docx output.pdf
# force backend
python docx2pdf.py input.docx --backend libreoffice
# batch
python docx2pdf.py --batch path\to\dir

Markdown -> PDF pipeline:

python md2pdf_pipeline.py input.md
# specify output
python md2pdf_pipeline.py input.md output.pdf
# keep intermediate DOCX
python md2pdf_pipeline.py input.md --keep-docx
# batch
python md2pdf_pipeline.py --batch path\to\dir

Add --verbose or -v to any command to enable more detailed logging.

Backend details & troubleshooting

  • docx2pdf: Python package that wraps Microsoft Word on Windows. Requires MS Office. If installation or Word is missing, conversion will fall back.
  • comtypes: Uses the Word COM interface directly on Windows. Requires MS Word.
  • LibreOffice: Cross-platform soffice/libreoffice command-line conversion. Must be installed and available on PATH. On Windows, the scripts try common installation paths.

Common issues:

  • Missing Python packages: Run pip install -r requirements.txt.
  • LibreOffice not found: Install LibreOffice or choose a different backend on Windows.
  • Permission errors saving output: Ensure the output directory exists and you have write permissions.

Examples

Convert a single markdown file to PDF in one step:

python md2pdf_pipeline.py report.md report.pdf

Batch convert all markdown files under a directory (recursively):

python md2pdf_pipeline.py --batch C:\projects\notes --recursive

Convert a directory of DOCX files to PDF using LibreOffice explicitly:

python docx2pdf.py --batch C:\docs --backend libreoffice

Development & Tests

This is a small utility repository. To run basic smoke tests, try converting a minimal markdown file and inspect the output.

License

This project includes a LICENSE file in the repository root. Review it for terms.

Contributing

Contributions welcome — please open issues or pull requests. When adding features, include tests and update this README with new usage notes.

About

This tool is an AI generated tool!

Topics

Resources

License

Stars

Watchers

Forks

Languages