Skip to content

OCRFlux Pipeline #31

@dantetemplar

Description

@dantetemplar

Pipeline Name

OCRFlux

URL

https://github.com/chatdoc-com/OCRFlux

GitHub URL

https://github.com/chatdoc-com/OCRFlux

License

Apache-2.0

Custom License

No response

Pipeline Description

OCRFlux is a multimodal large language model based toolkit designed to convert PDFs and images into clean, readable, plain Markdown text. It excels in complex layout handling, including multi-column layouts, figures, insets, complicated tables, and equations. The system also provides automated removal of headers and footers, alongside native support for cross-page table and paragraph merging, a pioneering feature among open-source OCR tools. Built on a 3 billion parameter vision-language model, it can run efficiently on GPUs such as the GTX 3090. OCRFlux provides batch inference support for whole documents and detailed parsing quality with benchmarks demonstrating significant improvements over several leading OCR models.​

Primary Language

No response

Demo (if available)

https://ocrflux.pdfparser.io/

Has the pipeline been benchmarked? If yes, provide benchmark results or a link to evaluation metrics.

No response

Does it have an API?

No

API URL (if applicable)

No response

API Pricing Page (if applicable)

No response

API Average Price per 1000 Page (if applicable)

No response

Additional Notes

  • Recommended GPU: 24GB or more VRAM for best performance, but supports tensor parallelism to divide workload across multiple smaller GPUs
  • Includes Docker container support for easy deployment
  • Supports various command-line options for customizing inference, GPU memory utilization, page merging behavior, and data type selection
  • Outputs results as JSONL files convertible into Markdown documents
  • Developed and maintained by ChatDOC team
  • Has 2.3k stars on GitHub

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions