PP-StructureV3 Pipeline

### Pipeline Name

PP-StructureV3

### URL

https://github.com/PaddlePaddle/PaddleOCR

### GitHub URL

https://github.com/PaddlePaddle/PaddleOCR

### License

Apache-2.0

### Custom License

_No response_

### Pipeline Description

PP-StructureV3 is a multi-model pipeline for document image parsing that converts document images or PDFs into structured JSON and Markdown files. It integrates several key modules: preprocessing for image quality improvements, an OCR engine (PP-OCRv5), layout detection via PP-DocLayout-plus, document item recognition (tables, formulas, charts, seals), and post-processing to reconstruct element relationships and reading order. The pipeline is designed for high accuracy in complex layouts including multi-column texts, magazines, handwritten documents, and vertically typeset languages.

It supports comprehensive recognition with specialized models for tables (PP-TableMagic), formulas (PP-FormulaNet_plus), charts (PP-Chart2Table), and seals (PP-OCRv4_seal). It achieves state-of-the-art results on benchmarks like OmniDocBench, especially for Chinese and English documents, competing well with expert and general vision-language models.

### Primary Language

_No response_

### Demo (if available)

https://huggingface.co/spaces/PaddlePaddle/PP-StructureV3_Online_Demo

### Has the pipeline been benchmarked? If yes, provide benchmark results or a link to evaluation metrics.

_No response_

### Does it have an API?

No

### API URL (if applicable)

_No response_

### API Pricing Page (if applicable)

_No response_

### API Average Price per 1000 Page (if applicable)

_No response_

### Additional Notes

-   PP-StructureV3 uses PP-OCRv5 as the OCR backbone, which includes improvements in network architecture and training, supporting vertical text, handwriting, and rare Chinese characters.
-   Preprocessing includes document orientation classification and text unwarping.
-   Layout analysis uses PP-DocLayout-plus and a region detection model to handle multiple articles per page.
-   Table recognition with PP-TableMagic outputs HTML formatted structures.
-   Formula recognition with PP-FormulaNet_plus outputs LaTeX.
-   Chart parsing converts charts into markdown tables.
-   Seal recognition handles curved text and round/oval seals.
-   Post-processing enhances reading order reconstruction especially for complex document layouts (e.g., multi-column magazines, vertical typesetting).
-  Performance is tested on NVIDIA V100/A100 GPUs with detailed resource usage statistics available.
-  The system can process PDFs and images and can save results in JSON and Markdown formats.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PP-StructureV3 Pipeline #33

Pipeline Name

URL

GitHub URL

License

Custom License

Pipeline Description

Primary Language

Demo (if available)

Has the pipeline been benchmarked? If yes, provide benchmark results or a link to evaluation metrics.

Does it have an API?

API URL (if applicable)

API Pricing Page (if applicable)

API Average Price per 1000 Page (if applicable)

Additional Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

PP-StructureV3 Pipeline #33

Description

Pipeline Name

URL

GitHub URL

License

Custom License

Pipeline Description

Primary Language

Demo (if available)

Has the pipeline been benchmarked? If yes, provide benchmark results or a link to evaluation metrics.

Does it have an API?

API URL (if applicable)

API Pricing Page (if applicable)

API Average Price per 1000 Page (if applicable)

Additional Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions