A high-performance multi-format document parsing service supporting PDF, Word, Excel, and PowerPoint with GPU acceleration capabilities.
- 🚀 High-Performance Parsing: MinerU and MarkItDown dual-engine support
- 🎯 GPU Acceleration: CUDA/sglang support for GPU acceleration (optional)
- 🔧 Zero-Configuration Deployment: Automatic environment detection and dependency installation
- 📚 Multi-Format Support: PDF, Word, Excel, PowerPoint, Markdown, and more
- 🌐 HTTP API: RESTful API interface for easy integration
- 📊 Real-time Monitoring: Built-in performance monitoring and health checks
- ☁️ OSS Integration: Alibaba Cloud OSS support for cloud storage
cd document-parser
# Initialize uv virtual environment and dependencies (first time)
document-parser uv-init
# Check environment status
document-parser check# Start document parsing service
document-parser server
# Or specify custom port
document-parser server --port 8088The service will start at http://localhost:8087 (default) and automatically activate the virtual environment.
- Rust: 1.70+
- Python: 3.8+
- uv: Python package manager
- NVIDIA GPU: CUDA-compatible
- CUDA Toolkit: 11.8+
- GPU Memory: At least 8GB recommended
| Format | Parsing Engine | Features |
|---|---|---|
| MinerU | Professional PDF parsing, image extraction, table recognition | |
| Word | MarkItDown | Document structure preservation, format conversion |
| Excel | MarkItDown | Table data extraction, format preservation |
| PowerPoint | MarkItDown | Slide content extraction, image saving |
| Markdown | Built-in | Real-time parsing, table of contents generation |
# Server configuration
server:
port: 8087
host: "0.0.0.0"
# MinerU configuration
mineru:
backend: "vlm-sglang-engine" # Enable GPU acceleration
max_concurrent: 3
quality_level: "Balanced"mineru:
backend: "vlm-sglang-engine" # Use sglang backend
max_concurrent: 2 # Lower concurrency for GPU
batch_size: 1# Environment management
document-parser check # Check environment status
document-parser uv-init # Initialize environment
document-parser troubleshoot # Troubleshooting guide
# Service management
document-parser server # Start service
document-parser server --port 8088 # Specify port
# File parsing (CLI)
document-parser parse --input file.pdf --output result.md --parser minerucurl -X POST "http://localhost:8087/api/v1/documents/parse" \
-H "Content-Type: multipart/form-data" \
-F "file=@document.pdf" \
-F "format=pdf"curl "http://localhost:8087/api/v1/documents/{task_id}/status"Once the service is running, visit:
- OpenAPI Swagger UI:
http://localhost:8087/swagger-ui/ - OpenAPI JSON:
http://localhost:8087/api-docs/openapi.json
- Ensure
sglang[all]is installed - Configure
backend: "vlm-sglang-engine" - Adjust concurrency parameters based on GPU memory
- Monitor GPU usage
mineru:
max_concurrent: 2 # Adjust based on system performance
batch_size: 1 # Process in small batches
queue_size: 100 # Queue buffer size- Virtual environment not activated: Run
source ./venv/bin/activate - Dependency installation failed: Run
document-parser uv-init - GPU acceleration not working: Refer to CUDA Environment Setup Guide
- Permission issues: Check directory and user permissions
# Detailed troubleshooting guide
document-parser troubleshoot
# Environment status check
document-parser check
# View logs
tail -f logs/log.$(date +%Y-%m-%d)cargo build --releasecargo testcargo fmt
cargo clippyThis project is licensed under MIT License.
Issues and Pull Requests are welcome!