OmniMRZ is an open-source Python library for Machine Readable Zone (MRZ) extraction, parsing, and ICAO-9303 validation from passport and ID images, built for OCR, KYC, and identity verification systems.
It is a production-grade MRZ extraction and validation engine designed for high-accuracy KYC, identity verification, and document intelligence pipelines.
Unlike simple MRZ readers, OmniMRZ evaluates whether an MRZ is structurally correct, cryptographically valid, and logically plausible.
🛂 Passport and ID card OCR pipelines
🏦 KYC / AML identity verification systems
📄 Document digitization and archiving
🔐 Authentication and onboarding workflows
⭐ Show Your Support If OmniMRZ helped you or saved development time: 👉 Please consider starring the repository It helps visibility and motivates continued development
Unlike basic MRZ readers, OmniMRZ provides end-to-end MRZ quality assurance:
- Combines OCR, structural validation, checksum verification, and logical consistency checks
- Fully compliant with ICAO 9303
- Designed for production KYC and identity verification systems
- Robust against OCR noise and partially corrupted MRZ lines
- MRZ detection and extraction from images
- Supports TD3 (passport) format
- Checksum validation (ICAO 9303)
- Logical and structural validation
- Clean Python API
- PaddleOCR-based MRZ text extraction (robust on mobile & noisy images)
- Intelligent MRZ line clustering & reconstruction
- Automatic MRZ type detection (TD1 / TD2 / TD3)
- OCR noise filtering & MRZ-safe character normalization
- Works even with partially corrupted or misaligned MRZs
- Exact line-length enforcement
- Strict MRZ format verification
- Field-level structural checks
- Early-exit gating for invalid layouts
- Fully ICAO-9303 compliant checksum algorithm
- Field-level validation:
- Document number
- Date of birth
- Expiry date
- Composite checksum
- OCR-error tolerant digit correction (O→0, S→5, B→8, etc.)
- Detailed checksum failure diagnostics
- Expired document detection
- Future date-of-birth detection
- Implausible age detection
- DOB ≥ expiry detection
- Gender value validation (M, F, X, <)
- Cross-field consistency signals (issuer vs nationality)
- Clean MRZ text
- Structured JSON
- Deterministic pass / fail / warning signals
- Human-readable error messages
pip install omnimrzNote: PaddleOCR requires additional system dependencies. Please ensure PaddlePaddle installs correctly on your platform.
pip install paddleocr
pip install paddle paddleor if that fails then run
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/from omnimrz import OmniMRZ
omni = OmniMRZ()
result = omni.process("ukpassport.jpg")
print(result){
"extraction": {
"status": "SUCCESS(extraction of mrz)",
"line1": "P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<",
"line2": "7077979792GBR9505209M1704224<<<<<<<<<<<<<<00"
},
"structural_validation": {
"status": "PASS",
"mrz_type": "TD3",
"errors": []
},
"checksum_validation": {
"status": "PASS",
"errors": []
},
"parsed_data": {
"status": "PARSED",
"data": {
"document_type": "P",
"issuing_country": "GBR",
"surname": "PUDARSAN",
"given_names": "HENERT",
"document_number": "707797979",
"nationality": "GBR",
"date_of_birth": "1995-05-20",
"gender": "M",
"expiry_date": "2017-04-22",
"personal_number": ""
}
},
"logical_validation": {
"status": "FAIL",
"errors": [
"DOCUMENT_EXPIRED"
]
},
"screenshot_detection": {
"status": "PASS",
"is_screenshot": false,
"score": 3,
"confidence": 30.0,
"reasons": [
"Low ELA: 0.38",
"High horizontal edges: 0.51",
"High sharpness: 2029.58"
]
}
}
If you use OmniMRZ in academic research or publications, please consider citing this repository:
Contributions are welcome!🤝
- Fork the repository
- Create your feature branch
git checkout -b feature/amazing-feature- Commit your changes
- Push to your branch
- Open a Pull Request
MRZ extraction, passport OCR, machine readable zone, ICAO 9303, MRZ parser, Python OCR, identity verification, KYC automation, document intelligence, ID card scanning, border control OCR