📝 Word Recognition System – v0.0.4
Fourth release of the Word Recognition (OCR) System, introducing a confidence-driven ensemble inference engine that dynamically selects the most reliable prediction across multiple CRNN architectures. This release significantly improves real-world accuracy, robustness, and interpretability.
Highlights:
-
Confidence-based multi-model selection
- Both Baseline CRNN and Transfer-Learning CRNN are executed per input
- Final prediction is selected using CTC negative log-likelihood confidence
- Eliminates heuristic thresholds and fallback logic
-
Likelihood-aware ensemble inference
- Prediction confidence derived directly from ctc_decode log probabilities
- Ensures statistically grounded model arbitration
- Higher-confidence prediction consistently correlates with higher accuracy
-
Model-specific preprocessing pipelines
- Dynamic preprocessing based on model architecture
- Grayscale variable-width handling for baseline CRNN
- Fixed-width RGB pipeline for transfer-learning CRNN
-
Improved recognition accuracy
- Challenging real-world word images
- Mixed-case and stylized fonts
- Low-contrast and noisy inputs
- Short and long word sequences
-
Deployment-ready inference layer
- Unified decoding interface
- Clean separation of preprocessing, inference, and confidence evaluation
- Stable Flask deployment with confidence visualization support
This release upgrades the system from a confidence-aware single-model OCR pipeline to a likelihood-based ensemble recognition system. By leveraging CTC log-probability scores as a decision criterion, the system achieves more reliable predictions without increasing model complexity or inference fragility.