Skip to content

v0.0.4

Latest

Choose a tag to compare

@Kalana-S Kalana-S released this 29 Jan 14:03

📝 Word Recognition System – v0.0.4

Fourth release of the Word Recognition (OCR) System, introducing a confidence-driven ensemble inference engine that dynamically selects the most reliable prediction across multiple CRNN architectures. This release significantly improves real-world accuracy, robustness, and interpretability.

Highlights:

  • Confidence-based multi-model selection

    • Both Baseline CRNN and Transfer-Learning CRNN are executed per input
    • Final prediction is selected using CTC negative log-likelihood confidence
    • Eliminates heuristic thresholds and fallback logic
  • Likelihood-aware ensemble inference

    • Prediction confidence derived directly from ctc_decode log probabilities
    • Ensures statistically grounded model arbitration
    • Higher-confidence prediction consistently correlates with higher accuracy
  • Model-specific preprocessing pipelines

    • Dynamic preprocessing based on model architecture
    • Grayscale variable-width handling for baseline CRNN
    • Fixed-width RGB pipeline for transfer-learning CRNN
  • Improved recognition accuracy

    • Challenging real-world word images
    • Mixed-case and stylized fonts
    • Low-contrast and noisy inputs
    • Short and long word sequences
  • Deployment-ready inference layer

    • Unified decoding interface
    • Clean separation of preprocessing, inference, and confidence evaluation
    • Stable Flask deployment with confidence visualization support

This release upgrades the system from a confidence-aware single-model OCR pipeline to a likelihood-based ensemble recognition system. By leveraging CTC log-probability scores as a decision criterion, the system achieves more reliable predictions without increasing model complexity or inference fragility.