Skip to content

Releases: Kalana-S/Word-Recognition-System

v0.0.4

29 Jan 14:03

Choose a tag to compare

📝 Word Recognition System – v0.0.4

Fourth release of the Word Recognition (OCR) System, introducing a confidence-driven ensemble inference engine that dynamically selects the most reliable prediction across multiple CRNN architectures. This release significantly improves real-world accuracy, robustness, and interpretability.

Highlights:

  • Confidence-based multi-model selection

    • Both Baseline CRNN and Transfer-Learning CRNN are executed per input
    • Final prediction is selected using CTC negative log-likelihood confidence
    • Eliminates heuristic thresholds and fallback logic
  • Likelihood-aware ensemble inference

    • Prediction confidence derived directly from ctc_decode log probabilities
    • Ensures statistically grounded model arbitration
    • Higher-confidence prediction consistently correlates with higher accuracy
  • Model-specific preprocessing pipelines

    • Dynamic preprocessing based on model architecture
    • Grayscale variable-width handling for baseline CRNN
    • Fixed-width RGB pipeline for transfer-learning CRNN
  • Improved recognition accuracy

    • Challenging real-world word images
    • Mixed-case and stylized fonts
    • Low-contrast and noisy inputs
    • Short and long word sequences
  • Deployment-ready inference layer

    • Unified decoding interface
    • Clean separation of preprocessing, inference, and confidence evaluation
    • Stable Flask deployment with confidence visualization support

This release upgrades the system from a confidence-aware single-model OCR pipeline to a likelihood-based ensemble recognition system. By leveraging CTC log-probability scores as a decision criterion, the system achieves more reliable predictions without increasing model complexity or inference fragility.

v0.0.3

26 Jan 00:58
552cea1

Choose a tag to compare

📝 Word Recognition System – v0.0.3

Third release of the Word Recognition (OCR) System, focusing on prediction quality, model selection intelligence, and deployment-grade inference enhancements. This version introduces multi-model evaluation, confidence-aware decoding, and improved validation behavior for real-world text images.

Highlights:

  • Dual-model inference pipeline
    • Baseline CRNN and transfer-learning CRNN evaluated per input
    • Automatic selection of the best prediction based on confidence scoring
  • Confidence-aware CTC decoding
    • Token-level probability aggregation for reliable confidence estimation
    • Enables confidence visualization in UI and downstream decision logic
  • Improved sequence decoding stability
  • Inference-only optimized backbone
  • Enhanced real-world robustness
    • Improved handling of:
    • Mixed-case text
    • Low-contrast word images
    • Short tokens and acronyms
  • Confidence-enabled web interface

This release elevates the system from a single-model OCR pipeline to a confidence-driven, multi-model recognition engine. The improvements significantly enhance prediction reliability and transparency, making the system more suitable for real-world deployment, user-facing applications, and academic demonstrations.

v0.0.2

25 Jan 17:08
8b90332

Choose a tag to compare

📝 Word Recognition System – v0.0.2

Second release of the Word Recognition (OCR) System, featuring major architectural and training improvements through transfer learning, data augmentation, and enhanced sequence modeling for higher accuracy and robustness.

Highlights:

  • Pretrained VGG16 backbone for feature extraction (New)
  • Transfer learning–based CRNN architecture with deeper representation
  • Improved CTC-based sequence modeling and decoding
  • Fixed-size RGB image pipeline for stable inference (32 × 256 × 3)
  • Advanced data augmentation:
    • Random brightness and contrast
    • Geometric rotation using KerasCV
  • Improved generalization on challenging fonts and casing
  • Higher recognition accuracy on unseen word images
  • Clean separation of training-only CTC logic and inference backbone
  • Optimized inference pipeline for Flask deployment
  • Production-ready TensorFlow .keras model

This release significantly improves recognition quality and model robustness by leveraging pretrained CNN features and modern augmentation strategies, while maintaining a lightweight and deployable Flask-based inference system.

v0.0.1

17 Jan 16:01
4de7f20

Choose a tag to compare

📝 Word Recognition System – v0.0.1

Initial release of the deep learning–based Word Recognition (OCR) System, capable of recognizing single-word images using a CRNN architecture with CTC decoding, deployed via a Flask web application.

Highlights:

  • Image-based single-word text recognition
  • CNN + Bidirectional LSTM (CRNN) architecture
  • CTC (Connectionist Temporal Classification) decoding for sequence prediction
  • Variable-width image handling (no fixed padding required)
  • Trained on Synth90k (100k) synthetic word images
  • TensorFlow .keras production model
  • Lightweight Flask web application
  • Simple HTML + CSS user interface
  • Real-time inference on uploaded images

This release provides a complete end-to-end OCR pipeline, including model training, decoding logic, and a web-based inference interface, designed for educational and experimental purposes.