Releases: Kalana-S/Word-Recognition-System
v0.0.4
📝 Word Recognition System – v0.0.4
Fourth release of the Word Recognition (OCR) System, introducing a confidence-driven ensemble inference engine that dynamically selects the most reliable prediction across multiple CRNN architectures. This release significantly improves real-world accuracy, robustness, and interpretability.
Highlights:
-
Confidence-based multi-model selection
- Both Baseline CRNN and Transfer-Learning CRNN are executed per input
- Final prediction is selected using CTC negative log-likelihood confidence
- Eliminates heuristic thresholds and fallback logic
-
Likelihood-aware ensemble inference
- Prediction confidence derived directly from ctc_decode log probabilities
- Ensures statistically grounded model arbitration
- Higher-confidence prediction consistently correlates with higher accuracy
-
Model-specific preprocessing pipelines
- Dynamic preprocessing based on model architecture
- Grayscale variable-width handling for baseline CRNN
- Fixed-width RGB pipeline for transfer-learning CRNN
-
Improved recognition accuracy
- Challenging real-world word images
- Mixed-case and stylized fonts
- Low-contrast and noisy inputs
- Short and long word sequences
-
Deployment-ready inference layer
- Unified decoding interface
- Clean separation of preprocessing, inference, and confidence evaluation
- Stable Flask deployment with confidence visualization support
This release upgrades the system from a confidence-aware single-model OCR pipeline to a likelihood-based ensemble recognition system. By leveraging CTC log-probability scores as a decision criterion, the system achieves more reliable predictions without increasing model complexity or inference fragility.
v0.0.3
📝 Word Recognition System – v0.0.3
Third release of the Word Recognition (OCR) System, focusing on prediction quality, model selection intelligence, and deployment-grade inference enhancements. This version introduces multi-model evaluation, confidence-aware decoding, and improved validation behavior for real-world text images.
Highlights:
- Dual-model inference pipeline
- Baseline CRNN and transfer-learning CRNN evaluated per input
- Automatic selection of the best prediction based on confidence scoring
- Confidence-aware CTC decoding
- Token-level probability aggregation for reliable confidence estimation
- Enables confidence visualization in UI and downstream decision logic
- Improved sequence decoding stability
- Inference-only optimized backbone
- Enhanced real-world robustness
- Improved handling of:
- Mixed-case text
- Low-contrast word images
- Short tokens and acronyms
- Confidence-enabled web interface
This release elevates the system from a single-model OCR pipeline to a confidence-driven, multi-model recognition engine. The improvements significantly enhance prediction reliability and transparency, making the system more suitable for real-world deployment, user-facing applications, and academic demonstrations.
v0.0.2
📝 Word Recognition System – v0.0.2
Second release of the Word Recognition (OCR) System, featuring major architectural and training improvements through transfer learning, data augmentation, and enhanced sequence modeling for higher accuracy and robustness.
Highlights:
- Pretrained VGG16 backbone for feature extraction (New)
- Transfer learning–based CRNN architecture with deeper representation
- Improved CTC-based sequence modeling and decoding
- Fixed-size RGB image pipeline for stable inference (32 × 256 × 3)
- Advanced data augmentation:
- Random brightness and contrast
- Geometric rotation using KerasCV
- Improved generalization on challenging fonts and casing
- Higher recognition accuracy on unseen word images
- Clean separation of training-only CTC logic and inference backbone
- Optimized inference pipeline for Flask deployment
- Production-ready TensorFlow
.kerasmodel
This release significantly improves recognition quality and model robustness by leveraging pretrained CNN features and modern augmentation strategies, while maintaining a lightweight and deployable Flask-based inference system.
v0.0.1
📝 Word Recognition System – v0.0.1
Initial release of the deep learning–based Word Recognition (OCR) System, capable of recognizing single-word images using a CRNN architecture with CTC decoding, deployed via a Flask web application.
Highlights:
- Image-based single-word text recognition
- CNN + Bidirectional LSTM (CRNN) architecture
- CTC (Connectionist Temporal Classification) decoding for sequence prediction
- Variable-width image handling (no fixed padding required)
- Trained on Synth90k (100k) synthetic word images
- TensorFlow
.kerasproduction model - Lightweight Flask web application
- Simple HTML + CSS user interface
- Real-time inference on uploaded images
This release provides a complete end-to-end OCR pipeline, including model training, decoding logic, and a web-based inference interface, designed for educational and experimental purposes.