Skip to content

A fast, efficient web-based tool for manually labeling text in image crops for OCR training datasets.

Notifications You must be signed in to change notification settings

faezeam/OCR-Labeler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Annotation Tool

A fast, efficient web-based tool for manually labeling text in image crops for OCR training datasets.

image

Features

  • Fast Image Loading: Optimized batch loading with preloading for smooth workflow
  • Random Image Order: Images load in random order to prevent annotation bias
  • Keyboard Shortcuts: Quick navigation with Enter to save, Ctrl+Enter to skip
  • RTL Text Support: Proper support for Persian, Arabic, and other RTL languages
  • Progress Tracking: Real-time statistics on completed, skipped, and remaining images
  • Undo Functionality: Go back to previous images and modify annotations
  • Default Text Integration: Support for OCR model predictions as starting points

Requirements

  • Python 3.6 or higher
  • Flask: pip install flask

Installation

  1. Clone the repository:
git clone https://github.com/faezeam/ocr-annotation-tool
cd ocr-annotation-tool
  1. Install dependencies:
pip install flask
  1. Run the application:
python app.py
  1. Open your browser and go to: http://127.0.0.1:5000

Usage

Setup

  1. Place your image crops in the crops/ folder
  2. (Optional) Create labels/default.txt with OCR predictions in in tab-separated format: filename.jpg predicted_text

Annotation Workflow

  1. View Image: Each image displays in the center with filename shown above
  2. Enter Text: Type the text you see in the input field (supports RTL languages)
  3. Save: Press Enter to save annotation and move to next image
  4. Skip: Press Ctrl+Enter or Alt+S to skip difficult images
  5. Undo: Click "Previous" button to go back and modify previous annotations

Keyboard Shortcuts

  • Enter - Save annotation and go to next image
  • Ctrl+Enter or Alt+S - Skip current image
  • Esc - Clear input field

File Organization

  • crops/ - Place your image files here (.jpg, .png, .bmp, .tiff, .gif)
  • completed/ - Successfully annotated images are moved here
  • skipped/ - Skipped images are moved here
  • labels/labels.txt - All annotations saved in tab-separated format
  • labels/default.txt - Optional OCR predictions (tab-separated: filename→text)

Output Format

Annotations are saved in labels/labels.txt in tab-separated format

About

A fast, efficient web-based tool for manually labeling text in image crops for OCR training datasets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages