OCR Annotation Tool

A fast, efficient web-based tool for manually labeling text in image crops for OCR training datasets.

Features

Fast Image Loading: Optimized batch loading with preloading for smooth workflow
Random Image Order: Images load in random order to prevent annotation bias
Keyboard Shortcuts: Quick navigation with Enter to save, Ctrl+Enter to skip
RTL Text Support: Proper support for Persian, Arabic, and other RTL languages
Progress Tracking: Real-time statistics on completed, skipped, and remaining images
Undo Functionality: Go back to previous images and modify annotations
Default Text Integration: Support for OCR model predictions as starting points

Requirements

Python 3.6 or higher
Flask: pip install flask

Installation

Clone the repository:

git clone https://github.com/faezeam/ocr-annotation-tool
cd ocr-annotation-tool

Install dependencies:

pip install flask

Run the application:

python app.py

Open your browser and go to: http://127.0.0.1:5000

Usage

Setup

Place your image crops in the crops/ folder
(Optional) Create labels/default.txt with OCR predictions in in tab-separated format: filename.jpg predicted_text

Annotation Workflow

View Image: Each image displays in the center with filename shown above
Enter Text: Type the text you see in the input field (supports RTL languages)
Save: Press Enter to save annotation and move to next image
Skip: Press Ctrl+Enter or Alt+S to skip difficult images
Undo: Click "Previous" button to go back and modify previous annotations

Keyboard Shortcuts

Enter - Save annotation and go to next image
Ctrl+Enter or Alt+S - Skip current image
Esc - Clear input field

File Organization

crops/ - Place your image files here (.jpg, .png, .bmp, .tiff, .gif)
completed/ - Successfully annotated images are moved here
skipped/ - Skipped images are moved here
labels/labels.txt - All annotations saved in tab-separated format
labels/default.txt - Optional OCR predictions (tab-separated: filename→text)

Output Format

Annotations are saved in labels/labels.txt in tab-separated format

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirments.txt		requirments.txt
run.bat		run.bat
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR Annotation Tool

Features

Requirements

Installation

Usage

Setup

Annotation Workflow

Keyboard Shortcuts

File Organization

Output Format

About

Uh oh!

Releases

Packages

Languages

faezeam/OCR-Labeler

Folders and files

Latest commit

History

Repository files navigation

OCR Annotation Tool

Features

Requirements

Installation

Usage

Setup

Annotation Workflow

Keyboard Shortcuts

File Organization

Output Format

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages