A deep learning project for classifying Nepali Sign Language characters using TensorFlow. This project processes sign language images and converts them into TFRecord format for efficient training of neural networks.
- For data collection information, see DATA_COLLECTION.md
- For data processing information, see DATA_PROCESSING.md
- For model training information, see MODEL_TRAINING.md
- For model training strategies, see TRAINING_STRATEGIES.md
Download Link: Nepali Sign Language Character Dataset
The dataset contains images of Nepali Sign Language characters (0-35) with two background types:
- Plain Background: Clean images with uniform backgrounds
- Random Background: Images with varied, realistic backgrounds
data/
├── Plain Background/
│ ├── 0/ (Character '0' images)
│ ├── 1/ (Character '1' images)
│ ├── ...
│ └── 35/ (Character '35' images)
└── Random Background/
├── 0/ (Character '0' images)
├── 1/ (Character '1' images)
├── ...
└── 35/ (Character '35' images)
nsl-classification/
├── data/ # Raw dataset (Plain & Random Background)
├── tfrecords/ # Processed TFRecord files
│ ├── train.tfrecord # Training data (70%)
│ ├── val.tfrecord # Validation data (15%)
│ └── test.tfrecord # Test data (15%)
├── tfrecord.py # Data preprocessing script
├── nsl.ipynb # Main training notebook
├── DATA_PREPARATION.md # Detailed preprocessing documentation
├── pyproject.toml # Project dependencies
└── README.md # README
- 36 Classes: Nepali Sign Language characters (0-35)
- Dual Background Types: Plain and random backgrounds for robustness
- TFRecord Format: Optimized binary format for fast training
- Stratified Splitting: Balanced train/validation/test splits
- Image Preprocessing: Standardized 256x256 pixel images
- Progress Tracking: Visual progress bars during data processing
- Python ≥ 3.12
- TensorFlow ≥ 2.20.0
- scikit-learn ≥ 1.7.2
- tqdm ≥ 4.67.1
-
Clone the repository:
git clone <repository-url> cd nsl
-
Install dependencies:
uv sync
Or install manually:
uv add tensorflow scikit-learn tqdm
- Download the dataset from Kaggle
- Extract the dataset to the
data/directory - Run the preprocessing script:
python3 tfrecord.py
This will:
- Process all images from both background types
- Resize images to 256×256 pixels
- Create stratified train/validation/test splits (70%/15%/15%)
- Generate optimized TFRecord files in the
tfrecords/directory - Display progress bars for each processing step
Open and run the Jupyter notebook:
jupyter notebook nsl.ipynbThe notebook includes:
- TFRecord loading and parsing
- Data augmentation techniques
- Model architecture definition
- Training loop with validation
- Performance evaluation
- Input: JPEG images of varying sizes
- Output: 256×256 RGB images normalized to [0,1]
- Format: TFRecord with image and label features
- Training: 70% of data for model training
- Validation: 15% for hyperparameter tuning
- Test: 15% for final model evaluation
- Stratification: Maintains class balance across all splits
- Performance: 5-10x faster loading compared to individual image files
- Storage: Compressed binary format reduces disk usage
- Memory: Efficient batch processing for large datasets
- Reproducibility: Consistent data splits across experiments
The project uses TensorFlow/Keras for building convolutional neural networks suitable for image classification tasks. The notebook explores various architectures optimized for sign language recognition.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is open source. Please check the dataset license on Kaggle for data usage terms.