Recognizing American Sign Language gestures from images using Convolutional Neural Networks (CNNs), data augmentation, and real-time webcam prediction.
- Overview
- Business Problem
- Dataset
- Tools & Technologies
- Project Structure
- Data Cleaning & Preparation
- Exploratory Data Analysis (EDA)
- Research Questions & Key Findings
- Dashboard
- How to Run This Project
- Final Recommendations
- Author & Contact
This project detects ASL gestures from image inputs using a deep learning pipeline built in Google Colab. It leverages CNNs and real-time webcam prediction to classify hand gestures with high accuracy. The workflow includes image preprocessing, model training, evaluation, and live prediction — all within a Colab notebook.
ASL recognition can empower inclusive communication for the hearing-impaired. This project aims to:
- Automate ASL gesture detection from webcam feeds
- Reduce reliance on manual interpretation
- Enable real-time prediction in browser-based environments
- Support accessibility in digital platforms
- Image dataset uploaded to
/content/data/in Colab with folders per gesture (A–Z) - Includes training and validation sets
- Augmented using rotation, flip, zoom, and brightness tuning
- Stored temporarily in Colab runtime or mounted from Google Drive
- Python (TensorFlow, OpenCV, NumPy, Matplotlib)
- Google Colab (GPU acceleration, webcam access)
- GitHub (for version control and notebook hosting)
- Google Drive (optional dataset mounting)
asl_detection/
│
├── README.md
├── train_model.ipynb # Main training notebook
├── realTime_prediction.ipynb # Webcam prediction notebook
│
├── asl_dataset # Uploaded image dataset
│ ├── A/
│ ├── B/
│ └── ...Z/
│
├── models/ # Saved models (.h5)
│ └── asl_model.h5
├── requirements.txt
├──asl_detection.pdf # summery pdf
- Removed blurry and mislabeled gesture images
- Resized all images to 64x64 pixels
- Applied augmentation to balance gesture classes
- Encoded labels and split into train/val sets
Class Distribution:
- 26 ASL gesture classes (A–Z)
- Balanced using augmentation for underrepresented gestures
Image Quality Checks:
- Verified resolution and gesture clarity
- Removed grayscale and low-contrast images
Sample Visuals:
- Grid plots of gesture samples
- Bar chart of image count per class
- Model Accuracy: Achieved 92.8% accuracy on validation set
- Misclassifications: Most confusion between M vs. N and U vs. V
- Augmentation Impact: Boosted accuracy by 6.3% on minority gestures
- Confidence Scores: Average prediction confidence = 0.89
- Real-Time Prediction: Webcam latency <1.2s per frame
- Real-time webcam interface via
cv2.VideoCapture()in Colab - Displays:
- Live gesture prediction
- Confidence score overlay
- Frame-by-frame classification
-
Open the training notebook in Colab:
🔗 ASL_Detection_Training.ipynb -
Upload your dataset to
/content/data/or mount Google Drive:
from google.colab import drive
drive.mount('/content/drive')- Train the model:
# Run all cells in ASL_Detection_Training.ipynb- Run real-time webcam prediction:
🔗 ASL_RealTime_Prediction.ipynb
- Expand dataset with dynamic hand gestures and backgrounds
- Integrate hand tracking for better localization
- Use Grad-CAM for gesture explainability
- Collaborate with accessibility-focused organizations for real-world testing
Sushree Bandita Das
📧 Email: sushreebanditadas01@gmail.com
🔗 LinkedIn
🔗 Portfolio