An intelligent web accessibility auditor that finds WCAG violations and provides AI-powered suggestions to help developers fix them.
-
Updated
Sep 6, 2025 - Python
An intelligent web accessibility auditor that finds WCAG violations and provides AI-powered suggestions to help developers fix them.
自动生成字幕,内容总结,章节划分 | AI-driven education video analysis using Whisper, BLIP-2, and DeepSeek
A full-stack AI-powered image captioning app built with ReactJS (using Vite) and Flask. Users can upload images, and the app generates descriptive captions using Hugging Face’s BLIP model. Perfect for showcasing AI integration and web development skills in a mini-project.
This repository contains a small set of Jupyter notebooks demonstrating key computer vision and vision–language tasks using pretrained models. The final notebook integrates these tasks into a realtime webcam application that performs captioning and classification concurrently.
Emotica AI is a compassionate and therapeutic virtual assistant designed to provide empathetic and supportive conversations. It integrates a local LLaMA model for text generation, a vision model for image captioning, a RAG system for information retrieval, and emotion detection to tailor its responses.
Fine-tuned BLIP model on Flickr8k for multimodal image captioning (vision + language).
Fine Tuned the model BLIP to accurately caption images of Tom and Jerry.
A Flask-based API that generates captions for images using a custom deep learning model (BLIP). API accepts image or image frames and returns the caption generated using BLIP model.
LUME is an AI-powered app that turns your images into viral memes. Upload a photo, add an optional trending topic, and let Lume use BLIP and Groq AI to craft witty, high-quality captions with stylish overlays—ready to download and share instantly.
Drone based Image Descriptor - Toyota Hackathon 2025
A Visual Question Answering (VQA) Application.
An AI-powered image captioning web app using BLIP model from Hugging Face and Gradio.
A simple web application that generates captions for images using the BLIP model from Hugging Face Transformers and a user-friendly interface created with Gradio.
AI StoryTeller is a multimodal AI application that converts images into creative short stories by combining computer vision and natural language generation. The system uses a pretrained image captioning model to understand visual content and Google Gemini to generate context-aware narratives grounded in the image.
Welcome to the AI-Powered Interactive Learning Assistant! 🚀. This is an open-source, free, and low-hardware-intensive project designed especially for students and educators! Our goal is to bring the power of AI right into your classroom, making learning more interactive, engaging, and accessible for everyone.
This project generates behavioral descriptions from images by combining computer vision and natural language processing. It goes beyond basic scene descriptions to infer human behaviors, intentions, and social contexts.
🎥 Enable real-time image captioning and classification with this Jupyter notebook collection, featuring pretrained models and live webcam applications.
Add a description, image, and links to the blip-model topic page so that developers can more easily learn about it.
To associate your repository with the blip-model topic, visit your repo's landing page and select "manage topics."