A curated list of awesome AI tools and repositories on GitHub
- Awesome AI Tools
- Contents
- Large Language Models
- AI Agents & Autonomous Systems
- AI Assistants & Chatbots
- RAG & Document AI
- AI Workflow Platforms
- Voice & Speech
- Face & Video Manipulation
- Video Generation
- Image Generation
- Image Enhancement
- Music & Audio
- Code Assistants
- Computer Vision
- AI Infrastructure
- Learning Resources
- Observability & Evaluation
- Specialized AI Tools
- Archived Projects
- Contributing
Run and deploy large language models locally or in the cloud.
- ollama - Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
- DeepSeek-V3 - Next generation language model from DeepSeek AI.
- DeepSeek-R1 - Advanced reasoning model from DeepSeek AI.
- gpt4all - Run Local LLMs on Any Device. Open-source and available for commercial use.
- llamafile - Distribute and run LLMs with a single file.
- mlc-llm - Universal LLM Deployment Engine with ML Compilation.
- koboldcpp - Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
- llama-gpt - A self-hosted, offline, ChatGPT-like chatbot powered by Llama 2. 100% private.
- FreedomGPT - Execute FreedomGPT LLM locally (offline and private) on Mac and Windows.
- gemma.cpp - Lightweight, standalone C++ inference engine for Google's Gemma models.
- gpt4free - Various collection of powerful language models.
Autonomous agents that can perform tasks, research, and make decisions.
- AutoGPT - Autonomous AI for everyone. Chain together LLM calls to autonomously achieve goals.
- openclaw - Your own personal AI assistant. Any OS. Any Platform. 🦞
- AgentGPT - Assemble, configure, and deploy autonomous AI Agents in your browser.
- goose - Open source AI agent that goes beyond code suggestions - install, execute, edit, and test.
- gpt-researcher - Autonomous agent that conducts web research and generates comprehensive reports with citations.
- storm - LLM-powered knowledge curation system that researches topics and generates full reports.
- plandex - AI driven development in your terminal. Designed for large, real-world tasks.
User-friendly chat interfaces and personal AI assistants.
- ChatGPT-Next-Web - Cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS).
- khoj - Your AI second brain. Self-hostable. Build custom agents, schedule automations, do deep research.
- Perplexica - AI-powered answering engine.
- FreeAskInternet - Completely free, private and local search aggregator using multi LLMs.
- browse-for-me - Arc Search Browse for Me clone using Ollama and DuckDuckGo.
Retrieval Augmented Generation and document processing systems.
- private-gpt - Interact with your documents using GPT, 100% privately, no data leaks.
- quivr - Production-ready RAG platform. Any LLM, any vectorstore, any file.
- localGPT - Chat with your documents on your local device. No data leaves your device.
- docling - Get your documents ready for gen AI.
No-code/low-code platforms for building AI applications and workflows.
- dify - Production-ready platform for agentic workflow development.
- langflow - Low-code app builder for RAG and multi-agent AI applications. Python-based and agnostic.
Text-to-speech, voice cloning, and speech recognition tools.
- whisper - Robust Speech Recognition via Large-Scale Weak Supervision.
- fish-speech - SOTA Open Source TTS.
- CosyVoice - Multi-lingual large voice generation model with full-stack deployment ability.
- Real-Time-Voice-Cloning - Clone a voice in 5 seconds to generate arbitrary speech in real-time.
- MockingBird - AI voice cloning: Clone a voice in 5 seconds.
- OpenVoice - Instant voice cloning by MIT and MyShell.
- Wav2Lip - Lip sync expert for speech to lip generation.
- AudioLDM - Generate speech, sound effects, music and beyond, with text.
- AI-Video-Translation - Translate videos into multiple languages along with lip sync.
Face swap, deepfake, and face animation tools.
- Deep-Live-Cam - Real time face swap and one-click video deepfake with only a single image.
- facefusion - Industry leading face manipulation platform.
- roop - One-click face swap.
- LivePortrait - Bring portraits to life!
- deepface - Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion, Race).
- hallo - Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation.
Text-to-video and image-to-video generation.
- Open-Sora-Plan - Open source reproduction of Sora (OpenAI T2V model).
- LTX-Video - Official repository for LTX-Video.
Text-to-image generation and diffusion models.
- ComfyUI - The most powerful and modular diffusion model GUI with a graph/nodes interface.
- Janus - Unified Multimodal Understanding and Generation Models.
- DiffSynth-Studio - Enjoy the magic of Diffusion models!
- Omost - Your image is almost there!
Image upscaling, restoration, and enhancement.
- GFPGAN - Practical algorithms for real-world face restoration.
- Real-ESRGAN - Practical algorithms for general image/video restoration.
Music generation and audio synthesis.
- bark - Text-Prompted Generative Audio Model.
- magenta-js - Music and Art Generation with Machine Learning in the browser.
AI-powered coding tools and assistants.
- fabric - Open-source framework for augmenting humans using AI prompts.
- cofounder - AI-generated apps, full stack + generative UI.
- aiXcoder-7B - Official repository of aiXcoder-7B Code Large Language Model.
- writer - AI powered documentation writer.
Object detection, tracking, and visual analysis.
- yolov10 - Real-Time End-to-End Object Detection [NeurIPS 2024].
- samurai - Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory.
Tools and platforms for deploying and managing AI systems.
- openpilot - Operating system for robotics. Upgrades driver assistance on 275+ supported cars.
Courses and educational materials for AI/ML.
- llm-course - Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Monitoring, debugging, and evaluating AI systems.
- phoenix - AI Observability & Evaluation.
Domain-specific and experimental AI applications.
- MiniCPM-o - Gemini 2.5 Flash Level MLLM for Vision, Speech, and Multimodal Live Streaming on phones.
- LLaVA - Visual Instruction Tuning towards GPT-4V level capabilities.
- MiniCPM-V - GPT-4V Level MLLM for Single Image, Multi Image and Video.
- Qwen2-VL - Multimodal large language model series by Qwen team, Alibaba Cloud.
- llama-fs - A self-organizing file system with llama 3.
- ai-town - Virtual town where AI characters live, chat and socialize.
- Marp-AI - Generate presentations using Ollama AI and Marp.
These projects haven't been updated in over 2 years and may be unmaintained or deprecated.
- jukebox - A Generative Model for Music.
(Last updated: Nov 2020)
- stable-diffusion - A latent text-to-image diffusion model.
(Last updated: Nov 2022)
- dalai - The simplest way to run LLaMA on your local machine.
(Last updated: Mar 2023)
- VoiceGPT - AI powered by EdgeGPT and LLaMa.cpp that you can talk to!
(Last updated: Apr 2023)
- AudioGPT - Understanding and Generating Speech, Music, Sound, and Talking Head.
(Last updated: May 2023)
- stanford_alpaca - Code and documentation to train Stanford's Alpaca models.
(Last updated: May 2023)
- Doctor-Dignity - LLM that can pass the US Medical Licensing Exam. Works offline.
(Last updated: Sep 2023)
Contributions are welcome! Please feel free to submit a Pull Request.
To add a new tool:
- Fork this repository
- Add the repository link and description in the appropriate category
- Ensure the description is concise and accurate
- Submit a PR with your changes
For major changes or new categories, please open an issue first to discuss.
Made with ❤️ by JMcrafter26