Medical RAG Chatbot is a Retrieval-Augmented Generation (RAG) application that answers medical questions using trusted knowledge extracted from PDF documents (e.g., guidelines, manuals, clinical notes). It combines Groq LLM, Hugging Face embeddings, FAISS vector search, and LangChain to provide grounded answers with relevant context.
The app provides a Flask backend API, a lightweight HTML/CSS frontend, containerization with Docker, vulnerability scanning using Trivy, and CI/CD automation via Jenkins for AWS deployment.
- 📄 Ingest medical PDFs using PyPDF
- 🔍 Semantic retrieval using Hugging Face embeddings + FAISS
- 🤖 Context-grounded answers using Groq LLM
- 🔗 LangChain orchestration for RAG pipeline (retrieval + generation)
- 🌐 Flask API for chat + ingestion endpoints
- 🎨 Simple HTML/CSS web UI
- 🐳 Dockerized application for consistent deployment
- 🔐 Security scanning using Aqua Trivy (Docker image vulnerabilities)
- 🔁 Jenkins CI/CD pipeline for automated build, scan, and deploy
- ☁️ AWS deployment-ready workflow
- PDFs are loaded and text is extracted using PyPDF
- Text is chunked and embedded using Hugging Face embeddings
- Embeddings are indexed in FAISS
- User query is embedded and matched against FAISS for top relevant chunks
- Retrieved context is passed to the Groq LLM via LangChain
- Chatbot returns an answer grounded in retrieved evidence
| Category | Tools |
|---|---|
| LLMs | Groq |
| Embeddings | Hugging Face |
| RAG Framework | LangChain |
| Vector Store | FAISS (local) |
| PDF Processing | PyPDF |
| Backend | Flask |
| Frontend | HTML / CSS |
| Containerization | Docker |
| Security Scanning | Aqua Trivy |
| CI/CD | Jenkins |
| Cloud | AWS |
| SCM | GitHub |
git clone https://github.com/your-username/medical-rag-chatbot.git
cd medical-rag-chatbotpython -m venv ven
source venv/bin/activate # Windows: venv\Scripts\activatepip install -e .python src/app.py