📘 DocQuery-AI

Multi-PDF AI Question Answering System (Local + Cloud)

DocQuery-AI is an advanced AI-powered document intelligence system that allows users to upload, select, and query multiple PDF documents using semantic search and Retrieval-Augmented Generation (RAG).

It supports both offline (local LLM) and online (cloud LLM) modes, ensuring flexibility for development, privacy, and deployment.

Answers are strictly grounded in the uploaded PDFs, with page-level and section-level citations to prevent hallucinations.

🌐 Live Demo (Cloud Mode)

🚀 DocQuery-AI is live here:
👉 https://docquery-ai-o9hwz6rjgeah75ytyv6l9y.streamlit.app/

⚡ This live deployment uses Groq LLM (llama-3.1-8b-instant) for fast, free cloud inference.

🧠 LLM Modes Supported

🔹 Cloud Mode (Current Live Demo)

LLM: Groq – llama-3.1-8b-instant
Inference: Cloud-based
Cost: Free tier (no billing)
Best for: Deployment, demos, sharing

🔹 Offline Mode (Local Setup)

LLM: Ollama (e.g., Phi, Mistral)
Inference: Fully local
Internet: Not required
Best for: Privacy-focused & offline use

🚀 Key Features

📂 Multi-PDF Upload & Selection
🔍 Semantic Search using FAISS
🧠 LLM-powered Answers (Groq or Ollama)
📄 Page-wise & Section-wise Explanations
🧾 Source Citations (PDF | Section | Page)
📊 Confidence Scoring
🛡️ Hallucination Control (PDF-grounded answers only)
⚡ Fast & Lightweight UI (Streamlit)
💸 No mandatory billing required

🧠 System Architecture

PDFs └── Text Extraction (PyMuPDF) └── Section Detection (Font-based) └── Embeddings (SentenceTransformers) └── Vector Store (FAISS) └── Query Retrieval └── Local LLM (Ollama) └── Answer + Sources

🧰 Tech Stack

Layer	Technology
UI	Streamlit
PDF Parsing	PyMuPDF
Embeddings	SentenceTransformers
Vector Database	FAISS
LLM	Ollama (phi / mistral)
Language	Python

⚙️ Installation & Setup

1️⃣ Clone the Repository

git clone https://github.com/Subhajit-Das-1/DocQuery-AI.git
cd DocQuery-AI
2️⃣ Install Dependencies
pip install -r requirements.txt

3️⃣ Install Ollama & Pull Model
ollama pull phi

4️⃣ Run the Application
streamlit run app.py

🖥️ How to Use

Upload one or more PDF files
Select PDFs from the sidebar
Ask questions such as:
- What is inside this PDF?
- Explain page 10
- Explain continuous-time signals
View:
- ✅ AI-generated answer
- 📊 Confidence score
- 📌 Source pages & sections
- 🔍 Highlighted reference text

🧩 Future Enhancements

🧠 OCR support for scanned PDFs
📊 PDF comparison mode
📝 Auto-generated notes & summaries
❓ MCQ / exam question generator
📤 Export answers to PDF / DOCX

📜 License

This project is licensed under the MIT License.

👤 Author

Subhajit Das

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
pdf_processor.py		pdf_processor.py
qa_engine.py		qa_engine.py
railway.json		railway.json
requirements.txt		requirements.txt
vector_store.py		vector_store.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📘 DocQuery-AI

🌐 Live Demo (Cloud Mode)

🧠 LLM Modes Supported

🔹 Cloud Mode (Current Live Demo)

🔹 Offline Mode (Local Setup)

🚀 Key Features

🧠 System Architecture

🧰 Tech Stack

⚙️ Installation & Setup

1️⃣ Clone the Repository

🖥️ How to Use

🧩 Future Enhancements

📜 License

👤 Author

About

Uh oh!

Releases

Packages

Languages

Subhajit-Das-1/DocQuery-AI

Folders and files

Latest commit

History

Repository files navigation

📘 DocQuery-AI

🌐 Live Demo (Cloud Mode)

🧠 LLM Modes Supported

🔹 Cloud Mode (Current Live Demo)

🔹 Offline Mode (Local Setup)

🚀 Key Features

🧠 System Architecture

🧰 Tech Stack

⚙️ Installation & Setup

1️⃣ Clone the Repository

🖥️ How to Use

🧩 Future Enhancements

📜 License

👤 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages