LOCAL RAG AND LLM SETUP - Chat with PDFs

Retrieval-Augmented Generation (RAG) stack that runs locally with Ollama for language models, Qdrant for vector search, and a Streamlit UI for document ingestion and chat. Use it to curate domain-specific knowledge bases.

Prerequisites

Python 3.10+
Git
Docker Desktop (or Docker Engine) with Compose support
Ollama running locally
At least 8 GB RAM recommended for the default LLM (llama3.2:1b); adjust models if needed

ℹ️ The repository uses local resources only—no external APIs are required once the models are pulled.

1. Clone & Virtual Environment

git clone https://github.com/<your-username>/local-rag-setup.git
cd local-rag-setup

# create virtual environment (Unix/macOS)
python -m venv .venv
source .venv/bin/activate

# on Windows PowerShell
python -m venv .venv
.venv\Scripts\Activate.ps1

Install Python dependencies:

pip install --upgrade pip
pip install -r requirements.txt

Prefer uv? Same steps look like:
uv venv --python 3.10
source .venv/bin/activate          # or .venv\Scripts\Activate.ps1 on Windows
uv pip install -r requirements.txt
You can then launch any command (e.g., uv run streamlit run app.py) without manually activating the environment.

2. Environment Configuration

The application reads configuration from environment variables (defaults are shown below). Create a .env file in the project root if you need to override them:

QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=
QDRANT_COLLECTION=my_documents
EMBED_MODEL=mxbai-embed-large
GEN_MODEL=llama3.2:1b
RETRIEVE_K=5
INGEST_BATCH_SIZE=64
CHUNK_SIZE_DEFAULT=1000
CHUNK_OVERLAP_DEFAULT=100
DATA_PATH=data
LOG_DIR=conversation_logs

QDRANT_COLLECTION is the default collection used when the UI first loads; you can create or switch collections at runtime.
DATA_PATH is where uploaded PDFs are stored (per collection).
LOG_DIR holds chat transcripts generated by the CLI tool.
RETRIEVE_K number how many chunks should be retrieved. Chunks are rated on relevancy

3. Start Services

Launch Qdrant
```
docker compose up -d qdrant
```
Qdrant will listen on http://localhost:6333 by default. You can check container status with docker compose ps.

Qdrant UI is accesible on http://localhost:6333/dashboard when using default url.
Pull Ollama models
```
ollama pull mxbai-embed-large   # embeddings
ollama pull llama3.2:1b         # generation
```
Ensure the Ollama service is running in the background (ollama serve). You may substitute alternative models; update EMBED_MODEL / GEN_MODEL accordingly.

4. Run the Streamlit App

streamlit run app.py

Ingest Workflow

Open the Ingest Documents page.
Pick an existing Qdrant collection or create a new one.
Upload one or more PDF files (they will be saved under <DATA_PATH>/<collection_name>/).
Adjust chunking parameters if necessary and click Process.
A progress bar tracks loading, chunking, and upsert status. Once complete, the collection is ready for querying.

Chat Workflow

Switch to the Chat with LLM page.
Select the collection you want to interrogate. The first query will trigger retrieval using the configured RETRIEVE_K value.
Ask a question in the chat box. Responses include an expandable view of the chunks retrieved from Qdrant with source metadata.
Each collection maintains its own conversation state; switching collections resets the visible history.

5. CLI Utilities (Optional)

This part is purely for testing purposes and behaves as a Streamlit chat but in the terminal. the only difference is saving the conversation logs.

python query_LLM.py — interactive console assistant using the same retrieval chain as the UI. It writes transcripts to LOG_DIR and is handy for quick smoke tests.
python rag_query_qdrant.py — single-shot helper that prints a model answer for a hard-coded prompt. Useful for integration checks or scripting.

Both commands respect the same configuration as the Streamlit app; ensure Qdrant and Ollama are running before invoking them.

6. Project Layout

.
├── app.py                 # Streamlit main UI for KB creation and chat
├── modules/               # Shared package
│   ├── __init__.py
│   ├── config.py          # RAGSettings dataclass + helpers
│   ├── ingestion.py       # DataIngestor with progress callbacks
│   ├── prompts.py         # Prompt templates for all chains
│   └── qdrant_utils.py    # Qdrant client, vector store, retriever
├── query_LLM.py           # KB chat with saving conversation log
├── rag_query_qdrant.py    # Minimal CLI smoke test
├── docker-compose.yml     # Qdrant service definition
├── requirements.txt
└── conversation_logs/     # Created at runtime

7. Troubleshooting

Ollama model not found: Verify ollama list shows the configured model names. Pull them again if needed.
Qdrant connection errors: Ensure the container is running and the QDRANT_URL in .env matches the exposed port (docker compose logs qdrant can help diagnose issues).
Large files ingest slowly: Increase INGEST_BATCH_SIZE or switch to a more compact embedding model. You can monitor ingestion progress directly in the UI logs.
Permission errors creating directories: Confirm the process has write access to DATA_PATH and LOG_DIR. Both are created automatically if the locations exist and are writable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LOCAL RAG AND LLM SETUP - Chat with PDFs

Prerequisites

1. Clone & Virtual Environment

2. Environment Configuration

3. Start Services

4. Run the Streamlit App

Ingest Workflow

Chat Workflow

5. CLI Utilities (Optional)

6. Project Layout

7. Troubleshooting

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
conversation_logs		conversation_logs
data		data
modules		modules
.env		.env
.env.template		.env.template
.gitignore		.gitignore
DataIngestor.py		DataIngestor.py
README.md		README.md
app.py		app.py
docker-compose.yml		docker-compose.yml
query_LLM.py		query_LLM.py
rag_query_qdrant.py		rag_query_qdrant.py
requirements.txt		requirements.txt

Idemdnu/local-rag-setup

Folders and files

Latest commit

History

Repository files navigation

LOCAL RAG AND LLM SETUP - Chat with PDFs

Prerequisites

1. Clone & Virtual Environment

2. Environment Configuration

3. Start Services

4. Run the Streamlit App

Ingest Workflow

Chat Workflow

5. CLI Utilities (Optional)

6. Project Layout

7. Troubleshooting

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages