Skip to content

Gangadhar24377/NeuroDynamics_langraph_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

RAGent - AI-Powered Multi-Agent RAG System with LangGraph

An intelligent multi-agent system that routes queries to specialized agents for weather information and document-based question answering using LangGraph, LangChain, and OpenAI.

Deployed Link: [(https://langraph-weather-pdf-agent.streamlit.app/)]

Python 3.11+ LangChain LangGraph License: MIT


Overview

This project implements an intelligent agentic system that uses LangGraph for orchestration and routing. The system automatically classifies user queries and routes them to specialized agents:

  • 🌤️ Weather Agent: Retrieves real-time weather data using OpenWeather API
  • 📄 Document RAG Agent: Answers questions from uploaded PDF documents using vector similarity search

Key Technologies

  • LangGraph: Agent orchestration and state management
  • LangChain: RAG chains and document processing
  • OpenAI GPT-4o-mini: Query classification and document QA
  • Qdrant: Cloud vector database for document embeddings
  • Streamlit: Interactive web interface
  • LangSmith: Observability and tracing

Features

Intelligent Query Routing

  • LLM-based classification using GPT-4o-mini
  • Automatic intent detection (weather vs documents)
  • Fallback mechanisms for robust handling

Weather Agent

  • Real-time weather data via OpenWeather One Call API 3.0
  • Smart query parsing (extracts location, date, weather type)
  • Multiple query formats supported:
    • "What's the weather in London?"
    • "Will it rain tomorrow in Paris?"
    • "Temperature in New York today"

Document RAG Agent

  • PDF document upload and processing
  • Vector similarity search using Qdrant
  • Context-aware answers with source attribution

Observability

  • LangSmith integration for complete trace visibility
  • Performance monitoring (latency, token usage, costs)
  • Debug logging at multiple levels

Comprehensive Testing

  • 100+ test cases covering:
    • API handling and error recovery
    • LLM processing and classification
    • Vector retrieval and RAG chains
    • End-to-end integration flows

Architecture

System Architecture

┌─────────────────────────────────────────────────────────────┐
│                    AI AGENTIC ARCHITECTURE                  │
└─────────────────────────────────────────────────────────────┘

                        🚀 USER QUERY
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  🧠 DECISION NODE - LLM CLASSIFIER (GPT-4o-mini)           │
│                                                             │
│  • Analyzes user intent                                     │
│  • Routes to appropriate agent                              │
│  • Fallback: keyword matching                               │
└─────────────────┬───────────────────────┬───────────────────┘
                  │                       │
     Weather Query│                       │Document Query
  (temp, forecast)│                       │(PDF, assignment)
                  ▼                       ▼
┌─────────────────────────┐    ┌─────────────────────────┐
│  🌤️ WEATHER AGENT       │    │  📄 PDF RAG AGENT       │
│                         │    │                         │
│  • OpenWeather API      │    │  • Qdrant Vector DB     │
│  • Geocoding Service    │    │  • Document Embeddings  │
│  • Current + Forecast   │    │  • Similarity Search    │
│  • 150+ countries       │    │  • LangChain RetrievalQA│
└─────────┬───────────────┘    └─────────┬───────────────┘
          │                              │
          └──────────────┬───────────────┘
                         ▼
┌─────────────────────────────────────────────────────────────┐
│              ✅ UNIFIED RESPONSE OUTPUT                     │
│                                                             │
│  • Weather: Formatted forecast with conditions             │
│  • Documents: RAG-based answers from uploaded PDFs         │
│  • Error handling: Graceful fallbacks                      │
└─────────────────────────────────────────────────────────────┘

LangGraph Flow

The system uses LangGraph for agent orchestration with conditional routing:

graph TD;
    START([User Query]) --> DECISION{Decision Node<br/>LLM Classification}
    DECISION -->|Weather| WEATHER[Weather Agent<br/>OpenWeather API]
    DECISION -->|Document| PDF[PDF RAG Agent<br/>Qdrant + LangChain]
    WEATHER --> END([Response])
    PDF --> END
Loading

Prerequisites

Required Software

  • Python 3.11+ (Download)
  • pip (comes with Python)
  • Git (for cloning the repository)

Required API Keys

  1. OpenAI API Key (Get it here)

    • Used for: Query classification, embeddings, document QA
  2. OpenWeather API Key (Get it here)

    • Used for: Real-time weather data
    • Free tier: 1,000 calls/day
  3. LangSmith API Key (Get it here) (Optional but recommended)

    • Used for: Observability and tracing

Installation

Step 1: Clone the Repository

git clone https://github.com/Gangadhar24377/NeuroDynamics_langraph_agent.git
cd NeuroDynamics_langraph_agent/my_rag_app

Step 2: Create Virtual Environment

Using Conda (Recommended):

conda create -n assignment python=3.11
conda activate assignment

Using venv:

python -m venv venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

Step 4: Create .env File

Create a .env file in the project root:

# Copy the template
cp .env.example .env

# Edit with your API keys
notepad .env  # Windows
nano .env     # Linux/Mac

Add your API keys:

# OpenAI Configuration
OPENAI_API_KEY=sk-proj-your_openai_api_key_here

# OpenWeather Configuration
OPENWEATHERMAP_API_KEY=your_openweather_api_key_here

# LangSmith Configuration (Optional)
LANGSMITH_API_KEY=lsv2_pt_your_langsmith_api_key_here
LANGSMITH_PROJECT=NeuroDynamics_Assignment
LANGSMITH_ENDPOINT=https://api.smith.langchain.com

Step 5: Verify Installation

# Test configuration
python -c "from config import *; print('✅ Configuration loaded successfully')"

---

## ⚙️ Configuration

### Environment Variables

| Variable | Required | Description | Default |
|----------|----------|-------------|---------|
| `OPENAI_API_KEY` | ✅ Yes | OpenAI API key for LLM and embeddings | - |
| `OPENWEATHERMAP_API_KEY` | ✅ Yes | OpenWeather API key for weather data | - |
| `LANGSMITH_API_KEY` | ⚠️ Optional | LangSmith API key for tracing | - |
| `LANGSMITH_PROJECT` | ⚠️ Optional | LangSmith project name | `NeuroDynamics_Assignment` |
| `LANGSMITH_ENDPOINT` | ⚠️ Optional | LangSmith API endpoint | `https://api.smith.langchain.com` |

### Vector Database

The system uses **Qdrant Cloud** for vector storage:

- **Collection**: `assignment`
- **Embedding Model**: `text-embedding-3-small` (1536 dimensions)
- **cloud required**: All data stored on Qdrant Cloud

---

## 💻 Usage

### Starting the Application

```bash
# Activate your environment
conda activate assignment  # or: source venv/bin/activate

# Run the Streamlit app
streamlit run app.py

The application will open in your browser at http://localhost:8501

Using the Interface

1. Upload Documents (Optional)

  • Click "Browse files" in the sidebar
  • Upload PDF documents
  • Click "Upload & Process"
  • Documents are vectorized and stored locally

2. Ask Questions

Weather Queries:

- "What's the weather in London?"
- "Will it rain tomorrow in Paris?"
- "Temperature in New York today"

Document Queries:

- "What are the requirements in the document?"
- "Tell me about the assignment"
- "What is mentioned in the PDF about AI?"

3. View Responses

  • Weather: Real-time data with temperature, conditions, humidity
  • Documents: AI-generated answers from your uploaded PDFs

Testing

Run All Tests

cd my_rag_app
python tests/run_all_tests_with_output.py

Run Individual Test Suites

# API handling tests (OpenWeather integration)
python tests/run_api_tests.py

# LLM processing tests (query classification)
python tests/run_llm_tests.py

# Retrieval logic tests (Qdrant + RAG)
python tests/run_retrieval_tests.py

# Integration tests (end-to-end flows)
python tests/run_integration_tests.py

Test Results

Results are saved in tests/tests_results/:

  • api_handling_results.txt
  • llm_processing_results.txt
  • retrieval_logic_results.txt
  • integration_results.txt
  • master_summary.txt

Test Coverage

  • 100+ test cases total
  • API Handling: 30+ tests (error handling, mocking, validation)
  • LLM Processing: 25+ tests (classification, fallbacks, consistency)
  • Retrieval Logic: 35+ tests (vector DB, RAG chains, performance)
  • Integration: 15+ tests (end-to-end flows, state management)

📁 Project Structure

my_rag_app/
│
├── app.py                          # Streamlit web interface
├── config.py                       # Configuration and environment variables
├── requirements.txt                # Python dependencies
├── .env                           # API keys (not in git)
├── README.md                      # This file
│
├── nodes/                         # LangGraph agent nodes
│   ├── decision_node.py          # LLM-based query classifier
│   ├── weather_node.py           # Weather API integration
│   └── pdf_rag_node.py           # Document RAG agent
│
├── graph/                         # LangGraph setup
│   └── graph_setup.py            # Graph builder and configuration
│
│── Langsmith_evals/              # contains langsmith screenshots fo each node with its outputs
│    └──PDF_RAG_NODE/             # contains pdf_rag_node screenshots
│    └──WEATHER_NODE/             # contains weather_node screenshots
├── rag/                          # RAG components
│   ├── ingestion.py             # Document processing and chunking
│   ├── retriever.py             # Vector store and retrieval
│   └── prompts.py               # RAG prompts and templates
│
├── utils/                        # Utility modules
│   └── query_processor.py       # Smart query parsing
│
├── tests/                        # Test suite
│   ├── test_api_handling.py     # API tests
│   ├── test_llm_processing.py   # LLM tests
│   ├── test_retrieval_logic.py  # RAG tests
│   ├── test_integration.py      # Integration tests
│   ├── run_all_tests_with_output.py  # Test runner
│   └── tests_results/           # Test output files
│
├── local_qdrant/                 # Local vector database
│   └── collection/
│       └── documents/
│
└── data/                         # Sample documents
    └── Assignment - AI Engineer.pdf

Decision Node (Query Routing) Features:

  • Uses GPT-4o-mini for fast, cost-effective classification
  • Fallback to keyword matching if LLM fails
  • Updates state with routing decision

Weather Node Features:

  • Smart query parsing (extracts location, date, weather type)
  • Geocoding support for 150+ countries
  • Error handling for invalid cities, API failures
  • Formatted, human-readable responses

PDF RAG Node Features:

  • Local Qdrant vector database
  • Similarity search with top-k retrieval
  • Context-aware answers using RetrievalQA
  • Document count checking

Document Ingestion Features:

  • Recursive character splitting for optimal chunking
  • OpenAI embeddings (text-embedding-3-small)
  • Metadata preservation
  • Chunk overlap for context continuity

API Documentation

OpenWeather One Call API 3.0

Endpoint: https://api.openweathermap.org/data/3.0/onecall

Parameters:

  • lat, lon: Geographic coordinates
  • appid: API key
  • units: metric (Celsius) or imperial (Fahrenheit)
  • exclude: Optional data exclusions

Response: Current weather, hourly/daily forecasts, alerts

OpenAI API

Models Used:

  • GPT-4o-mini: Query classification, document QA
  • text-embedding-3-small: Document embeddings (1536 dimensions)

Cost Optimization:

  • Using gpt-4o-mini instead of gpt-4 for routing (10x cheaper)
  • Token usage: ~150 tokens/query for classification
  • Cost: ~$0.0002 per RAG query

Troubleshooting

Common Issues

1. API Key Errors

Error: 401 Unauthorized or API key is not configured

Solution:

# Check .env file exists
ls -la .env

# Verify API keys are loaded
python -c "from config import *; print('OpenAI:', bool(OPENAI_API_KEY)); print('Weather:', bool(OPENWEATHER_API_KEY))"

# Test API connectivity
python test.py

2. Import Errors

Error: ModuleNotFoundError: No module named 'langchain'

Solution:

# Ensure environment is activated
conda activate assignment

# Reinstall dependencies
pip install -r requirements.txt

3. Qdrant Connection Issues

Error: Could not connect to Qdrant

Solution:

# Check local_qdrant directory exists
ls -la local_qdrant/

# Recreate collection
python -c "from rag.retriever import ensure_collection_exists; ensure_collection_exists()"

4. Streamlit Port Already in Use

Error: Address already in use

Solution:

# Use a different port
streamlit run app.py --server.port 8502

# Or kill the existing process (Windows)
netstat -ano | findstr :8501
taskkill /PID <PID> /F

Performance

Metrics

Operation Latency Tokens Cost/Query
Query Classification 1-2s ~150 $0.00003
Weather Query (Full) 2-4s ~150 $0.00003
Document Query (RAG) 8-12s ~1,170 $0.00024
Document Upload 5-10s Varies Per doc

Optimization Tips

  1. Use GPT-4o-mini for classification (10x cheaper than GPT-4)
  2. Cache embeddings for frequently queried documents
  3. Adjust chunk size (smaller = faster but less context)
  4. Limit top-k in retrieval (fewer docs = faster responses)
  5. Enable LangSmith to identify bottlenecks

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests before committing
python tests/run_all_tests_with_output.py

# Format code
black .
isort .

Acknowledgments

  • LangChain for the amazing RAG framework
  • LangGraph for agent orchestration
  • OpenAI for GPT and embedding models
  • Qdrant for vector database
  • OpenWeather for weather data API
  • Streamlit for the web interface

Contact & Support


Future Enhancements

  • Add more specialized agents (news, stocks, etc.)
  • Implement conversation memory
  • Support multiple languages
  • Add voice input/output
  • Deploy to cloud (AWS/Azure/GCP)
  • Add user authentication
  • Implement caching layer
  • Add analytics dashboard

Built using LangGraph, LangChain, and OpenAI

⭐ Star this repo if you find it helpful!

About

pretty cool application

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages