Skip to content

Advanced RAG with hybrid search, query classification, answer fusion, and self-correction

License

Notifications You must be signed in to change notification settings

TEJA4704/agentic-rag-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” Agentic RAG Framework

Python 3.10+ License: MIT

Advanced Retrieval-Augmented Generation with specialized agents for hybrid search, query classification, answer fusion, and self-correction. Implements SELF-RAG patterns for production-grade RAG systems.


🌟 Features

  • Hybrid Search - Vector + BM25 + Metadata with RRF fusion
  • Query Classification - Adaptive retrieval based on query type
  • Answer Fusion - Multi-source synthesis with voting
  • Cross-Reference Validation - Fact verification across sources
  • Source Citation - APA, MLA, Chicago, IEEE formatting
  • Knowledge Gap Detection - Iterative retrieval for missing info

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Agentic RAG Pipeline                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚ Query            β”‚  Classify: factual, analytical,           β”‚
β”‚  β”‚ Classification   β”‚  comparative, procedural                   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚           ↓                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚ Hybrid Search    β”‚  Vector + BM25 + Metadata                  β”‚
β”‚  β”‚ (RRF Fusion)     β”‚  Reciprocal Rank Fusion                    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚           ↓                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚ Knowledge Gap    β”‚  Detect missing info                       β”‚
β”‚  β”‚ Detection        β”‚  Trigger re-retrieval                      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚           ↓                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚ Answer Fusion    β”‚  Combine multiple sources                  β”‚
β”‚  β”‚ (Voting/Hybrid)  β”‚  Consistency analysis                      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚           ↓                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚ Cross-Reference  β”‚  Verify facts across sources               β”‚
β”‚  β”‚ Validation       β”‚                                            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚           ↓                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚ Source Citation  β”‚  APA, MLA, Chicago, IEEE                   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

git clone https://github.com/yourusername/agentic-rag-framework.git
cd agentic-rag-framework
pip install -r requirements.txt

Basic Usage

from rag_engine.agents import HybridSearchAgent, QueryClassificationAgent

# Classify query for adaptive retrieval
classifier = QueryClassificationAgent(llm_client=my_llm)
classification = await classifier.execute(QueryClassificationRequest(
    query="Compare Python vs JavaScript for web development"
))
print(classification.query_type)  # "comparative"
print(classification.suggested_strategy)  # "multi_source_comparison"

# Hybrid search with RRF fusion
searcher = HybridSearchAgent(
    vector_store=my_vector_db,
    keyword_index=my_bm25_index
)
results = await searcher.execute(HybridSearchRequest(
    query="machine learning best practices",
    semantic_weight=0.5,
    keyword_weight=0.3,
    metadata_weight=0.2,
    fusion_strategy=FusionStrategy.RRF
))

πŸ“š Agents

HybridSearchAgent

Combines vector, keyword, and metadata search with score fusion.

from rag_engine.agents import HybridSearchAgent, FusionStrategy

agent = HybridSearchAgent(vector_store=vs, keyword_index=ki)
result = await agent.execute(HybridSearchRequest(
    query="quantum computing applications",
    fusion_strategy=FusionStrategy.RRF,  # Reciprocal Rank Fusion
    use_reranking=True
))

QueryClassificationAgent

Classifies queries by type, complexity, and intent.

from rag_engine.agents import QueryClassificationAgent

agent = QueryClassificationAgent(llm_client=llm)
result = await agent.execute(QueryClassificationRequest(
    query="How do I implement a binary search tree?"
))
print(result.classification.query_type)    # PROCEDURAL
print(result.classification.complexity)    # MODERATE
print(result.classification.intent)        # LEARNING

AnswerFusionAgent

Combines answers from multiple sources using ensemble techniques.

from rag_engine.agents import AnswerFusionAgent, FusionStrategy

agent = AnswerFusionAgent(llm_client=llm)
result = await agent.execute(AnswerFusionRequest(
    answers=[answer1, answer2, answer3],
    query="What is the capital of France?",
    strategy=FusionStrategy.VOTING
))
print(result.fused_answer)
print(result.consistency_score)

CrossReferenceValidationAgent

Validates facts across multiple sources.

from rag_engine.agents import CrossReferenceValidationAgent

agent = CrossReferenceValidationAgent(llm_client=llm)
result = await agent.execute(CrossReferenceRequest(
    primary_content="Paris is the capital of France",
    reference_sources=[source1, source2, source3]
))
print(result.overall_reliability)
print(result.inconsistencies)

SourceCitationAgent

Generates properly formatted citations.

from rag_engine.agents import SourceCitationAgent, CitationStyle

agent = SourceCitationAgent()
result = await agent.execute(CitationRequest(
    sources=[source1, source2],
    style=CitationStyle.APA
))
print(result.bibliography)

πŸ“ Project Structure

agentic-rag-framework/
β”œβ”€β”€ rag_engine/
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── agents/
β”‚       β”œβ”€β”€ hybrid_search_agent.py
β”‚       β”œβ”€β”€ query_classification_agent.py
β”‚       β”œβ”€β”€ answer_fusion_agent.py
β”‚       β”œβ”€β”€ cross_reference_validation_agent.py
β”‚       └── source_citation_agent.py
β”œβ”€β”€ examples/
β”œβ”€β”€ tests/
β”œβ”€β”€ requirements.txt
└── README.md

πŸ“„ License

MIT License - See LICENSE


πŸ“¬ Contact

Ravi Teja K - AI/ML Engineer

About

Advanced RAG with hybrid search, query classification, answer fusion, and self-correction

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages