Welcome to my digital workspace. This repository documents my journey in building Artificial Intelligence solutions. Unlike traditional engineering portfolios, this collection highlights a dual focus:
-
AI Development: Writing robust Python code, fine-tuning LLMs, and building RAG pipelines.
-
Product Strategy: Defining user needs, technical feasibility, and go-to-market logic for AI features.
- Core Development: Python,Colab, Git
- AI & ML: RAG, LlamaIndex, Gemini API, HuggingFace, Mistral, Phi-2, TinyLlama, PyTorch
- Product & Strategy: Jira, Figma, A/B Testing, Technical PRD Writing,
- Data Engineering: Pandas,PyMuPDF, Matplotlib, Tesseract, EasyOCR, NumPy
- UI: Gradio
- The Product Problem: Users spend too much time reading mortgage application documents.
- The "Tech" Solution:
- Buiding a multi-stage pipeline using Python and OCR to digitize legacy mortgage blobs. Implemented Vector Search (RAG) to allow loan officers to 'chat' with loan applications and automated the extraction of key financial data into structured JSON for downstream underwriting systems.
- Stack: Python, OCR, RAG, LLM,API
- The Product Problem: Simple chatbox to build basic component for RAG pipeline (system).
- The "Tech" Solution:
- Create a simple, functional chatbot that handles user input and provides model-generated replies. Retrieval will come next!.
- Stack: Python, LlamaIndex, RAG, LLM,Gemini API
-
The Problem: Retrieval-Augmented Generation (RAG) accuracy often suffers when querying large, complex PDF documents (like contracts or technical papers) because basic vector search alone struggles with lexical mismatches, terminology variations, and long context.
-
The "Tech" Solution: This notebook demonstrates an advanced RAG pipeline that significantly boosts retrieval quality (Recall and Precision) by stacking three optimization techniques:
-
Query Expansion: Uses the LLM to generate multiple relevant query variations, ensuring a wider net is cast.
-
Hybrid Retrieval (Vector + BM25): Combines semantic search (embeddings) with keyword search (BM25) to retrieve both conceptual and exact term matches.
-
Reranking: Employs a Cross-Encoder model (e.g., Sentence Transformer) as a final filter to re-score and prioritize the most relevant retrieved chunks, maximizing the quality of context passed to the LLM.
-
-
Stack: Python, LlamaIndex,RAG, LLM, Gemini API
-
Review Data: HERE
-
The Problem: Financial documents, like the Lender's Fees Worksheet used in this demo, are often dense and semi-structured, mixing tables, line items, and prose. Extracting specific, cross-referenced data. For instance, calculating a total monthly payment or locating a single fee. This is a manual, time-consuming, and error-prone process for end-users or automated systems relying solely on keyword searches. The goal is to move from laborious human review to instant, reliable data retrieval.
-
The "Tech" Solution:
The solution is an optimized RAG pipeline that achieves high retrieval accuracy for both numerical and textual data.
-
Parsing: It uses PyMuPDF for superior PDF text extraction, ensuring high-quality input data from the start.
-
Semantic Search: It converts all document content into dense vector embeddings (🔢) using an efficient model. This enables semantic search (Vector Retrieval), allowing the system to understand the meaning of a user's question (e.g., asking for a "security protection fee") and accurately retrieve (🔍) the relevant financial line item ("Lender's Title Insurance") from the document.
-
Synthesis: The retrieved context is then passed to the Gemini 2.5 Flash LLM, which synthesizes the final, accurate answer, even performing required calculations like summing monthly components.
-
-
Stack: Python, RAG, LlamaIndex, LLM, Gemini 2.5 Flash, HuggingFace MiniLM-L6-v2, PyMuPDF (fitz)
-
Review Data: HERE
-
The Problem: Financial documents, like the Lender's Fees Worksheet, are dense, unstructured, and time-consuming to analyze manually. Extracting specific, cross-referenced data—such as calculating a total monthly payment or locating a single fee—is often rigid and prone to human error. The goal is to move beyond single, static queries to an instant, conversational data assistant that can handle multi-turn follow-up questions and provide reliable, grounded facts.
-
The "Tech" Solution:
The solution is an optimized RAG pipeline built around a Conversational Chat Engine that achieves high retrieval accuracy and maintains memory across turns:
-
Parsing: It uses PyMuPDF for superior PDF text extraction, ensuring high-quality input data from the start, preserving complex table structures.
-
Semantic Search & Indexing: The pipeline converts document content into dense vector embeddings (🔢) using the highly efficient MiniLM model. This enables semantic search (Vector Retrieval), allowing the system to understand the meaning of a user's question (e.g., asking for a "security protection fee") and accurately retrieve (🔍) the relevant financial line item.
-
Conversational RAG: The key upgrade is the use of the LlamaIndex ChatEngine, which automatically retrieves context for every turn of the conversation. It combines the conversation history with the newly retrieved document chunks.
-
Synthesis: The combined context is passed to the Gemini 2.5 Flash LLM, which synthesizes the final, accurate, and memory-aware answer, allowing users to ask complex, multi-turn follow-up questions.
-
-
Stack: Python, RAG, LlamaIndex, LLM, Gemini 2.5 Flash, HuggingFace MiniLM-L6-v2, PyMuPDF (fitz)
-
view Presentation PDF: Full RAG Pipeline with Interactive Gradio Chatbot
-
The Problem: Most automated systems struggle with merged documents. When multiple distinct files (e.g., a Resume, a PaySlip, and a Contract) are scanned into a single PDF "blob," standard AI tools treat them as one continuous stream of text.
-
Key Challenges:
-
Context Bleeding: Answers about a PaySlip might mistakenly pull data from a Resume.
-
Inaccurate Retrieval: Standard keyword search fails to find information if the terminology differs (e.g., "Salary" vs. "Gross Pay").
-
Manual Effort: Users traditionally have to manually split and categorize files before they can be processed by AI.
-
-
-
The "Tech" Solution: Semantic Boundary Intelligence
This project solves the "Blob" problem by shifting from simple text extraction to a Metadata-Aware RAG Pipeline.
-
How it works:
-
Intelligent Splitting: The system uses LLM-based reasoning (Gemini 2.0) to analyze page transitions and detect document boundaries in real-time.
-
Semantic Indexing: Instead of a flat search, pages are embedded into a vector space using BAAI/bge-small-en-v1.5, allowing the system to understand the meaning of your questions.
-
Intent Routing: The AI first predicts which document type contains the answer, then applies a Metadata Filter to search only that specific section. This ensures high precision and eliminates "noise" from irrelevant pages.
-
-
-
Tech Stack
-
Core AI & Frameworks
-
LlamaIndex: The primary orchestration framework used for data ingestion, indexing, and retrieval logic.
-
Google Gemini 2.0 Flash: The "reasoning engine" responsible for document classification, boundary detection, and final response generation.
-
Sentence-Transformers: Powers the semantic search capabilities via the BGE-Small embedding model.
-
-
Processing & UI
-
Gradio: A Python-based UI framework used to build the interactive web dashboard, featuring custom CSS for a professional, bordered layout.
-
PyMuPDF: Utilized for high-performance PDF text extraction and parsing.
-
Nest-Asyncio: Manages asynchronous event loops to ensure the Gradio UI and LLM calls run smoothly in interactive environments.
-
-
Artifacts demonstrating system design and product management skills.
- Translating Gen Z behavioral insights into retention-focused product features. Covers the full lifecycle from hypothesis and user research to KPI definition and final feature recommendation.
- To systematically evaluate and compare the indexing speed, retrieval speed, and retrieval quality (conciseness) of three leading open-source embedding models: MiniLM-L6-v2, BGE-small-en, and E5-small-v2, within a simple RAG pipeline.
- To showcase the rigorous testing phase of the AI-Powered Document Automation Platform, demonstrating evidence-based model selection for LLMs and embedding models to achieve sub-second retrieval with factual accuracy.