RAG-Ingest: A tool for converting PDFs to markdown and indexing them for enhanced Retrieval Augmented Generation (RAG) capabilities.
-
Updated
Nov 22, 2024 - Python
RAG-Ingest: A tool for converting PDFs to markdown and indexing them for enhanced Retrieval Augmented Generation (RAG) capabilities.
Case study using dotfurther's Open Discover Platform with the RavenDB document store to rapidly create a full-text search/eDiscovery/information governance capable demonstration application.
A simple RAG toolkit.
Self-hosted RAG engine for AI coding assistants. Ingests technical docs & code repositories locally with structure-aware chunking. Serves grounded context via MCP to prevent hallucinations in software development workflows.
Production-grade RAG chatbot with a FastAPI + LangGraph backend (Pinecone vector search + Groq LLM + Tavily web fallback) and a Streamlit chat UI, secured via API key and observable in LangSmith.
An implementation of the GraphRAG pipeline (based on the 2024 paper "From Local to Global" by Edge et al.) for query-focused summarization of large text corpora.
Self-hosted RAG prototype to ingest PDFs/HTML and chat with them via a local UI
Store millions of text chunks inside ultra-compact MP4 files, index them with local embeddings, and retrieve answers instantly for fully offline RAG with any LLM.
An AI Analytics Dashboard for research labs analytics, collaboration, and email workflow using React and FastAPI.
AI-powered RAG assistant for parents to get instant, context-aware answers on Brainwonders’ career counseling programs, pricing, and services. Built with Streamlit, LangChain, ChromaDB, and Google Gemma LLM for fast, multi-document retrieval and conversational Q&A.
Async document watcher that keeps your RAG index hot. Automatically ingests new or changed documents into a live RAG pipeline with built-in observability.
Agentic RAG Chatbot using multi-agent architecture and Streamlit. Ingests PDFs, DOCX, PPTX, CSV, TXT, and Markdown files to provide contextually accurate answers with a persistent knowledge base. Supports multi-turn conversations, source citations, and dynamic document uploads.
ScriptumAI is an advanced Retrieval-Augmented Generation platform designed for document ingestion and query processing.
🗂️ Build a knowledge graph for global query-focused summarization from document corpora using the GraphRAG pipeline, enhancing information synthesis.
📊 Streamline query-focused summarization by constructing knowledge graphs and extracting insights from document corpora with the GraphRAG pipeline.
Add a description, image, and links to the document-ingestion topic page so that developers can more easily learn about it.
To associate your repository with the document-ingestion topic, visit your repo's landing page and select "manage topics."