Retrieval‑Augmented Generation demo with LangChain, FAISS and OpenAI
- Downloads Wikipedia articles on a topic of your choice via
WikipediaLoader. - Splits text into semantically coherent chunks with
SemanticChunker. - Creates an in‑memory FAISS vector store with OpenAI embeddings.
- Retrieves the most relevant chunks for a user question.
- Generates an answer with GPT-4o using a prompt pulled from LangChain Hub.
- Everything runs in a single script:
naive_rag_wikipedia.py.
# 1. Clone the repo
git clone https://github.com/felipeortizh/basic-RAG.git
cd basic-RAG
# 2. (Optional) create & activate a virtual environment
python -m venv .venv && source .venv/bin/activate # macOS / Linux
# or on Windows
python -m venv .venv && .\.venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txtSet your OpenAI API key before running:
export OPENAI_API_KEY="sk-..." # macOS / Linux
setx OPENAI_API_KEY "sk-..." # Windowspython naive_rag_wikipedia.pySample output:
Starting document loading...
Documents loaded successfully: 2 documents
Creating text splits...
Created 34 text splits
Creating vector store...
Vector store created successfully
...
Result: The main areas of Artificial Intelligence are...
- Ingest –
WikipediaLoaderfetches up to n articles for the query defined in the script (default: "Artificial Intelligence"). - Chunk –
SemanticChunkerbreaks each article into ~500‑token, semantically coherent chunks. - Index – Each chunk is embedded with
OpenAIEmbeddingsand stored in a FAISS index. - Retrieve – The top‑k chunks relevant to the user question are retrieved.
- Generate – Chunks are injected into a RAG prompt template from LangChain Hub and sent to the LLM for the final answer.
Wikipedia → Chunk → FAISS → Retriever → Prompt → LLM
| Change you want | Where to edit |
|---|---|
| Topic to search on Wikipedia | query argument of WikipediaLoader |
| Number of articles to load | load_max_docs |
| Embedding model | OpenAIEmbeddings() |
| Vector store implementation | replace FAISS with any LangChain store |
| LLM model | ChatOpenAI(model_name="gpt-3.5-turbo") |
basic-RAG/
├── naive_rag_wikipedia.py # end‑to‑end RAG pipeline
├── requirements.txt # Python dependencies
└── README.md # You are here
- Parse CLI arguments for topic, doc count & model selection
- Add persistence to the FAISS index
- Stream token-by-token LLM output
- Unit tests
Contributions are welcome! Please open an issue to discuss your ideas or submit a pull request.
Released under the MIT License.
