|
| 1 | +# **Open-Source Model Experiments** |
| 2 | + |
| 3 | +This directory contains four standalone experiments exploring |
| 4 | +**local, open-source language models** for Retrieval-Augmented Generation |
| 5 | +(RAG), model evaluation, recursive editing, and sustainability tracking |
| 6 | +(energy & CO₂ emissions). |
| 7 | +Each subfolder includes its own notebook, documentation, outputs, and |
| 8 | +model-specific setup. |
| 9 | + |
| 10 | +--- |
| 11 | + |
| 12 | +## Directory Structure |
| 13 | + |
| 14 | +```text |
| 15 | +2_open_source_models/ |
| 16 | +│ |
| 17 | +├── distilled_models/ |
| 18 | +│ └── rag_and_distilled_model/ |
| 19 | +│ |
| 20 | +├── quantized_models/ |
| 21 | +│ └── mistral7b/ |
| 22 | +│ |
| 23 | +└── slm/ |
| 24 | + ├── google_gemma/ |
| 25 | + └── qwen/ |
| 26 | +``` |
| 27 | + |
| 28 | +Each subfolder contains a self-contained model with its own README, |
| 29 | +notebook(s), generated outputs, and energy/emissions logs where applicable. |
| 30 | + |
| 31 | +--- |
| 32 | + |
| 33 | +## Project Summaries |
| 34 | + |
| 35 | +Below is a concise description of each model project to understand |
| 36 | +the purpose of the overall folder at a glance. |
| 37 | + |
| 38 | +--- |
| 39 | + |
| 40 | +### **1. Distilled Models – RAG + Instruction-Tuned Distilled LMs** |
| 41 | + |
| 42 | +**Folder:** `distilled_models/rag_and_distilled_model/` |
| 43 | +**Notebook:** `Apollo11_rag&distilled.ipynb` |
| 44 | + |
| 45 | +This project uses a lightweight **LaMini-Flan-T5-248M** distilled model |
| 46 | +combined with a **MiniLM** embedding model to run a fully local |
| 47 | +Retrieval-Augmented Generation pipeline on the Apollo 11 dataset. |
| 48 | +It demonstrates: |
| 49 | + |
| 50 | +* Local embeddings and ChromaDB vector storage |
| 51 | +* RAG-based question answering |
| 52 | +* Evaluation across several prompt types |
| 53 | +* Emissions tracking and generated output logs |
| 54 | + |
| 55 | +Ideal for showing how **compact distilled models** can handle |
| 56 | +RAG efficiently on CPU or modest GPU hardware. |
| 57 | + |
| 58 | +--- |
| 59 | + |
| 60 | +### **2. Quantized Models – Mistral 7B RAG Pipeline** |
| 61 | + |
| 62 | +**Folder:** `quantized_models/mistral7b/` |
| 63 | + |
| 64 | +This project evaluates a **quantized Mistral-7B (GGUF)** model running |
| 65 | +fully locally via `llama-cpp-python`. |
| 66 | +It focuses on: |
| 67 | + |
| 68 | +* Retrieval-Augmented Generation using LlamaIndex |
| 69 | +* Local inference using a 4-bit quantized LLM |
| 70 | +* Document processing, embedding (BGE-small), and top-k retrieval |
| 71 | +* Practical observations on feasibility and performance on a laptop |
| 72 | + |
| 73 | +A strong example of how quantization enables |
| 74 | +**large-model capability at small-device cost**. |
| 75 | + |
| 76 | +--- |
| 77 | + |
| 78 | +### **3. Small Language Model (SLM): Google Gemma 2-2B** |
| 79 | + |
| 80 | +**Folder:** `slm/google_gemma/` |
| 81 | + |
| 82 | +This experiment implements a structured RAG workflow with Google’s lightweight |
| 83 | +**Gemma 2-2B** model and a fixed Apollo 11 source text. |
| 84 | +Key features include: |
| 85 | + |
| 86 | +* Standardized 21-prompt evaluation set |
| 87 | +* RAG pipeline with chunked retrieval |
| 88 | +* Draft to Critic to Refiner multi-step generation |
| 89 | +* Real-time emissions logging with CodeCarbon |
| 90 | +* Fully reproducible testing and reporting |
| 91 | + |
| 92 | +This project demonstrates how even very small open-weight models can |
| 93 | +perform multi-step reasoning when paired with thoughtful prompting and revision |
| 94 | +cycles. |
| 95 | + |
| 96 | +--- |
| 97 | + |
| 98 | +### **4. Small Language Model (SLM): Qwen 2.5B + Recursive Editing** |
| 99 | + |
| 100 | +**Folder:** `slm/qwen/` |
| 101 | + |
| 102 | +This notebook experiments with **Qwen 2.5B**, integrating: |
| 103 | + |
| 104 | +* RAG retrieval |
| 105 | +* A recursive editing loop (Draft to Critic to Refine) |
| 106 | +* Context retrieval through Hugging Face embeddings |
| 107 | +* Energy + CO₂ logging for each query |
| 108 | + |
| 109 | +Outputs are saved in markdown form with all iterations and emissions data. |
| 110 | + |
| 111 | +--- |
| 112 | + |
| 113 | +## Purpose of This Collection |
| 114 | + |
| 115 | +This folder exists to: |
| 116 | + |
| 117 | +* Compare how different **model sizes**, **architectures**, and |
| 118 | +**inference strategies** behave on the **same tasks**. |
| 119 | +* Demonstrate **fully local RAG pipelines** using only open-source components. |
| 120 | +* Document **energy and carbon trade-offs** in local LLM usage. |
| 121 | +* Provide reproducible examples that can be extended or rerun with other models. |
| 122 | + |
| 123 | +Each subfolder is designed as a standalone experiment, but together they |
| 124 | +form a cohesive study of open-source LLM efficiency and performance. |
| 125 | + |
| 126 | +--- |
| 127 | + |
| 128 | +## Notes |
| 129 | + |
| 130 | +* All code is intended to run locally. |
| 131 | +* Each folder includes its own notebook and README with instructions. |
| 132 | +* Energy/emissions reporting is included where relevant (via CodeCarbon). |
| 133 | +* Datasets and prompts are standardized across projects for fairness and comparability. |
0 commit comments