This repository contains the Backend Module developed as the primary functional layer for the Bachelor's Engineering Thesis: "Learning module with conversational educational system".
- University: Warsaw University of Technology (Politechnika Warszawska)
- Faculty: Faculty of Mathematics and Information Science (MiNI)
- Supervisor: dr inż. Anna Wróblewska
- Authors: Anna Ostrowska, Gabriela Majstrak, Jan Opala
The backend is built as a high-performance REST API using FastAPI, designed to utilize the RAG (Retrieval-Augmented Generation) pipeline.
- Vector Database: ChromaDB for efficient semantic search and context retrieval.
- OCR & Parsing: Adaptive document processing.
- LLM Orchestration: Integration with OpenAI (GPT-4o-mini) and Google Gemini for generation, and Voyage AI for embeddings.
- Task Scheduling: APScheduler automatically synchronizes course materials from Moodle.
The system features a custom-built ingestion pipeline (parser_utils.py) that includes:
- Adaptive OCR
- Context-Aware Chunking
- Math Normalization
- RAG Pipeline: Uses vector search to provide factually grounded answers strictly based on course content.
- Quiz Generation: Automated assessment creation based on Bloom's Taxonomy to evaluate student understanding.
- Smart Sync: Periodic background jobs that check for new course materials without manual teacher intervention.
app/api/routes/: Define API endpoints for chat, quiz, and dashboard features.app/services/: Core business logic, including the RAG engine and embedding management.app/core/parser_utils.py: Technical implementation of the parsing and OCR logic.Dockerfile&docker-compose.yaml: Full containerization setup for easy deployment and scalability.
- Ensure your
.envfile is configured with the necessary API keys. - Build and start the system:
docker compose up -d --buildThe API will be available at http://localhost:8000 with interactive Swagger documentation at /docs.
Developed as the primary technical part of the diploma process at Warsaw University of Technology.