Skip to content

LLM-Engine: A Platform to Build and Deploy Large Language Models for Chatbots

Notifications You must be signed in to change notification settings

anthonylucky1909/LLM-Engine

Repository files navigation

🚀 LLM-Engine: Build, Train, and Deploy Large Language Models for Chatbots

Latest Release GitHub downloads Python versions License Open issues Coverage


📖 Overview

LLM-Engine is a modular platform to build, train, evaluate, and deploy large language models (LLMs) for chatbot applications.
It implements a GPT-2 style Transformer decoder, enabling efficient natural language understanding and generation with customizable architecture.


🧩 GPT-2 Model Architecture

The GPT-2 model follows the Transformer decoder architecture, consisting of stacked layers of:

  • Multi-head self-attention
  • Position-wise feed-forward layers
  • Residual connections & layer normalization

This design enables capturing long-range dependencies and contextual information effectively.

GPT-2 Model Architecture

Reference: Yang, Steve; Ali, Zulfikhar; Wong, Bryan (2023). FLUID-GPT (Fast Learning to Understand and Investigate Dynamics with a Generative Pre-Trained Transformer): Efficient Predictions of Particle Trajectories and Erosion. ChemRxiv. https://doi.org/10.26434/chemrxiv-2023-ppk9s


⚡ Quick Start

Using Docker

chmod +x run.sh
./run.sh

Manual Setup

pip install -r requirements.txt
python3 inference.py

📂 Dataset Preparation

Download datasets (example: ChatGPT conversations from Kaggle):

import kagglehub

path = kagglehub.dataset_download("noahpersaud/89k-chatgpt-conversations")
print("Path:", path)

Then preprocess:

python scripts/prepare_dataset.py --input chatlogs.jsonl --output data/word_level_dataset.csv

🏋️ Training the Model

python3 train.py     --epochs 10     --lr 0.0001     --d_model 512     --n_layers 8     --n_heads 8     --dropout 0.1     --save_path Model.pth     --print_samples 3     --tie_embeddings

Arguments:

  • --epochs : Training epochs
  • --lr : Learning rate
  • --d_model : Embedding dimension
  • --n_layers : Transformer decoder layers
  • --n_heads : Attention heads
  • --dropout : Dropout rate
  • --save_path : Save model path
  • --print_samples : Print training samples
  • --tie_embeddings : Tie input/output embeddings

📦 Pretrained Model

git clone https://huggingface.co/anthonyhuang1909/LLM-Engine

Includes:

  • Model.pth – pretrained weights
  • vocab.json – tokenizer vocabulary

⚠️ Disclaimer

This project is intended for educational & research purposes.
It demonstrates the principles of Transformer-based models on a smaller scale.


📜 License

Released under the MIT License.


Last updated: 2025-08-21

About

LLM-Engine: A Platform to Build and Deploy Large Language Models for Chatbots

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published