🚀 LLM-Engine: Build, Train, and Deploy Large Language Models for Chatbots

📖 Overview

LLM-Engine is a modular platform to build, train, evaluate, and deploy large language models (LLMs) for chatbot applications.
It implements a GPT-2 style Transformer decoder, enabling efficient natural language understanding and generation with customizable architecture.

🧩 GPT-2 Model Architecture

The GPT-2 model follows the Transformer decoder architecture, consisting of stacked layers of:

Multi-head self-attention
Position-wise feed-forward layers
Residual connections & layer normalization

This design enables capturing long-range dependencies and contextual information effectively.

Reference: Yang, Steve; Ali, Zulfikhar; Wong, Bryan (2023). FLUID-GPT (Fast Learning to Understand and Investigate Dynamics with a Generative Pre-Trained Transformer): Efficient Predictions of Particle Trajectories and Erosion. ChemRxiv. https://doi.org/10.26434/chemrxiv-2023-ppk9s

⚡ Quick Start

Using Docker

chmod +x run.sh
./run.sh

Manual Setup

pip install -r requirements.txt
python3 inference.py

📂 Dataset Preparation

Download datasets (example: ChatGPT conversations from Kaggle):

import kagglehub

path = kagglehub.dataset_download("noahpersaud/89k-chatgpt-conversations")
print("Path:", path)

Then preprocess:

python scripts/prepare_dataset.py --input chatlogs.jsonl --output data/word_level_dataset.csv

🏋️ Training the Model

python3 train.py     --epochs 10     --lr 0.0001     --d_model 512     --n_layers 8     --n_heads 8     --dropout 0.1     --save_path Model.pth     --print_samples 3     --tie_embeddings

Arguments:

--epochs : Training epochs
--lr : Learning rate
--d_model : Embedding dimension
--n_layers : Transformer decoder layers
--n_heads : Attention heads
--dropout : Dropout rate
--save_path : Save model path
--print_samples : Print training samples
--tie_embeddings : Tie input/output embeddings

📦 Pretrained Model

git clone https://huggingface.co/anthonyhuang1909/LLM-Engine

Includes:

Model.pth – pretrained weights
vocab.json – tokenizer vocabulary

⚠️ Disclaimer

This project is intended for educational & research purposes.
It demonstrates the principles of Transformer-based models on a smaller scale.

📜 License

Released under the MIT License.

Last updated: 2025-08-21

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
diagram		diagram
front-end		front-end
model		model
training		training
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
config.py		config.py
gpt-2.ipynb		gpt-2.ipynb
inference.py		inference.py
requirements.txt		requirements.txt
run.sh		run.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 LLM-Engine: Build, Train, and Deploy Large Language Models for Chatbots

📖 Overview

🧩 GPT-2 Model Architecture

⚡ Quick Start

Using Docker

Manual Setup

📂 Dataset Preparation

🏋️ Training the Model

📦 Pretrained Model

⚠️ Disclaimer

📜 License

About

Uh oh!

Releases

Packages

Languages

anthonylucky1909/LLM-Engine

Folders and files

Latest commit

History

Repository files navigation

🚀 LLM-Engine: Build, Train, and Deploy Large Language Models for Chatbots

📖 Overview

🧩 GPT-2 Model Architecture

⚡ Quick Start

Using Docker

Manual Setup

📂 Dataset Preparation

🏋️ Training the Model

📦 Pretrained Model

⚠️ Disclaimer

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages