LLM-Engine is a modular platform to build, train, evaluate, and deploy large language models (LLMs) for chatbot applications.
It implements a GPT-2 style Transformer decoder, enabling efficient natural language understanding and generation with customizable architecture.
The GPT-2 model follows the Transformer decoder architecture, consisting of stacked layers of:
- Multi-head self-attention
- Position-wise feed-forward layers
- Residual connections & layer normalization
This design enables capturing long-range dependencies and contextual information effectively.
Reference: Yang, Steve; Ali, Zulfikhar; Wong, Bryan (2023). FLUID-GPT (Fast Learning to Understand and Investigate Dynamics with a Generative Pre-Trained Transformer): Efficient Predictions of Particle Trajectories and Erosion. ChemRxiv. https://doi.org/10.26434/chemrxiv-2023-ppk9s
chmod +x run.sh
./run.shpip install -r requirements.txt
python3 inference.pyDownload datasets (example: ChatGPT conversations from Kaggle):
import kagglehub
path = kagglehub.dataset_download("noahpersaud/89k-chatgpt-conversations")
print("Path:", path)Then preprocess:
python scripts/prepare_dataset.py --input chatlogs.jsonl --output data/word_level_dataset.csvpython3 train.py --epochs 10 --lr 0.0001 --d_model 512 --n_layers 8 --n_heads 8 --dropout 0.1 --save_path Model.pth --print_samples 3 --tie_embeddingsArguments:
--epochs: Training epochs--lr: Learning rate--d_model: Embedding dimension--n_layers: Transformer decoder layers--n_heads: Attention heads--dropout: Dropout rate--save_path: Save model path--print_samples: Print training samples--tie_embeddings: Tie input/output embeddings
git clone https://huggingface.co/anthonyhuang1909/LLM-EngineIncludes:
Model.pth– pretrained weightsvocab.json– tokenizer vocabulary
This project is intended for educational & research purposes.
It demonstrates the principles of Transformer-based models on a smaller scale.
Released under the MIT License.
Last updated: 2025-08-21
