Skip to content

Ankur-Deka/GPT-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

GPT From Scratch

This project demonstrates how to tokenize a dataset and train a GPT-2 style model from scratch. Default parameters have been tested to work with a ASUS Zephyrus G14 with RTX 4060 GPU (8GB VRAM), 16 core CPU and 32 GB RAM.


Installation

conda create -n gpt-scratch python=3.11 -y
conda activate gpt-scratch
pip install torch torchvision torchaudio  # update for your CUDA version
pip install transformers datasets accelerate
pip install tensorboard tqdm pyarrow

1. Tokenization

python tokenize.py

Parameters to adjust

  • num_proc → set close to the number of CPU cores on your system.\
  • batch_size → increase as much as your RAM capacity allows.

2. Pre-Training

python pretrain.py

Parameters to adjust

  • BATCH_SIZE → depends on GPU VRAM capacity

3. Fine-tuning

Coming up soon

About

Train your own GPT from scratch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages