Initial Release
A pip-installable Python package for fine-tuning Vision Transformer (ViT) models.
Features
- Easy model loading: Pretrained ViT variants (vit_b_16, vit_b_32, vit_l_16)
- Modern training: Mixed precision (AMP), cosine annealing with warmup, early stopping
- CIFAR-10/100 support: Built-in data loaders with proper train/val/test splits
- Evaluation tools: Metrics, confusion matrices, classification reports
- Attention visualization: Interpretable attention maps
- CLI interface: Train, evaluate, predict, and export models
- ONNX export: Deploy models to production
Installation
pip install vit-trainerQuick Start
from vit_trainer import Trainer, load_model, get_cifar10_loaders
train_loader, val_loader, test_loader = get_cifar10_loaders(batch_size=64)
model = load_model("vit_b_16", num_classes=10)
trainer = Trainer(model, lr=1e-4, use_amp=True)
trainer.fit(train_loader, val_loader, epochs=10)