Minimal, production-style baseline for multi-class image classification on the Oxford-IIIT Pets dataset (37 breeds). Uses transfer learning with torchvision ResNet18 ImageNet weights and runs on macOS MPS or CPU.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtpython src/train.py --config configs/default.yamlCommon overrides:
python src/train.py --epochs 10 --batch-size 64 --lr 3e-4 --freeze-epochs 2 --num-workers 0Best checkpoint is saved to ./checkpoints/best.pt.
python src/eval.py --ckpt checkpoints/best.ptpython src/predict.py --ckpt checkpoints/best.pt --image path/to/image.jpgExample output:
Top-1: abyssinian (0.9234)
Top-5:
abyssinian (0.9234)
bengal (0.0345)
siamese (0.0121)
ragdoll (0.0098)
birman (0.0076)
The code automatically selects MPS if available via torch.backends.mps.is_available(). If MPS is not available, it falls back to CPU.
- Dataset downloads to ./data (not committed)
- Checkpoints saved to ./checkpoints (not committed)
- Grad-CAM visualization
- AMP training
- Optuna hyperparameter search
- Weights & Biases logging
- ONNX export