From-scratch implementations of ML/DL architectures in Rust using Burn. Built for learning — every layer, every gradient, every training loop written by hand.
- No hidden magic — No high-level wrappers hiding what's actually happening
- Performance — Native speed with GPU acceleration via WGPU
- Type safety — Tensor dimensions checked at compile time
- Learning — If you can build it in Rust, you truly understand it
| Architecture | Description | Status |
|---|---|---|
| Transformer | Decoder-only Transformer (GPT-style) trained on Urdu Wikipedia | Done |
A ~10M parameter decoder-only Transformer for Urdu text completion. Includes:
- Multi-head self-attention with causal masking
- Pre-LayerNorm transformer blocks
- Byte-level BPE tokenizer (10K vocab)
- Full training pipeline with GPU acceleration (WGPU)
- Interactive inference REPL
Trained on Urdu Wikipedia, generates Wikipedia-style Urdu prose.
Model on Hugging Face: Ibzie/Urdu-Completion-Transformer-10M
See the Transformer README for full details.
Each implementation is a standalone Rust project. To run one:
cd Transformer
cargo run --release --bin train # Train the model
cargo run --release --bin infer # Run inference- Rust 1.70+
- GPU with Vulkan/Metal/DX12 support (for WGPU backend)
Contributions and new architecture implementations are welcome! If you'd like to add an implementation:
- Create a new directory for the architecture
- Include a README with architecture details and usage
- Open a PR
Stars are appreciated if you find this useful.
MIT - see LICENSE
Ibrahim Akhtar (@Ibzie)