Skip to content

Releases: tzervas/tritter

v0.2.0 - Multimodal Architecture & Training Optimization

25 Jan 00:38
c8e8d06

Choose a tag to compare

🚀 Major Release: Multimodal Transformer with Training Optimization

Core Features

  • Multimodal Architecture: Unified text, vision, and audio embedding space
  • BitNet 1.58-bit Quantization: Ternary weights {-1, 0, +1} with STE training
  • Training Optimization: VSA compression, ternary math, gradient prediction
  • LoRA/QLoRA Fine-tuning: Train 40B models on 16GB GPU

New Components

  • Tokenization: BPE (tiktoken) + AST-aware code tokenization (tree-sitter)
  • Vision: SigLIP encoder + VQ-VAE image tokenizer
  • Audio: EnCodec-style audio tokenization
  • Curation: Dataset quality gates for security and quality
  • Embedding: KNN/VQ rounding for embedding-prediction paradigm
  • Optimization: Phase-based training (FULL → PREDICT → CORRECT cycles)

Infrastructure

  • RTX 5080 (Blackwell) GPU support
  • Python 3.13 compatibility
  • HuggingFace Hub integration
  • Complete training pipeline with data preparation

Model Sizes

Size Params Packed Weights Recommended VRAM
test ~10M ~2 MB Any
125M 125M ~29 MB 8GB
350M 350M ~82 MB 8GB
1B 1.1B 261 MB 8GB
7B 6.2B 1.45 GB 16GB

Standalone Modules

Installation

pip install tritter

Test Results

  • 600+ tests passing
  • Verified on RTX 5080 16GB