Run a <400ms latency Voice Agent on just 4GB VRAM. Fully offline, no API keys required. Optimized for GTX 1650 and edge robotics with zero-copy inference. (Apache 2.0)
speech-synthesis python-3 nvidia-cuda rag voice-agent quantizations offline-ai sherpa-onnx local-llm-integration ai-voice-agent edge-ai-agents local-voice-agent edge-robotics-voice-assistant setfit-model setfit-training sentence-transformers-finetune llama-fine-tune ollama-voice-agent kokora-tts parakeet-model
-
Updated
Feb 15, 2026 - Python