Simple example for a real time local video chat example with llama.cpp
How to:
- Install llama.cpp -> https://github.com/ggml-org/llama.cpp
- Run model (tested on Mac) Real-time model: llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF (quickest) Gemma3 4b: llama-server -hf ggml-org/gemma-3-4b-it-GGUF (best quality of answer to time to answer ration)
- Download visual-local-chat.html and open it
- Start chating :)
