What's New in v0.9.3
API Gateway Mode (afm -w -g)
- Auto-discovers and proxies to Ollama, LM Studio, Jan, llama.cpp, and other local LLM backends
- Unified model selector — all backend models appear in a single dropdown
- Model info strip — shows backend name, capabilities (Vision, Tools), and context window size
- LM Studio loaded state detection — correctly identifies loaded vs unloaded models
Reasoning Model Support
- GPT-OSS / DeepSeek / Qwen reasoning — normalizes
reasoning field to reasoning_content for WebUI compatibility
<think> tag extraction — extracts reasoning from <think>...</think> blocks in streaming responses
WebUI Enhancements
- Apple Intelligence branding in single-model mode with SF Symbol icon
- Startup flash fix — page hidden until branding applied, no more llama.cpp → AFM transition flicker
- Single-model mode fix — model selector dropdown no longer pops up when only Foundation model is available
- llama.cpp webui subtitle — shows "llama.cpp webui" under AFM heading
Streaming & Stats
stream_options.include_usage sent to all backends (LM Studio, Ollama, Jan) for real token counts
- Estimated token counting fallback for backends without usage data
- Native streaming for Apple Foundation Model
Other
- pip install —
pip install macafm now available as alternative to Homebrew
- Pre-warm — model pre-warmed on server startup for faster first response
- Multimodal OCR — vision support for Foundation Model
- Fix: WebUI bundled in pip package —
afm -w now correctly opens browser when installed via pip
Installation
# Homebrew
brew tap scouzi1966/afm
brew install afm
brew update
brew upgrade afm (if previously installed with brew)
# pip
pip install macafm
pip install --upgrade macafm (if installed with pip earlier)
Quick Start
afm # API server only
afm -w # API server + WebUI
afm -w -g # WebUI + gateway (auto-discovers all local backends)