DGX Spark - Multi-Model LLM Serving

Local LLM infrastructure for DGX Spark (GB10 Blackwell) with vLLM, web UI, and model management. Works with 1 or 2 DGX Sparks.

⭐ If you find this repo useful, please give it a star - that's all I ask. Thanks! :D

Quick Start

./start-all.sh

Then open the Dashboard at http://localhost:5173 and start a model.

Chat: http://localhost:5173/chat

To stop all services: ./start-all.sh --stop

Features

Web Dashboard - Start/stop models, GPU monitoring, chat interface
7 Models - Code, vision, reasoning, 235B distributed
Tool Calling - Web search + sandboxed code execution
OpenAI API - Compatible endpoints on ports 8100-8235

Screenshots

Dashboard

Chat

Models

Model	Port	Best For
Qwen3-Coder-30B-AWQ	8104	Code + tools (recommended)
Qwen3-235B-AWQ	8235	Large tasks (2-node)
Qwen2-VL-7B	8101	Vision
Nemotron-3-Nano-30B	8105	Reasoning

Technical Reference

For Claude Code and developers

Services

Start all services: ./start-all.sh (recommended)

Service	Port	Manual Start
Web GUI	5173	`cd web-gui && ./start-docker.sh`
Model Manager	5175	`cd model-manager && ./serve.sh`
Tool Sandbox	5176	`cd tool-call-sandbox && ./serve.sh`
SearXNG	8080	`cd searxng-docker && docker compose up -d`

Key Files

models.yaml - All model configurations
shared/auth.py - API authentication (Bearer token via DGX_API_KEY)
vllm-*/serve.sh - Model startup scripts

Environment Variables

Variable	Purpose
`DGX_API_KEY`	Enable API authentication
`DGX_RATE_LIMIT`	Requests/min per IP (default: 60)
`DGX_LOG_LEVEL`	Log level: debug, info, warning, error (default: info)
`HF_TOKEN`	HuggingFace access token

Runtime Configuration

# Check current log level
curl http://localhost:5175/api/config/log-level

# Enable debug logging (no restart needed)
curl -X POST http://localhost:5175/api/config/log-level \
  -H "Content-Type: application/json" -d '{"level": "debug"}'

Architecture

Frontend: React + Vite (web-gui/)
APIs: FastAPI with shared auth middleware
Models: vLLM in Docker with CORS enabled
Sandbox: Seccomp + capabilities + non-root execution

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
claude-transcripts		claude-transcripts
docs		docs
model-manager		model-manager
ngc		ngc
ollama-qwen3-vl-32b		ollama-qwen3-vl-32b
searxng-docker		searxng-docker
shared		shared
tool-call-sandbox		tool-call-sandbox
vllm-chandra-ocr		vllm-chandra-ocr
vllm-ministral3-14b		vllm-ministral3-14b
vllm-nemotron-3-nano-30b-bf16		vllm-nemotron-3-nano-30b-bf16
vllm-qwen2-vl-7b		vllm-qwen2-vl-7b
vllm-qwen3-235b-awq		vllm-qwen3-235b-awq
vllm-qwen3-coder-30b-awq		vllm-qwen3-coder-30b-awq
vllm-qwen3-coder-30b		vllm-qwen3-coder-30b
web-gui		web-gui
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
models.yaml		models.yaml
start-all.sh		start-all.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DGX Spark - Multi-Model LLM Serving

Quick Start

Features

Screenshots

Dashboard

Chat

Models

Technical Reference

Services

Key Files

Environment Variables

Runtime Configuration

Architecture

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

dataforgex/dgx_spark

Folders and files

Latest commit

History

Repository files navigation

DGX Spark - Multi-Model LLM Serving

Quick Start

Features

Screenshots

Dashboard

Chat

Models

Technical Reference

Services

Key Files

Environment Variables

Runtime Configuration

Architecture

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages