CLI agent benchmarker dashboard. Run multiple coding agents on the same task, watch their terminals live, and compare output side by side.
- Run
amp,opencode,claude,codex,pi,droidin parallel - WebSocket-driven PTY streaming for live terminal output
- Explicit global stop path via
POST /stopwith shutdown ladder (Ctrl-C,Ctrl-C,SIGTERM,SIGKILL) - Per-agent model selectors sourced from the model config
- Dark, monospace-first UI with
ghostty-webterminals
bun install
bun run devOpen http://localhost:3000.
bun run dev # UI + PTY server
bun run ui # UI only (Vite on :3000)
bun run pty # PTY websocket server on :4000
bun run build # production build
bun run preview # preview built app
bun run start # start output server (.output/server/index.mjs)
bun run test # vitest
bun run lint # eslint
bun run format # prettier
bun run check # format + lint- Bun
- Agent CLIs installed and available on
PATH:amp,droid,pi,codex,claude,opencode - Git
- Frontend: TanStack Start routes and root shell
- WebSocket client connects to
ws://localhost:4000/vt - Dashboard UI renders agent cards and live terminals
- Backend server spawns PTYs, streams base64 output, exposes
/diffand/stop - Theme system controls dark/light styles and tokens
- Wire model selection into backend runs
- Add per-agent stop controls (not just global STOP)
- Persist run metrics/logs for comparison history
