Releases: vercel-labs/agent-eval
Releases · vercel-labs/agent-eval
@vercel/agent-eval@0.0.12
Patch Changes
- #18
85bfb21Thanks @paoloricciuti! - feat: addeditPromptconfig to experiment
@vercel/agent-eval-playground@0.0.5
Patch Changes
6159d01Thanks @allenzhou101! - Run playground in production mode (next start) instead of dev mode (next dev) to fix React version conflicts and "Cannot read properties of null (reading 'useInsertionEffect')" errors when running via npx.
@vercel/agent-eval-playground@0.0.4
Patch Changes
23e2d43Thanks @allenzhou101! - Add repository field to package.json to fix npm provenance verification error during publishing.
v0.0.11
Patch Changes
558abe5Thanks @paoloricciuti! - feat: accept array of models in experiment #10
v0.0.8
v0.0.7
What's Changed
Fixes
- CLI loads
.env.local: CLI now loads.env.localbefore.env(matching integration test behavior) - CLI version from package.json: Version is now read dynamically instead of being hardcoded
Changes
- Default timeout increased: Changed from 5 minutes to 10 minutes (600s)
v0.0.6
What's Changed
Fixes
- [OpenCode] Fix model format: OpenCode now requires
vercel/prefix in model strings (e.g.,vercel/minimax/minimax-m2.1) - [Docker] Fix file permissions: Added
chownafter file upload so agents can edit files in the sandbox - Timeout enforcement: Added
Promise.raceat runner level and signal abort to agent on timeout for proper resource cleanup
New
- Minimax model support: Added
vercel/minimax/minimax-m2.1to supported models - Parallel integration tests: All agents now tested on both Docker and Vercel sandboxes concurrently
- [Config] Sandbox backend selection: Moved sandbox backend selection from env var to experiment config (
sandbox: 'docker' | 'vercel' | 'auto')
Documentation
- Updated README with correct model format and examples