Releases · vercel-labs/agent-eval · GitHub

06 Feb 21:11

@vercel/agent-eval@0.0.12

@vercel/agent-eval@0.0.12

Patch Changes

#18 85bfb21 Thanks @paoloricciuti! - feat: add editPrompt config to experiment

Assets 2

06 Feb 22:39

@vercel/agent-eval-playground@0.0.5

@vercel/agent-eval-playground@0.0.5

Patch Changes

6159d01 Thanks @allenzhou101! - Run playground in production mode (next start) instead of dev mode (next dev) to fix React version conflicts and "Cannot read properties of null (reading 'useInsertionEffect')" errors when running via npx.

Assets 2

06 Feb 22:33

@vercel/agent-eval-playground@0.0.4

@vercel/agent-eval-playground@0.0.4

Patch Changes

23e2d43 Thanks @allenzhou101! - Add repository field to package.json to fix npm provenance verification error during publishing.

Assets 2

05 Feb 18:14

v0.0.11

Patch Changes

558abe5 Thanks @paoloricciuti! - feat: accept array of models in experiment #10

Assets 2

04 Feb 00:11

gaojude

v0.0.8

Version 0.0.8

Assets 2

03 Feb 20:58

gaojude

v0.0.7

What's Changed

Fixes

CLI loads .env.local: CLI now loads .env.local before .env (matching integration test behavior)
CLI version from package.json: Version is now read dynamically instead of being hardcoded

Changes

Default timeout increased: Changed from 5 minutes to 10 minutes (600s)

Assets 2

03 Feb 20:15

gaojude

v0.0.6

What's Changed

Fixes

[OpenCode] Fix model format: OpenCode now requires vercel/ prefix in model strings (e.g., vercel/minimax/minimax-m2.1)
[Docker] Fix file permissions: Added chown after file upload so agents can edit files in the sandbox
Timeout enforcement: Added Promise.race at runner level and signal abort to agent on timeout for proper resource cleanup

New

Minimax model support: Added vercel/minimax/minimax-m2.1 to supported models
Parallel integration tests: All agents now tested on both Docker and Vercel sandboxes concurrently
[Config] Sandbox backend selection: Moved sandbox backend selection from env var to experiment config (sandbox: 'docker' | 'vercel' | 'auto')

Documentation

Updated README with correct model format and examples

Assets 2

03 Feb 17:49

gaojude

v0.0.5

What's Changed

Added Docker sandbox as alternative to Vercel Sandbox
Added test:integration:docker and test:integration:vercel scripts

Assets 2