SWE-bench Service for AgentCompass

SWE-bench FastAPI service for integration with AgentCompass service-type benchmarks. It exposes a simple REST API to run the agent on SWE-bench tasks, returning the final answer with evaluation.

Introduction

FastAPI app defined in swebench_service.py
Endpoints:
- GET /health: health check
- POST /api/tasks: run a single SWE-bench task and return results (patch, evaluation, trajectory)

Quick Start

1. Environment setup

Python: 3.12+ recommended

Install Python dependencies:

pip install -r requirements.txt

Docker: Required for running agent and evaluation in isolated environments.

2. Configuration

Set environment variables (see .env or your deployment system):

SWE_BENCH_IMAGES_PATH: Path containing pre-downloaded SWE-bench Docker images (optional)
IMAGE_CACHE_MAX_SIZE: Max number of cached Docker images (default: 20)
THREAD_POOL_MAX_WORKERS: Number of thread pool workers (default: 1)

Image Loading Modes

The service supports two modes for loading SWE-bench evaluation images:

Mode 1: Local Tar Files

If SWE_BENCH_IMAGES_PATH is set, the service will load pre-downloaded images from local .tar files.

Image Format: The required images are .tar files exported from Docker (e.g., using docker save). Each file should be named according to the SWE-bench naming convention, where all / and : characters in the image name are replaced with underscores (_).

For example, the Docker image:

swebench/sweb.eval.x86_64.astropy_1776_astropy-12907:latest

should be saved as:

swebench_sweb.eval.x86_64.astropy_1776_astropy-12907_latest.tar

These should match the expected image keys for the corresponding SWE-bench tasks and be placed in the SWE_BENCH_IMAGES_PATH directory before starting the service.

Mode 2: Docker Hub

If SWE_BENCH_IMAGES_PATH is not set, the service will automatically pull images from Docker Hub when needed.

3. Start the Service

Method 1: Run the API server

python swebench_service.py --host 0.0.0.0 --port 8080

Method 2: Docker Deployment

docker build -t swebench-server .
docker run --privileged \
    --name swebench-server \
    -p 8080:8080 \
    -e SWE_BENCH_IMAGES_PATH=/your/image/path \
    -e THREAD_POOL_MAX_WORKERS=4 \
    swebench-server

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
start.sh		start.sh
swebench_agent_config.py		swebench_agent_config.py
swebench_service.py		swebench_service.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SWE-bench Service for AgentCompass

Introduction

Quick Start

1. Environment setup

2. Configuration

Image Loading Modes

3. Start the Service

Method 1: Run the API server

Method 2: Docker Deployment

About

Uh oh!

Releases

Packages

Languages

open-compass/SWE-bench-server

Folders and files

Latest commit

History

Repository files navigation

SWE-bench Service for AgentCompass

Introduction

Quick Start

1. Environment setup

2. Configuration

Image Loading Modes

3. Start the Service

Method 1: Run the API server

Method 2: Docker Deployment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages