Skip to content

refactor: API-first positioning, Python client library, and tests#17

Open
abhishek-anand wants to merge 7 commits intomainfrom
refactor-api-and-docs
Open

refactor: API-first positioning, Python client library, and tests#17
abhishek-anand wants to merge 7 commits intomainfrom
refactor-api-and-docs

Conversation

@abhishek-anand
Copy link
Collaborator

@abhishek-anand abhishek-anand commented Feb 2, 2026

Summary

  • Reposition CodeRunner from "MCP server" to "sandboxed code execution platform"
  • Add Python client library for programmatic access
  • Add unit tests (25 tests, all pass)
  • Improve container best practices

Changes

README rewrite

  • Library-first positioning (code execution focus, not MCP-centric)
  • API documentation with curl examples
  • Simplified Homebrew install: brew install instavm/cli/coderunner
  • MCP integrations moved to collapsed sections

API improvements

  • New versioned endpoints: /v1/execute, /v1/sessions, /v1/browser/*, /health
  • Old endpoints kept as deprecated (with Deprecation header)
  • Backward compatible

Python client library

from coderunner import CodeRunner, execute

cr = CodeRunner()
result = cr.execute("print(2 + 2)")

# Or one-liner
print(execute("2 + 2"))

Container best practices

  • SIGTERM/SIGINT handler for graceful kernel shutdown
  • 10-minute timeout on container system start
  • Improved cleanup.sh with graceful stop

Tests

  • 25 unit tests (client library + parsing)
  • No bloat tests requiring Jupyter/Playwright
  • All tests pass

Test plan

  • All 25 unit tests pass (pytest tests/ -v)
  • Manual test: install and run server
  • Manual test: Python client against running server

🤖 Generated with Claude Code

abhishek-anand and others added 2 commits February 2, 2026 16:11
- Rewrite README: library-first, code execution focus, no AI slop
- Add versioned API endpoints (/v1/execute, /v1/sessions, /v1/browser)
- Add deprecated wrappers for old endpoints with Deprecation header
- Add /health endpoint
- Create Python client library (coderunner package)
- Add pyproject.toml for pip installability

Breaking changes (deprecated, still work):
- /execute -> /v1/execute
- /v1/sessions/session -> /v1/sessions, /v1/sessions/{id}
- /v1/browser/interactions/* -> /v1/browser/*

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tests:
- Add 25 unit tests for client library
- Add parsing tests for skill frontmatter
- All tests pass without requiring running server

Container improvements:
- Add SIGTERM/SIGINT handler for graceful kernel shutdown
- Add 10-minute timeout to container system start in install.sh
- Improve cleanup.sh with graceful stop (no aggressive pkill)
- Simplify Homebrew install command (auto-tap syntax)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello @abhishek-anand, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances CodeRunner by shifting its core identity to an API-first, sandboxed code execution platform. It introduces a dedicated Python client library, making programmatic interaction more streamlined, and updates the API with versioned endpoints for better future compatibility. Concurrently, the changes improve the robustness of the container environment through better shutdown mechanisms and startup timeouts, ensuring a more reliable user experience.

Highlights

  • API-First Repositioning: CodeRunner has been refactored to emphasize an API-first approach, shifting its identity from an 'MCP server' to a 'sandboxed code execution platform'.
  • Python Client Library: A new Python client library has been introduced, providing a programmatic interface for easier interaction with CodeRunner's functionalities.
  • Comprehensive Unit Tests: Twenty-five new unit tests have been added, covering the client library and parsing logic, all of which are reported to be passing.
  • Improved Container Best Practices: Enhancements include graceful shutdown mechanisms for kernels via SIGTERM/SIGINT handlers and a 10-minute timeout for container system startup, improving robustness.
  • README Rewrite and API Versioning: The README has been updated to reflect the new library-first focus, include API documentation, and detail a simplified Homebrew installation. New versioned API endpoints (/v1/*) are introduced, with backward compatibility for older, now deprecated, endpoints.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial pull request that successfully repositions the project with an API-first approach, introducing a well-structured Python client library and adding a suite of unit tests. The refactoring greatly improves the project's clarity and usability. The code quality is high, with notable improvements to container management like graceful shutdown and timeouts. My review focuses on a few critical areas to ensure correctness and robustness: the graceful shutdown implementation needs refinement to work as intended, and a key set of tests should be updated to validate the actual application code. I've also included some suggestions to improve the client library's error handling and the user-friendliness of the installation instructions.

server.py Outdated
Comment on lines 365 to 381
def handle_sigterm(signum, frame):
"""Handle SIGTERM signal for graceful shutdown."""
logger.info(f"Received signal {signum}, initiating shutdown...")
# Schedule shutdown in the event loop
try:
loop = asyncio.get_running_loop()
loop.create_task(graceful_shutdown())
except RuntimeError:
# No running loop, we're probably not started yet
pass
# Allow uvicorn to handle the actual shutdown
sys.exit(0)


# Register signal handlers
signal.signal(signal.SIGTERM, handle_sigterm)
signal.signal(signal.SIGINT, handle_sigterm)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current graceful shutdown implementation is unlikely to work as intended. Calling sys.exit(0) in the signal handler will terminate the process immediately, not allowing the graceful_shutdown() async task to complete. This defeats the purpose of cleaning up kernel resources.

The idiomatic way to handle this in a Starlette application is to use the on_event("shutdown") lifecycle hook. This allows uvicorn to manage the signals and ensures your cleanup logic is executed before the application exits.

I suggest replacing the manual signal handling with the following:

@app.on_event("shutdown")
async def on_shutdown():
    await graceful_shutdown()

# Then, you can remove the `handle_sigterm` function and the `signal.signal` calls.

Comment on lines 71 to 93
# Inline implementation of parsing logic for testing
async def parse_frontmatter(file_path):
async with aiofiles.open(file_path, mode='r') as f:
content = await f.read()
frontmatter = []
in_frontmatter = False
for line in content.splitlines():
if line.strip() == '---':
if in_frontmatter:
break
else:
in_frontmatter = True
continue
if in_frontmatter:
frontmatter.append(line)

metadata = {}
for line in frontmatter:
if ':' in line:
key, value = line.split(':', 1)
metadata[key.strip()] = value.strip()
return metadata

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The tests within TestSkillFrontmatterParsing do not test the actual application code from server.py. Instead, the parsing logic is re-implemented inside the test function itself. This means the _parse_skill_frontmatter function in server.py is not being validated at all.

The test should import the function from server.py and test it directly. If there are concerns about side effects during import, they can be managed using pytest.mock.patch.

        # The test should import the function from server.py and test it.
        # This requires refactoring the test to handle imports from server.py.
        # For example:
        # from server import _parse_skill_frontmatter
        # 
        # result = await _parse_skill_frontmatter(skill_file)

Comment on lines 28 to 31
```bash
git clone https://github.com/instavm/coderunner.git
cd coderunner
chmod +x install.sh
./install.sh

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The installation instructions are missing a step to make the script executable and don't mention that it requires sudo. This could lead to a frustrating experience for new users as the installation would fail. It's best practice to include the chmod command and to run the script with sudo since it performs privileged operations.

Suggested change
```bash
git clone https://github.com/instavm/coderunner.git
cd coderunner
chmod +x install.sh
./install.sh
git clone https://github.com/instavm/coderunner.git
cd coderunner
chmod +x install.sh
sudo ./install.sh

Comment on lines 125 to 126
except Exception:
return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a generic Exception is too broad and can mask underlying issues in the code. It's better to catch more specific exceptions related to network requests. For httpx, httpx.RequestError is the base exception for request-related problems and would be more appropriate here.

Suggested change
except Exception:
return False
except httpx.RequestError:
return False

server.py Outdated

# --- GRACEFUL SHUTDOWN ---

_shutdown_event = asyncio.Event()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _shutdown_event is defined but never used in the code. This unused variable should be removed to improve code clarity.

- Claude Code, OpenAI Codex, Cursor, Gemini CLI
- AWS, GCP, Azure, GitHub CLIs
- Config file support (~/.coderunner.config)
- Credential mount options (--with-ssh-agent, --with-credentials)
- 48h auto-update via GitHub Actions cron
- Trivy vulnerability scanning
@abhishek-anand
Copy link
Collaborator Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant and well-executed refactoring. It successfully repositions CodeRunner as an API-first platform, introduces a clean Python client library with comprehensive tests, and greatly improves the installation and container management scripts. The introduction of versioned, RESTful endpoints with a clear deprecation strategy for older endpoints is a highlight. The code quality is high, with good attention to detail like graceful shutdowns and robust error handling. I've identified a critical issue with API route ordering that needs to be addressed, along with some medium and high severity suggestions to improve security and maintainability in the Dockerfile and installation scripts. Once these points are resolved, this will be an excellent contribution to the project.

server.py Outdated
Comment on lines 1154 to 1169
# New endpoints (primary)
app.add_route("/health", api_health, methods=["GET"])
app.add_route("/v1/execute", api_execute, methods=["POST"])
app.add_route("/v1/sessions", api_start_session, methods=["POST"])
app.add_route("/v1/sessions/{session_id}", api_get_session_by_id, methods=["GET"])
app.add_route("/v1/sessions/{session_id}", api_stop_session_by_id, methods=["DELETE"])
app.add_route("/v1/browser/navigate", api_browser_navigate, methods=["POST"])
app.add_route("/v1/browser/content", api_browser_extract_content, methods=["POST"])

# Deprecated endpoints (backward compatibility - will be removed in v1.0)
app.add_route("/execute", deprecated_execute, methods=["POST"])
app.add_route("/v1/sessions/session", deprecated_session_post, methods=["POST"])
app.add_route("/v1/sessions/session", deprecated_session_get, methods=["GET"])
app.add_route("/v1/sessions/session", deprecated_session_delete, methods=["DELETE"])
app.add_route("/v1/browser/interactions/navigate", deprecated_browser_navigate, methods=["POST"])
app.add_route("/v1/browser/interactions/content", deprecated_browser_content, methods=["POST"]) No newline at end of file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a critical issue with the order of route registration. Starlette matches routes in the order they are added. Because the parameterized route /v1/sessions/{session_id} is registered before the specific route /v1/sessions/session, requests to the old /v1/sessions/session endpoint will be incorrectly handled by the new api_get_session_by_id and api_stop_session_by_id handlers.

This bypasses the deprecation warning logic in deprecated_session_get and deprecated_session_delete. To fix this, the specific, non-parameterized routes for the deprecated endpoints must be registered before their new, parameterized counterparts.

Suggested change
# New endpoints (primary)
app.add_route("/health", api_health, methods=["GET"])
app.add_route("/v1/execute", api_execute, methods=["POST"])
app.add_route("/v1/sessions", api_start_session, methods=["POST"])
app.add_route("/v1/sessions/{session_id}", api_get_session_by_id, methods=["GET"])
app.add_route("/v1/sessions/{session_id}", api_stop_session_by_id, methods=["DELETE"])
app.add_route("/v1/browser/navigate", api_browser_navigate, methods=["POST"])
app.add_route("/v1/browser/content", api_browser_extract_content, methods=["POST"])
# Deprecated endpoints (backward compatibility - will be removed in v1.0)
app.add_route("/execute", deprecated_execute, methods=["POST"])
app.add_route("/v1/sessions/session", deprecated_session_post, methods=["POST"])
app.add_route("/v1/sessions/session", deprecated_session_get, methods=["GET"])
app.add_route("/v1/sessions/session", deprecated_session_delete, methods=["DELETE"])
app.add_route("/v1/browser/interactions/navigate", deprecated_browser_navigate, methods=["POST"])
app.add_route("/v1/browser/interactions/content", deprecated_browser_content, methods=["POST"])
# Deprecated endpoints must be registered before new parameterized ones to ensure correct matching.
app.add_route("/execute", deprecated_execute, methods=["POST"])
app.add_route("/v1/sessions/session", deprecated_session_post, methods=["POST"])
app.add_route("/v1/sessions/session", deprecated_session_get, methods=["GET"])
app.add_route("/v1/sessions/session", deprecated_session_delete, methods=["DELETE"])
app.add_route("/v1/browser/interactions/navigate", deprecated_browser_navigate, methods=["POST"])
app.add_route("/v1/browser/interactions/content", deprecated_browser_content, methods=["POST"])
# New endpoints (primary)
app.add_route("/health", api_health, methods=["GET"])
app.add_route("/v1/execute", api_execute, methods=["POST"])
app.add_route("/v1/sessions", api_start_session, methods=["POST"])
app.add_route("/v1/sessions/{session_id}", api_get_session_by_id, methods=["GET"])
app.add_route("/v1/sessions/{session_id}", api_stop_session_by_id, methods=["DELETE"])
app.add_route("/v1/browser/navigate", api_browser_navigate, methods=["POST"])
app.add_route("/v1/browser/content", api_browser_extract_content, methods=["POST"])

Dockerfile Outdated
Comment on lines 122 to 123
RUN curl -fsSL https://cursor.com/install | bash && \
ln -sf /root/.local/bin/agent /usr/local/bin/cursor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Piping the output of curl directly to bash is a significant security risk. The script from cursor.com/install is executed without any verification of its contents. If the source is compromised, this could lead to arbitrary code execution during the image build process.

A safer approach is to download the script to a file, verify its checksum if one is provided, and then execute it.

Dockerfile Outdated
apt-get clean && rm -rf /var/lib/apt/lists/*

# Azure CLI
RUN curl -sL https://aka.ms/InstallAzureCLIDeb | bash

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Similar to a previous comment, piping curl to bash from aka.ms/InstallAzureCLIDeb is a security risk. It executes a script from the internet without verification. While aka.ms is a Microsoft domain, the practice itself is insecure and should be avoided. The recommended way to install the Azure CLI is by following the official instructions which involve adding the Microsoft package repository, a more secure method.

Dockerfile Outdated
Comment on lines 118 to 119
RUN npm install -g @anthropic-ai/claude-code@1 && \
pip install --no-cache-dir google-generativeai

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The package @anthropic-ai/claude-code@1 is installed here, but it was already installed on line 112. This duplicate installation should be removed to reduce build time and image size.

Additionally, it's a Docker best practice to chain RUN commands to reduce the number of layers. You could combine the npm install commands for the AI agents into a single layer.

    pip install --no-cache-dir google-generativeai

install.sh Outdated
Comment on lines 58 to 87
# --- Load config file if exists ---
declare -a ENV_VARS
if [[ -f "$CONFIG_FILE" ]]; then
echo "Loading config from $CONFIG_FILE"
while IFS='=' read -r key value; do
# Skip comments and empty lines
[[ "$key" =~ ^#.*$ ]] && continue
[[ -z "$key" ]] && continue
# Trim whitespace
key=$(echo "$key" | xargs)
value=$(echo "$value" | xargs)
if [[ -n "$key" && -n "$value" ]]; then
ENV_VARS+=("--env" "$key=$value")
fi
done < "$CONFIG_FILE"
fi

# --- Load env file if specified ---
if [[ -n "$ENV_FILE" && -f "$ENV_FILE" ]]; then
echo "Loading environment from $ENV_FILE"
while IFS='=' read -r key value; do
[[ "$key" =~ ^#.*$ ]] && continue
[[ -z "$key" ]] && continue
key=$(echo "$key" | xargs)
value=$(echo "$value" | xargs)
if [[ -n "$key" && -n "$value" ]]; then
ENV_VARS+=("--env" "$key=$value")
fi
done < "$ENV_FILE"
fi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for parsing an environment file and populating the ENV_VARS array is duplicated for the main config file and the file specified with --env-file. To improve maintainability and follow the DRY (Don't Repeat Yourself) principle, this logic can be refactored into a shell function.

# --- Helper function to parse env files ---
parse_env_file() {
    local file_path="$1"
    while IFS='=' read -r key value; do
        # Skip comments and empty lines
        [[ "$key" =~ ^#.*$ ]] && continue
        [[ -z "$key" ]] && continue
        # Trim whitespace
        key=$(echo "$key" | xargs)
        value=$(echo "$value" | xargs)
        if [[ -n "$key" && -n "$value" ]]; then
            ENV_VARS+=("--env" "$key=$value")
        fi
    done < "$file_path"
}

# --- Load environment variables ---
declare -a ENV_VARS
if [[ -f "$CONFIG_FILE" ]]; then
    echo "Loading config from $CONFIG_FILE"
    parse_env_file "$CONFIG_FILE"
fi

if [[ -n "$ENV_FILE" && -f "$ENV_FILE" ]]; then
    echo "Loading environment from $ENV_FILE"
    parse_env_file "$ENV_FILE"
fi

- Remove curl|bash for Cursor install (download first)
- Use official Azure CLI package repo instead of curl|bash
- Combine npm installs to reduce layers
- Refactor env file parsing into helper function
Starlette matches routes in registration order. The deprecated
/v1/sessions/session routes were registered after the parameterized
/v1/sessions/{session_id} routes, causing them to never be reached.

Moved literal path routes before parameterized paths so both old
and new API endpoints work correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant