Skip to content

glm-4.7 incorrectly reported as vision-capable, causes swebenchmultimodal evaluation to fail #1897

@juanmichelini

Description

@juanmichelini

Problem

glm-4.7 evaluations fail on swebenchmultimodal benchmark with BrokenProcessPool errors, while working correctly on text-only benchmarks like regular swebench.

Root Cause

LiteLLM's model database incorrectly reports openrouter/z-ai/glm-4.7 as vision-capable (supports_vision: true), but OpenRouter's actual API for this model only accepts text input (input_modalities: ["text"]).

Evidence

LiteLLM claims:

{
  "model": "openrouter/z-ai/glm-4.7",
  "supports_vision": true
}

OpenRouter API shows:

{
  "id": "z-ai/glm-4.7",
  "architecture": {
    "input_modalities": ["text"]
  }
}

Impact

When swebenchmultimodal sends messages with ImageContent:

  1. SDK detects glm-4.7 as vision-capable (via LiteLLM)
  2. SDK includes images in API request
  3. OpenRouter rejects request (text-only model)
  4. Child process crashes with unhandled error
  5. Evaluation fails with BrokenProcessPool

Solution

Add disable_vision: true to glm-4.7 model configuration in .github/run-eval/resolve_model_config.py to override LiteLLM's incorrect capability detection.

Verification

# LiteLLM incorrectly reports vision support
python3 -c "from litellm import supports_vision; print(supports_vision('openrouter/z-ai/glm-4.7'))"
# Returns: True

# OpenRouter API shows text-only
curl -s "https://openrouter.ai/api/v1/models" | \\
  jq '.data[] | select(.id == "z-ai/glm-4.7") | .architecture.input_modalities'
# Returns: ["text"]

Related

  • Works: glm-4.7 on regular swebench (text-only)
  • Works: Other models on swebenchmultimodal (actual vision support)
  • Fails: glm-4.7 on swebenchmultimodal (images sent but not supported)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions