-
Notifications
You must be signed in to change notification settings - Fork 138
Closed
Labels
Description
Problem
glm-4.7 evaluations fail on swebenchmultimodal benchmark with BrokenProcessPool errors, while working correctly on text-only benchmarks like regular swebench.
Root Cause
LiteLLM's model database incorrectly reports openrouter/z-ai/glm-4.7 as vision-capable (supports_vision: true), but OpenRouter's actual API for this model only accepts text input (input_modalities: ["text"]).
Evidence
LiteLLM claims:
{
"model": "openrouter/z-ai/glm-4.7",
"supports_vision": true
}OpenRouter API shows:
{
"id": "z-ai/glm-4.7",
"architecture": {
"input_modalities": ["text"]
}
}Impact
When swebenchmultimodal sends messages with ImageContent:
- SDK detects glm-4.7 as vision-capable (via LiteLLM)
- SDK includes images in API request
- OpenRouter rejects request (text-only model)
- Child process crashes with unhandled error
- Evaluation fails with
BrokenProcessPool
Solution
Add disable_vision: true to glm-4.7 model configuration in .github/run-eval/resolve_model_config.py to override LiteLLM's incorrect capability detection.
Verification
# LiteLLM incorrectly reports vision support
python3 -c "from litellm import supports_vision; print(supports_vision('openrouter/z-ai/glm-4.7'))"
# Returns: True
# OpenRouter API shows text-only
curl -s "https://openrouter.ai/api/v1/models" | \\
jq '.data[] | select(.id == "z-ai/glm-4.7") | .architecture.input_modalities'
# Returns: ["text"]Related
- Works: glm-4.7 on regular swebench (text-only)
- Works: Other models on swebenchmultimodal (actual vision support)
- Fails: glm-4.7 on swebenchmultimodal (images sent but not supported)
Reactions are currently unavailable