Skip to content

BadRequestError: temperature=0.0 breaks evaluation with reasoning models (gpt-5.2, etc.) #166

@shubchat

Description

@shubchat

temperature=0.0 default breaks evaluation with reasoning models (gpt-5.2, etc.)

Summary

Running tau2 run with any reasoning model fails immediately with a 400 BadRequestError
because temperature=0.0 is hardcoded into the default LLM args.

Steps to Reproduce

tau2 run --domain airline \
  --agent-llm o4-mini \
  --user-llm o4-mini \
  --num-trials 1 --num-tasks 1
Error

  litellm.BadRequestError: AzureException BadRequestError -
  Unsupported value: 'temperature' does not support 0.0 with this model.
  Only the default (1) value is supported.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions