-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Python: [BREAKING] Moved to a single get_response and run API #3379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Python: [BREAKING] Moved to a single get_response and run API #3379
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR consolidates the Python Agent Framework's streaming and non-streaming APIs into a unified interface. The primary changes include:
Changes:
- Unified
run()andget_response()methods withstreamparameter replacing separaterun_stream()andget_streaming_response()methods - Migration from decorator-based (
@use_instrumentation,@use_function_invocation) to mixin-based architecture for telemetry and function invocation - Introduction of
ResponseStreamclass for unified stream handling with hooks, finalizers, and teardown support - Renamed
AgentExecutionExceptiontoAgentRunException
Reviewed changes
Copilot reviewed 84 out of 85 changed files in this pull request and generated 28 comments.
Show a summary per file
| File | Description |
|---|---|
_types.py |
Added ResponseStream class for unified streaming, updated prepare_messages to handle None |
_clients.py |
Refactored BaseChatClient with unified get_response() method, introduced FunctionInvokingChatClient mixin |
openai/_responses_client.py |
Consolidated streaming/non-streaming into single _inner_get_response() method |
openai/_chat_client.py |
Similar consolidation for chat completions API |
openai/_assistants_client.py |
Unified assistants API with stream parameter |
_workflows/_workflow.py |
Consolidated run() and run_stream() into single run(stream=bool) method |
_workflows/_agent.py |
Updated WorkflowAgent.run() to use stream parameter |
| Test files (multiple) | Updated all tests to use run(stream=True) and get_response(stream=True) |
| Sample files (multiple) | Updated samples to demonstrate new unified API |
| Provider clients | Updated all provider implementations (Azure, Anthropic, Bedrock, Ollama, etc.) to use mixins |
python/packages/core/agent_framework/openai/_assistants_client.py
Outdated
Show resolved
Hide resolved
python/packages/core/tests/workflow/test_agent_executor_tool_calls.py
Outdated
Show resolved
Hide resolved
07afd46 to
dd65afa
Compare
32f0473 to
5c78d91
Compare
python/packages/anthropic/agent_framework_anthropic/_chat_client.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
ebfc3b0 to
92995e6
Compare
|
Fixes lingering CI failures: import missing response types in streaming telemetry finalizers, move AG-UI tests to ag_ui_tests with config updates, and track service thread IDs in AG-UI test client.\n\nChecks: uv run poe fmt/lint/pyright/mypy; uv run poe all-tests. |
a8f7c92 to
a5dadf8
Compare
Duration is a metrics-only attribute per OpenTelemetry semantic conventions. It should be recorded to the histogram but not set as a span attribute.
…ure_response Duration is a metrics-only attribute. It's now passed directly to _capture_response instead of being included in the attributes dict that gets set on the span.
_finalize_stream already calls _close_span() in its finally block, so adding it as a separate cleanup hook is redundant.
If a user creates a streaming response but never consumes it, the cleanup hooks won't run. Now we register a weak reference finalizer that will close the span when the stream object is garbage collected, ensuring spans don't leak in this scenario.
Renamed function to _get_result_hooks_from_stream and fixed it to look for the _result_hooks attribute which is the correct name in ResponseStream class.
92df8e3 to
a99fdba
Compare
Replace ChatResponse.from_chat_response_generator() with the new ResponseStream.get_final_response() pattern in integration tests for: - Azure AI client - OpenAI chat client - OpenAI responses client - Azure responses client
Tests with tool_choice options require at least 2 iterations: 1. First iteration to get function call and execute the tool 2. Second iteration to get the final text response With max_iterations=1, streaming tests would return early with only the function call/result but no final text content.
When using conversation_id (for Responses/Assistants APIs), the server already has the function call message from the previous response. We should only send the new function result message, not all messages including the function call which would cause a duplicate ID error. Fix: When conversation_id is set, only send the last message (the tool result) instead of all response.messages.
…ations Port test from PR microsoft#3664 with updates for new streaming API pattern. Tests that conversation_id is properly updated in options dict during function invocation loop iterations.
When tool_choice is 'required', the user's intent is to force exactly one tool call. After the tool executes, return immediately with the function call and result - don't continue to call the model again. This fixes integration tests that were failing with empty text responses because with tool_choice=required, the model would keep returning function calls instead of text. Also adds regression tests for: - conversation_id propagation between tool iterations (from PR microsoft#3664) - tool_choice=required returns after tool execution
- Add table explaining tool_choice values (auto, none, required) - Explain why tool_choice=required returns immediately after tool execution - Add code example showing the difference between required and auto - Update flow diagram to show the early return path for tool_choice=required
Remove the hardcoded default of 'auto' for tool_choice in ChatAgent init. When tool_choice is not specified (None), it will now not be sent to the API, allowing the API's default behavior to be used. Users who want tool_choice='auto' can still explicitly set it either in default_options or at runtime. Fixes microsoft#3585
In OpenAI Assistants client, tools were not being sent when tool_choice='none'. This was incorrect - tool_choice='none' means the model won't call tools, but tools should still be available in the request (they may be used later in the conversation). Fixes microsoft#3585
Adds a regression test to ensure that when tool_choice='none' is set but tools are provided, the tools are still sent to the API. This verifies the fix for microsoft#3585.
Apply the same fix to OpenAI Responses client and Azure AI client: - OpenAI Responses: Remove else block that popped tool_choice/parallel_tool_calls - Azure AI: Remove tool_choice != 'none' check when adding tools When tool_choice='none', the model won't call tools, but tools should still be sent to the API so they're available for future turns. Also update README to clarify tool_choice=required supports multiple tools. Fixes microsoft#3585
Move tool_choice processing outside of the 'if tools' block in OpenAI Responses client so tool_choice is sent to the API even when no tools are provided.
Changed test_prepare_options_removes_parallel_tool_calls_when_no_tools to test_prepare_options_preserves_parallel_tool_calls_when_no_tools to reflect that parallel_tool_calls is now preserved even when no tools are present, consistent with the tool_choice behavior.
Motivation and Context
Summary
ResponseStream[ChatResponseUpdate, ChatResponse]toResponseStream[AgentResponseUpdate, AgentResponse], but the object has a classmethod calledwrapthat is used to wrap the ResponseStream from the chat client into the new ResponseStream in the Agent.Description
Contribution Checklist
Fixes #3585
Fixes #3607
Fixes #3617