Skip to content

Conversation

@eavanvalkenburg
Copy link
Member

@eavanvalkenburg eavanvalkenburg commented Jan 22, 2026

Motivation and Context

Summary

  • Migrate chat/agent telemetry to mixin-based usage and remove legacy decorators, with streaming telemetry now using finalizers/teardown hooks instead of consuming streams.
    • This makes understanding the code a lot simpler, because we can set attributes on the chat client in the init of those mixin (making them technically not a mixin)
    • Added those parameters to the constructors, making it easier to configure things like function calling
  • Replace function invocation decorators with FunctionInvokingChatClient/FunctionInvokingMixin across clients, tests, and samples; update docs/comments accordingly.
  • Introducing a ResponseStream object that can is created to unify the API's
    • It is generic over TUpdate and TFinal, which in our case is usually ChatResponseUpdate and ChatReponse or the agent equivalent.
    • It features a update_hook mechanism, to allow you to run code while the internal stream is being unpacked, this can mostly be leveraged by middleware
    • It features a teardown hook mechanism, this get's run when the stream is exhausted, it's used now by the telemtry to record the duration
    • It features a finalizer (one or more) mechanism, that runs after the end of the stream, which is used to turn the updates list into a final object, this can be used by middleware and is also used in function calling and telemetry
    • In principle the ResponseStream is created by the most lowlevel object, the actual chat client implementations, and ideally all the layers in between should only use the hooks to do something, FunctionCalling does not work that way, because there are multiple calls to the underlying chat client that all then have to be combined into a single stream at runtime. Agent also creates a new stream, because it goes from ResponseStream[ChatResponseUpdate, ChatResponse] to ResponseStream[AgentResponseUpdate, AgentResponse], but the object has a classmethod called wrap that is used to wrap the ResponseStream from the chat client into the new ResponseStream in the Agent.
  • Overall this change reduces the number of times we iterate the stream and return a new AsyncGenerator and the new hooks actually make it simpler to create middleware that alters the stream (as the sample shows), it should therefore also improve performance a bit.
  • Removed use_instrumentation/use_agent_instrumentation and use_function_invocation decorators; mixins are now the supported path.

Description

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add [BREAKING] prefix to the title of the PR.

Fixes #3585
Fixes #3607
Fixes #3617

Copilot AI review requested due to automatic review settings January 22, 2026 17:34
@markwallace-microsoft markwallace-microsoft added documentation Improvements or additions to documentation python labels Jan 22, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates the Python Agent Framework's streaming and non-streaming APIs into a unified interface. The primary changes include:

Changes:

  • Unified run() and get_response() methods with stream parameter replacing separate run_stream() and get_streaming_response() methods
  • Migration from decorator-based (@use_instrumentation, @use_function_invocation) to mixin-based architecture for telemetry and function invocation
  • Introduction of ResponseStream class for unified stream handling with hooks, finalizers, and teardown support
  • Renamed AgentExecutionException to AgentRunException

Reviewed changes

Copilot reviewed 84 out of 85 changed files in this pull request and generated 28 comments.

Show a summary per file
File Description
_types.py Added ResponseStream class for unified streaming, updated prepare_messages to handle None
_clients.py Refactored BaseChatClient with unified get_response() method, introduced FunctionInvokingChatClient mixin
openai/_responses_client.py Consolidated streaming/non-streaming into single _inner_get_response() method
openai/_chat_client.py Similar consolidation for chat completions API
openai/_assistants_client.py Unified assistants API with stream parameter
_workflows/_workflow.py Consolidated run() and run_stream() into single run(stream=bool) method
_workflows/_agent.py Updated WorkflowAgent.run() to use stream parameter
Test files (multiple) Updated all tests to use run(stream=True) and get_response(stream=True)
Sample files (multiple) Updated samples to demonstrate new unified API
Provider clients Updated all provider implementations (Azure, Anthropic, Bedrock, Ollama, etc.) to use mixins

@eavanvalkenburg eavanvalkenburg force-pushed the python_single_response branch 3 times, most recently from 07afd46 to dd65afa Compare January 23, 2026 10:46
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Jan 23, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/a2a/agent_framework_a2a
   _agent.py148894%262, 400–401, 438–439, 468–470
packages/ag-ui/agent_framework_ag_ui
   _client.py1501788%83–84, 88–92, 96–100, 263, 295, 464–466
   _run.py44112471%154–161, 304, 323–324, 339–340, 351, 354–355, 357–358, 360–362, 372, 382–385, 389–391, 393, 403, 406–409, 411–412, 415–421, 424–426, 429, 445–447, 454, 460–461, 463–464, 478–484, 495, 508, 510–511, 545–546, 603–605, 617–619, 643, 648–650, 766, 777–778, 785, 803–805, 839–841, 856, 862, 870, 872, 908–914, 917–920, 922–931, 934, 941–942, 947, 953–955, 968–970
   _types.py360100% 
   _utils.py101298%257, 262
packages/ag-ui/agent_framework_ag_ui/_orchestration
   _tooling.py570100% 
packages/anthropic/agent_framework_anthropic
   _chat_client.py36115058%371, 403, 405, 420, 442–445, 454, 456, 487–491, 493, 495–496, 498, 503–504, 506, 539–540, 549, 551–552, 557, 574–575, 617, 632, 636–637, 653, 662, 664, 668–669, 712–714, 716, 729–730, 737–739, 743–745, 749–752, 763, 765, 787, 797, 819–825, 832–833, 841–842, 850–853, 860–861, 867–868, 874–875, 881, 889–891, 895, 902–903, 909–910, 916–917, 923, 931–934, 941–942, 961, 968–969, 988, 1010, 1012, 1021–1022, 1028, 1050–1051, 1057–1058, 1067–1077, 1084–1090, 1097–1103, 1110–1119, 1126–1129
packages/azure-ai/agent_framework_azure_ai
   _agent_provider.py115397%122–123, 251
   _chat_client.py4847584%382, 387–388, 390–391, 394, 397, 399, 404, 665–666, 668, 671, 674, 677–682, 685, 687, 695, 707–709, 713, 716–717, 725–728, 738, 746–749, 751–752, 754–755, 762, 770–771, 779–780, 785–786, 790–797, 802, 805, 813, 819, 827–829, 832, 854–855, 988, 1016, 1031, 1152, 1178, 1187, 1196, 1329
   _client.py1951393%360, 362, 411, 440–445, 488, 523, 525, 601
   _project_provider.py115694%132–133, 211, 309, 353, 386
packages/chatkit/agent_framework_chatkit
   _converter.py1334665%116, 121, 169, 171, 341, 394, 396, 415–417, 419, 437, 439, 441, 444, 456, 466, 484, 504–528, 530–532
packages/copilotstudio/agent_framework_copilotstudio
   _agent.py83593%155–156, 191, 199, 316
packages/core/agent_framework
   _agents.py3203589%473, 885, 921, 1020–1022, 1135, 1176, 1178, 1187–1192, 1198, 1200, 1210–1211, 1218, 1220–1221, 1229–1233, 1241–1242, 1244, 1249, 1251, 1285, 1325, 1345
   _clients.py52394%294, 495, 497
   _middleware.py3351695%80, 83, 88, 797, 799, 801, 922, 949, 951, 976, 1057, 1061, 1183, 1187, 1248, 1322
   _serialization.py105496%516, 532, 542, 610
   _tools.py7937390%232, 278, 329, 331, 359, 529, 564–565, 667, 669, 689, 707, 721, 733, 738, 740, 747, 780, 851–853, 894, 919–928, 934–943, 979, 987, 1228, 1433, 1490, 1494, 1574–1577, 1595, 1597–1598, 1703, 1759, 1761, 1777, 1779, 1844, 1871, 1924, 1992, 2194, 2223–2224, 2339–2344
   _types.py113010290%86, 109–110, 164, 169, 188, 190, 194, 198, 200, 202, 204, 222, 226, 252, 274, 279, 284, 288, 314, 318, 664–665, 1036, 1098, 1115, 1133, 1138, 1156, 1166, 1183–1184, 1186, 1204–1205, 1207, 1214–1215, 1217, 1252, 1263–1264, 1266, 1304, 1549, 1554, 1558, 1562, 1748, 1757, 1767, 1812, 1855–1860, 1882, 1887, 2182, 2288, 2297, 2445, 2672, 2676, 2688, 2695, 2706, 2866–2868, 2907, 2999, 3026, 3035, 3294–3296, 3299–3301, 3305, 3310, 3314, 3426–3428, 3456, 3510, 3514–3516, 3518, 3529–3530, 3533–3537, 3543
   exceptions.py480100% 
   observability.py6068486%332, 334–336, 339–341, 346–347, 353–354, 360–361, 368, 370–372, 375–377, 382–383, 389–390, 396–397, 404, 660, 663, 671–672, 675–678, 680, 683–685, 688–689, 717, 719, 730–732, 734–737, 741, 749, 850, 852, 1001, 1003, 1007–1012, 1014, 1017–1021, 1023, 1135–1136, 1138, 1189–1190, 1325, 1373–1374, 1490–1492, 1551, 1721, 1875, 1877
packages/core/agent_framework/_workflows
   _agent.py2844584%62, 70–76, 104–105, 297, 355, 369, 382, 431–434, 440, 446, 450–451, 454–460, 464–465, 534, 541, 547–548, 559, 591, 598, 619, 628, 632, 634–636, 643
   _agent_executor.py1712386%95, 117, 151, 167–168, 219–220, 222–223, 255–257, 265–267, 277–279, 281, 285, 289, 293–294
   _const.py60100% 
   _handoff.py3815785%110–111, 113, 142–143, 168–178, 180, 182, 184, 189, 291, 345, 370, 396, 404–405, 419, 468–469, 499, 546–548, 731, 738, 743, 830, 833, 842–845, 855, 860, 867, 873–876, 911, 916, 1113, 1116, 1124, 1142, 1149, 1224
   _magentic.py6149185%69–78, 83, 87–98, 263, 274, 278, 298, 359, 368, 370, 412, 429, 438–439, 441–443, 445, 456, 598, 600, 640, 688, 724–726, 728, 736–739, 743–746, 789, 816–819, 910, 916, 922, 961, 999, 1028, 1045, 1056, 1110–1111, 1115–1117, 1141, 1162–1163, 1176, 1192, 1214, 1262–1263, 1301–1302, 1458, 1467, 1470, 1475, 1871, 1913, 1928, 1957
   _runner_context.py168696%84, 87, 383, 403, 491, 495
   _workflow.py2521793%89, 259–261, 263–264, 282, 310, 411, 679, 713, 718, 721, 740–742, 807
   _workflow_builder.py2783687%259, 594, 693, 700–701, 802, 805, 810, 812, 819, 822–826, 828, 890, 965, 968, 1028–1029, 1174, 1188–1195, 1197, 1200, 1202–1204, 1212
   _workflow_context.py1772486%63–64, 72, 76, 90, 166, 191, 309, 428, 471–473, 475, 477–478, 480–481, 490–492, 494–496, 498
packages/core/agent_framework/azure
   _chat_client.py79494%301, 303, 316–317
   _responses_client.py37683%146, 169, 198–201
packages/core/agent_framework/openai
   _assistant_provider.py1101190%156–157, 169, 294, 360, 475–480
   _assistants_client.py2753587%359, 361, 363, 366, 370–371, 374, 377, 382–383, 385, 388–390, 395, 406, 431, 433, 435, 437, 439, 444, 447, 450, 454, 465, 550, 635, 672, 709–712, 764, 781
   _chat_client.py2642192%180–181, 185, 295, 302, 383–390, 392–395, 405, 490, 527, 543
   _responses_client.py5606288%277–278, 283, 314, 322, 345, 407, 439, 464, 470, 488–489, 511, 516, 572, 587, 601, 614, 669, 748, 753, 757–759, 763–764, 787, 856, 878–879, 894–895, 913–914, 1045–1046, 1062, 1064, 1139–1147, 1195, 1250, 1265, 1301–1302, 1304–1306, 1320–1322, 1332–1333, 1339, 1354
   _shared.py1351688%63, 69–72, 151, 153, 155, 162, 164, 177, 253, 277, 341–342, 344
packages/purview/agent_framework_purview
   _middleware.py950100% 
TOTAL16500202287% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3854 225 💤 0 ❌ 0 🔥 1m 12s ⏱️

@eavanvalkenburg eavanvalkenburg changed the title Python: [BREAKING} Python single response Python: [BREAKING] Moved to a single get_response and run API Jan 23, 2026
@eavanvalkenburg eavanvalkenburg force-pushed the python_single_response branch 4 times, most recently from 32f0473 to 5c78d91 Compare January 30, 2026 05:03
@eavanvalkenburg eavanvalkenburg requested a review from a team as a code owner January 30, 2026 16:25
@markwallace-microsoft markwallace-microsoft added .NET workflows Related to Workflows in agent-framework lab Agent Framework Lab labels Feb 1, 2026
@github-actions github-actions bot changed the title Python: [BREAKING] Moved to a single get_response and run API .NET: Python: [BREAKING] Moved to a single get_response and run API Feb 1, 2026
@eavanvalkenburg eavanvalkenburg changed the title .NET: Python: [BREAKING] Moved to a single get_response and run API Python: [BREAKING] Moved to a single get_response and run API Feb 1, 2026
@eavanvalkenburg
Copy link
Member Author

Fixes lingering CI failures: import missing response types in streaming telemetry finalizers, move AG-UI tests to ag_ui_tests with config updates, and track service thread IDs in AG-UI test client.\n\nChecks: uv run poe fmt/lint/pyright/mypy; uv run poe all-tests.

Duration is a metrics-only attribute per OpenTelemetry semantic conventions.
It should be recorded to the histogram but not set as a span attribute.
…ure_response

Duration is a metrics-only attribute. It's now passed directly to _capture_response
instead of being included in the attributes dict that gets set on the span.
_finalize_stream already calls _close_span() in its finally block,
so adding it as a separate cleanup hook is redundant.
If a user creates a streaming response but never consumes it, the cleanup
hooks won't run. Now we register a weak reference finalizer that will close
the span when the stream object is garbage collected, ensuring spans don't
leak in this scenario.
Renamed function to _get_result_hooks_from_stream and fixed it to
look for the _result_hooks attribute which is the correct name in
ResponseStream class.
@eavanvalkenburg eavanvalkenburg added this pull request to the merge queue Feb 4, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 4, 2026
Replace ChatResponse.from_chat_response_generator() with the new
ResponseStream.get_final_response() pattern in integration tests for:
- Azure AI client
- OpenAI chat client
- OpenAI responses client
- Azure responses client
@eavanvalkenburg eavanvalkenburg added this pull request to the merge queue Feb 4, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 4, 2026
Tests with tool_choice options require at least 2 iterations:
1. First iteration to get function call and execute the tool
2. Second iteration to get the final text response

With max_iterations=1, streaming tests would return early with only
the function call/result but no final text content.
When using conversation_id (for Responses/Assistants APIs), the server
already has the function call message from the previous response. We
should only send the new function result message, not all messages
including the function call which would cause a duplicate ID error.

Fix: When conversation_id is set, only send the last message (the tool
result) instead of all response.messages.
…ations

Port test from PR microsoft#3664 with updates for new streaming API pattern.
Tests that conversation_id is properly updated in options dict during
function invocation loop iterations.
When tool_choice is 'required', the user's intent is to force exactly one
tool call. After the tool executes, return immediately with the function
call and result - don't continue to call the model again.

This fixes integration tests that were failing with empty text responses
because with tool_choice=required, the model would keep returning function
calls instead of text.

Also adds regression tests for:
- conversation_id propagation between tool iterations (from PR microsoft#3664)
- tool_choice=required returns after tool execution
- Add table explaining tool_choice values (auto, none, required)
- Explain why tool_choice=required returns immediately after tool execution
- Add code example showing the difference between required and auto
- Update flow diagram to show the early return path for tool_choice=required
Remove the hardcoded default of 'auto' for tool_choice in ChatAgent init.
When tool_choice is not specified (None), it will now not be sent to the
API, allowing the API's default behavior to be used.

Users who want tool_choice='auto' can still explicitly set it either in
default_options or at runtime.

Fixes microsoft#3585
In OpenAI Assistants client, tools were not being sent when
tool_choice='none'. This was incorrect - tool_choice='none' means
the model won't call tools, but tools should still be available
in the request (they may be used later in the conversation).

Fixes microsoft#3585
Adds a regression test to ensure that when tool_choice='none' is set but
tools are provided, the tools are still sent to the API. This verifies
the fix for microsoft#3585.
Apply the same fix to OpenAI Responses client and Azure AI client:
- OpenAI Responses: Remove else block that popped tool_choice/parallel_tool_calls
- Azure AI: Remove tool_choice != 'none' check when adding tools

When tool_choice='none', the model won't call tools, but tools should
still be sent to the API so they're available for future turns.

Also update README to clarify tool_choice=required supports multiple tools.

Fixes microsoft#3585
Move tool_choice processing outside of the 'if tools' block in OpenAI
Responses client so tool_choice is sent to the API even when no tools
are provided.
Changed test_prepare_options_removes_parallel_tool_calls_when_no_tools to
test_prepare_options_preserves_parallel_tool_calls_when_no_tools to reflect
that parallel_tool_calls is now preserved even when no tools are present,
consistent with the tool_choice behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation lab Agent Framework Lab python workflows Related to Workflows in agent-framework

Projects

None yet

5 participants