-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
This proposal is a spin-off of https://github.com/microsoft/agent-framework/blob/main/docs/decisions/0013-python-get-response-simplification.md for .NET and (possibly) Python.
Context and Problem Statement
Currently chat clients must implement two separate methods to get responses, one for streaming and one for non-streaming. This adds complexity to the client implementations and increases the maintenance burden. This was likely done because the .NET version cannot do proper typing with a single method. Yet, there are tricks we can do to provide a single method without losing type safety, which would then also make it simpler to work with the AIAgent because there is only one method to learn about instead of two.
Proposal
- Add AgentRunOptions.Streaming to allow callers and agent implementations to define and know if streaming is enabled.
- Remove AIAgent.RunStreamingAsync and AIAgent.RunCoreStreamingAsync.
- Change AIAgent.RunCoreAsync to return
IAsyncEnumerable<AgentResponseUpdate> - All agent implementations (including delegated agents, aka middlewares) will only have to implement AIAgent.RunCoreAsync and, when supported, will use streaming if AgentRunOptions.Streaming is true.
- Define a new public sealed class, AgentRun (not attached to this name, feel free to change it).
- Update AIAgent.RunAsync to return AIAgent.RunAsync instead of Task.
- Remove the cancellationToken parameter from AIAgent.RunAsync.
- AgentRun will be in charge of calling AIAgent.RunCoreAsync with the parameters passed to AIAgent.RunAsync.
- AgentRun will implement
IAsyncEnumerable<AgentResponseUpdate>so it can be iterated without further boilerplate. - AgentRun will be convertible to
Task<AgentResponse>so thatAgentResponseUpdate resp = agent.RunAsync("hello")still works, - AgentRun will set the AgentRunOptions.Streaming (unless the caller explicitly set it) before calling AIAgent.RunCoreAsync when being iterated.
- AgentRun can be documented to support being executed multiple times, although the results might be different on each invocation depending on the LLM provider.
Note that agent implementations won't have to deal with AgentRun, that's only a convenience artifact for the agent consumers. Also, it is not expected that consumer directly interact with AgentRun, at least for the common case of it being automatically casted into a AgentResponse or an iterator.
Implementation draft
public class AgentRunOptions {
/// <summary>
/// Gets or sets a value indicating whether streaming is enabled.
/// </summary>
/// <remarks>
/// <para>
/// When set to <see langword="true"/>, the agent will produce streaming response updates that can be
/// consumed as they are generated. When set to <see langword="false"/>, the agent will return a complete
/// response after all processing is done.
/// </para>
/// <para>
/// When iterating over the result of <see cref="AIAgent.RunAsync(AgentSession?, AgentRunOptions?, System.Threading.CancellationToken)"/>
/// as an <see cref="System.Collections.Generic.IAsyncEnumerable{T}"/>, this property will be automatically set to <see langword="true"/>
/// unless explicitly configured by the caller.
/// </para>
/// <para>
/// This property only takes effect if the implementation supports streaming.
/// If the implementation does not support streaming, this property will be ignored.
/// </para>
/// </remarks>
public bool? Streaming { get; set; }
...
}public abstract class AIAgent
{
...
public AgentRun RunAsync(
IEnumerable<ChatMessage> messages,
AgentSession? session = null,
AgentRunOptions? options = null) =>
new(this, messages, session, options);
...
}/// <summary>
/// Represents a pending agent run that can be awaited as an <see cref="AgentResponse"/> or iterated
/// as an <see cref="IAsyncEnumerable{T}"/> of <see cref="AgentResponseUpdate"/> instances.
/// </summary>
/// <remarks>
/// <para>
/// <see cref="AgentRun"/> provides a unified way to interact with agent responses regardless of
/// whether streaming is desired. It can be used in two ways:
/// </para>
/// <para>
/// 1. <b>Non-streaming:</b> Await the result directly to get a complete <see cref="AgentResponse"/>:
/// <code>
/// AgentResponse response = await agent.RunAsync("Hello");
/// </code>
/// </para>
/// <para>
/// 2. <b>Streaming:</b> Iterate over the result to receive incremental <see cref="AgentResponseUpdate"/> instances:
/// <code>
/// await foreach (var update in agent.RunAsync("Hello"))
/// {
/// Console.Write(update.Text);
/// }
/// </code>
/// </para>
/// <para>
/// When iterating, <see cref="AgentRunOptions.Streaming"/> is automatically set to <see langword="true"/>
/// unless the caller has explicitly set it. This enables agent implementations to optimize their
/// behavior based on whether streaming is requested.
/// </para>
/// <para>
/// An <see cref="AgentRun"/> can be executed multiple times, although the results may differ on
/// each invocation depending on the underlying model provider.
/// </para>
/// </remarks>
public sealed class AgentRun : IAsyncEnumerable<AgentResponseUpdate>
{
private readonly AIAgent _agent;
private readonly IEnumerable<ChatMessage> _messages;
private readonly AgentSession? _session;
private readonly AgentRunOptions? _options;
/// <summary>
/// Initializes a new instance of the <see cref="AgentRun"/> class.
/// </summary>
/// <param name="agent">The agent to run.</param>
/// <param name="messages">The messages to send to the agent.</param>
/// <param name="session">The conversation session to use for this invocation.</param>
/// <param name="options">Optional configuration parameters for controlling the agent's invocation behavior.</param>
internal AgentRun(
AIAgent agent,
IEnumerable<ChatMessage> messages,
AgentSession? session,
AgentRunOptions? options)
{
_ = Throw.IfNull(agent);
_ = Throw.IfNull(messages);
this._agent = agent;
this._messages = messages;
this._session = session;
this._options = options;
}
/// <summary>
/// Gets an awaiter used to await this <see cref="AgentRun"/>.
/// </summary>
/// <returns>An awaiter instance.</returns>
/// <remarks>
/// This enables using <c>await</c> directly on an <see cref="AgentRun"/> to get a complete <see cref="AgentResponse"/>.
/// When awaited, the run will be executed with streaming disabled unless explicitly configured otherwise.
/// To pass a cancellation token, use <see cref="AsResponseAsync(CancellationToken)"/> instead.
/// </remarks>
public TaskAwaiter<AgentResponse> GetAwaiter() =>
this.AsResponseAsync().GetAwaiter();
/// <summary>
/// Executes the agent run and returns the complete response.
/// </summary>
/// <param name="cancellationToken">The <see cref="CancellationToken"/> to monitor for cancellation requests. The default is <see cref="CancellationToken.None"/>.</param>
/// <returns>A task that represents the asynchronous operation. The task result contains the complete <see cref="AgentResponse"/>.</returns>
/// <remarks>
/// This method aggregates all streaming updates into a single <see cref="AgentResponse"/>.
/// The <see cref="AgentRunOptions.Streaming"/> property will not be modified, so the agent implementation
/// will use its default streaming behavior unless the caller has explicitly set it.
/// </remarks>
public Task<AgentResponse> AsResponseAsync(CancellationToken cancellationToken = default) =>
this._agent
.RunCoreAsync(this._messages, this._session, this._options, cancellationToken)
.ToAgentResponseAsync(cancellationToken);
/// <summary>
/// Returns an enumerator that asynchronously iterates through the streaming response updates.
/// </summary>
/// <param name="cancellationToken">
/// A <see cref="CancellationToken"/> that may be used to cancel the asynchronous iteration.
/// </param>
/// <returns>An enumerator that can be used to asynchronously iterate through the streaming response updates.</returns>
/// <remarks>
/// <para>
/// When iteration begins, <see cref="AgentRunOptions.Streaming"/> is automatically set to <see langword="true"/>
/// unless the caller has explicitly configured it. This signals to the agent implementation that streaming
/// output is desired.
/// </para>
/// <para>
/// The enumeration can be performed multiple times, and each enumeration will execute a new agent run.
/// Results may differ between enumerations depending on the underlying model provider.
/// </para>
/// </remarks>
public IAsyncEnumerator<AgentResponseUpdate> GetAsyncEnumerator(CancellationToken cancellationToken = default)
{
// Create options with streaming enabled if not explicitly set
AgentRunOptions? effectiveOptions = this._options;
if (effectiveOptions?.Streaming is null)
{
effectiveOptions = effectiveOptions is null
? new AgentRunOptions { Streaming = true }
: new AgentRunOptions(effectiveOptions) { Streaming = true };
}
return this._agent.RunCoreAsync(this._messages, this._session, effectiveOptions, cancellationToken).GetAsyncEnumerator(cancellationToken);
}
/// <summary>
/// Implicitly converts an <see cref="AgentRun"/> to a <see cref="Task{AgentResponse}"/>.
/// </summary>
/// <param name="run">The <see cref="AgentRun"/> to convert.</param>
/// <returns>A task that represents the asynchronous operation. The task result contains the complete <see cref="AgentResponse"/>.</returns>
/// <remarks>
/// This conversion enables code like:
/// <code>
/// Task<AgentResponse> responseTask = agent.RunAsync("Hello");
/// AgentResponse response = await responseTask;
/// </code>
/// </remarks>
public static implicit operator Task<AgentResponse>(AgentRun run)
{
_ = Throw.IfNull(run);
return run.AsResponseAsync();
}
}Metadata
Metadata
Assignees
Labels
Type
Projects
Status