Skip to content

Latest commit

 

History

History
719 lines (400 loc) · 41.9 KB

File metadata and controls

719 lines (400 loc) · 41.9 KB

Changelog

[Unreleased]

[0.55.0] - 2026-02-24

[0.54.2] - 2026-02-23

[0.54.1] - 2026-02-22

[0.54.0] - 2026-02-19

[0.53.1] - 2026-02-19

[0.53.0] - 2026-02-17

Added

  • Added Anthropic claude-sonnet-4-6 fallback model entry to generated model definitions.

[0.52.12] - 2026-02-13

Added

  • Added transport to StreamOptions with values "sse", "websocket", and "auto" (currently supported by openai-codex-responses).
  • Added WebSocket transport support for OpenAI Codex Responses (openai-codex-responses).

Changed

  • OpenAI Codex Responses now defaults to SSE transport unless transport is explicitly set.
  • OpenAI Codex Responses WebSocket connections are cached per sessionId and expire after 5 minutes of inactivity.

[0.52.11] - 2026-02-13

Added

  • Added MiniMax M2.5 model entries for minimax, minimax-cn, openrouter, and vercel-ai-gateway providers, plus minimax-m2.5-free for opencode.

[0.52.10] - 2026-02-12

Added

  • Added optional metadata field to StreamOptions for passing provider-specific metadata (e.g. Anthropic user_id for abuse tracking/rate limiting) (#1384 by @7Sageer)
  • Added gpt-5.3-codex-spark model definition for OpenAI and OpenAI Codex providers (128k context, text-only, research preview). Not yet functional, may become available in the next few hours or days.

Changed

  • Routed GitHub Copilot Claude 4.x models through Anthropic Messages API, centralized Copilot dynamic header handling, and added Copilot Claude Anthropic stream coverage (#1353 by @NateSmyth)

Fixed

  • Fixed OpenAI completions and responses streams to tolerate malformed trailing tool-call JSON without failing parsing (#1424)

[0.52.9] - 2026-02-08

Changed

  • Updated the Antigravity system instruction to a more compact version for Google Gemini CLI compatibility

Fixed

  • Use parametersJsonSchema for Google provider tool declarations to support full JSON Schema (anyOf, oneOf, const, etc.) (#1398 by @jarib)
  • Reverted incorrect Antigravity model change: claude-opus-4-6-thinking back to claude-opus-4-5-thinking (model doesn't exist on Antigravity endpoint)
  • Corrected opencode context windows for Claude Sonnet 4 and 4.5 (#1383)

[0.52.8] - 2026-02-07

Added

  • Added OpenRouter auto model alias for automatic model routing (#1361 by @yogasanas)

Changed

[0.52.7] - 2026-02-06

Added

  • Added AWS_BEDROCK_SKIP_AUTH and AWS_BEDROCK_FORCE_HTTP1 environment variables for connecting to unauthenticated Bedrock proxies (#1320 by @virtuald)

Fixed

  • Set OpenAI Responses API requests to store: false by default to avoid server-side history logging (#1308)
  • Re-exported TypeBox Type, Static, and TSchema from @mariozechner/pi-ai to match documentation and avoid duplicate TypeBox type identity issues in pnpm setups (#1338)
  • Fixed Bedrock adaptive thinking handling for Claude Opus 4.6 with interleaved thinking beta responses (#1323 by @markusylisiurunen)
  • Fixed AWS_BEDROCK_SKIP_AUTH environment detection to avoid process access in non-Node.js environments

[0.52.6] - 2026-02-05

[0.52.5] - 2026-02-05

Fixed

  • Fixed supportsXhigh() to treat Anthropic Messages Opus 4.6 models as xhigh-capable so streamSimple can map xhigh to adaptive effort max

[0.52.4] - 2026-02-05

[0.52.3] - 2026-02-05

Fixed

  • Fixed Bedrock Opus 4.6 model IDs (removed :0 suffix) and cache pricing for us.* and eu.* variants
  • Added missing eu.anthropic.claude-opus-4-6-v1 inference profile to model catalog
  • Fixed Claude Opus 4.6 context window metadata to 200000 for Anthropic and OpenCode providers

[0.52.2] - 2026-02-05

[0.52.1] - 2026-02-05

Added

  • Added adaptive thinking support for Claude Opus 4.6 with effort levels (low, medium, high, max)
  • Added effort option to AnthropicOptions for controlling adaptive thinking depth
  • thinkingEnabled now automatically uses adaptive thinking for Opus 4.6+ models and budget-based thinking for older models
  • streamSimple/completeSimple automatically map ThinkingLevel to effort levels for Opus 4.6

Changed

  • Updated @anthropic-ai/sdk to 0.73.0
  • Updated @aws-sdk/client-bedrock-runtime to 3.983.0
  • Updated @google/genai to 1.40.0
  • Removed fast-xml-parser override (no longer needed)

[0.52.0] - 2026-02-05

Added

  • Added Claude Opus 4.6 model to the generated model catalog
  • Added GPT-5.3 Codex model to the generated model catalog (OpenAI Codex provider only)

[0.51.6] - 2026-02-04

Fixed

  • Fixed OpenAI Codex Responses provider to respect configured baseUrl (#1244)

[0.51.5] - 2026-02-04

Changed

  • Changed Bedrock model generation to drop legacy workarounds now handled upstream (#1239 by @unexge)

[0.51.4] - 2026-02-03

[0.51.3] - 2026-02-03

Fixed

  • Fixed xhigh thinking level support check to accept gpt-5.2 model IDs (#1209)

[0.51.2] - 2026-02-03

[0.51.1] - 2026-02-02

Fixed

  • Fixed cache_control not being applied to string-format user messages in Anthropic provider

[0.51.0] - 2026-02-01

Fixed

  • Fixed cacheRetention option not being passed through in buildBaseOptions (#1154)
  • Fixed OAuth login/refresh not using HTTP proxy settings (HTTP_PROXY, HTTPS_PROXY env vars) (#1132)
  • Fixed OpenAI-compatible completions to omit unsupported strict tool fields for providers that reject them (#1172)

[0.50.9] - 2026-02-01

Added

  • Added PI_AI_ANTIGRAVITY_VERSION environment variable to override the Antigravity User-Agent version when Google updates their version requirements (#1129)
  • Added cacheRetention stream option with provider-specific mappings for prompt cache controls, defaulting to short retention (#1134)

[0.50.8] - 2026-02-01

Added

  • Added maxRetryDelayMs option to StreamOptions to cap server-requested retry delays. When a provider (e.g., Google Gemini CLI) requests a delay longer than this value, the request fails immediately with an informative error instead of waiting silently. Default: 60000ms (60 seconds). Set to 0 to disable the cap. (#1123)
  • Added Qwen thinking format support for OpenAI-compatible completions via enable_thinking. (#940 by @4h9fbZ)

[0.50.7] - 2026-01-31

[0.50.6] - 2026-01-30

[0.50.5] - 2026-01-30

[0.50.4] - 2026-01-30

Added

  • Added Vercel AI Gateway routing support via vercelGatewayRouting option in model config (#1051 by @ben-vargas)

Fixed

  • Updated Antigravity User-Agent from 1.11.5 to 1.15.8 to fix rejected requests (#1079)
  • Fixed tool call argument defaults for Anthropic and Google history conversion when providers omit inputs (#1065)

[0.50.3] - 2026-01-29

Added

  • Added Kimi For Coding provider support (Moonshot AI's Anthropic-compatible coding API)

[0.50.2] - 2026-01-29

Added

  • Added Hugging Face provider support via OpenAI-compatible Inference Router (#994)
  • Added PI_CACHE_RETENTION environment variable to control cache TTL for Anthropic (5m vs 1h) and OpenAI (in-memory vs 24h). Set to long for extended retention. Only applies to direct API calls (api.anthropic.com, api.openai.com). (#967)

Fixed

  • Fixed OpenAI completions toolChoice handling to correctly set type: "function" wrapper (#998 by @williamtwomey)
  • Fixed cross-provider handoff failing when switching from OpenAI Responses API providers (github-copilot, openai-codex) to other providers due to pipe-separated tool call IDs not being normalized, and trailing underscores in truncated IDs being rejected by OpenAI Codex (#1022)
  • Fixed 429 rate limit errors incorrectly triggering auto-compaction instead of retry with backoff (#1038)
  • Fixed Anthropic provider to handle sensitive stop_reason returned by API (#978)
  • Fixed DeepSeek API compatibility by detecting deepseek.com URLs and disabling unsupported developer role (#1048)
  • Fixed Anthropic provider to preserve input token counts when proxies omit them in message_delta events (#1045)

[0.50.1] - 2026-01-26

Fixed

[0.50.0] - 2026-01-26

Added

  • Added OpenRouter provider routing support for custom models via openRouterRouting compat field (#859 by @v01dpr1mr0s3)
  • Added azure-openai-responses provider support for Azure OpenAI Responses API. (#890 by @markusylisiurunen)
  • Added HTTP proxy environment variable support for API requests (#942 by @haoqixu)
  • Added createAssistantMessageEventStream() factory function for use in extensions.
  • Added resetApiProviders() to clear and re-register built-in API providers.

Changed

  • Refactored API streaming dispatch to use an API registry with provider-owned streamSimple mapping.
  • Moved environment API key resolution to env-api-keys.ts and re-exported it from the package entrypoint.
  • Azure OpenAI Responses provider now uses base URL configuration with deployment-aware model mapping and no longer includes service tier handling.

Fixed

  • Fixed Bun runtime detection for dynamic imports in browser-compatible modules (stream.ts, openai-codex-responses.ts, openai-codex.ts) (#922 by @dannote)
  • Fixed streaming functions to use model.api instead of hardcoded API types
  • Fixed Google providers to default tool call arguments to an empty object when omitted
  • Fixed OpenAI Responses streaming to handle arguments.done events on OpenAI-compatible endpoints (#917 by @williballenthin)
  • Fixed OpenAI Codex Responses tool strictness handling after the shared responses refactor
  • Fixed Azure OpenAI Responses streaming to guard deltas before content parts and correct metadata and handoff gating
  • Fixed OpenAI completions tool-result image batching after consecutive tool results (#902 by @terrorobe)

[0.49.3] - 2026-01-22

Added

  • Added headers option to StreamOptions for custom HTTP headers in API requests. Supported by all providers except Amazon Bedrock (which uses AWS SDK auth). Headers are merged with provider defaults and model.headers, with options.headers taking precedence.
  • Added originator option to loginOpenAICodex() for custom OAuth client identification
  • Browser compatibility for pi-ai: replaced top-level Node.js imports with dynamic imports for browser environments (#873)

Fixed

  • Fixed OpenAI Responses API 400 error "function_call without required reasoning item" when switching between models (same provider, different model). The fix omits the id field for function_calls from different models to avoid triggering OpenAI's reasoning/function_call pairing validation (#886)

[0.49.2] - 2026-01-19

Added

  • Added AWS credential detection for ECS/Kubernetes environments: AWS_CONTAINER_CREDENTIALS_RELATIVE_URI, AWS_CONTAINER_CREDENTIALS_FULL_URI, AWS_WEB_IDENTITY_TOKEN_FILE (#848)

Fixed

  • Fixed OpenAI Responses 400 error "reasoning without following item" by skipping errored/aborted assistant messages entirely in transform-messages.ts (#838)

Removed

  • Removed strictResponsesPairing compat option (no longer needed after the transform-messages fix)

[0.49.1] - 2026-01-18

Added

  • Added OpenAIResponsesCompat interface with strictResponsesPairing option for Azure OpenAI Responses API, which requires strict reasoning/message pairing in history replay (#768 by @prateekmedia)

Changed

  • Split OpenAICompat into OpenAICompletionsCompat and OpenAIResponsesCompat for type-safe API-specific compat settings

Fixed

  • Fixed tool call ID normalization for cross-provider handoffs (e.g., Codex to Antigravity Claude) (#821)

[0.49.0] - 2026-01-17

Changed

  • OpenAI Codex responses now use the context system prompt directly in the instructions field.

Fixed

  • Fixed orphaned tool results after errored assistant messages causing Codex API errors. When an assistant message has stopReason: "error", its tool calls are now excluded from pending tool tracking, preventing synthetic tool results from being generated for calls that will be dropped by provider-specific converters. (#812)
  • Fixed Bedrock Claude max_tokens handling to always exceed thinking budget tokens, preventing compaction failures. (#797 by @pjtf93)
  • Fixed Claude Code tool name normalization to match the Claude Code tool list case-insensitively and remove invalid mappings.

[0.48.0] - 2026-01-16

Fixed

  • Fixed OpenAI-compatible provider feature detection to use model.provider in addition to URL, allowing custom base URLs (e.g., proxies) to work correctly with provider-specific settings (#774)
  • Fixed Gemini 3 context loss when switching from providers without thought signatures: unsigned tool calls are now converted to text with anti-mimicry notes instead of being skipped
  • Fixed string numbers in tool arguments not being coerced to numbers during validation (#786 by @dannote)
  • Fixed Bedrock tool call IDs to use only alphanumeric characters, avoiding API errors from invalid characters (#781 by @pjtf93)
  • Fixed empty error assistant messages (from 429/500 errors) breaking the tool_use to tool_result chain by filtering them in transformMessages

[0.47.0] - 2026-01-16

Fixed

  • Fixed OpenCode provider's /v1 endpoint to use system role instead of developer role, fixing 400 Incorrect role information error for models using openai-completions API (#755 by @melihmucuk)
  • Added retry logic to OpenAI Codex provider for transient errors (429, 5xx, connection failures). Uses exponential backoff with up to 3 retries. (#733)

[0.46.0] - 2026-01-15

Added

  • Added MiniMax China (minimax-cn) provider support (#725 by @tallshort)
  • Added gpt-5.2-codex models for GitHub Copilot and OpenCode Zen providers (#734 by @aadishv)

Fixed

  • Avoid unsigned Gemini 3 tool calls (#741 by @roshanasingh4)
  • Fixed signature support for non-Anthropic models in Amazon Bedrock provider (#727 by @unexge)

[0.45.7] - 2026-01-13

Fixed

[0.45.6] - 2026-01-13

Fixed

  • Export parseStreamingJson from main package for tsx dev mode compatibility

[0.45.5] - 2026-01-13

[0.45.4] - 2026-01-13

Added

  • Added Vercel AI Gateway provider with model discovery and AI_GATEWAY_API_KEY env support (#689 by @timolins)

Fixed

  • Fixed z.ai thinking/reasoning: z.ai uses thinking: { type: "enabled" } instead of OpenAI's reasoning_effort. Added thinkingFormat compat flag to handle this. (#688)

[0.45.3] - 2026-01-13

[0.45.2] - 2026-01-13

[0.45.1] - 2026-01-13

[0.45.0] - 2026-01-13

Added

  • MiniMax provider support with M2 and M2.1 models via Anthropic-compatible API (#656 by @dannote)
  • Add Amazon Bedrock provider with prompt caching for Claude models (experimental, tested with Anthropic Claude models only) (#494 by @unexge)
  • Added serviceTier option for OpenAI Responses requests (#672 by @markusylisiurunen)
  • Anthropic caching on OpenRouter: Interactions with Anthropic models via OpenRouter now set a 5-minute cache point using Anthropic-style cache_control breakpoints on the last assistant or user message. (#584 by @nathyong)
  • Google Gemini CLI provider improvements: Added Antigravity endpoint fallback (tries daily sandbox then prod when baseUrl is unset), header-based retry delay parsing (Retry-After, x-ratelimit-reset, x-ratelimit-reset-after), stable sessionId derivation from first user message for cache affinity, empty SSE stream retry with backoff, and anthropic-beta header for Claude thinking models (#670 by @kim0)

[0.44.0] - 2026-01-12

[0.43.0] - 2026-01-11

Fixed

  • Fixed Google provider thinking detection: isThinkingPart() now only checks thought === true, not thoughtSignature. Per Google docs, thoughtSignature is for context replay and can appear on any part type. Also removed id field from functionCall/functionResponse (rejected by Vertex AI and Cloud Code Assist), and added textSignature round-trip for multi-turn reasoning context. (#631 by @theBucky)

[0.42.5] - 2026-01-11

[0.42.4] - 2026-01-10

[0.42.3] - 2026-01-10

Changed

  • OpenAI Codex: switched to bundled system prompt matching opencode, changed originator to "pi", simplified prompt handling

[0.42.2] - 2026-01-10

Added

  • Added GOOGLE_APPLICATION_CREDENTIALS env var support for Vertex AI credential detection (standard for CI/production).
  • Added supportsUsageInStreaming compatibility flag for OpenAI-compatible providers that reject stream_options: { include_usage: true }. Defaults to true. Set to false in model config for providers like gatewayz.ai. (#596 by @XesGaDeus)
  • Improved Google model pricing info (#588 by @aadishv)

Fixed

  • Fixed os.homedir() calls at module load time; now resolved lazily when needed.
  • Fixed OpenAI Responses tool strict flag to use a boolean for LM Studio compatibility (#598 by @gnattu)
  • Fixed Google Cloud Code Assist OAuth for paid subscriptions: properly handles long-running operations for project provisioning, supports GOOGLE_CLOUD_PROJECT / GOOGLE_CLOUD_PROJECT_ID env vars for paid tiers, and handles VPC-SC affected users (#582 by @cmf)

[0.42.1] - 2026-01-09

[0.42.0] - 2026-01-09

Added

  • Added OpenCode Zen provider support with 26 models (Claude, GPT, Gemini, Grok, Kimi, GLM, Qwen, etc.). Set OPENCODE_API_KEY env var to use.

[0.41.0] - 2026-01-09

[0.40.1] - 2026-01-09

[0.40.0] - 2026-01-08

[0.39.1] - 2026-01-08

[0.39.0] - 2026-01-08

Fixed

  • Fixed Gemini CLI abort handling: detect native AbortError in retry catch block, cancel SSE reader when abort signal fires (#568 by @tmustier)
  • Fixed Antigravity provider 429 errors by aligning request payload with CLIProxyAPI v6.6.89: inject Antigravity system instruction with role: "user", set requestType: "agent", and use antigravity userAgent. Added bridge prompt to override Antigravity behavior (identity, paths, web dev guidelines) with Pi defaults. (#571 by @ben-vargas)
  • Fixed thinking block handling for cross-model conversations: thinking blocks are now converted to plain text (no <thinking> tags) when switching models. Previously, <thinking> tags caused models to mimic the pattern and output literal tags. Also fixed empty thinking blocks causing API errors. (#561)

[0.38.0] - 2026-01-08

Added

  • thinkingBudgets option in SimpleStreamOptions for customizing token budgets per thinking level on token-based providers (#529 by @melihmucuk)

Breaking Changes

  • Removed OpenAI Codex model aliases (gpt-5, gpt-5-mini, gpt-5-nano, codex-mini-latest, gpt-5-codex, gpt-5.1-codex, gpt-5.1-chat-latest). Use canonical model IDs: gpt-5.1, gpt-5.1-codex-max, gpt-5.1-codex-mini, gpt-5.2, gpt-5.2-codex. (#536 by @ghoulr)

Fixed

  • Fixed OpenAI Codex context window from 400,000 to 272,000 tokens to match Codex CLI defaults and prevent 400 errors. (#536 by @ghoulr)
  • Fixed Codex SSE error events to surface message, code, and status. (#551 by @tmustier)
  • Fixed context overflow detection for context_length_exceeded error codes.

[0.37.8] - 2026-01-07

[0.37.7] - 2026-01-07

[0.37.6] - 2026-01-06

Added

  • Exported OpenAI Codex utilities: CacheMetadata, getCodexInstructions, getModelFamily, ModelFamily, buildCodexPiBridge, buildCodexSystemPrompt, CodexSystemPrompt (#510 by @mitsuhiko)

[0.37.5] - 2026-01-06

[0.37.4] - 2026-01-06

[0.37.3] - 2026-01-06

Added

  • sessionId option in StreamOptions for providers that support session-based caching. OpenAI Codex provider uses this to set prompt_cache_key and routing headers.

[0.37.2] - 2026-01-05

Fixed

  • Codex provider now always includes reasoning.encrypted_content even when custom include options are passed (#484 by @kim0)

[0.37.1] - 2026-01-05

[0.37.0] - 2026-01-05

Breaking Changes

  • OpenAI Codex models no longer have per-thinking-level variants (e.g., gpt-5.2-codex-high). Use the base model ID and set thinking level separately. The Codex provider clamps reasoning effort to what each model supports internally. (initial implementation by @ben-vargas in #472)

Added

  • Headless OAuth support for all callback-server providers (Google Gemini CLI, Antigravity, OpenAI Codex): paste redirect URL when browser callback is unreachable (#428 by @ben-vargas, #468 by @crcatala)
  • Cancellable GitHub Copilot device code polling via AbortSignal

Fixed

  • Codex requests now omit the reasoning field entirely when thinking is off, letting the backend use its default instead of forcing a value. (#472)

[0.36.0] - 2026-01-05

Added

  • OpenAI Codex OAuth provider with Responses API streaming support: openai-codex-responses streaming provider with SSE parsing, tool-call handling, usage/cost tracking, and PKCE OAuth flow (#451 by @kim0)

Fixed

  • Vertex AI dummy value for getEnvApiKey(): Returns "<authenticated>" when Application Default Credentials are configured (~/.config/gcloud/application_default_credentials.json exists) and both GOOGLE_CLOUD_PROJECT (or GCLOUD_PROJECT) and GOOGLE_CLOUD_LOCATION are set. This allows streamSimple() to work with Vertex AI without explicit apiKey option. The ADC credentials file existence check is cached per-process to avoid repeated filesystem access.

[0.35.0] - 2026-01-05

[0.34.2] - 2026-01-04

[0.34.1] - 2026-01-04

[0.34.0] - 2026-01-04

[0.33.0] - 2026-01-04

[0.32.3] - 2026-01-03

Fixed

  • Google Vertex AI models no longer appear in available models list without explicit authentication. Previously, getEnvApiKey() returned a dummy value for google-vertex, causing models to show up even when Google Cloud ADC was not configured.

[0.32.2] - 2026-01-03

[0.32.1] - 2026-01-03

[0.32.0] - 2026-01-03

Added

  • Vertex AI provider with ADC (Application Default Credentials) support. Authenticate with gcloud auth application-default login, set GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_LOCATION, and access Gemini models via Vertex AI. (#300 by @default-anton)

Fixed

  • Gemini CLI rate limit handling: Added automatic retry with server-provided delay for 429 errors. Parses delay from error messages like "Your quota will reset after 39s" and waits accordingly. Falls back to exponential backoff for other transient errors. (#370)

[0.31.1] - 2026-01-02

[0.31.0] - 2026-01-02

Breaking Changes

  • Agent API moved: All agent functionality (agentLoop, agentLoopContinue, AgentContext, AgentEvent, AgentTool, AgentToolResult, etc.) has moved to @mariozechner/pi-agent-core. Import from that package instead of @mariozechner/pi-ai.

Added

  • GoogleThinkingLevel type: Exported type that mirrors Google's ThinkingLevel enum values ("THINKING_LEVEL_UNSPECIFIED" | "MINIMAL" | "LOW" | "MEDIUM" | "HIGH"). Allows configuring Gemini thinking levels without importing from @google/genai.
  • ANTHROPIC_OAUTH_TOKEN env var: Now checked before ANTHROPIC_API_KEY in getEnvApiKey(), allowing OAuth tokens to take precedence.
  • event-stream.js export: AssistantMessageEventStream utility now exported from package index.

Changed

  • OAuth uses Web Crypto API: PKCE generation and OAuth flows now use Web Crypto API (crypto.subtle) instead of Node.js crypto module. This improves browser compatibility while still working in Node.js 20+.
  • Deterministic model generation: generate-models.ts now sorts providers and models alphabetically for consistent output across runs. (#332 by @mrexodia)

Fixed

  • OpenAI completions empty content blocks: Empty text or thinking blocks in assistant messages are now filtered out before sending to the OpenAI completions API, preventing validation errors. (#344 by @default-anton)
  • Thinking token duplication: Fixed thinking content duplication with chutes.ai provider. The provider was returning thinking content in both reasoning_content and reasoning fields, causing each chunk to be processed twice. Now only the first non-empty reasoning field is used.
  • zAi provider API mapping: Fixed zAi models to use openai-completions API with correct base URL (https://api.z.ai/api/coding/paas/v4) instead of incorrect Anthropic API mapping. (#344, #358 by @default-anton)

[0.28.0] - 2025-12-25

Breaking Changes

  • OAuth storage removed (#296): All storage functions (loadOAuthCredentials, saveOAuthCredentials, setOAuthStorage, etc.) removed. Callers are responsible for storing credentials.
  • OAuth login functions: loginAnthropic, loginGitHubCopilot, loginGeminiCli, loginAntigravity now return OAuthCredentials instead of saving to disk.
  • refreshOAuthToken: Now takes (provider, credentials) and returns new OAuthCredentials instead of saving.
  • getOAuthApiKey: Now takes (provider, credentials) and returns { newCredentials, apiKey } or null.
  • OAuthCredentials type: No longer includes type: "oauth" discriminator. Callers add discriminator when storing.
  • setApiKey, resolveApiKey: Removed. Callers must manage their own API key storage/resolution.
  • getApiKey: Renamed to getEnvApiKey. Only checks environment variables for known providers.

[0.27.7] - 2025-12-24

Fixed

  • Thinking tag leakage: Fixed Claude mimicking literal </thinking> tags in responses. Unsigned thinking blocks (from aborted streams) are now converted to plain text without <thinking> tags. The TUI still displays them as thinking blocks. (#302 by @nicobailon)

[0.25.1] - 2025-12-21

Added

  • xhigh thinking level support: Added supportsXhigh() function to check if a model supports xhigh reasoning level. Also clamps xhigh to high for OpenAI models that don't support it. (#236 by @theBucky)

Fixed

  • Gemini multimodal tool results: Fixed images in tool results causing flaky/broken responses with Gemini models. For Gemini 3, images are now nested inside functionResponse.parts per the docs. For older models (which don't support multimodal function responses), images are sent in a separate user message.

  • Queued message steering: When getQueuedMessages is provided, the agent loop now checks for queued user messages after each tool call and skips remaining tool calls in the current assistant message when a queued message arrives (emitting error tool results).

  • Double API version path in Google provider URL: Fixed Gemini API calls returning 404 after baseUrl support was added. The SDK was appending its default apiVersion to baseUrl which already included the version path. (#251 by @shellfyred)

  • Anthropic SDK retries disabled: Re-enabled SDK-level retries (default 2) for transient HTTP failures. (#252)

[0.23.5] - 2025-12-19

Added

  • Gemini 3 Flash thinking support: Extended thinking level support for Gemini 3 Flash models (MINIMAL, LOW, MEDIUM, HIGH) to match Pro models' capabilities. (#212 by @markusylisiurunen)

  • GitHub Copilot thinking models: Added thinking support for additional Copilot models (o3-mini, o1-mini, o1-preview). (#234 by @aadishv)

Fixed

  • Gemini tool result format: Fixed tool result format for Gemini 3 Flash Preview which strictly requires { output: value } for success and { error: value } for errors. Previous format using { result, isError } was rejected by newer Gemini models. Also improved type safety by removing as any casts. (#213, #220)

  • Google baseUrl configuration: Google provider now respects baseUrl configuration for custom endpoints or API proxies. (#216, #221 by @theBucky)

  • GitHub Copilot vision requests: Added Copilot-Vision-Request header when sending images to GitHub Copilot models. (#222)

  • GitHub Copilot X-Initiator header: Fixed X-Initiator logic to check last message role instead of any message in history. This ensures proper billing when users send follow-up messages. (#209)

[0.22.3] - 2025-12-16

Added

  • Image limits test suite: Added comprehensive tests for provider-specific image limitations (max images, max size, max dimensions). Discovered actual limits: Anthropic (100 images, 5MB, 8000px), OpenAI (500 images, ≥25MB), Gemini (~2500 images, ≥40MB), Mistral (8 images, ~15MB), OpenRouter (~40 images context-limited, ~15MB). (#120)

  • Tool result streaming: Added tool_execution_update event and optional onUpdate callback to AgentTool.execute() for streaming tool output during execution. Tools can now emit partial results (e.g., bash stdout) that are forwarded to subscribers. (#44)

  • X-Initiator header for GitHub Copilot: Added X-Initiator header handling for GitHub Copilot provider to ensure correct call accounting (agent calls are not deducted from quota). Sets initiator based on last message role. (#200 by @kim0)

Changed

  • Normalized tool_execution_end result: tool_execution_end event now always contains AgentToolResult (no longer AgentToolResult | string). Errors are wrapped in the standard result format.

Fixed

  • Reasoning disabled by default: When reasoning option is not specified, thinking is now explicitly disabled for all providers. Previously, some providers like Gemini with "dynamic thinking" would use their default (thinking ON), causing unexpected token usage. This was the original intended behavior. (#180 by @markusylisiurunen)

[0.22.2] - 2025-12-15

Added

  • Interleaved thinking for Anthropic: Added interleavedThinking option to AnthropicOptions. When enabled, Claude 4 models can think between tool calls and reason after receiving tool results. Enabled by default (no extra token cost, just unlocks the capability). Set interleavedThinking: false to disable.

[0.22.1] - 2025-12-15

Dedicated to Peter's shoulder (@steipete)

Added

  • Interleaved thinking for Anthropic: Enabled interleaved thinking in the Anthropic provider, allowing Claude models to output thinking blocks interspersed with text responses.

[0.22.0] - 2025-12-15

Added

  • GitHub Copilot provider: Added github-copilot as a known provider with models sourced from models.dev. Includes Claude, GPT, Gemini, Grok, and other models available through GitHub Copilot. (#191 by @cau1k)

Fixed

  • GitHub Copilot gpt-5 models: Fixed API selection for gpt-5 models to use openai-responses instead of openai-completions (gpt-5 models are not accessible via completions endpoint)

  • GitHub Copilot cross-model context handoff: Fixed context handoff failing when switching between GitHub Copilot models using different APIs (e.g., gpt-5 to claude-sonnet-4). Tool call IDs from OpenAI Responses API were incompatible with other models. (#198)

  • Gemini 3 Pro thinking levels: Thinking level configuration now works correctly for Gemini 3 Pro models. Previously all levels mapped to -1 (minimal thinking). Now LOW/MEDIUM/HIGH properly control test-time computation. (#176 by @markusylisiurunen)

[0.18.2] - 2025-12-11

Changed

  • Anthropic SDK retries disabled: Set maxRetries: 0 on Anthropic client to allow application-level retry handling. The SDK's built-in retries were interfering with coding-agent's retry logic. (#157)

[0.18.1] - 2025-12-10

Added

  • Mistral provider: Added support for Mistral AI models via the OpenAI-compatible API. Includes automatic handling of Mistral-specific requirements (tool call ID format). Set MISTRAL_API_KEY environment variable to use.

Fixed

  • Fixed Mistral 400 errors after aborted assistant messages by skipping empty assistant messages (no content, no tool calls) (#165)

  • Removed synthetic assistant bridge message after tool results for Mistral (no longer required as of Dec 2025) (#165)

  • Fixed bug where ANTHROPIC_API_KEY environment variable was deleted globally after first OAuth token usage, causing subsequent prompts to fail (#164)

[0.17.0] - 2025-12-09

Added

  • agentLoopContinue function: Continue an agent loop from existing context without adding a new user message. Validates that the last message is user or toolResult. Useful for retry after context overflow or resuming from manually-added tool results.

Breaking Changes

  • Removed provider-level tool argument validation. Validation now happens in agentLoop via executeToolCalls, allowing models to retry on validation errors. For manual tool execution, use validateToolCall(tools, toolCall) or validateToolArguments(tool, toolCall).

Added

  • Added validateToolCall(tools, toolCall) helper that finds the tool by name and validates arguments.

  • OpenAI compatibility overrides: Added compat field to Model for openai-completions API, allowing explicit configuration of provider quirks (supportsStore, supportsDeveloperRole, supportsReasoningEffort, maxTokensField). Falls back to URL-based detection if not set. Useful for LiteLLM, custom proxies, and other non-standard endpoints. (#133, thanks @fink-andreas for the initial idea and PR)

  • xhigh reasoning level: Added xhigh to ReasoningEffort type for OpenAI codex-max models. For non-OpenAI providers (Anthropic, Google), xhigh is automatically mapped to high. (#143)

Changed

  • Updated SDK versions: OpenAI SDK 5.21.0 → 6.10.0, Anthropic SDK 0.61.0 → 0.71.2, Google GenAI SDK 1.30.0 → 1.31.0

[0.13.0] - 2025-12-06

Breaking Changes

  • Added totalTokens field to Usage type: All code that constructs Usage objects must now include the totalTokens field. This field represents the total tokens processed by the LLM (input + output + cache). For OpenAI and Google, this uses native API values (total_tokens, totalTokenCount). For Anthropic, it's computed as input + output + cacheRead + cacheWrite.

[0.12.10] - 2025-12-04

Added

  • Added gpt-5.1-codex-max model support

Fixed

  • OpenAI Token Counting: Fixed usage.input to exclude cached tokens for OpenAI providers. Previously, input included cached tokens, causing double-counting when calculating total context size via input + cacheRead. Now input represents non-cached input tokens across all providers, making input + output + cacheRead + cacheWrite the correct formula for total context size.

  • Fixed Claude Opus 4.5 cache pricing (was 3x too expensive)

    • Corrected cache_read: $1.50 → $0.50 per MTok
    • Corrected cache_write: $18.75 → $6.25 per MTok
    • Added manual override in scripts/generate-models.ts until upstream fix is merged
    • Submitted PR to models.dev: anomalyco/models.dev#439

[0.9.4] - 2025-11-26

Initial release with multi-provider LLM support.