fix: persist MCP tool call outputs to session storage#2388
fix: persist MCP tool call outputs to session storage#2388saakshigupta2002 wants to merge 1 commit intoopenai:mainfrom
Conversation
When the model returns McpCall items with output data, the SDK was only creating ToolCallItem for the call but not ToolCallOutputItem for the result. This caused MCP tool call results to be missing from session persistence, making session replay incomplete. Changes: - Create ToolCallOutputItem for McpCall items that have output or error - Add mcp_call -> mcp_call_output mapping to _TOOL_CALL_TO_OUTPUT_TYPE - Update drop_orphan_function_calls to check 'id' field for mcp_call items (McpCall uses 'id' instead of 'call_id') - Add comprehensive tests for MCP tool call output persistence Fixes openai#2384
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 61a2dfd257
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "apply_patch_call": "apply_patch_call_output", | ||
| "computer_call": "computer_call_output", | ||
| "local_shell_call": "local_shell_call_output", | ||
| "mcp_call": "mcp_call_output", |
There was a problem hiding this comment.
Preserve legacy MCP calls without output items
By adding mcp_call to _TOOL_CALL_TO_OUTPUT_TYPE, drop_orphan_function_calls will now treat every MCP call as requiring a matching mcp_call_output item. That means sessions persisted before this change (or any caller-supplied history that only includes mcp_call items with output/error inline) will have those MCP calls dropped during normalize_resumed_input, losing tool history and potentially changing replay behavior. Consider exempting MCP calls that already include output/error, or backfilling an output item when normalizing legacy input.
Useful? React with 👍 / 👎.
| mcp_call_output: dict[str, Any] = { | ||
| "type": "mcp_call_output", | ||
| "call_id": output.id, | ||
| "output": mcp_output_content or "", | ||
| } |
There was a problem hiding this comment.
Preserve falsy MCP outputs instead of blanking
When constructing the synthetic mcp_call_output, the code uses mcp_output_content or "" for the stored output. If an MCP tool legitimately returns a falsy but meaningful value (e.g., 0, False, [], or {}), this will be persisted as an empty string instead, so session storage and replay lose the real output and the raw_item diverges from ToolCallOutputItem.output. Use an explicit None check (or drop the fallback) to avoid corrupting falsy outputs.
Useful? React with 👍 / 👎.
When the model returns McpCall items with output data, the SDK was only creating ToolCallItem for the call but not ToolCallOutputItem for the result. This caused MCP tool call results to be missing from session persistence, making session replay incomplete.
Changes:
Fixes #2384