Skip to content

Commit 25eb0d4

Browse files
andreibogdanclaudedanielchalefprasmussen15
authored andcommitted
Fix Azure OpenAI integration for v1 API compatibility (getzep#1192)
* Fix MCP server documentation to reference correct entry point Update all documentation references from graphiti_mcp_server.py to main.py. The old filename was causing "No such file or directory" errors when users tried to run the commands as documented. The actual entry point is main.py in the mcp_server directory. Changes: - Update 7 command examples in README.md - Update example configuration file with correct path Co-Authored-By: Claude (us.anthropic.claude-sonnet-4-5-20250929-v1:0) <noreply@anthropic.com> * @andreibogdan has signed the CLA in getzep#1179 * Add extracted edge facts to entity summaries (getzep#1182) * Add extracted edge facts to entity summaries Update _extract_entity_summary to include facts from edges connected to each node. Edge facts are appended to the existing summary, and LLM summarization is only triggered if the combined content exceeds the character limit. - Add edges parameter to extract_attributes_from_nodes and related functions - Filter edges per node before passing to attribute extraction - Append edge facts (newline-separated) to node summary - Skip LLM call when combined summary is within length limits Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Remove unused reflexion prompts and invalidate_edges v1 - Remove reflexion prompts from extract_nodes.py and extract_edges.py - Remove extract_nodes_reflexion function from node_operations.py - Remove unused v1 function from invalidate_edges.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Filter out None/empty edge facts when building summary Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Remove unused MissedEntities import Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Optimize edge filtering with pre-built lookup dictionary Replace O(N * E) per-node edge filtering with O(E + N) pre-built dictionary lookup. Edges are now indexed by node UUID once before the gather operation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Handle empty summary edge case Return early if summary_with_edges is empty after stripping, avoiding storing empty summaries when node.summary and all edge facts are empty. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Update tests to reflect summary optimization behavior Tests now expect that short summaries are kept as-is without LLM calls. Added new test to verify LLM is called when summary exceeds character limit due to edge facts. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * format * Bump version to 0.27.0 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * lock * change version --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * Fix dependabot security vulnerabilities (getzep#1184) Fix dependabot security vulnerabilities in dependencies Update lock files to address multiple security alerts: - pyasn1: 0.6.1 → 0.6.2 (CVE-2026-23490) - langchain-core: 0.3.74 → 0.3.83 (CVE-2025-68664) - mcp: 1.9.4 → 1.26.0 (DNS rebinding, DoS) - azure-core: 1.34.0 → 1.38.0 (deserialization) - starlette: 0.46.2/0.47.1 → 0.50.0/0.52.1 (DoS vulnerabilities) - python-multipart: 0.0.20 → 0.0.22 (arbitrary file write) - fastapi: 0.115.14 → 0.128.0 (for starlette compatibility) - nbconvert: 7.16.6 → 7.17.0 - orjson: 3.11.5 → 3.11.6 - protobuf: 6.33.4 → 6.33.5 Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * Revert "Fix dependabot security vulnerabilities" (getzep#1185) Revert "Fix dependabot security vulnerabilities (getzep#1184)" This reverts commit 30cd907. * Pin mcp_server to graphiti-core 0.26.3 (getzep#1186) * Fix dependabot security vulnerabilities in dependencies Update lock files to address multiple security alerts: - pyasn1: 0.6.1 → 0.6.2 (CVE-2026-23490) - langchain-core: 0.3.74 → 0.3.83 (CVE-2025-68664) - mcp: 1.9.4 → 1.26.0 (DNS rebinding, DoS) - azure-core: 1.34.0 → 1.38.0 (deserialization) - starlette: 0.46.2/0.47.1 → 0.50.0/0.52.1 (DoS vulnerabilities) - python-multipart: 0.0.20 → 0.0.22 (arbitrary file write) - fastapi: 0.115.14 → 0.128.0 (for starlette compatibility) - nbconvert: 7.16.6 → 7.17.0 - orjson: 3.11.5 → 3.11.6 - protobuf: 6.33.4 → 6.33.5 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Pin mcp_server to graphiti-core 0.26.3 from PyPI - Change dependency from >=0.23.1 to ==0.26.3 - Remove editable source override to use published package - Addresses code review feedback about RC version usage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Fix remaining security vulnerabilities in mcp_server Update vulnerable transitive dependencies: - aiohttp: 3.12.15 → 3.13.3 (High: zip bomb, DoS) - urllib3: 2.5.0 → 2.6.3 (High: decompression bomb bypass) - filelock: 3.19.1 → 3.20.3 (Medium: TOCTOU symlink) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * Update manual code review workflow to use claude-opus-4-5-20251101 (getzep#1189) * Update manual code review workflow to use claude-opus-4-5-20251101 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * Update auto code review workflow to use claude-opus-4-5-20251101 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com> * Fix Azure OpenAI integration for v1 API compatibility This commit addresses several issues with Azure OpenAI integration: 1. Azure OpenAI Client (graphiti_core/llm_client/azure_openai_client.py): - Use AsyncOpenAI with v1 endpoint instead of AsyncAzureOpenAI - Implement separate handling for reasoning models (responses.parse) vs non-reasoning models (beta.chat.completions.parse) - Add custom response handler to parse both response formats correctly - Fix RefusalError import path from llm_client.errors 2. MCP Server Factories (mcp_server/src/services/factories.py): - Update Azure OpenAI factory to use v1 compatibility endpoint - Use same deployment for both main and small models in Azure - Add support for custom embedder endpoints (Ollama compatibility) - Add support for custom embedding dimensions - Remove unused Azure AD authentication code (TODO for future) - Add reasoning model detection for OpenAI provider 3. MCP Server Configuration (mcp_server/pyproject.toml): - Add local graphiti-core source dependency for development 4. Tests (tests/llm_client/test_azure_openai_client.py): - Update test mocks to support beta.chat.completions.parse - Update test expectations for non-reasoning model path These changes enable Azure OpenAI to work correctly with both reasoning and non-reasoning models, support custom embedder endpoints like Ollama, and maintain compatibility with the OpenAI v1 API specification. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: address code review feedback - Remove local development dependency from mcp_server/pyproject.toml that would break PyPI installations - Move json import to top of azure_openai_client.py - Add comments explaining why non-reasoning models use beta.chat.completions.parse instead of responses.parse (Azure v1 compatibility limitation) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: address minor code review issues - Add noqa comment for unused response_model parameter (inherited from abstract method interface) - Fix misleading comment in factories.py that referenced Azure OpenAI in the regular OpenAI case Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude (us.anthropic.claude-sonnet-4-5-20250929-v1:0) <noreply@anthropic.com> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com> Co-authored-by: Preston Rasmussen <109292228+prasmussen15@users.noreply.github.com> Co-authored-by: prestonrasmussen <prasmuss15@gmail.com>
1 parent 5d5a3f3 commit 25eb0d4

File tree

6 files changed

+260
-681
lines changed

6 files changed

+260
-681
lines changed

graphiti_core/llm_client/azure_openai_client.py

Lines changed: 72 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,9 @@
1414
limitations under the License.
1515
"""
1616

17+
import json
1718
import logging
18-
from typing import ClassVar
19+
from typing import Any, ClassVar
1920

2021
from openai import AsyncAzureOpenAI, AsyncOpenAI
2122
from openai.types.chat import ChatCompletionMessageParam
@@ -63,34 +64,52 @@ async def _create_structured_completion(
6364
reasoning: str | None,
6465
verbosity: str | None,
6566
):
66-
"""Create a structured completion using Azure OpenAI's responses.parse API."""
67-
supports_reasoning = self._supports_reasoning_features(model)
68-
request_kwargs = {
69-
'model': model,
70-
'input': messages,
71-
'max_output_tokens': max_tokens,
72-
'text_format': response_model, # type: ignore
73-
}
67+
"""Create a structured completion using Azure OpenAI.
7468
75-
temperature_value = temperature if not supports_reasoning else None
76-
if temperature_value is not None:
77-
request_kwargs['temperature'] = temperature_value
78-
79-
if supports_reasoning and reasoning:
80-
request_kwargs['reasoning'] = {'effort': reasoning} # type: ignore
81-
82-
if supports_reasoning and verbosity:
83-
request_kwargs['text'] = {'verbosity': verbosity} # type: ignore
69+
For reasoning models (GPT-5, o1, o3): uses responses.parse API
70+
For regular models (GPT-4o, etc): uses chat.completions with response_format
71+
"""
72+
supports_reasoning = self._supports_reasoning_features(model)
8473

85-
return await self.client.responses.parse(**request_kwargs)
74+
if supports_reasoning:
75+
# Use responses.parse for reasoning models (o1, o3, gpt-5)
76+
request_kwargs = {
77+
'model': model,
78+
'input': messages,
79+
'max_output_tokens': max_tokens,
80+
'text_format': response_model, # type: ignore
81+
}
82+
83+
if reasoning:
84+
request_kwargs['reasoning'] = {'effort': reasoning} # type: ignore
85+
86+
if verbosity:
87+
request_kwargs['text'] = {'verbosity': verbosity} # type: ignore
88+
89+
return await self.client.responses.parse(**request_kwargs)
90+
else:
91+
# Use beta.chat.completions.parse for non-reasoning models (gpt-4o, etc.)
92+
# Azure's v1 compatibility endpoint doesn't fully support responses.parse
93+
# for non-reasoning models, so we use the structured output API instead
94+
request_kwargs = {
95+
'model': model,
96+
'messages': messages,
97+
'max_tokens': max_tokens,
98+
'response_format': response_model, # Structured output
99+
}
100+
101+
if temperature is not None:
102+
request_kwargs['temperature'] = temperature
103+
104+
return await self.client.beta.chat.completions.parse(**request_kwargs)
86105

87106
async def _create_completion(
88107
self,
89108
model: str,
90109
messages: list[ChatCompletionMessageParam],
91110
temperature: float | None,
92111
max_tokens: int,
93-
response_model: type[BaseModel] | None = None,
112+
response_model: type[BaseModel] | None = None, # noqa: ARG002 - inherited from abstract method
94113
):
95114
"""Create a regular completion with JSON format using Azure OpenAI."""
96115
supports_reasoning = self._supports_reasoning_features(model)
@@ -108,6 +127,39 @@ async def _create_completion(
108127

109128
return await self.client.chat.completions.create(**request_kwargs)
110129

130+
def _handle_structured_response(self, response: Any) -> dict[str, Any]:
131+
"""Handle structured response parsing for both reasoning and non-reasoning models.
132+
133+
For reasoning models (responses.parse): uses response.output_text
134+
For regular models (beta.chat.completions.parse): uses response.choices[0].message.parsed
135+
"""
136+
# Check if this is a ParsedChatCompletion (from beta.chat.completions.parse)
137+
if hasattr(response, 'choices') and response.choices:
138+
# Standard ParsedChatCompletion format
139+
message = response.choices[0].message
140+
if hasattr(message, 'parsed') and message.parsed:
141+
# The parsed object is already a Pydantic model, convert to dict
142+
return message.parsed.model_dump()
143+
elif hasattr(message, 'refusal') and message.refusal:
144+
from graphiti_core.llm_client.errors import RefusalError
145+
146+
raise RefusalError(message.refusal)
147+
else:
148+
raise Exception(f'Invalid response from LLM: {response.model_dump()}')
149+
elif hasattr(response, 'output_text'):
150+
# Reasoning model response format (responses.parse)
151+
response_object = response.output_text
152+
if response_object:
153+
return json.loads(response_object)
154+
elif hasattr(response, 'refusal') and response.refusal:
155+
from graphiti_core.llm_client.errors import RefusalError
156+
157+
raise RefusalError(response.refusal)
158+
else:
159+
raise Exception(f'Invalid response from LLM: {response.model_dump()}')
160+
else:
161+
raise Exception(f'Unknown response format: {type(response)}')
162+
111163
@staticmethod
112164
def _supports_reasoning_features(model: str) -> bool:
113165
"""Return True when the Azure model supports reasoning/verbosity options."""

0 commit comments

Comments
 (0)