Add OpenAI-compatible enrichment endpoint support#37
Conversation
- Add EnrichmentType enum to support both 'ollama' and 'openai' enrichment types - Implement _enrich_openai() method with OpenAI Vision API support - Refactor _enrich() to route to appropriate enrichment method based on type - Add api_key configuration option for OpenAI-compatible authentication - Update config validation to handle enrichment type and optional api_key - Update README with documentation for OpenAI-compatible enrichment - Add config.openai-example.json showing OpenAI configuration example - Make keep_alive validation conditional (only for Ollama enrichment) Co-Authored-By: Chris Dzombak <chris@dzombak.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
WalkthroughAdds a generic, pluggable vision enrichment system with two types (Ollama, OpenAI-compatible). Updates README to describe unified enrichment config. Introduces an OpenAI example config. Extends config parsing to validate enrichment.type and api_key. Implements dispatch in ntfy to call either Ollama or OpenAI enrichment paths with prompt-file handling and error/timeout handling. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Cam as Camera/Detector
participant N as Notifier
participant E as Enrichment (_enrich)
participant P as Prompt FS
participant S as Enrichment Service
note over N,E: Enrichment enabled
Cam->>N: ObjectNotification
N->>E: _enrich(notification)
E->>P: Read prompt file
alt type == OLLAMA
E->>S: POST /generate (image URL + prompt)
S-->>E: JSON { response }
else type == OPENAI
E->>E: Base64 image
E->>S: POST /chat.completions (messages + image) [Auth if api_key]
S-->>E: JSON { choices[0].message.content }
end
E-->>N: Enriched ObjectNotification (or original on error)
N-->>Cam: Continue pipeline
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Comment |
Co-Authored-By: Chris Dzombak <chris@dzombak.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
ntfy.py (1)
314-335: Ollama path: add resp.raise_for_status() and guard non‑JSON responsesCurrently you parse JSON without checking HTTP status; non‑JSON bodies will raise and bubble out. Harden this path.
try: resp = requests.post( self._config.enrichment.endpoint, json={ @@ }, timeout=self._config.enrichment.timeout_s, ) - parsed = resp.json() + resp.raise_for_status() + try: + parsed = resp.json() + except ValueError as e: + logger.error(f"enrichment returned non-JSON response: {e}") + logger.debug(f"raw response: {resp.text[:512]}") + return n except requests.Timeout: logger.error("enrichment request timed out") return n except requests.RequestException as e: logger.error(f"enrichment failed: {e}") return nREADME.md (1)
11-11: Tiny typosQuick wording fixes.
-... results in another notifiation ... +... results in another notification ...-To make `drivewway-monitor`'s API securely accessible ... +To make `driveway-monitor`'s API securely accessible ...-`driveway-monitor` is designed to facilite monitoring ... +`driveway-monitor` is designed to facilitate monitoring ...Also applies to: 417-417, 425-425
🧹 Nitpick comments (11)
ntfy.py (6)
75-85: from_str should raise ValueError instead of KeyErrorReturning a KeyError forces callers to rely on dict semantics. Prefer raising ValueError for bad input to match parsing expectations.
class EnrichmentType(Enum): @@ @staticmethod def from_str(etype: str) -> "EnrichmentType": - return { + try: + return { EnrichmentType.OLLAMA.value.lower(): EnrichmentType.OLLAMA, EnrichmentType.OPENAI.value.lower(): EnrichmentType.OPENAI, - }[etype.lower()] + }[etype.lower()] + except KeyError as e: + raise ValueError(f"unknown enrichment type: {etype}") from e
87-97: Don’t expose secrets in object repr; hide api_key in EnrichmentConfigIf this dataclass is logged or printed, the API key will be exposed.
@dataclasses.dataclass class EnrichmentConfig: @@ - api_key: Optional[str] = None + api_key: Optional[str] = dataclasses.field(default=None, repr=False)
345-347: Reduce log verbosity for raw model outputRaw model output can be large/noisy. Use debug.
- logger.info(f"response: {model_resp_str}") + logger.debug(f"response: {model_resp_str[:512]}")
391-413: OpenAI path: request JSON mode explicitlyAsk for a JSON object to reduce parsing failures across providers that support OpenAI’s response_format.
json={ "model": self._config.enrichment.model, "messages": [ @@ ], - "max_tokens": 300, + "max_tokens": 300, + "response_format": {"type": "json_object"}, },Note: If a specific “OpenAI‑compatible” provider doesn’t support response_format yet, this will 4xx; your existing error handling will fall back gracefully. If that becomes noisy, consider a config flag to toggle this.
437-439: Reduce log verbosity for raw model outputSame as the Ollama path: prefer debug and truncate.
- logger.info(f"response: {model_resp_str}") + logger.debug(f"response: {model_resp_str[:512]}")
342-371: DRY: unify model JSON parsingThe “parse -> expect keys -> build ObjectNotification” flow is duplicated. Consider extracting a small helper to parse and validate the model’s JSON for both paths.
+ def _parse_enrichment_json(self, logger, model_resp_str: str) -> Optional[str]: + try: + model_resp_parsed = json.loads(model_resp_str) + except json.JSONDecodeError as e: + logger.info(f"enrichment model did not produce valid JSON: {e}") + logger.debug(f"response: {model_resp_str[:512]}") + return None + if "type" not in model_resp_parsed and "error" not in model_resp_parsed: + logger.info("enrichment model did not produce expected JSON keys") + return None + desc = model_resp_parsed.get("desc", "unknown") + if not desc or desc == "unknown": + logger.info( + f"enrichment model could not produce a useful description: " + f"{model_resp_parsed.get('error', '(no error returned)')}" + ) + return None + return descThen call this helper in both _enrich_ollama and _enrich_openai.
Also applies to: 441-463
config.py (1)
304-317: Optional: support env‑based secrets to keep configs cleanConsider allowing api_key to be sourced from an environment variable (e.g., accept values like "env:OPENAI_API_KEY").
Happy to draft a small helper for “env:” expansion if you want it in this PR.
config.openai-example.json (1)
27-37: Avoid including secret‑looking placeholders; mitigate gitleaks noiseStatic analysis flags the sample token/api_key as secrets. Use unmistakably fake values to avoid false positives, or omit them entirely.
"notifier": { "server": "https://ntfy.example.com", - "token": "tk_0123456789ABCDEF", + "token": "REPLACE_ME_TOKEN", @@ "enrichment": { "enable": true, "type": "openai", "endpoint": "https://api.openai.com/v1/chat/completions", "model": "gpt-4o", - "api_key": "sk-your-api-key-here", + "api_key": "REPLACE_ME_OPENAI_API_KEY",Alternatively, remove these fields from the example and document setting them via environment variables.
Also applies to: 39-49
README.md (3)
9-10: Broaden optional enrichment blurb to match new OpenAI-compatible supportThis paragraph still mentions only Ollama. Suggest updating to reflect both enrichment types.
-Optionally, `driveway-monitor` can also use an instance of [Ollama](https://ollama.com) to provide a detailed description of the object that triggered the notification. +Optionally, `driveway-monitor` can enrich notifications using either an instance of [Ollama](https://ollama.com) or an OpenAI‑compatible vision API to provide a detailed description of the object that triggered the notification.
171-176: Clarify timeout default and prompt-file fallback behavior
- Document the default of
enrichment.timeout_s(if any).- State what happens when a classification has no entry in
prompt_files(skip enrichment vs. fallback prompt).Example tweak:
-You can change the timeout for enrichment to generate a response by setting `enrichment.timeout_s`... +You can change the timeout for enrichment by setting `enrichment.timeout_s` (default: <document default>)...-Using enrichment requires providing a _prompt file_ for each YOLO object classification ... +Provide a _prompt file_ for each YOLO classification you want to enrich. If a classification is missing, the system will <skip enrichment | use the default prompt at path X> (document actual behavior). Prompt files should be UTF‑8 text.
210-218: Config schema: add small safety notes
- Explicitly note that
api_keyshould be sourced from env/secrets and not committed.- For Azure/compatible providers, mention that
endpointformat may differ and may require provider-specific query params.- - `api_key`: (Optional) API key for authentication. Used for OpenAI-compatible endpoints. + - `api_key`: (Optional) API key for authentication. Used for OpenAI-compatible endpoints. Prefer providing via environment variable or Docker secret; avoid committing secrets to `config.json`.- - `endpoint`: Complete URL to the API endpoint. For Ollama: e.g. `http://localhost:11434/api/generate`. For OpenAI-compatible: e.g. `https://api.openai.com/v1/chat/completions`. + - `endpoint`: Complete URL to the API endpoint. For Ollama: e.g. `http://localhost:11434/api/generate`. For OpenAI-compatible: e.g. `https://api.openai.com/v1/chat/completions`. Some providers (e.g., Azure OpenAI) use different paths and require an `api-version` query parameter.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.md(2 hunks)config.openai-example.json(1 hunks)config.py(3 hunks)ntfy.py(3 hunks)
🧰 Additional context used
🪛 Gitleaks (8.28.0)
config.openai-example.json
[high] 29-29: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🔇 Additional comments (6)
ntfy.py (2)
293-300: Dispatch looks goodClear routing to Ollama vs OpenAI with a safe fallback log. LGTM.
467-468: Skip preload for non‑Ollama: goodEarly return avoids unnecessary calls. LGTM.
config.py (1)
252-262: Type parsing/validation looks correctGracefully maps strings to EnrichmentType with a clear error for invalid values. LGTM.
README.md (3)
153-156: Enrichment overview reads wellClear and matches the new capabilities.
163-164: Ollama example looks correctEndpoint and default type note are accurate.
165-170: OpenAI-compatible enrichment: update to current Chat/Responses API and secure key handling
- Replace the legacy
/v1/chat/completionsendpoint with the modern Chat/Responses API endpoint (e.g./v1/chat/responses) and target the latest multimodal model id (check the official OpenAI API docs for today’s exact model identifier).- Supply
enrichment.api_keyvia an environment variable or Docker secret instead of embedding it in JSON.- Note Azure OpenAI (and other compatible providers) may use different header names (
api-keyvsAuthorization: Bearer) and require anapi-versionquery parameter.Please verify the canonical endpoint path and current multimodal model id in the OpenAI API documentation and update the README accordingly.
| if cfg.notifier.enrichment.type == EnrichmentType.OLLAMA: | ||
| cfg.notifier.enrichment.keep_alive = enrichment_dict.get( | ||
| "keep_alive", cfg.notifier.enrichment.keep_alive | ||
| ) | ||
| if not isinstance(cfg.notifier.enrichment.keep_alive, str): | ||
| raise ConfigValidationError("enrichment.keep_alive must be a str") | ||
| cfg.notifier.enrichment.api_key = enrichment_dict.get( | ||
| "api_key", cfg.notifier.enrichment.api_key | ||
| ) | ||
| if not isinstance(cfg.notifier.enrichment.keep_alive, str): | ||
| raise ConfigValidationError("enrichment.keep_alive must be a str") | ||
| if cfg.notifier.enrichment.api_key is not None and not isinstance( | ||
| cfg.notifier.enrichment.api_key, str | ||
| ): | ||
| raise ConfigValidationError("enrichment.api_key must be a string") | ||
|
|
There was a problem hiding this comment.
Prevent sending API keys over plain HTTP
If type=openai and an api_key is provided, require an HTTPS endpoint to avoid leaking secrets.
if cfg.notifier.enrichment.type == EnrichmentType.OLLAMA:
cfg.notifier.enrichment.keep_alive = enrichment_dict.get(
"keep_alive", cfg.notifier.enrichment.keep_alive
)
if not isinstance(cfg.notifier.enrichment.keep_alive, str):
raise ConfigValidationError("enrichment.keep_alive must be a str")
cfg.notifier.enrichment.api_key = enrichment_dict.get(
"api_key", cfg.notifier.enrichment.api_key
)
if cfg.notifier.enrichment.api_key is not None and not isinstance(
cfg.notifier.enrichment.api_key, str
):
raise ConfigValidationError("enrichment.api_key must be a string")
+ if (
+ cfg.notifier.enrichment.type == EnrichmentType.OPENAI
+ and cfg.notifier.enrichment.api_key
+ and cfg.notifier.enrichment.endpoint.casefold().startswith("http://")
+ ):
+ raise ConfigValidationError(
+ "enrichment.endpoint must use https:// when enrichment.api_key is set"
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if cfg.notifier.enrichment.type == EnrichmentType.OLLAMA: | |
| cfg.notifier.enrichment.keep_alive = enrichment_dict.get( | |
| "keep_alive", cfg.notifier.enrichment.keep_alive | |
| ) | |
| if not isinstance(cfg.notifier.enrichment.keep_alive, str): | |
| raise ConfigValidationError("enrichment.keep_alive must be a str") | |
| cfg.notifier.enrichment.api_key = enrichment_dict.get( | |
| "api_key", cfg.notifier.enrichment.api_key | |
| ) | |
| if not isinstance(cfg.notifier.enrichment.keep_alive, str): | |
| raise ConfigValidationError("enrichment.keep_alive must be a str") | |
| if cfg.notifier.enrichment.api_key is not None and not isinstance( | |
| cfg.notifier.enrichment.api_key, str | |
| ): | |
| raise ConfigValidationError("enrichment.api_key must be a string") | |
| if cfg.notifier.enrichment.type == EnrichmentType.OLLAMA: | |
| cfg.notifier.enrichment.keep_alive = enrichment_dict.get( | |
| "keep_alive", cfg.notifier.enrichment.keep_alive | |
| ) | |
| if not isinstance(cfg.notifier.enrichment.keep_alive, str): | |
| raise ConfigValidationError("enrichment.keep_alive must be a str") | |
| cfg.notifier.enrichment.api_key = enrichment_dict.get( | |
| "api_key", cfg.notifier.enrichment.api_key | |
| ) | |
| if cfg.notifier.enrichment.api_key is not None and not isinstance( | |
| cfg.notifier.enrichment.api_key, str | |
| ): | |
| raise ConfigValidationError("enrichment.api_key must be a string") | |
| if ( | |
| cfg.notifier.enrichment.type == EnrichmentType.OPENAI | |
| and cfg.notifier.enrichment.api_key | |
| and cfg.notifier.enrichment.endpoint.casefold().startswith("http://") | |
| ): | |
| raise ConfigValidationError( | |
| "enrichment.endpoint must use https:// when enrichment.api_key is set" | |
| ) |
🤖 Prompt for AI Agents
In config.py around lines 304-317, add validation for EnrichmentType.OPENAI:
when cfg.notifier.enrichment.type == EnrichmentType.OPENAI and an api_key is
provided, ensure the enrichment endpoint (cfg.notifier.enrichment.endpoint) is
present and starts with "https://"; if it is missing or uses plain "http://"
raise ConfigValidationError("enrichment.endpoint must be an HTTPS URL when using
an api_key") so API keys are not sent over plain HTTP.
This PR adds support for using OpenAI-compatible vision API endpoints for enrichment, in addition to the existing Ollama support.
Changes
EnrichmentTypeenum to support both 'ollama' and 'openai' enrichment types_enrich_openai()method with OpenAI Vision API support using the standard chat completions format_enrich()to route to the appropriate enrichment method based on configured typeapi_keyconfiguration option for OpenAI-compatible authenticationkeep_alivevalidation conditional (only required for Ollama enrichment)config.openai-example.jsonshowing OpenAI configuration exampleConfiguration
The new
enrichment.typefield supports two values:ollama(default): Uses existing Ollama API formatopenai: Uses OpenAI Vision API format with Bearer token authenticationExample OpenAI configuration:
{ "enrichment": { "enable": true, "type": "openai", "endpoint": "https://api.openai.com/v1/chat/completions", "model": "gpt-4o", "api_key": "sk-...", "timeout_s": 10, "prompt_files": { "car": "enrichment-prompts/llava_prompt_car.txt" } } }Compatibility
This change is fully backward compatible - existing configurations without the
typefield will default toollamabehavior.Link to Devin run: https://app.devin.ai/sessions/bfc8622697694196822f707afb40690c
Requested by: Chris Dzombak (chris@dzombak.com) (@cdzombak)
Summary by CodeRabbit
New Features
Documentation