Skip to content

Comments

Add OpenAI-compatible enrichment endpoint support#37

Merged
cdzombak merged 2 commits intomainfrom
devin/1759171879-openai-enrichment-support
Sep 29, 2025
Merged

Add OpenAI-compatible enrichment endpoint support#37
cdzombak merged 2 commits intomainfrom
devin/1759171879-openai-enrichment-support

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Sep 29, 2025

This PR adds support for using OpenAI-compatible vision API endpoints for enrichment, in addition to the existing Ollama support.

Changes

  • Added EnrichmentType enum to support both 'ollama' and 'openai' enrichment types
  • Implemented _enrich_openai() method with OpenAI Vision API support using the standard chat completions format
  • Refactored _enrich() to route to the appropriate enrichment method based on configured type
  • Added api_key configuration option for OpenAI-compatible authentication
  • Updated config validation to handle enrichment type and optional api_key
  • Made keep_alive validation conditional (only required for Ollama enrichment)
  • Updated README with comprehensive documentation for both enrichment types
  • Added config.openai-example.json showing OpenAI configuration example

Configuration

The new enrichment.type field supports two values:

  • ollama (default): Uses existing Ollama API format
  • openai: Uses OpenAI Vision API format with Bearer token authentication

Example OpenAI configuration:

{
  "enrichment": {
    "enable": true,
    "type": "openai",
    "endpoint": "https://api.openai.com/v1/chat/completions",
    "model": "gpt-4o",
    "api_key": "sk-...",
    "timeout_s": 10,
    "prompt_files": {
      "car": "enrichment-prompts/llava_prompt_car.txt"
    }
  }
}

Compatibility

This change is fully backward compatible - existing configurations without the type field will default to ollama behavior.


Link to Devin run: https://app.devin.ai/sessions/bfc8622697694196822f707afb40690c

Requested by: Chris Dzombak (chris@dzombak.com) (@cdzombak)

Summary by CodeRabbit

  • New Features

    • Added OpenAI-compatible vision enrichment alongside existing Ollama option.
    • New enrichment.type selector (openai or ollama) with optional API key support.
    • Unified enrichment timeout setting and improved validation with clear errors for invalid types.
    • Ollama-specific keep_alive now applies only when using Ollama.
  • Documentation

    • Updated README to describe a generic vision AI enrichment flow and revised config keys.
    • Added config.openai-example.json demonstrating OpenAI setup, prompts per class, and notifier/web settings.

- Add EnrichmentType enum to support both 'ollama' and 'openai' enrichment types
- Implement _enrich_openai() method with OpenAI Vision API support
- Refactor _enrich() to route to appropriate enrichment method based on type
- Add api_key configuration option for OpenAI-compatible authentication
- Update config validation to handle enrichment type and optional api_key
- Update README with documentation for OpenAI-compatible enrichment
- Add config.openai-example.json showing OpenAI configuration example
- Make keep_alive validation conditional (only for Ollama enrichment)

Co-Authored-By: Chris Dzombak <chris@dzombak.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@coderabbitai
Copy link

coderabbitai bot commented Sep 29, 2025

Walkthrough

Adds a generic, pluggable vision enrichment system with two types (Ollama, OpenAI-compatible). Updates README to describe unified enrichment config. Introduces an OpenAI example config. Extends config parsing to validate enrichment.type and api_key. Implements dispatch in ntfy to call either Ollama or OpenAI enrichment paths with prompt-file handling and error/timeout handling.

Changes

Cohort / File(s) Summary
Documentation
README.md
Rewrites enrichment docs to a generic vision AI API model; documents new fields (enable, type, endpoint, model, prompt_files, timeout_s, api_key, keep_alive); adjusts general timeout language; minor text fixes.
Config Example
config.openai-example.json
Adds OpenAI-compatible example config showing enrichment.type=openai, endpoint/model/api_key, prompt_files per class, and existing model/tracker/notifier/web settings.
Config Parsing & Validation
config.py
Imports EnrichmentType and parses enrichment.type via from_str; validates api_key (string) and keep_alive only for Ollama; guards invalid types.
Enrichment Engine
ntfy.py
Adds EnrichmentType enum; extends EnrichmentConfig with type and api_key; routes _enrich to _enrich_ollama or _enrich_openai; implements both paths with prompt reading, request/timeout handling, response parsing, and fallback behavior; keeps preload limited to Ollama.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Cam as Camera/Detector
  participant N as Notifier
  participant E as Enrichment (_enrich)
  participant P as Prompt FS
  participant S as Enrichment Service
  note over N,E: Enrichment enabled

  Cam->>N: ObjectNotification
  N->>E: _enrich(notification)
  E->>P: Read prompt file
  alt type == OLLAMA
    E->>S: POST /generate (image URL + prompt)
    S-->>E: JSON { response }
  else type == OPENAI
    E->>E: Base64 image
    E->>S: POST /chat.completions (messages + image) [Auth if api_key]
    S-->>E: JSON { choices[0].message.content }
  end
  E-->>N: Enriched ObjectNotification (or original on error)
  N-->>Cam: Continue pipeline
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I twitch my whiskers at branching flows,
Two paths to wisdom where the prompt wind blows.
Ollama’s burrow, OpenAI sky,
I hop between with a keen, bright eye.
Logs in my paws, timeouts kept tight—
Enriched and swift, I bound into night. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly captures the primary change of the pull request, which is adding support for OpenAI-compatible enrichment endpoints alongside existing Ollama support. It is concise, clear, and directly references the new functionality without extraneous detail. By focusing on the introduction of OpenAI-compatible enrichment endpoint support, the title informs a reviewer of the most significant update at a glance. Therefore it aligns with the guidelines for a concise and descriptive pull request title.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1759171879-openai-enrichment-support

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Chris Dzombak <chris@dzombak.com>
@cdzombak
Copy link
Owner

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Sep 29, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
ntfy.py (1)

314-335: Ollama path: add resp.raise_for_status() and guard non‑JSON responses

Currently you parse JSON without checking HTTP status; non‑JSON bodies will raise and bubble out. Harden this path.

         try:
             resp = requests.post(
                 self._config.enrichment.endpoint,
                 json={
@@
                 },
                 timeout=self._config.enrichment.timeout_s,
             )
-            parsed = resp.json()
+            resp.raise_for_status()
+            try:
+                parsed = resp.json()
+            except ValueError as e:
+                logger.error(f"enrichment returned non-JSON response: {e}")
+                logger.debug(f"raw response: {resp.text[:512]}")
+                return n
         except requests.Timeout:
             logger.error("enrichment request timed out")
             return n
         except requests.RequestException as e:
             logger.error(f"enrichment failed: {e}")
             return n
README.md (1)

11-11: Tiny typos

Quick wording fixes.

-... results in another notifiation ...
+... results in another notification ...
-To make `drivewway-monitor`'s API securely accessible ...
+To make `driveway-monitor`'s API securely accessible ...
-`driveway-monitor` is designed to facilite monitoring ...
+`driveway-monitor` is designed to facilitate monitoring ...

Also applies to: 417-417, 425-425

🧹 Nitpick comments (11)
ntfy.py (6)

75-85: from_str should raise ValueError instead of KeyError

Returning a KeyError forces callers to rely on dict semantics. Prefer raising ValueError for bad input to match parsing expectations.

 class EnrichmentType(Enum):
@@
     @staticmethod
     def from_str(etype: str) -> "EnrichmentType":
-        return {
+        try:
+            return {
             EnrichmentType.OLLAMA.value.lower(): EnrichmentType.OLLAMA,
             EnrichmentType.OPENAI.value.lower(): EnrichmentType.OPENAI,
-        }[etype.lower()]
+            }[etype.lower()]
+        except KeyError as e:
+            raise ValueError(f"unknown enrichment type: {etype}") from e

87-97: Don’t expose secrets in object repr; hide api_key in EnrichmentConfig

If this dataclass is logged or printed, the API key will be exposed.

 @dataclasses.dataclass
 class EnrichmentConfig:
@@
-    api_key: Optional[str] = None
+    api_key: Optional[str] = dataclasses.field(default=None, repr=False)

345-347: Reduce log verbosity for raw model output

Raw model output can be large/noisy. Use debug.

-            logger.info(f"response: {model_resp_str}")
+            logger.debug(f"response: {model_resp_str[:512]}")

391-413: OpenAI path: request JSON mode explicitly

Ask for a JSON object to reduce parsing failures across providers that support OpenAI’s response_format.

                 json={
                     "model": self._config.enrichment.model,
                     "messages": [
@@
                     ],
-                    "max_tokens": 300,
+                    "max_tokens": 300,
+                    "response_format": {"type": "json_object"},
                 },

Note: If a specific “OpenAI‑compatible” provider doesn’t support response_format yet, this will 4xx; your existing error handling will fall back gracefully. If that becomes noisy, consider a config flag to toggle this.


437-439: Reduce log verbosity for raw model output

Same as the Ollama path: prefer debug and truncate.

-            logger.info(f"response: {model_resp_str}")
+            logger.debug(f"response: {model_resp_str[:512]}")

342-371: DRY: unify model JSON parsing

The “parse -> expect keys -> build ObjectNotification” flow is duplicated. Consider extracting a small helper to parse and validate the model’s JSON for both paths.

+    def _parse_enrichment_json(self, logger, model_resp_str: str) -> Optional[str]:
+        try:
+            model_resp_parsed = json.loads(model_resp_str)
+        except json.JSONDecodeError as e:
+            logger.info(f"enrichment model did not produce valid JSON: {e}")
+            logger.debug(f"response: {model_resp_str[:512]}")
+            return None
+        if "type" not in model_resp_parsed and "error" not in model_resp_parsed:
+            logger.info("enrichment model did not produce expected JSON keys")
+            return None
+        desc = model_resp_parsed.get("desc", "unknown")
+        if not desc or desc == "unknown":
+            logger.info(
+                f"enrichment model could not produce a useful description: "
+                f"{model_resp_parsed.get('error', '(no error returned)')}"
+            )
+            return None
+        return desc

Then call this helper in both _enrich_ollama and _enrich_openai.

Also applies to: 441-463

config.py (1)

304-317: Optional: support env‑based secrets to keep configs clean

Consider allowing api_key to be sourced from an environment variable (e.g., accept values like "env:OPENAI_API_KEY").

Happy to draft a small helper for “env:” expansion if you want it in this PR.

config.openai-example.json (1)

27-37: Avoid including secret‑looking placeholders; mitigate gitleaks noise

Static analysis flags the sample token/api_key as secrets. Use unmistakably fake values to avoid false positives, or omit them entirely.

   "notifier": {
     "server": "https://ntfy.example.com",
-    "token": "tk_0123456789ABCDEF",
+    "token": "REPLACE_ME_TOKEN",
@@
   "enrichment": {
     "enable": true,
     "type": "openai",
     "endpoint": "https://api.openai.com/v1/chat/completions",
     "model": "gpt-4o",
-    "api_key": "sk-your-api-key-here",
+    "api_key": "REPLACE_ME_OPENAI_API_KEY",

Alternatively, remove these fields from the example and document setting them via environment variables.

Also applies to: 39-49

README.md (3)

9-10: Broaden optional enrichment blurb to match new OpenAI-compatible support

This paragraph still mentions only Ollama. Suggest updating to reflect both enrichment types.

-Optionally, `driveway-monitor` can also use an instance of [Ollama](https://ollama.com) to provide a detailed description of the object that triggered the notification.
+Optionally, `driveway-monitor` can enrich notifications using either an instance of [Ollama](https://ollama.com) or an OpenAI‑compatible vision API to provide a detailed description of the object that triggered the notification.

171-176: Clarify timeout default and prompt-file fallback behavior

  • Document the default of enrichment.timeout_s (if any).
  • State what happens when a classification has no entry in prompt_files (skip enrichment vs. fallback prompt).

Example tweak:

-You can change the timeout for enrichment to generate a response by setting `enrichment.timeout_s`...
+You can change the timeout for enrichment by setting `enrichment.timeout_s` (default: <document default>)...
-Using enrichment requires providing a _prompt file_ for each YOLO object classification ...
+Provide a _prompt file_ for each YOLO classification you want to enrich. If a classification is missing, the system will <skip enrichment | use the default prompt at path X> (document actual behavior). Prompt files should be UTF‑8 text.

210-218: Config schema: add small safety notes

  • Explicitly note that api_key should be sourced from env/secrets and not committed.
  • For Azure/compatible providers, mention that endpoint format may differ and may require provider-specific query params.
-  - `api_key`: (Optional) API key for authentication. Used for OpenAI-compatible endpoints.
+  - `api_key`: (Optional) API key for authentication. Used for OpenAI-compatible endpoints. Prefer providing via environment variable or Docker secret; avoid committing secrets to `config.json`.
-  - `endpoint`: Complete URL to the API endpoint. For Ollama: e.g. `http://localhost:11434/api/generate`. For OpenAI-compatible: e.g. `https://api.openai.com/v1/chat/completions`.
+  - `endpoint`: Complete URL to the API endpoint. For Ollama: e.g. `http://localhost:11434/api/generate`. For OpenAI-compatible: e.g. `https://api.openai.com/v1/chat/completions`. Some providers (e.g., Azure OpenAI) use different paths and require an `api-version` query parameter.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22103ec and f03810d.

📒 Files selected for processing (4)
  • README.md (2 hunks)
  • config.openai-example.json (1 hunks)
  • config.py (3 hunks)
  • ntfy.py (3 hunks)
🧰 Additional context used
🪛 Gitleaks (8.28.0)
config.openai-example.json

[high] 29-29: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🔇 Additional comments (6)
ntfy.py (2)

293-300: Dispatch looks good

Clear routing to Ollama vs OpenAI with a safe fallback log. LGTM.


467-468: Skip preload for non‑Ollama: good

Early return avoids unnecessary calls. LGTM.

config.py (1)

252-262: Type parsing/validation looks correct

Gracefully maps strings to EnrichmentType with a clear error for invalid values. LGTM.

README.md (3)

153-156: Enrichment overview reads well

Clear and matches the new capabilities.


163-164: Ollama example looks correct

Endpoint and default type note are accurate.


165-170: OpenAI-compatible enrichment: update to current Chat/Responses API and secure key handling

  • Replace the legacy /v1/chat/completions endpoint with the modern Chat/Responses API endpoint (e.g. /v1/chat/responses) and target the latest multimodal model id (check the official OpenAI API docs for today’s exact model identifier).
  • Supply enrichment.api_key via an environment variable or Docker secret instead of embedding it in JSON.
  • Note Azure OpenAI (and other compatible providers) may use different header names (api-key vs Authorization: Bearer) and require an api-version query parameter.

Please verify the canonical endpoint path and current multimodal model id in the OpenAI API documentation and update the README accordingly.

Comment on lines +304 to 317
if cfg.notifier.enrichment.type == EnrichmentType.OLLAMA:
cfg.notifier.enrichment.keep_alive = enrichment_dict.get(
"keep_alive", cfg.notifier.enrichment.keep_alive
)
if not isinstance(cfg.notifier.enrichment.keep_alive, str):
raise ConfigValidationError("enrichment.keep_alive must be a str")
cfg.notifier.enrichment.api_key = enrichment_dict.get(
"api_key", cfg.notifier.enrichment.api_key
)
if not isinstance(cfg.notifier.enrichment.keep_alive, str):
raise ConfigValidationError("enrichment.keep_alive must be a str")
if cfg.notifier.enrichment.api_key is not None and not isinstance(
cfg.notifier.enrichment.api_key, str
):
raise ConfigValidationError("enrichment.api_key must be a string")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prevent sending API keys over plain HTTP

If type=openai and an api_key is provided, require an HTTPS endpoint to avoid leaking secrets.

         if cfg.notifier.enrichment.type == EnrichmentType.OLLAMA:
             cfg.notifier.enrichment.keep_alive = enrichment_dict.get(
                 "keep_alive", cfg.notifier.enrichment.keep_alive
             )
             if not isinstance(cfg.notifier.enrichment.keep_alive, str):
                 raise ConfigValidationError("enrichment.keep_alive must be a str")
         cfg.notifier.enrichment.api_key = enrichment_dict.get(
             "api_key", cfg.notifier.enrichment.api_key
         )
         if cfg.notifier.enrichment.api_key is not None and not isinstance(
             cfg.notifier.enrichment.api_key, str
         ):
             raise ConfigValidationError("enrichment.api_key must be a string")
+        if (
+            cfg.notifier.enrichment.type == EnrichmentType.OPENAI
+            and cfg.notifier.enrichment.api_key
+            and cfg.notifier.enrichment.endpoint.casefold().startswith("http://")
+        ):
+            raise ConfigValidationError(
+                "enrichment.endpoint must use https:// when enrichment.api_key is set"
+            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if cfg.notifier.enrichment.type == EnrichmentType.OLLAMA:
cfg.notifier.enrichment.keep_alive = enrichment_dict.get(
"keep_alive", cfg.notifier.enrichment.keep_alive
)
if not isinstance(cfg.notifier.enrichment.keep_alive, str):
raise ConfigValidationError("enrichment.keep_alive must be a str")
cfg.notifier.enrichment.api_key = enrichment_dict.get(
"api_key", cfg.notifier.enrichment.api_key
)
if not isinstance(cfg.notifier.enrichment.keep_alive, str):
raise ConfigValidationError("enrichment.keep_alive must be a str")
if cfg.notifier.enrichment.api_key is not None and not isinstance(
cfg.notifier.enrichment.api_key, str
):
raise ConfigValidationError("enrichment.api_key must be a string")
if cfg.notifier.enrichment.type == EnrichmentType.OLLAMA:
cfg.notifier.enrichment.keep_alive = enrichment_dict.get(
"keep_alive", cfg.notifier.enrichment.keep_alive
)
if not isinstance(cfg.notifier.enrichment.keep_alive, str):
raise ConfigValidationError("enrichment.keep_alive must be a str")
cfg.notifier.enrichment.api_key = enrichment_dict.get(
"api_key", cfg.notifier.enrichment.api_key
)
if cfg.notifier.enrichment.api_key is not None and not isinstance(
cfg.notifier.enrichment.api_key, str
):
raise ConfigValidationError("enrichment.api_key must be a string")
if (
cfg.notifier.enrichment.type == EnrichmentType.OPENAI
and cfg.notifier.enrichment.api_key
and cfg.notifier.enrichment.endpoint.casefold().startswith("http://")
):
raise ConfigValidationError(
"enrichment.endpoint must use https:// when enrichment.api_key is set"
)
🤖 Prompt for AI Agents
In config.py around lines 304-317, add validation for EnrichmentType.OPENAI:
when cfg.notifier.enrichment.type == EnrichmentType.OPENAI and an api_key is
provided, ensure the enrichment endpoint (cfg.notifier.enrichment.endpoint) is
present and starts with "https://"; if it is missing or uses plain "http://"
raise ConfigValidationError("enrichment.endpoint must be an HTTPS URL when using
an api_key") so API keys are not sent over plain HTTP.

@cdzombak cdzombak merged commit 04665ac into main Sep 29, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant