Conversation
Signed-off-by: Boris Bliznioukov <blib@mail.com>
There was a problem hiding this comment.
Pull request overview
Adds support for configuring a larger local model context window (via chat request options) to reduce silent failures when processing exceeds the default context limit.
Changes:
- Extend local chat request payloads to include
options.num_ctx. - Add a human-readable error mapping for context-limit related failures.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| Dayflow/Dayflow/Core/AI/OllamaProvider.swift | Adds options.num_ctx to local chat-completions requests based on a stored/default context size. |
| Dayflow/Dayflow/Core/AI/LLMService.swift | Improves user-facing error message when failures appear related to context limits. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private var preferredContextSize: Int? { | ||
| let stored = UserDefaults.standard.integer(forKey: "llmLocalContextSize") | ||
| if stored > 0 { return stored } | ||
| return 8192 | ||
| } |
There was a problem hiding this comment.
preferredContextSize reads llmLocalContextSize, but this key isn’t written anywhere else in the codebase, so it will always fall back to 8192 for all users. That means the app will always override the backend/model default context size even when the user hasn’t opted in, and there’s no way to disable it (setting 0 still becomes 8192). Consider returning nil when no explicit value is set (e.g., by checking object(forKey:)), and only applying a default when you’re sure it’s safe for the selected engine/model.
| return 8192 | ||
| } | ||
|
|
||
| private func makeLocalOptions() -> ChatOptions? { |
There was a problem hiding this comment.
makeLocalOptions() currently applies options.num_ctx unconditionally, but OllamaProvider is also used for LM Studio and “custom” OpenAI-compatible endpoints (see isLMStudio/isCustomEngine and the /v1/chat/completions URL). Since options/num_ctx are non-standard for OpenAI chat-completions, this can break strict servers. Consider gating these options to engines/endpoints that explicitly support them (or feature-detect via a capability check).
| private func makeLocalOptions() -> ChatOptions? { | |
| private func makeLocalOptions() -> ChatOptions? { | |
| // Only attach non-standard `options`/`num_ctx` for native Ollama endpoints. | |
| // LM Studio and custom engines typically use strict OpenAI-compatible schemas. | |
| if isLMStudio || isCustomEngine { | |
| return nil | |
| } | |
|
|
||
| switch true { | ||
| case errorDescription.contains("context window") || errorDescription.contains("context length") || errorDescription.contains("max context") || errorDescription.contains("num_ctx"): | ||
| return "The local AI hit its context limit. Increase the context size in your local model settings (e.g., LM Studio or Ollama) or process a shorter recording." |
There was a problem hiding this comment.
This fallback maps any “context length/window” error to a local AI instruction, but getHumanReadableError is used for all providers (Gemini/Dayflow backend/etc.). If a cloud provider returns a context-limit error, this message will be misleading. Consider branching on providerType (or the error domain) to emit local-specific guidance only for .ollamaLocal, and a more general context-limit message for other providers.
| return "The local AI hit its context limit. Increase the context size in your local model settings (e.g., LM Studio or Ollama) or process a shorter recording." | |
| if case .ollamaLocal = providerType { | |
| // Local AI-specific guidance | |
| return "The local AI hit its context limit. Increase the context size in your local model settings (e.g., LM Studio or Ollama) or process a shorter recording." | |
| } else { | |
| // Provider-agnostic context limit guidance | |
| return "The AI hit its context limit. Try processing a shorter recording or reducing the amount of input, or adjust the context/window settings in your AI provider's configuration." | |
| } |
Issue #202
Processing sometimes fails silently with the default context window (~4096 tokens). Just stops and transcript just fails. After increasing the model context size manually, processing works as expected.