Add stop sequences support for response generation by scouzi1966 · Pull Request #12 · scouzi1966/maclocal-api

scouzi1966 · 2025-10-26T14:37:39Z

This commit implements stop sequences functionality, allowing users to specify strings where the model should stop generating text. This is a standard OpenAI API feature that enhances output control.

Features:

CLI parameter: --stop "seq1,seq2" (comma-separated stop sequences)
API parameter: "stop": ["seq1", "seq2"] (array of strings)
Works in both streaming and non-streaming modes
Stop sequences from CLI and API are merged with duplicates removed
The stop sequence itself is excluded from the output
Stops at the earliest occurrence when multiple sequences match

Implementation:

Added --stop parameter to RootCommand and ServeCommand in main.swift
Updated Server.swift to accept and pass stop parameter
Updated ChatCompletionsController with mergeStopSequences helper
Updated FoundationModelService with applyStopSequences method
Enhanced CLAUDE.md documentation with examples and usage

Use cases:

Structured output formatting (JSON, XML, etc.)
Limiting response length at specific markers
Multi-step generation with clear boundaries
Code generation with stop markers

🤖 Generated with Claude Code

Summary by Sourcery

Implement configurable stop sequences for response generation by extending CLI and API interfaces, merging inputs, and truncating outputs at specified markers.

New Features:

Add support for stop sequences in both CLI (--stop) and API ("stop") parameters
Merge and dedupe stop sequences from CLI and API inputs
Truncate generated output at the earliest stop sequence and exclude the sequence itself
Enable stop sequences in both streaming and non-streaming response modes

Documentation:

Update documentation with stop sequence usage examples and parameter details

This commit implements stop sequences functionality, allowing users to specify strings where the model should stop generating text. This is a standard OpenAI API feature that enhances output control. Features: - CLI parameter: --stop "seq1,seq2" (comma-separated stop sequences) - API parameter: "stop": ["seq1", "seq2"] (array of strings) - Works in both streaming and non-streaming modes - Stop sequences from CLI and API are merged with duplicates removed - The stop sequence itself is excluded from the output - Stops at the earliest occurrence when multiple sequences match Implementation: - Added --stop parameter to RootCommand and ServeCommand in main.swift - Updated Server.swift to accept and pass stop parameter - Updated ChatCompletionsController with mergeStopSequences helper - Updated FoundationModelService with applyStopSequences method - Enhanced CLAUDE.md documentation with examples and usage Use cases: - Structured output formatting (JSON, XML, etc.) - Limiting response length at specific markers - Multi-step generation with clear boundaries - Code generation with stop markers 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

sourcery-ai · 2025-10-26T14:37:45Z

Reviewer's Guide

This PR adds support for stop sequences by introducing a new CLI/API parameter, merging sequences from both sources, truncating model output at the earliest stop match, and updating documentation and logging accordingly.

Sequence diagram for merging and applying stop sequences during response generation

sequenceDiagram
    participant CLI as actor CLI User
    participant Root as RootCommand
    participant Serve as ServeCommand
    participant Server as Server
    participant Chat as ChatCompletionsController
    participant Model as FoundationModelService

    CLI->>Root: Provide --stop parameter
    Root->>Serve: Pass stop parameter
    Serve->>Server: Pass stop parameter
    Server->>Chat: Pass stop parameter
    Chat->>Chat: mergeStopSequences(cliStop, apiStop)
    Chat->>Model: generateResponse(..., stop: mergedStop)
    Model->>Model: applyStopSequences(content, stopSequences)
    Model-->>Chat: Return truncated content
    Chat-->>Server: Return response
    Server-->>Serve: Return response
    Serve-->>Root: Return response
    Root-->>CLI: Output response

Class diagram for stop sequence support in response generation

classDiagram
    class RootCommand {
        +temperature: Double?
        +randomness: String?
        +permissiveGuardrails: Bool
        +stop: String?
        +run()
        -runSinglePrompt(prompt: String, adapter: String?)
    }
    class ServeCommand {
        +temperature: Double?
        +randomness: String?
        +permissiveGuardrails: Bool
        +stop: String?
        +run()
    }
    class Server {
        +temperature: Double?
        +randomness: String?
        +permissiveGuardrails: Bool
        +stop: String?
        +constructor(..., stop: String?)
    }
    class ChatCompletionsController {
        +temperature: Double?
        +randomness: String?
        +permissiveGuardrails: Bool
        +stop: String?
        +constructor(..., stop: String?)
        -mergeStopSequences(cliStop: String?, apiStop: [String]?): [String]?
    }
    class FoundationModelService {
        +generateResponse(..., stop: [String]?): String
        +generateStreamingResponseWithTiming(..., stop: [String]?): (String, Double)
        -applyStopSequences(content: String, stopSequences: [String]?): String
    }
    RootCommand --> ServeCommand
    ServeCommand --> Server
    Server --> ChatCompletionsController
    ChatCompletionsController --> FoundationModelService

File-Level Changes

Change	Details	Files
Introduce stop sequence parameter and propagate through CLI/server	Add @option stop to RootCommand and ServeCommand Extend Server init signature to accept stop parameter Pass stop value when instantiating Server Include stop in debug logging output	`Sources/MacLocalAPI/main.swift` `Sources/MacLocalAPI/Server.swift`
Merge stop sequences from CLI and API in controller	Add stop property to ChatCompletionsController init Implement mergeStopSequences helper to combine and dedupe sequences Use merged stop sequences in generateResponse and generateStreaming calls	`Sources/MacLocalAPI/Controllers/ChatCompletionsController.swift`
Apply stop sequences in model service	Extend generateResponse and generateStreamingResponseWithTiming to accept stop list Insert applyStopSequences call to truncate content Add private applyStopSequences method to locate and remove earliest stop	`Sources/MacLocalAPI/Models/FoundationModelService.swift`
Document stop sequences usage and examples	Add Stop Sequences section with CLI/API usage in CLAUDE.md Include stop examples in build/test commands Revise advanced sampling section for consistency	`CLAUDE.md`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

Consider centralizing the parsing of comma-separated stop sequences into an array at the CLI parsing stage to avoid duplicating splitting logic in both RootCommand and ChatCompletionsController.
The applyStopSequences helper truncates only after the full response is available; for true streaming support, consider detecting and halting on stop sequences mid-stream so you don’t emit extra chunks.
Filter out empty or whitespace-only stop sequences after splitting to avoid unintended early truncation when users supply consecutive commas or trailing commas.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Consider centralizing the parsing of comma-separated stop sequences into an array at the CLI parsing stage to avoid duplicating splitting logic in both RootCommand and ChatCompletionsController.
- The applyStopSequences helper truncates only after the full response is available; for true streaming support, consider detecting and halting on stop sequences mid-stream so you don’t emit extra chunks.
- Filter out empty or whitespace-only stop sequences after splitting to avoid unintended early truncation when users supply consecutive commas or trailing commas.

## Individual Comments

### Comment 1
<location> `Sources/MacLocalAPI/main.swift:290-291` </location>
<code_context>
                     let message = Message(role: "user", content: prompt)
                     DebugLogger.log("Generating response...")
-                    let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness)
+                    let stopSequences = stop?.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) }
+                    let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)
                     DebugLogger.log("Response generated successfully")
                     result = .success(response)
</code_context>

<issue_to_address>
**suggestion:** Consider extracting stop sequence parsing into a shared utility to avoid duplication.

Centralizing the stop sequence parsing will help maintain consistency and simplify future updates to the parsing logic.

Suggested implementation:

```
                    let stopSequences = StopSequenceParser.parse(stop)
                    let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)

```

```
struct StopSequenceParser {
    static func parse(_ stop: String?) -> [String]? {
        guard let stop = stop else { return nil }
        return stop.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) }
    }
}

```
</issue_to_address>

### Comment 2
<location> `Sources/MacLocalAPI/Models/FoundationModelService.swift:611-612` </location>
<code_context>
+            return content
+        }
+
+        var shortestStopIndex: String.Index? = nil
+        var foundStop = false
+
+        // Find the earliest occurrence of any stop sequence
</code_context>

<issue_to_address>
**nitpick:** Using a boolean flag for foundStop is redundant since shortestStopIndex can serve the same purpose.

Consider removing the foundStop flag and rely on shortestStopIndex being nil to indicate whether a stop was found.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-10-26T14:38:30Z

Sources/MacLocalAPI/main.swift

+                    let stopSequences = stop?.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) }
+                    let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)


suggestion: Consider extracting stop sequence parsing into a shared utility to avoid duplication.

Centralizing the stop sequence parsing will help maintain consistency and simplify future updates to the parsing logic.

Suggested implementation:

let stopSequences = StopSequenceParser.parse(stop) let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)

struct StopSequenceParser { static func parse(_ stop: String?) -> [String]? { guard let stop = stop else { return nil } return stop.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) } } }

sourcery-ai · 2025-10-26T14:38:30Z

Sources/MacLocalAPI/Models/FoundationModelService.swift

+        var shortestStopIndex: String.Index? = nil
+        var foundStop = false


nitpick: Using a boolean flag for foundStop is redundant since shortestStopIndex can serve the same purpose.

Consider removing the foundStop flag and rely on shortestStopIndex being nil to indicate whether a stop was found.

sourcery-ai bot reviewed Oct 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add stop sequences support for response generation#12

Add stop sequences support for response generation#12
scouzi1966 wants to merge 1 commit intomainfrom
claude/new-feature-proposal-011CUVxoYP8TQgaX3kDGSfve

scouzi1966 commented Oct 26, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Oct 26, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Oct 26, 2025

Uh oh!

sourcery-ai bot Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		let stopSequences = stop?.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) }
		let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)

		var shortestStopIndex: String.Index? = nil
		var foundStop = false

Conversation

scouzi1966 commented Oct 26, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for merging and applying stop sequences during response generation

Class diagram for stop sequence support in response generation

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

scouzi1966 commented Oct 26, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Oct 26, 2025 •

edited

Loading