Skip to content

Add stop sequences support for response generation#12

Open
scouzi1966 wants to merge 1 commit intomainfrom
claude/new-feature-proposal-011CUVxoYP8TQgaX3kDGSfve
Open

Add stop sequences support for response generation#12
scouzi1966 wants to merge 1 commit intomainfrom
claude/new-feature-proposal-011CUVxoYP8TQgaX3kDGSfve

Conversation

@scouzi1966
Copy link
Owner

@scouzi1966 scouzi1966 commented Oct 26, 2025

This commit implements stop sequences functionality, allowing users to specify strings where the model should stop generating text. This is a standard OpenAI API feature that enhances output control.

Features:

  • CLI parameter: --stop "seq1,seq2" (comma-separated stop sequences)
  • API parameter: "stop": ["seq1", "seq2"] (array of strings)
  • Works in both streaming and non-streaming modes
  • Stop sequences from CLI and API are merged with duplicates removed
  • The stop sequence itself is excluded from the output
  • Stops at the earliest occurrence when multiple sequences match

Implementation:

  • Added --stop parameter to RootCommand and ServeCommand in main.swift
  • Updated Server.swift to accept and pass stop parameter
  • Updated ChatCompletionsController with mergeStopSequences helper
  • Updated FoundationModelService with applyStopSequences method
  • Enhanced CLAUDE.md documentation with examples and usage

Use cases:

  • Structured output formatting (JSON, XML, etc.)
  • Limiting response length at specific markers
  • Multi-step generation with clear boundaries
  • Code generation with stop markers

🤖 Generated with Claude Code

Summary by Sourcery

Implement configurable stop sequences for response generation by extending CLI and API interfaces, merging inputs, and truncating outputs at specified markers.

New Features:

  • Add support for stop sequences in both CLI (--stop) and API ("stop") parameters
  • Merge and dedupe stop sequences from CLI and API inputs
  • Truncate generated output at the earliest stop sequence and exclude the sequence itself
  • Enable stop sequences in both streaming and non-streaming response modes

Documentation:

  • Update documentation with stop sequence usage examples and parameter details

This commit implements stop sequences functionality, allowing users to
specify strings where the model should stop generating text. This is a
standard OpenAI API feature that enhances output control.

Features:
- CLI parameter: --stop "seq1,seq2" (comma-separated stop sequences)
- API parameter: "stop": ["seq1", "seq2"] (array of strings)
- Works in both streaming and non-streaming modes
- Stop sequences from CLI and API are merged with duplicates removed
- The stop sequence itself is excluded from the output
- Stops at the earliest occurrence when multiple sequences match

Implementation:
- Added --stop parameter to RootCommand and ServeCommand in main.swift
- Updated Server.swift to accept and pass stop parameter
- Updated ChatCompletionsController with mergeStopSequences helper
- Updated FoundationModelService with applyStopSequences method
- Enhanced CLAUDE.md documentation with examples and usage

Use cases:
- Structured output formatting (JSON, XML, etc.)
- Limiting response length at specific markers
- Multi-step generation with clear boundaries
- Code generation with stop markers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sourcery-ai
Copy link

sourcery-ai bot commented Oct 26, 2025

Reviewer's Guide

This PR adds support for stop sequences by introducing a new CLI/API parameter, merging sequences from both sources, truncating model output at the earliest stop match, and updating documentation and logging accordingly.

Sequence diagram for merging and applying stop sequences during response generation

sequenceDiagram
    participant CLI as actor CLI User
    participant Root as RootCommand
    participant Serve as ServeCommand
    participant Server as Server
    participant Chat as ChatCompletionsController
    participant Model as FoundationModelService

    CLI->>Root: Provide --stop parameter
    Root->>Serve: Pass stop parameter
    Serve->>Server: Pass stop parameter
    Server->>Chat: Pass stop parameter
    Chat->>Chat: mergeStopSequences(cliStop, apiStop)
    Chat->>Model: generateResponse(..., stop: mergedStop)
    Model->>Model: applyStopSequences(content, stopSequences)
    Model-->>Chat: Return truncated content
    Chat-->>Server: Return response
    Server-->>Serve: Return response
    Serve-->>Root: Return response
    Root-->>CLI: Output response
Loading

Class diagram for stop sequence support in response generation

classDiagram
    class RootCommand {
        +temperature: Double?
        +randomness: String?
        +permissiveGuardrails: Bool
        +stop: String?
        +run()
        -runSinglePrompt(prompt: String, adapter: String?)
    }
    class ServeCommand {
        +temperature: Double?
        +randomness: String?
        +permissiveGuardrails: Bool
        +stop: String?
        +run()
    }
    class Server {
        +temperature: Double?
        +randomness: String?
        +permissiveGuardrails: Bool
        +stop: String?
        +constructor(..., stop: String?)
    }
    class ChatCompletionsController {
        +temperature: Double?
        +randomness: String?
        +permissiveGuardrails: Bool
        +stop: String?
        +constructor(..., stop: String?)
        -mergeStopSequences(cliStop: String?, apiStop: [String]?): [String]?
    }
    class FoundationModelService {
        +generateResponse(..., stop: [String]?): String
        +generateStreamingResponseWithTiming(..., stop: [String]?): (String, Double)
        -applyStopSequences(content: String, stopSequences: [String]?): String
    }
    RootCommand --> ServeCommand
    ServeCommand --> Server
    Server --> ChatCompletionsController
    ChatCompletionsController --> FoundationModelService
Loading

File-Level Changes

Change Details Files
Introduce stop sequence parameter and propagate through CLI/server
  • Add @option stop to RootCommand and ServeCommand
  • Extend Server init signature to accept stop parameter
  • Pass stop value when instantiating Server
  • Include stop in debug logging output
Sources/MacLocalAPI/main.swift
Sources/MacLocalAPI/Server.swift
Merge stop sequences from CLI and API in controller
  • Add stop property to ChatCompletionsController init
  • Implement mergeStopSequences helper to combine and dedupe sequences
  • Use merged stop sequences in generateResponse and generateStreaming calls
Sources/MacLocalAPI/Controllers/ChatCompletionsController.swift
Apply stop sequences in model service
  • Extend generateResponse and generateStreamingResponseWithTiming to accept stop list
  • Insert applyStopSequences call to truncate content
  • Add private applyStopSequences method to locate and remove earliest stop
Sources/MacLocalAPI/Models/FoundationModelService.swift
Document stop sequences usage and examples
  • Add Stop Sequences section with CLI/API usage in CLAUDE.md
  • Include stop examples in build/test commands
  • Revise advanced sampling section for consistency
CLAUDE.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Consider centralizing the parsing of comma-separated stop sequences into an array at the CLI parsing stage to avoid duplicating splitting logic in both RootCommand and ChatCompletionsController.
  • The applyStopSequences helper truncates only after the full response is available; for true streaming support, consider detecting and halting on stop sequences mid-stream so you don’t emit extra chunks.
  • Filter out empty or whitespace-only stop sequences after splitting to avoid unintended early truncation when users supply consecutive commas or trailing commas.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider centralizing the parsing of comma-separated stop sequences into an array at the CLI parsing stage to avoid duplicating splitting logic in both RootCommand and ChatCompletionsController.
- The applyStopSequences helper truncates only after the full response is available; for true streaming support, consider detecting and halting on stop sequences mid-stream so you don’t emit extra chunks.
- Filter out empty or whitespace-only stop sequences after splitting to avoid unintended early truncation when users supply consecutive commas or trailing commas.

## Individual Comments

### Comment 1
<location> `Sources/MacLocalAPI/main.swift:290-291` </location>
<code_context>
                     let message = Message(role: "user", content: prompt)
                     DebugLogger.log("Generating response...")
-                    let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness)
+                    let stopSequences = stop?.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) }
+                    let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)
                     DebugLogger.log("Response generated successfully")
                     result = .success(response)
</code_context>

<issue_to_address>
**suggestion:** Consider extracting stop sequence parsing into a shared utility to avoid duplication.

Centralizing the stop sequence parsing will help maintain consistency and simplify future updates to the parsing logic.

Suggested implementation:

```
                    let stopSequences = StopSequenceParser.parse(stop)
                    let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)

```

```
struct StopSequenceParser {
    static func parse(_ stop: String?) -> [String]? {
        guard let stop = stop else { return nil }
        return stop.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) }
    }
}

```
</issue_to_address>

### Comment 2
<location> `Sources/MacLocalAPI/Models/FoundationModelService.swift:611-612` </location>
<code_context>
+            return content
+        }
+
+        var shortestStopIndex: String.Index? = nil
+        var foundStop = false
+
+        // Find the earliest occurrence of any stop sequence
</code_context>

<issue_to_address>
**nitpick:** Using a boolean flag for foundStop is redundant since shortestStopIndex can serve the same purpose.

Consider removing the foundStop flag and rely on shortestStopIndex being nil to indicate whether a stop was found.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +290 to +291
let stopSequences = stop?.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) }
let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider extracting stop sequence parsing into a shared utility to avoid duplication.

Centralizing the stop sequence parsing will help maintain consistency and simplify future updates to the parsing logic.

Suggested implementation:

                    let stopSequences = StopSequenceParser.parse(stop)
                    let response = try await foundationService.generateResponse(for: [message], temperature: temperature, randomness: randomness, stop: stopSequences)

struct StopSequenceParser {
    static func parse(_ stop: String?) -> [String]? {
        guard let stop = stop else { return nil }
        return stop.split(separator: ",").map { String($0.trimmingCharacters(in: .whitespaces)) }
    }
}

Comment on lines +611 to +612
var shortestStopIndex: String.Index? = nil
var foundStop = false
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Using a boolean flag for foundStop is redundant since shortestStopIndex can serve the same purpose.

Consider removing the foundStop flag and rely on shortestStopIndex being nil to indicate whether a stop was found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants