Skip to content

Cap max_tokens and n to prevent local API DoS#3637

Open
arditbe wants to merge 2 commits intonomic-ai:mainfrom
arditbe:fix-dos-max-tokens-n
Open

Cap max_tokens and n to prevent local API DoS#3637
arditbe wants to merge 2 commits intonomic-ai:mainfrom
arditbe:fix-dos-max-tokens-n

Conversation

@arditbe
Copy link

@arditbe arditbe commented Dec 6, 2025

My Changes

Added server-side upper bounds for max_tokens and n in BaseCompletionRequest::parseImpl.
Requests exceeding these limits now return 400 via InvalidRequestError, preventing memory and CPU exhaustion in
/v1/completions and /v1/chat/completions.

Issue ticket number and link

Fixes #3635

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • I have added thorough documentation for my code.
  • I have tagged PR with relevant project labels.
  • If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution.

Demo

N/A (validation change)

Steps to Reproduce

  1. Start GPT4All with API server enabled.
  2. Send a request with very large max_tokens or n.
  3. Observe the server now responds with 400 instead of consuming excessive resources.

Notes

Limits used: max_tokens <= 4096, n <= 8. Happy to adjust per maintainer preference.

This PR adds server-side upper bounds for max_tokens and n in BaseCompletionRequest::parseImpl. Requests exceeding limits now return 400 via InvalidRequestError, preventing memory/CPU exhaustion on /v1/completions and /v1/chat/completions. Fixes nomic-ai#3635.

Signed-off-by: ardit <88629825+arditbe@users.noreply.github.com>
Signed-off-by: ardit <88629825+arditbe@users.noreply.github.com>
@arditbe
Copy link
Author

arditbe commented Dec 6, 2025

Hi! This PR adds server-side caps for max_tokens and n in BaseCompletionRequest to prevent local API DoS on /v1/completions and /v1/chat/completions. Fixes #3635. Happy to adjust limits or add tests if you prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Uncontrolled Resource Consumption in /v1/completions and /v1/chat/completions Endpoints (Memory + CPU DoS)

1 participant