Prompt cache length default by aidando73 · Pull Request #61 · fw-ai/benchmark

aidando73 · 2026-02-22T14:34:34Z

Auto-compute prompt cache length in benchmark load test to enable caching.

Previously, --prompt-cache-max-len defaulted to 0, causing TranslationDataset to build an empty shared prefix (common_tokens=0). This meant prompt caching was effectively disabled even when --prompt-cache-max-pct was set. The fix changes the default of --prompt-cache-max-len to None and, when it's not explicitly provided but --prompt-cache-max-pct is, it calculates common_tokens as int(prompt_tokens * prompt_cache_max_pct / 100). This ensures that passing --prompt-cache-max-pct 100 --prompt-tokens 8192 now correctly generates a shared prefix for caching.

…_pct When --prompt-cache-max-pct is set but --prompt-cache-max-len is not explicitly provided, auto-compute common_tokens as int(prompt_tokens * prompt_cache_max_pct / 100). This makes '--prompt-cache-max-pct 100 --prompt-tokens 8192' do the intuitive thing: build an 8192-token shared prefix so the server can actually cache it. Previously, --prompt-cache-max-len defaulted to 0, which meant TranslationDataset always built an empty shared prefix regardless of --prompt-cache-max-pct, defeating prompt caching. Changed --prompt-cache-max-len default from 0 to None so we can distinguish 'not provided' from 'explicitly set to 0'. Co-authored-by: Aidan Do <aidando73@gmail.com>

cursor · 2026-02-22T14:34:35Z

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
_{Learn more about Cursor Agents}

cursor · 2026-02-22T14:35:58Z

llm_bench/load_test.py

+        help="Maximum number of shared prefix tokens across requests. "
+        "If not specified but --prompt-cache-max-pct is set, auto-computed as "
+        "int(prompt_tokens * prompt_cache_max_pct / 100). Defaults to 0 (no shared prefix) "
+        "when neither this nor --prompt-cache-max-pct is provided.",


Done. The default for --prompt-cache-max-len is back to 0. The auto-computation logic now works as:

If --prompt-cache-max-len is explicitly set to a non-zero value, use it directly.

Otherwise, if --prompt-cache-max-pct is provided, auto-compute common_tokens = int(prompt_tokens * prompt_cache_max_pct / 100).

If neither is meaningfully set, common_tokens stays 0.

So --prompt-cache-max-pct 100 --prompt-tokens 8192 works on its own, producing common_tokens = 8192.

Per review feedback, keep the default at 0 instead of None. The auto-computation from --prompt-cache-max-pct now triggers when prompt_cache_max_len is 0 (the default) and prompt_cache_max_pct is set. An explicit non-zero --prompt-cache-max-len still takes precedence. Co-authored-by: Aidan Do <aidando73@gmail.com>

Cleaner idiom: use None to mean 'not provided' rather than overloading 0. Behaviour is identical. Co-authored-by: Aidan Do <aidando73@gmail.com>

aidando73 requested a review from a team February 22, 2026 14:34

cursor bot reviewed Feb 22, 2026

View reviewed changes

cursoragent and others added 2 commits February 22, 2026 14:36

Revert to None default for --prompt-cache-max-len

9460832

Cleaner idiom: use None to mean 'not provided' rather than overloading 0. Behaviour is identical. Co-authored-by: Aidan Do <aidando73@gmail.com>

aidando73 marked this pull request as ready for review February 22, 2026 14:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Prompt cache length default#61

Prompt cache length default#61
aidando73 wants to merge 3 commits intomainfrom
cursor/prompt-cache-length-default-bfb2

aidando73 commented Feb 22, 2026

Uh oh!

cursor bot commented Feb 22, 2026

Uh oh!

cursor bot Feb 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

aidando73 commented Feb 22, 2026

Uh oh!

cursor bot commented Feb 22, 2026

Uh oh!

cursor bot Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cursor bot Feb 22, 2026 •

edited

Loading