Skip to content

Negative KV sequence length error in Attention op#4316

Open
jinminxi104 wants to merge 1 commit intoInternLM:mainfrom
jinminxi104:fix_kvseq
Open

Negative KV sequence length error in Attention op#4316
jinminxi104 wants to merge 1 commit intoInternLM:mainfrom
jinminxi104:fix_kvseq

Conversation

@jinminxi104
Copy link
Collaborator

While testing on the Ascend platform, the attention operator encountered an error due to a negative KV sequence length.
(shareGPT case with 10k prompt + qwen-235B)
Root Cause: The issue occurs when ignore_history is too large in a dp configuration where dp > 1.
Error Logs:

[[36m(RayWorkerWrapper pid=2472948)^[[0m 2026-02-02 12:06:39,393 - lmdeploy - ERROR - model_inputs.py:112 - ignore: tensor([323,   0,   0,   0,          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
 44672 ^[[36m(RayWorkerWrapper pid=2472948)^[[0m           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0]
 the value of the key/value's actual sequence lengths[0] must be greater than or equal to 0, but it i       s -294[FUNC:ParseActualSeqLens][FILE:incre_flash_attention_tiling.cc][LINE:875] 

This behavior was introduced in #4265; versions prior to this PR do not exhibit the bug.

@jinminxi104 jinminxi104 marked this pull request as ready for review February 2, 2026 15:33
Copilot AI review requested due to automatic review settings February 2, 2026 15:33
@jinminxi104 jinminxi104 changed the title reset ignore_history to 0 Negative KV sequence length error in Attention op Feb 2, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a dp>1 decoding failure (negative KV sequence length) by ensuring num_ignored_history does not retain stale/incorrect values when sliding-window attention is disabled.

Changes:

  • When cache_config.window_size <= 0, create_model_inputs_delta() now emits a zero num_ignored_history tensor (instead of None) so downstream decoding-input updates reset ignore history to 0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@grimoire grimoire self-requested a review February 3, 2026 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants