Skip to content

Fix CUDA illegal memory access bug in monotonic_rnnt#14

Merged
SimBe195 merged 2 commits intomainfrom
mono_rnnt_illegal_memory_fix
Feb 3, 2026
Merged

Fix CUDA illegal memory access bug in monotonic_rnnt#14
SimBe195 merged 2 commits intomainfrom
mono_rnnt_illegal_memory_fix

Conversation

@SimBe195
Copy link
Contributor

@SimBe195 SimBe195 commented Feb 2, 2026

The size of the gradient tensor in monotonic RNN-T loss computation is essentially B * T * (S+1) * V. For larger vocabulary sizes and sequence lengths, this size can overflow the signed 32-bit integer limit. In the current implementation of the gradient CUDA kernel, the index for writing into the gradient tensor (grads[bts * *V + v] = ...) has a datatype of int, so such an overflow leads to a negative index and thus an illegal memory access error. Changing the datatype to int64_t fixes the issue.

@SimBe195 SimBe195 merged commit 422d579 into main Feb 3, 2026
1 check passed
@SimBe195 SimBe195 deleted the mono_rnnt_illegal_memory_fix branch February 3, 2026 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants