Skip to content

fix rotary embedding for transformers v5#4303

Merged
lvhan028 merged 5 commits intoInternLM:mainfrom
grimoire:support-trans-v5
Feb 5, 2026
Merged

fix rotary embedding for transformers v5#4303
lvhan028 merged 5 commits intoInternLM:mainfrom
grimoire:support-trans-v5

Conversation

@grimoire
Copy link
Collaborator

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
  3. If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

Copilot AI review requested due to automatic review settings January 28, 2026 06:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates RoPE (rotary embedding) configuration handling to improve compatibility with Transformers v5 configs by introducing helpers to read rope_theta and (partially) rope_parameters.

Changes:

  • Added _get_rope_parameters() and get_rope_theta() helpers in rotary_embedding.py.
  • Switched multiple model implementations from direct config.rope_theta access to get_rope_theta(config).
  • Updated RoPE scaling param extraction in rotary_embedding.py to read from rope_parameters when present.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
lmdeploy/pytorch/nn/rotary_embedding.py Adds v5-aware helpers and updates RoPE param/theta extraction logic.
lmdeploy/pytorch/models/phi3_moe.py Uses get_rope_theta when building rotary embedding.
lmdeploy/pytorch/models/minicpm3.py Uses get_rope_theta when building rotary embedding.
lmdeploy/pytorch/models/llama4.py Uses get_rope_theta for vision rotary frequency computation.
lmdeploy/pytorch/models/internlm2_ve.py Uses get_rope_theta when building rotary embedding.
lmdeploy/pytorch/models/deepseek_v32.py Uses get_rope_theta when building rotary embedding params.
lmdeploy/pytorch/models/deepseek_v2.py Uses get_rope_theta when building rotary embedding params.
lmdeploy/pytorch/models/deepseek_mtp.py Uses get_rope_theta when building rotary embedding params.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 189 to 197
def get_rope_theta(config: PretrainedConfig, default: int = 100000) -> int:
"""Get rope theta from config."""
if hasattr(config, 'rope_parameters'):
# for transformers v5
rope_base = config.rope_parameters.get('rope_theta', default)
else:
rope_base = config.rope_theta
return rope_base

Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new Transformers v5 compatibility paths (_get_rope_parameters / get_rope_theta) don’t appear to be covered by tests. Adding a small unit test that constructs a dummy PretrainedConfig with rope_parameters (and without rope_scaling / rope_theta) would help prevent regressions in RoPE parameter parsing.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@RunningLeon RunningLeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvhan028 lvhan028 merged commit c0e07f1 into InternLM:main Feb 5, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants