Fix qwen_vl key error in dist converter by ShareLer · Pull Request #706 · alibaba/Pai-Megatron-Patch

ShareLer · 2025-11-12T02:18:44Z

when use the high version of the transformers (> = 4.52.0) convert qwen_vl dist ckpt (hf - > McOre), weight name from model is diff to the name from index file, resulting in failure.

error stack:

[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/qwen2_5_vl/h2m_synchronizer.py", line 26, in sync_params
[rank0]: super().sync_params(self._mgmodel.language_model, self._hfmodel.model.language_model)
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/synchronizer.py", line 118, in sync_params
[rank0]: self.set_preprocess_state(mg_model=mg_model, hf_model=hf_model)
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/h2m_synchronizer.py", line 109, in set_preprocess_state
[rank0]: self.copy(
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/synchronizer.py", line 150, in copy
[rank0]: return self._copy_impl(src_tensor, dst_tensor, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/h2m_synchronizer.py", line 105, in copy_impl
[rank0]: dst_tensor.data.copy(split_mappingparam_type)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/h2m_synchronizer.py", line 90, in
[rank0]: ParamType.COLUMN: lambda x: torch.chunk(self.load_tensor(x), tp_size, dim=0)[tp_rank],
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/qwen2_5_vl/h2m_synchronizer.py", line 54, in load_tensor
[rank0]: file = _get_filename_from_key(key)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/qwen2_5_vl/h2m_synchronizer.py", line 41, in _get_filename_from_key
[rank0]: raise KeyError(f'{key} not found in index file')
[rank0]: KeyError: 'model.language_model.embed_tokens.weight not found in index file'

Fix qwen_vl key error in dist converter

2387f6c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix qwen_vl key error in dist converter#706

Fix qwen_vl key error in dist converter#706
ShareLer wants to merge 1 commit intoalibaba:mainfrom
ShareLer:fix_qwen_vl

ShareLer commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ShareLer commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant