Skip to content

Fix qwen_vl key error in dist converter#706

Open
ShareLer wants to merge 1 commit intoalibaba:mainfrom
ShareLer:fix_qwen_vl
Open

Fix qwen_vl key error in dist converter#706
ShareLer wants to merge 1 commit intoalibaba:mainfrom
ShareLer:fix_qwen_vl

Conversation

@ShareLer
Copy link

when use the high version of the transformers (> = 4.52.0) convert qwen_vl dist ckpt (hf - > McOre), weight name from model is diff to the name from index file, resulting in failure.

error stack:

[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/qwen2_5_vl/h2m_synchronizer.py", line 26, in sync_params
[rank0]: super().sync_params(self._mgmodel.language_model, self._hfmodel.model.language_model)
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/synchronizer.py", line 118, in sync_params
[rank0]: self.set_preprocess_state(mg_model=mg_model, hf_model=hf_model)
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/h2m_synchronizer.py", line 109, in set_preprocess_state
[rank0]: self.copy(
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/synchronizer.py", line 150, in copy
[rank0]: return self._copy_impl(src_tensor, dst_tensor, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/h2m_synchronizer.py", line 105, in copy_impl
[rank0]: dst_tensor.data.copy
(split_mappingparam_type)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/general/h2m_synchronizer.py", line 90, in
[rank0]: ParamType.COLUMN: lambda x: torch.chunk(self.load_tensor(x), tp_size, dim=0)[tp_rank],
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/qwen2_5_vl/h2m_synchronizer.py", line 54, in load_tensor
[rank0]: file = _get_filename_from_key(key)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/shareler/LLM/pai-megatron-patch-didi/toolkits/distributed_checkpoints_convertor/impl/qwen2_5_vl/h2m_synchronizer.py", line 41, in _get_filename_from_key
[rank0]: raise KeyError(f'{key} not found in index file')
[rank0]: KeyError: 'model.language_model.embed_tokens.weight not found in index file'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant