code of EMNLP2025 paper: Superpose Task-specific Features for Model Merging
- Authors: Haiquan Qiu, You Wu, Dong Li, Jianmin Guo, Quanming Yao
- Paper:
stf/
│
├── merge/ # Core STF merging scripts
│ ├── stf_lora.py # GPT-2 LoRA adapter merging
│ ├── stf_t5.py # T5 merging
│ ├── stf_vit.py # ViT merging
| └── utils.py # common function: state dict - vector converting, merge matrix
│
├── eval/
│ └── lora/ # LoRA evaluate utilities
│ ├── data/ # E2E / DART / WebNLG raw & formatted data
│ ├── eval/ # Official / third‑party generation metrics
| ├── merged_checkpoint/
| ├── pretrained_checkpoints
│ ├── src/ # GPT-2 decoding, beam search, wrappers
│ ├── trained_models/ # LoRA checkpoints of finetuned models
| ├── create_dataset.sh
| ├── download_pretrained_checkpoints.sh
| └── eval.sh
│
| └── t5/
| ├── merged_checkpoint/
| └── src/
| ├── data/ # Dataset readers & batching
| ├── eval/ # Evaluation & scoring components
| ├── model/ # T5Wrapper, loading & merge ops
| ├── train/ # Training configuration
| └── utils/ # Distributed + general utilities
| └── inference.py # evaluate utils
│
| └── ViT/
| ├── checkpoints/
| ├── datasets/
| ├── merged_checkpoint/
| ├── src/
│ └── download.sh
├── LICENSE
├── README.md
├── requirements.txt
└── THIRD_PARTY_LICENSES.md
- stf_lora.py: LoRA (Low-Rank Adaptation) merging algorithms
- stf_t5.py: T5 model merging implementation
- stf_vit.py: Vision Transformer model merging implementation
- utils.py: merging utilities: merge matrix (core function), state_dict_to_vector, vector_to_state_dict
- Create a virtual environment and activate it.
conda create --name stf python=3.9
conda activate stf
- Install dependencies
python -m pip install -r requirements.txt
download (7 ViT finetuned checkpoints, pretrained checkpoint and classification heads) to eval/ViT/Checkpoints/
https://drive.google.com/drive/folders/1u_Tva6x0p6oxu5Eo0ZZsf-520Cc_3MKw
gdown --folder https://drive.google.com/drive/folders/1u_Tva6x0p6oxu5Eo0ZZsf-520Cc_3MKw -O eval/ViT/checkpoints
cd eval/ViT
bash download.sh
cd ../..
cd eval/LoRA/
download 3 GPT2-Medium LoRA checkpoints to eval/LoRA/trained_models/GPT2_M/
mkdir -p eval/LoRA/trained_models/GPT2_M
wget https://github.com/microsoft/LoRA/releases/download/GPT-2/gpt2_md_lora_e2e.pt -O eval/LoRA/trained_models/GPT2_M/gpt2_md_lora_e2e.pt
wget https://github.com/microsoft/LoRA/releases/download/GPT-2/gpt2_md_lora_dart.pt -O eval/LoRA/trained_models/GPT2_M/gpt2_md_lora_dart.pt
wget https://github.com/microsoft/LoRA/releases/download/GPT-2/gpt2_md_lora_webnlg.pt -O eval/LoRA/trained_models/GPT2_M/gpt2_md_lora_webnlg.pt
download pretrained checkpoints to stf/eval/LoRA/pretrained_checkpoints
bash download_pretrained_ckeckpoints.sh
create datasets at stf/eval/LoRA/data
bash create_datasets.sh
download eval utils to stf/eval/LoRA/eval
bash eval/download_evalscript.sh
cd GenerationEval
bash install_dependencies.sh
cd ../../../..
3 huggingface Llama-2-7B checkpoints:
qualis2006/llama-2-7b-int4-python-code-18k
python /merge/stf_vit.py
python /merge/stf_t5.py
python /merge/stf_lora.py
cd eval/LoRA
bash eval.sh
@inproceedings{qiu2025stf,
title = {Superpose Task-specific Features for Model Merging},
author = {Qiu Haiquan and Wu You and Li Dong and Guo, Jianmin and Yao, Quanming},
booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
year = {2025},
publisher = {Association for Computational Linguistics},
note = {Corresponding author: Quanming Yao (qyaoaa@tsinghua.edu.cn)}
}