Skip to content

Superpose Task-specific Features for Model Merging. EMNLP 2025

License

Notifications You must be signed in to change notification settings

LARS-research/STF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Superpose Task-specific Features for Model Merging (EMNLP 2025)

code of EMNLP2025 paper: Superpose Task-specific Features for Model Merging

  • Authors: Haiquan Qiu, You Wu, Dong Li, Jianmin Guo, Quanming Yao
  • Paper:

File Structure

stf/
│
├── merge/                         # Core STF merging scripts 
│   ├── stf_lora.py                # GPT-2 LoRA adapter merging
│   ├── stf_t5.py                  # T5 merging
│   ├── stf_vit.py                 # ViT merging
|   └── utils.py                   # common function: state dict - vector converting, merge matrix
│
├── eval/
│   └── lora/                      # LoRA evaluate utilities
│       ├── data/                  # E2E / DART / WebNLG raw & formatted data
│       ├── eval/                  # Official / third‑party generation metrics
|       ├── merged_checkpoint/
|       ├── pretrained_checkpoints 
│       ├── src/                   # GPT-2 decoding, beam search, wrappers
│       ├── trained_models/        # LoRA checkpoints of finetuned models
|       ├── create_dataset.sh
|       ├── download_pretrained_checkpoints.sh
|       └── eval.sh
│
|    └── t5/
|       ├── merged_checkpoint/
|       └── src/
|           ├── data/                  # Dataset readers & batching
|           ├── eval/                  # Evaluation & scoring components
|           ├── model/                 # T5Wrapper, loading & merge ops
|           ├── train/                 # Training configuration
|           └── utils/                 # Distributed + general utilities
|           └── inference.py           # evaluate utils
│
|    └── ViT/
|       ├── checkpoints/
|       ├── datasets/
|       ├── merged_checkpoint/
|       ├── src/
│       └── download.sh
├── LICENSE
├── README.md
├── requirements.txt
└── THIRD_PARTY_LICENSES.md

Core Files

  • stf_lora.py: LoRA (Low-Rank Adaptation) merging algorithms
  • stf_t5.py: T5 model merging implementation
  • stf_vit.py: Vision Transformer model merging implementation
  • utils.py: merging utilities: merge matrix (core function), state_dict_to_vector, vector_to_state_dict

Setup

  1. Create a virtual environment and activate it.
conda create --name stf python=3.9
conda activate stf

  1. Install dependencies
python -m pip install -r requirements.txt 

Download Checkpoints and Datasets

T5

ViT

model checkpoint

download (7 ViT finetuned checkpoints, pretrained checkpoint and classification heads) to eval/ViT/Checkpoints/

https://drive.google.com/drive/folders/1u_Tva6x0p6oxu5Eo0ZZsf-520Cc_3MKw

gdown --folder https://drive.google.com/drive/folders/1u_Tva6x0p6oxu5Eo0ZZsf-520Cc_3MKw -O eval/ViT/checkpoints
dataset
cd eval/ViT
bash download.sh
cd ../..

LoRA

cd eval/LoRA/
model checkpoints

download 3 GPT2-Medium LoRA checkpoints to eval/LoRA/trained_models/GPT2_M/

E2E

DART

WebNLG

mkdir -p eval/LoRA/trained_models/GPT2_M
wget https://github.com/microsoft/LoRA/releases/download/GPT-2/gpt2_md_lora_e2e.pt -O eval/LoRA/trained_models/GPT2_M/gpt2_md_lora_e2e.pt

wget https://github.com/microsoft/LoRA/releases/download/GPT-2/gpt2_md_lora_dart.pt -O eval/LoRA/trained_models/GPT2_M/gpt2_md_lora_dart.pt

wget https://github.com/microsoft/LoRA/releases/download/GPT-2/gpt2_md_lora_webnlg.pt -O eval/LoRA/trained_models/GPT2_M/gpt2_md_lora_webnlg.pt

download pretrained checkpoints to stf/eval/LoRA/pretrained_checkpoints

bash download_pretrained_ckeckpoints.sh
dataset

create datasets at stf/eval/LoRA/data

bash create_datasets.sh

download eval utils to stf/eval/LoRA/eval

bash eval/download_evalscript.sh
cd GenerationEval
bash install_dependencies.sh
cd ../../../..

LLM

3 huggingface Llama-2-7B checkpoints:

LinkSoul/Chinese-Llama-2-7b

meta-math/MetaMath-7B-V1.0

qualis2006/llama-2-7b-int4-python-code-18k

Merge expriment and Test performance

ViT-B-32

python /merge/stf_vit.py

t5

python /merge/stf_t5.py

LoRA (GPT2-M)

python /merge/stf_lora.py
cd eval/LoRA
bash eval.sh

Reference

@inproceedings{qiu2025stf,
  title     = {Superpose Task-specific Features for Model Merging},
  author    = {Qiu Haiquan and Wu You and Li Dong and Guo, Jianmin and Yao, Quanming},
  booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year      = {2025},
  publisher = {Association for Computational Linguistics},
  note      = {Corresponding author: Quanming Yao (qyaoaa@tsinghua.edu.cn)}
}

About

Superpose Task-specific Features for Model Merging. EMNLP 2025

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •