Phantom-Data: Towards a General Subject-Consistent Video Generation Dataset
Zhuowei Chen * , Bingchuan Li * †, Tianxiang Ma * , Lijie Liu * , Mingcong Liu, Yi Zhang, Gen Li, Xinghui Li, Siyu Zhou, Qian He, Xinglong Wu
* Equal contribution, † Project lead
Intelligent Creation Lab, ByteDance
- We released the dataset, built upon koala-36M, on Huggingface Phantom-data-Koala36M.
- Add more detailed instruction on how to use this dataset after the national vacation.
If Phantom-Data is helpful, please help to ⭐ the repo.
If you find this project useful for your research, please consider citing our paper.
@article{chen2025phantom-data,
title={Phantom-Data: Towards a General Subject-Consistent Video Generation Dataset},
author={Chen, Zhuowei and Li, Bingchuan and Ma, Tianxiang and Liu, Lijie and Liu, Mingcong and Zhang, Yi and Li, Gen and Li, Xinghui and Zhou, Siyu and He, Qian and Wu, Xinglong},
journal={arXiv preprint arXiv:2506.18851},
year={2025}
}If you have any comments or questions regarding this open-source project, please open a new issue or contact Zhuowei Chen.