-
Notifications
You must be signed in to change notification settings - Fork 19
Description
The LFM code is very comprehensive and is a very good work. We tried to reproduce your results. The FID of lsun_church dataset in your repo is 5.54, but we got the 6.61 (best). We believe that this paper is authentic and reliable. Can you help me see where I went wrong? Thank you very much. On the lsun_church dataset, we used the following command to start the program in A6000:
accelerate launch --multi_gpu --num_processes 10 train_flow_latent.py --exp church_f8_dit --dataset lsun_church --datadir <our/data/dir> --batch_size 48 --num_workers 4 --num_epoch 600 --image_size 256 --f 8 --num_in_channels 4 --num_out_channels 4 --nf 256 --ch_mult 1 2 3 4 --attn_resolution 16 8 4 --num_res_blocks 2 --lr 1e-4 --scale_factor 0.18215 --no_lr_decay --model_type DiT-L/2 --num_classes 1 --label_dropout 0. --save_content --save_content_every 10
The environment is
torch==2.0.0 numpy==1.26.4 lmdb==1.4.1 diffusers==0.20.0 transformers==4.30.2 huggingface_hub==0.14.1 accelerate==0.20.3 torchdiffeq==0.2.3 ml_collections==0.1.1 omegaconf==2.3.0 timm==0.9.2 ninja==1.11.1 blobfile==2.0.2 accelerate==0.20.3 einops==0.6.1 opencv-python==4.8.1.78 scikit-image==0.21.0
The command of best evaluation result is
python test_flow_latent.py --exp church_f8_dit --dataset lsun_church --batch_size 80 --epoch_id 575 --image_size 256 --f 8 --num_in_channels 4 --num_out_channels 4 --nf 256 --ch_mult 1 2 3 4 --attn_resolution 16 8 4 --num_res_blocks 2 --master_port 12345 --num_process_per_node 1 --method dopri5 --model_type DiT-L/2 --num_classes 1 --label_dropout 0. --compute_fid --pretrained_autoencoder_ckpt <our/ckpt/dir>