Hi,
First, thank you for your fantastic work on this project! I have a couple of questions regarding the implementation of SD21 in the codebase:
(1) It seems the current codebase includes the SD21 model, but it doesn't fully function as expected due to the implementation only supporting epsilon prediction. Is there any way to add support for v-prediction for SD21?
(2) Is there a theoretical reason behind only implementing epsilon prediction, or is it more of a practical limitation?
Thank you for your time and help!