-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is there an existing issue for this?
- I have searched the existing issues
Contact Details
No response
What should this feature add?
SDNQ is a Diffusers-compatible quantizer that works with all major platforms. It includes support for SVDQuant, previously introduced in the CUDA-only Nunchaku, which allows models to run at 4-bit precision with minimal quality loss.
SDNQ may be suitable as the default quantizer for Invoke, as it is compatible with more end user systems than Bitsandbytes while avoiding the need for upcasting to FP16 inherent to GGUF quantization. A range of prequantized checkpoints is available here.
As part of adopting SDNQ, it would be helpful to install triton-windows for Windows CUDA and ROCm installations.
Alternatives
No response
Additional Content
Updated 2/3: Triton for Windows now experimentally supports AMD (triton-windows#188)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request