Skip to content

[enhancement]: Support for SD.Next Quantizer #8789

@iwr-redmond

Description

@iwr-redmond

Is there an existing issue for this?

  • I have searched the existing issues

Contact Details

No response

What should this feature add?

SDNQ is a Diffusers-compatible quantizer that works with all major platforms. It includes support for SVDQuant, previously introduced in the CUDA-only Nunchaku, which allows models to run at 4-bit precision with minimal quality loss.

SDNQ may be suitable as the default quantizer for Invoke, as it is compatible with more end user systems than Bitsandbytes while avoiding the need for upcasting to FP16 inherent to GGUF quantization. A range of prequantized checkpoints is available here.

As part of adopting SDNQ, it would be helpful to install triton-windows for Windows CUDA and ROCm installations.

Alternatives

No response

Additional Content

Updated 2/3: Triton for Windows now experimentally supports AMD (triton-windows#188)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions