-
Notifications
You must be signed in to change notification settings - Fork 426
Closed
huggingface/diffusers
#13068Closed
Copy link
Description
Error:
NotImplementedError: NVFP4Tensor dispatch: attempting to run unimplemented operator/function: func=<OpOverload(op='aten.expand', overload='default')>, types=(<class 'torchao.prototype.mx_formats.nvfp4_tensor.NVFP4Tensor'>,), arg_types=(<class 'torchao.prototype.mx_formats.nvfp4_tensor.NVFP4Tensor'>, <class 'list'>), kwarg_types={}Code:
from diffusers import DiffusionPipeline
import torch
from torchao.quantization import quantize_
from torchao.prototype.mx_formats.inference_workflow import (
NVFP4DynamicActivationNVFP4WeightConfig,
NVFP4WeightOnlyConfig,
)
config = NVFP4WeightOnlyConfig(
use_dynamic_per_tensor_scale=True,
)
pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16
).to("cuda")
quantize_(pipe.transformer, config=config)
pipe.transformer.compile_repeated_blocks(fullgraph=True)
_ = pipe("a dog", num_images_per_prompt=4)Same error happens with NVFP4DynamicActivationNVFP4WeightConfig as well.
I am using PyTorch 2.10.0 and nightly TorchAO. I am on B200 with CUDA 12.9.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels