[core] make flux hidden states contiguous #13068

sayakpaul · 2026-02-03T03:35:43Z

What does this PR do?

Fixes pytorch/ao#3783.

NVFP4 has nice speed benefits, which this PR allows:

More results are in https://gist.github.com/sayakpaul/6e6883db921149a87d35cfde4b4dd5d8.

Copilot

Pull request overview

This PR addresses performance issues with NVFP4 quantization by making hidden states contiguous after split operations in the Flux transformer attention processor. The fix enables significant speed improvements when using NVFP4 quantization by ensuring tensors have contiguous memory layout before being passed to linear layers.

Changes:

Added .contiguous() calls to hidden_states and encoder_hidden_states after split_with_sizes operation in FluxAttnProcessor

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-03T03:46:00Z

src/diffusers/models/transformers/transformer_flux.py

+            hidden_states = attn.to_out[0](hidden_states.contiguous())
            hidden_states = attn.to_out[1](hidden_states)
-            encoder_hidden_states = attn.to_add_out(encoder_hidden_states)
+            encoder_hidden_states = attn.to_add_out(encoder_hidden_states.contiguous())


The same contiguous() fix should be applied to the FluxIPAdapterAttnProcessor class. Lines 240-242 have an identical split_with_sizes pattern but are missing the contiguous() calls. For consistency and to ensure NVFP4 quantization benefits are available across all attention processors, please add contiguous() calls at lines 240 and 242 similar to these changes.

HuggingFaceDocBuilderDev · 2026-02-03T03:47:03Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu

thanks!

make flux hidden states contiguous

23fdf38

sayakpaul requested a review from Copilot February 3, 2026 03:35

Copilot started reviewing on behalf of sayakpaul February 3, 2026 03:36 View session

make fix-copies

82aaa36

Copilot AI reviewed Feb 3, 2026

View reviewed changes

sayakpaul requested review from dg845 and yiyixuxu February 3, 2026 03:50

Merge branch 'main' into flux-contiguous

fcdfaf0

sayakpaul mentioned this pull request Feb 4, 2026

[core] make qwen hidden states contiguous to make torchao happy. #13081

Open

yiyixuxu approved these changes Feb 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] make flux hidden states contiguous #13068

[core] make flux hidden states contiguous #13068

sayakpaul commented Feb 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 3, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Feb 3, 2026

Uh oh!

yiyixuxu left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[core] make flux hidden states contiguous #13068

Are you sure you want to change the base?

[core] make flux hidden states contiguous #13068

Conversation

sayakpaul commented Feb 3, 2026

What does this PR do?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Feb 3, 2026

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants