Skip to content

[Triton]-Flashattn - sync the changes from tridao PR2217#1980

Open
tianwyan wants to merge 2 commits intomainfrom
tianwyan/navi_fa
Open

[Triton]-Flashattn - sync the changes from tridao PR2217#1980
tianwyan wants to merge 2 commits intomainfrom
tianwyan/navi_fa

Conversation

@tianwyan
Copy link

@tianwyan tianwyan commented Feb 5, 2026

Motivation

This PR enables Flash Attention Triton support for AMD RDNA3 (Navi) GPUs, specifically targeting the gfx1100 architecture. The goal is to bring Flash Attention performance optimizations to consumer-grade AMD GPUs while leveraging the unique Infinity Cache (LLC) architecture for improved memory throughput.

Technical Details

New Architecture Support:

  • Added gfx1100 (RDNA3/Navi 31) to the supported GPU architectures in the Triton Flash Attention backend

Performance Optimizations:

  • Implemented Infinity Cache (LLC) awareness to optimize memory access patterns and reduce DRAM bandwidth pressure
  • Enabled exp2 instruction by default for faster exponential calculations on RDNA3
  • Added additional Triton autotuning configurations optimized for Navi's wavefront and cache characteristics

Code Cleanup:

  • Renamed "L2 cache" terminology to "Infinity Cache (LLC)" throughout the codebase to accurately reflect AMD's cache hierarchy and avoid confusion with the traditional L2 cache

Test Plan

  • Functional testing on AMD Radeon RX 7900 XTX (gfx1100)
  • Verified Flash Attention forward pass correctness against reference implementation
  • Benchmarked memory bandwidth utilization with and without LLC awareness

Test Result

  • All existing Triton Flash Attention tests pass on gfx1100
  • ~2-4x performance improvement with LLC-aware implementation on memory-bound attention workloads
  • LLC awareness significantly reduces DRAM bandwidth pressure by better utilizing the 96MB Infinity Cache on RDNA3

@tianwyan tianwyan requested review from a team and micmelesse February 5, 2026 08:00
…he_aware.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant