-
Notifications
You must be signed in to change notification settings - Fork 723
Feat: Trtllm-gen MxFP8 MoE integration #2505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
IwakuraRein
merged 26 commits into
flashinfer-ai:main
from
IwakuraRein:siyuanf/mxfp8-trtllm-integration
Feb 17, 2026
+605
β161
Merged
Changes from 7 commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
557db0a
wip: not compiles yet
nekorobov 45cdb86
fix: compiles, but hangs in autotuning
nekorobov d8c15b4
banned splitK and tileN 256, unit test works
nekorobov 8a7a269
Merge remote-tracking branch 'origin/main' into nkorobov/mxfp8-trtllmβ¦
IwakuraRein 77c49a7
upd
IwakuraRein 3e1a29f
add mxfp8 bench
IwakuraRein b12c461
fix test
IwakuraRein 46eddfa
upd comments
IwakuraRein b046320
drop tile==8 and use unroll loop 2x
IwakuraRein acf0c39
fix test
IwakuraRein 2702ee2
WAR: drop all UnrollLoop2xForMma kernels
IwakuraRein 1dc688d
Merge remote-tracking branch 'origin/main' into siyuanf/mxfp8-trtllm-β¦
IwakuraRein 4e83b82
address comment
IwakuraRein aae1719
fix unit test
IwakuraRein 73d7594
fix hang and segfault
nekorobov 4354ec4
use permute cache in unit test (WIP)
IwakuraRein 0944312
use permute cache in unit test (WIP)
IwakuraRein aa85e94
Revert "use permute cache in unit test (WIP)"
IwakuraRein a7ebf1e
Merge remote-tracking branch 'origin/main' into siyuanf/mxfp8-trtllm-β¦
IwakuraRein 4815a0c
address comments
IwakuraRein e18d73c
intermediate_size_factor
IwakuraRein b9f198d
Merge remote-tracking branch 'origin/main' into siyuanf/mxfp8-trtllm-β¦
IwakuraRein c310276
address comments
IwakuraRein 33acaa2
quick fix
IwakuraRein 03cac02
fix intermediate_size_factor initialization
IwakuraRein 19417d1
allow split k
IwakuraRein File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.