Commit 0c4cfc3
feat: Add native FP8 model support with scale_inv dequantization
Add comprehensive FP8 quantized model support for models like Qwen3-FP8.
This enables loading and running FP8 models with per-block scale factors.
Changes:
bumblebee.ex:
- Add :preserve_source_types option to load_model/2 to keep FP8 types
pytorch_params.ex:
- Pass preserve_source_types through param loading pipeline
- Modify ensure_type/3 to preserve FP8 types when option is set
layers.ex:
- Add fp8_aware_dense/3 layer that handles FP8 quantized weights
- Implements block-wise dequantization using scale_inv parameter
- Automatically falls back to identity scaling for non-FP8 models
layers/transformer.ex:
- Add :attention_dense option to blocks/2, block/2, multi_head_attention/4
- Allows custom dense function for Q, K, V, and output projections
text/qwen3.ex:
- Update decoder to use fp8_aware_dense for attention via attention_dense
- Update gated_ffn to use fp8_aware_dense for FFN layers
- Add scale_inv to params_mapping for all attention and FFN layers
The implementation supports both:
- Pre-dequantization: Convert FP8->F32 before loading
- Native FP8: Load FP8 weights directly, apply scale_inv at runtime
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>1 parent bbd4d83 commit 0c4cfc3
File tree
5 files changed
+257
-27
lines changed- lib
- bumblebee
- conversion
- layers
- text
5 files changed
+257
-27
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
607 | 607 | | |
608 | 608 | | |
609 | 609 | | |
610 | | - | |
| 610 | + | |
| 611 | + | |
611 | 612 | | |
612 | 613 | | |
613 | 614 | | |
| |||
654 | 655 | | |
655 | 656 | | |
656 | 657 | | |
657 | | - | |
| 658 | + | |
658 | 659 | | |
659 | 660 | | |
660 | 661 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
31 | 36 | | |
32 | 37 | | |
33 | 38 | | |
| |||
36 | 41 | | |
37 | 42 | | |
38 | 43 | | |
| 44 | + | |
39 | 45 | | |
40 | 46 | | |
41 | 47 | | |
| |||
58 | 64 | | |
59 | 65 | | |
60 | 66 | | |
61 | | - | |
| 67 | + | |
| 68 | + | |
62 | 69 | | |
63 | 70 | | |
64 | 71 | | |
| |||
95 | 102 | | |
96 | 103 | | |
97 | 104 | | |
98 | | - | |
| 105 | + | |
99 | 106 | | |
100 | 107 | | |
101 | 108 | | |
102 | 109 | | |
103 | 110 | | |
104 | 111 | | |
105 | 112 | | |
106 | | - | |
| 113 | + | |
107 | 114 | | |
108 | 115 | | |
109 | 116 | | |
| |||
155 | 162 | | |
156 | 163 | | |
157 | 164 | | |
158 | | - | |
| 165 | + | |
159 | 166 | | |
160 | 167 | | |
161 | 168 | | |
| |||
507 | 514 | | |
508 | 515 | | |
509 | 516 | | |
510 | | - | |
| 517 | + | |
511 | 518 | | |
512 | | - | |
513 | | - | |
514 | | - | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
515 | 524 | | |
516 | 525 | | |
517 | 526 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
438 | 438 | | |
439 | 439 | | |
440 | 440 | | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
441 | 563 | | |
442 | 564 | | |
443 | 565 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
66 | | - | |
| 66 | + | |
| 67 | + | |
67 | 68 | | |
68 | 69 | | |
69 | 70 | | |
| |||
354 | 355 | | |
355 | 356 | | |
356 | 357 | | |
357 | | - | |
| 358 | + | |
| 359 | + | |
358 | 360 | | |
359 | 361 | | |
360 | 362 | | |
| |||
386 | 388 | | |
387 | 389 | | |
388 | 390 | | |
| 391 | + | |
389 | 392 | | |
390 | 393 | | |
391 | 394 | | |
| |||
446 | 449 | | |
447 | 450 | | |
448 | 451 | | |
| 452 | + | |
449 | 453 | | |
450 | 454 | | |
451 | 455 | | |
| |||
491 | 495 | | |
492 | 496 | | |
493 | 497 | | |
| 498 | + | |
494 | 499 | | |
495 | 500 | | |
496 | 501 | | |
| |||
772 | 777 | | |
773 | 778 | | |
774 | 779 | | |
775 | | - | |
| 780 | + | |
| 781 | + | |
776 | 782 | | |
777 | 783 | | |
778 | 784 | | |
| |||
792 | 798 | | |
793 | 799 | | |
794 | 800 | | |
| 801 | + | |
795 | 802 | | |
796 | 803 | | |
797 | 804 | | |
| |||
804 | 811 | | |
805 | 812 | | |
806 | 813 | | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
807 | 823 | | |
808 | 824 | | |
809 | | - | |
| 825 | + | |
810 | 826 | | |
811 | 827 | | |
812 | 828 | | |
| |||
815 | 831 | | |
816 | 832 | | |
817 | 833 | | |
818 | | - | |
| 834 | + | |
819 | 835 | | |
820 | 836 | | |
821 | 837 | | |
| |||
824 | 840 | | |
825 | 841 | | |
826 | 842 | | |
827 | | - | |
| 843 | + | |
828 | 844 | | |
829 | 845 | | |
830 | 846 | | |
| |||
937 | 953 | | |
938 | 954 | | |
939 | 955 | | |
940 | | - | |
| 956 | + | |
941 | 957 | | |
942 | 958 | | |
943 | 959 | | |
| |||
0 commit comments