Skip to content

Commit ff76165

Browse files
Metal backend: port quantized GEMM from MLX (#17362)
Optimized 4-bit linear in the Metal backend for the Matrix-Matrix case (M > 1), by porting the quantized GEMM shaders (QMM) from MLX
1 parent fd0adb6 commit ff76165

File tree

1 file changed

+808
-246
lines changed

1 file changed

+808
-246
lines changed

0 commit comments

Comments
 (0)