GitHub - sueszli/llvm-to-air: the missing link to Apple GPUs

sueszli / llvm-to-air Public

Notifications You must be signed in to change notification settings
Fork 0
Star 6

the missing link to Apple GPUs

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 239 Commits
docs		docs
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README		README
demo_linalg.py		demo_linalg.py
demo_mandelbrot.py		demo_mandelbrot.py

Repository files navigation

▌  ▌ ▌ ▌▙▗▌  ▐       ▞▀▖▜▘▛▀▖
▌  ▌ ▚▗▘▌▘▌  ▜▀ ▞▀▖  ▙▄▌▐ ▙▄▘
▌  ▌ ▝▞ ▌ ▌  ▐ ▖▌ ▌  ▌ ▌▐ ▌▚ 
▀▀▘▀▀▘▘ ▘ ▘   ▀ ▝▀   ▘ ▘▀▘▘ ▘

// Reverse-engineered compiler stack for Apple Silicon GPUs
//
// Coming soon to xDSL:
// https://github.com/xdslproject/xdsl/blob/main/xdsl/backend/mps/__init__.py

MLIR is the right abstraction for portable performance across heterogeneous
hardware, but it has no Apple Silicon GPU backend [^1]. This is a problem
because (1) Apple Silicon market share is growing fast, (2) most of these
machines have powerful GPUs sitting idle and (3) the world needs more compute
for everything from protein folding to ML training.

Mojo proved that targeting Apple GPUs via MLIR->LLVM->AIR->MetalLib works, but
their implementation is closed source [^2].

This project reverse engineers that missing piece and provides an open source
implementation of the LLVM IR to AIR lowering pass.

  +----------------+      +----------------+      +----------------+
  |    Frontend    |----->|  MLIR Dialect  |----->|  LLVM Bitcode  |
  |                |      |                |      | (Open Source)  |
  +----------------+      +----------------+      +----------------+
                                                           |
                                                 [ src/llvm_to_air.py ]
                                                           |
  +----------------+      +----------------+      +--------v-------+
  |   Apple GPU    |<-----|    Metallib    |<-----|   AIR Bitcode  |
  |   (M-Series)   |      |    (Binary)    |      |  (Proprietary) |
  +----------------+      +----------------+      +----------------+

The core contribution is `src/llvm_to_air.py`, which takes LLVM IR and lowers it 
to Apple's Intermediate Representation AIR. This enables a full compilation 
pipeline from high-level MLIR dialects down to executable code on Apple Silicon 
GPUs. I used xDSL to write the entire compiler stack in Python, making it accessible
and hackable.

Fair warning: this is experimental and brittle. AIR is closed source and 
undocumented, so everything here is reverse engineered. But it works and to 
my knowledge this is the first open source end-to-end stack for Apple Silicon.

--------------------------------------------------------------------------------
Performance
--------------------------------------------------------------------------------

The mandelbrot benchmark shows a 1150x speedup 🔥 over the vanilla Python impl.

$ uv run demo_mandelbrot.py

    mandelbrot benchmark (1,048,576 pixels)
                                                                                                        
    results (avg latency ms):
    gpu            : 2.47 ms
    numba          : 188.56 ms
    numpy          : 1519.57 ms
    numpy+numba    : 1820.99 ms
    plain          : 2840.38 ms

    relative to vanilla python:
    gpu            : 1150.23x faster
    numba          : 15.06x faster
    numpy          : 1.87x faster
    numpy+numba    : 1.56x faster

--------------------------------------------------------------------------------
Lisp Frontend
--------------------------------------------------------------------------------

There's also a tiny Common Lisp subset as a frontend.

$ uv run demo_linalg.py

    (print
        (add
            (matmul
                (tensor (2 3) (-1.0 2.0 -3.0 4.0 -5.0 6.0))
                (tensor (3 2) (7.0 8.0 9.0 10.0 11.0 12.0))
            )
            (tensor (2 2) (100.0 100.0 100.0 100.0))
        )
    )


    Tensor(2 x 2):
            78.000000 76.000000
            149.000000 154.000000

--------------------------------------------------------------------------------
References
--------------------------------------------------------------------------------

[^1] MLIR: https://discourse.llvm.org/t/rfc-mps-dialect-in-mlir/77102
[^2] Mojo: https://forum.modular.com/t/apple-silicon-gpu-support-in-mojo/2295

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Languages

License

sueszli/llvm-to-air

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages