Skip to content

Add ROCm GPU backend support for stable-diffusion.cpp#980

Open
makn87amd wants to merge 17 commits intomainfrom
rocm_gpu_support
Open

Add ROCm GPU backend support for stable-diffusion.cpp#980
makn87amd wants to merge 17 commits intomainfrom
rocm_gpu_support

Conversation

@makn87amd
Copy link

@makn87amd makn87amd commented Jan 29, 2026

This PR adds ROCm support to sd.cpp.

Only one GPU configuration (stx-halo) has been tested. Help is needed testing other GPU configurations

- Add sdcpp_backend option with --sdcpp CLI flag and LEMONADE_SDCPP env var
- Support 'cpu' (default) and 'rocm' backends for sd.cpp
- Download ROCm-enabled sd.cpp binaries for AMD GPU acceleration
- Add backend-specific versioning in backend_versions.json
- Set PATH for ROCm DLLs on Windows for HIP runtime loading
- Add api_image_gen_rocm.py example demonstrating ROCm image generation
- Fix example script to use correct API parameters (model_name, model)
@makn87amd
Copy link
Author

The example needs to be pruned. It was auto-generated and has a ton of helper functions

{"envname", "LEMONADE_LLAMACPP_ARGS"},
{"help", "Custom arguments to pass to llama-server (must not conflict with managed args)"}
}},
// ADDED: sd.cpp backend selection option
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty noisy/pointless comment, no?

{"option_name", "sdcpp_backend"},
{"type_name", "BACKEND"},
{"allowed_values", {"cpu", "rocm"}},
{"envname", "LEMONADE_SDCPP"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also add this new environment variable to data/lemonade.conf

@makn87amd makn87amd marked this pull request as ready for review February 4, 2026 18:44
@makn87amd makn87amd marked this pull request as draft February 5, 2026 23:55
@makn87amd makn87amd marked this pull request as ready for review February 6, 2026 02:30
@jeremyfowers jeremyfowers mentioned this pull request Feb 6, 2026
Copy link
Contributor

@ramkrishna2910 ramkrishna2910 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some minor comments, looks great overall!

Comment on lines 189 to 197
#ifdef _WIN32
filename = "sd-" + short_version + "-bin-win-rocm-x64.zip";
#elif defined(__linux__)
filename = "sd-" + short_version + "-bin-linux-rocm-x64.zip";
#else
throw std::runtime_error("ROCm sd.cpp only supported on Windows and Linux");
#endif
std::cout << "[SDServer] Using ROCm GPU backend" << std::endl;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation feels a little off here. This should be inside the backend_==rocm?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. Somewhere in a source file a diff showed up with the indentation you mention. So, I thought this was your preference.

I was probably comparing it to nonsense. Let me fix it

// stable-diffusion.cpp - Windows/Linux x86_64
{"sd-cpp", "default", {"windows", "linux"}, {
{"sd-cpp", "rocm", {"windows", "linux"}, {
{"amd_igpu", {"gfx1150", "gfx1151"}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gfx1150 on stx did not work for me and I dont think its supported by rocm.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. Thomas was able to get this to run, but he had to disable his IGPU.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I just take out "linux" support for rocm? We had discussed this. Change would be:

{"sd-cpp", "rocm", {"windows"}, {
    {"amd_igpu", {"gfx1151"}},                     
    {"amd_dgpu", {"gfx110X", "gfx120X"}},                      
}},

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what goes wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just remove linux from the list.

@superm1 stablediffusion.cpp doesn't publish Linux ROCm binaries on release, only Windows. We'll need to work with them to get them building for Linux too. I don't suppose that's something you'd be interested in...?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@superm1 stablediffusion.cpp doesn't publish Linux ROCm binaries on release, only Windows. We'll need to work with them to get them building for Linux too. I don't suppose that's something you'd be interested in...?

leejet/stable-diffusion.cpp#1258

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These binaries support more architectures than Windows HIP do. Here's what I added:

gfx1151
gfx1150
gfx1100
gfx1101
gfx1102
gfx1200
gfx1201

@@ -0,0 +1,66 @@
#!/usr/bin/env python3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a new example file or can we use the existing example and add a cmd line param?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@jeremyfowers
Copy link
Contributor

Not only does this work on my Radeon 9070 XT, it is SO FREAKING FAST. Love it!

RDNA 4 is supported :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants