Skip to content

Comments

fix: Improve efficiency and apply overall fixes to the role processes#22

Merged
spetrosi merged 5 commits intolinux-system-roles:mainfrom
spetrosi:rm-extra-info-readme
Oct 2, 2025
Merged

fix: Improve efficiency and apply overall fixes to the role processes#22
spetrosi merged 5 commits intolinux-system-roles:mainfrom
spetrosi:rm-extra-info-readme

Conversation

@spetrosi
Copy link
Collaborator

@spetrosi spetrosi commented Oct 2, 2025

Improvements after QE tests

@spetrosi spetrosi requested a review from richm as a code owner October 2, 2025 07:52
@sourcery-ai
Copy link

sourcery-ai bot commented Oct 2, 2025

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Refreshed the README to provide clearer guidance on installing and using the system OpenMPI package and ensure consistent phrasing for loading environment modules.

File-Level Changes

Change Details Files
Clarify purpose and installation advice for system OpenMPI
  • Replaced the brief utility description with guidance on non-CUDA and non-GPU use cases
  • Added a note on safe co-existence with other OpenMPI installations
  • Enhanced introductory sentence for module usage
README.md
Standardize lmod command phrasing
  • Changed “for this openmpi” to “to select this openmpi”
  • Applied the wording update in both the system OpenMPI and OpenMPI v5.x sections
README.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@spetrosi spetrosi changed the title docs: Update info in readme per Dave's suggestions fix: Improve efficiency and apply overall fixes to the role processes Oct 2, 2025
* Install kernel-headers and -devel for all installed kernels
* Restart dkms after installing nvidia drivers
openmpi needs to be configured with the rebuilt, PMIx aware UCX
library, not the default on e shipped in HPCX.

Make sure the lmod for the library has the correct library paths set
up, too.
Testing of the openmpi-5.0.8 library resulted in this warning being
emitted:

[tmp-chhhgt6r-gpu-1:25690] SET UCX_TLS=tcp
[LOG_CAT_SBGP] libnuma.so: cannot open shared object file: No such file
or directory
[LOG_CAT_SBGP] Failed to dlopen libnuma.so. Fallback to GROUP_BY_SOCKET
manual.
[LOG_CAT_SBGP] libnuma.so: cannot open shared object file: No such file
or directory
[LOG_CAT_SBGP] Failed to dlopen libnuma.so. Fallback to GROUP_BY_SOCKET
manual.

The lib64/libnuma.so link is provided by the numactl-devel package,
not numactl-libs. Add the -devel package to the depedency list for
early install.
@spetrosi spetrosi force-pushed the rm-extra-info-readme branch from b29de92 to d0b0f4a Compare October 2, 2025 11:17
Rename the openmpi-5.0.8 module to indicate that it only
supports CUDA-based applications on hardware with NVidia GPUs.

Add new wrapper scripts for the NVidia HPCX openmpi libraries.
There are two versions of this - hpcx and hpcx-pmix - that point to
the original library and the one rebuilt with pmix. This provides
conflicts with the other MPI libraries, as well as allows us to
remove the /etc/lmod/.modulespath file that points to the base HPCX
module files.
@spetrosi spetrosi force-pushed the rm-extra-info-readme branch from d0b0f4a to 296b704 Compare October 2, 2025 11:46
@spetrosi spetrosi merged commit 6d52009 into linux-system-roles:main Oct 2, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant