feat: Install system openmpi for usecases that don't need GPUs#19
feat: Install system openmpi for usecases that don't need GPUs#19spetrosi merged 16 commits intolinux-system-roles:mainfrom
Conversation
* Install mpitest-opnmpi and lmod anyway * Add documentation about modules * Add variables hpc_install_system_openmpi and hpc_build_openmpi * Use HPC-X wording for consisstency * Use 0644 where executing perms is not required * Apply sunrpc configuration via template * Add tests for tuning
Reviewer's GuideThis PR enhances the HPC role by installing a system-provided OpenMPI alongside the GPU-enabled build, refactors build flags, adds firewall configuration support, standardizes file modes and handler ordering, and enriches testing for ulimits and sysctl tunings. Class diagram for OpenMPI installation optionsclassDiagram
class HPC_Role {
+hpc_install_system_openmpi: bool
+hpc_build_openmpi: bool
+hpc_manage_firewall: bool
+hpc_manage_storage: bool
+hpc_tuning: bool
}
class OpenMPI {
+system_openmpi
+gpu_enabled_openmpi
}
HPC_Role --> OpenMPI: installs
OpenMPI : system_openmpi from AppStream
OpenMPI : gpu_enabled_openmpi built from source
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- The firewall task is including the storage role instead of a firewall role and even has a typo in its name; please correct it to include the proper firewall role and fix the 'firewll' typo.
- Variable names for building OpenMPI are inconsistent—defaults and README use hpc_build_openmpi while tasks use hpc_build_nvidia_openmpi—please unify them across the role.
- You've changed several Lua module files and templates to 0755 (executable) where 0644 would likely suffice; please review and tighten permissions to avoid unnecessary executable bits.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The firewall task is including the storage role instead of a firewall role and even has a typo in its name; please correct it to include the proper firewall role and fix the 'firewll' typo.
- Variable names for building OpenMPI are inconsistent—defaults and README use hpc_build_openmpi while tasks use hpc_build_nvidia_openmpi—please unify them across the role.
- You've changed several Lua module files and templates to 0755 (executable) where 0644 would likely suffice; please review and tighten permissions to avoid unnecessary executable bits.
## Individual Comments
### Comment 1
<location> `tasks/main.yml:45` </location>
<code_context>
- "{{ __hpc_nvidia_cuda_repo }}"
- "{{ __hpc_microsoft_prod_repo }}"
+- name: Configure firewll to use trusted zone as default
+ when: hpc_manage_firewall
+ include_role:
</code_context>
<issue_to_address>
**nitpick (typo):** Typo in task name: 'firewll' should be 'firewall'.
Please update the task name to 'firewall' for consistency.
```suggestion
- name: Configure firewall to use trusted zone as default
```
</issue_to_address>
### Comment 2
<location> `tasks/main.yml:342` </location>
<code_context>
owner: root
group: root
- mode: '0644'
+ mode: '0755'
- name: Install GDRCopy packages
</code_context>
<issue_to_address>
**question:** Changed file mode from '0644' to '0755' for module files; review necessity of executable permissions.
Executable permissions are unnecessary for these module files; '0644' is standard unless a specific need for '0755' exists.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| owner: root | ||
| group: root | ||
| mode: '0644' | ||
| mode: '0755' |
There was a problem hiding this comment.
question: Changed file mode from '0644' to '0755' for module files; review necessity of executable permissions.
Executable permissions are unnecessary for these module files; '0644' is standard unless a specific need for '0755' exists.
ec5720b to
cf7a403
Compare
981a02f to
bcff17f
Compare
bcff17f to
a528c4e
Compare
ce479c7 to
2fda79e
Compare
2fda79e to
e010790
Compare
| or __hpc_versionlock_content.stdout is not search(item + "-[0-9]") | ||
| loop: "{{ __hpc_versionlock_rpms }}" | ||
|
|
||
| - name: Update all packages to bring system to the latest state |
There was a problem hiding this comment.
this needs to happen at the very beginning of the process, before you do any version lock or any installation of anything related to HPC.
1619576 to
147ef03
Compare
147ef03 to
3bd5fd4
Compare
|
[citest] |
9082c57 to
3e5f2a0
Compare
|
[citest] |
Enhancement: Install system openmpi for usecases that don't need GPUs
Reason: OpenMPI that the role builds with NVidia GPU support doesn't work on systems that do not have GPU. Simply do not fall back to CPU.
Result: The role installs both openmpi alongside, users can select the one to use with environment modules.
Authored with assistance from Cursor AI.
Summary by Sourcery
Introduce a dual-path OpenMPI installation supporting both OS packages and NVIDIA-enabled builds, add optional firewall trusted zone configuration, unify configuration templates, update file modes, and enhance documentation and tests.
New Features:
Enhancements:
Documentation:
Tests: