-
Notifications
You must be signed in to change notification settings - Fork 25
Description
(Following up on an email thread)
It looks like the page Running your AI training jobs on Satori using Slurm
contains some incorrect info on GPUs and exclusivity. I'm guessing this might be left over from a time when GPUs were exposed to jobs differently? E.g.:
getting-started/satori-workload-manager-using-slurm.rst
Lines 33 to 40 in 940cdd6
| exclusive. That means that unless you ask otherwise, the GPUs on the node(s) | |
| you are assigned may already be in use by another user. That means if you | |
| request a node with 2GPU's the 2 other GPUs on that node may be engaged by | |
| another job. This allows us to more efficently allocate all of the GPU | |
| resources. This may require some additional checking to make sure you can | |
| uniquely use all of the GPU's on a machine. If you're in doubt, you can request | |
| the node to be 'exclusive' . See below on how to request exclusive access in | |
| an interactive and batch situation. |
I don't think any additional checking is required, nor is it necessary to request exclusive use of the node... my understanding of the current behavior (per @adamdrucker) is that a job gets exclusive use of any GPUs requested via the --gres flag, and my experience is that any additional unallocated GPUs are simply not exposed to a job at all.
getting-started/satori-workload-manager-using-slurm.rst
Lines 65 to 78 in 940cdd6
| srun --gres=gpu:4 -N 1 --mem=1T --time 1:00:00 -I --pty /bin/bash | |
| This will request an AC922 node with 4x GPUs from the Satori (normal | |
| queue) for 1 hour. | |
| If you need to make sure no one else can allocate the unused GPU's on the machine you can use | |
| .. code:: bash | |
| srun --gres=gpu:4 -N 1 --exclusive --mem=1T --time 1:00:00 -I --pty /bin/bash | |
| this will request exclusive use of an interactive node with 4GPU's | |
I believe the first command above is sufficient to ensure that nobody else can allocate the four GPUs on the node, right?
getting-started/satori-workload-manager-using-slurm.rst
Lines 178 to 179 in 940cdd6
| - line 13: ``--exclusive`` means that you want full use of the GPUS on the nodes you are reserving. Leaving this out allows | |
| the GPU resources you're not using on the node to be shared. |
Again, my understanding is that this isn't necessary and may be detrimental; requesting all GPUs on a node is sufficient to ensure exclusive access to the job, and omitting the --exclusive flag unless it's really needed (e.g. you need all resources available on a node, not just all GPUs) would give the scheduler more flexibility to combine GPU-heavy, CPU-light jobs with those that need only CPU cores.
Don't have the bandwidth to open a PR at the moment, but hope the above helps! (And please let me know if I misunderstood any of this...)