Skip to content

feat: add pass-device-specs flag for volcano-vgpu-device-plugin #104

Open
vcr1311 wants to merge 1 commit intoProject-HAMi:mainfrom
vcr1311:vikash_rai_dev
Open

feat: add pass-device-specs flag for volcano-vgpu-device-plugin #104
vcr1311 wants to merge 1 commit intoProject-HAMi:mainfrom
vcr1311:vikash_rai_dev

Conversation

@vcr1311
Copy link

@vcr1311 vcr1311 commented Feb 6, 2026

This commit adds support for the --pass-device-specs flag, enabling volcano-vgpu-device-plugin to work with standard OCI container runtimes (containerd, docker) that do not have nvidia-container-runtime installed.

When enabled, the device plugin explicitly passes GPU device file paths to kubelet via the DeviceSpec field in ContainerAllocateResponse. Kubelet then mounts these devices (/dev/nvidia*, /dev/nvidiactl, /dev/nvidia-uvm, etc.) directly into containers, bypassing the need for nvidia-container-runtime.

Changes:

  • Add --pass-device-specs command-line flag (default: false)
  • Add PassDeviceSpecs configuration variable in config package
  • Implement GetDeviceSpecs() helper method to build device specifications
  • Modify Allocate() method to populate response.Devices when flag is enabled
  • Support both regular GPU and MIG device paths
  • Handle control devices (/dev/nvidiactl, etc.) with existence checks

This maintains full backward compatibility - when the flag is disabled (default), the plugin behaves exactly as before, relying on nvidia-container-runtime for device mounting.

Signed-off-by: Vikash Rai vikash.ingress@gmail.com

This commit adds support for the --pass-device-specs flag, enabling
volcano-vgpu-device-plugin to work with standard OCI container runtimes
(containerd, docker) that do not have nvidia-container-runtime installed.

When enabled, the device plugin explicitly passes GPU device file paths
to kubelet via the DeviceSpec field in ContainerAllocateResponse. Kubelet
then mounts these devices (/dev/nvidia*, /dev/nvidiactl, /dev/nvidia-uvm,
etc.) directly into containers, bypassing the need for nvidia-container-runtime.

Changes:
- Add --pass-device-specs command-line flag (default: false)
- Add PassDeviceSpecs configuration variable in config package
- Implement GetDeviceSpecs() helper method to build device specifications
- Modify Allocate() method to populate response.Devices when flag is enabled
- Support both regular GPU and MIG device paths
- Handle control devices (/dev/nvidiactl, etc.) with existence checks

This maintains full backward compatibility - when the flag is disabled
(default), the plugin behaves exactly as before, relying on
nvidia-container-runtime for device mounting.

Signed-off-by: Vikash Rai <vikash.ingress@gmail.com>
@hami-robot
Copy link
Contributor

hami-robot bot commented Feb 6, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vcr1311
Once this PR has been reviewed and has the lgtm label, please assign archlitchi for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hami-robot hami-robot bot requested a review from archlitchi February 6, 2026 03:41
@hami-robot hami-robot bot requested a review from SataQiu February 6, 2026 03:41
@hami-robot
Copy link
Contributor

hami-robot bot commented Feb 6, 2026

Welcome @vcr1311! It looks like this is your first PR to Project-HAMi/volcano-vgpu-device-plugin 🎉

@hami-robot hami-robot bot added the size/L label Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant