-
Notifications
You must be signed in to change notification settings - Fork 51
Open
Description
In a heterogeneous cluster (MIG and non-MIG GPU nodes), the Volcano vGPU device plugin ignores per-node settings from the volcano-vgpu-node-config ConfigMap because command-line flag defaults override them.
Environment:
Kubernetes: v1.28.15
Volcano: v1.13.0
Problem:
Defaults from --mig-strategy (none) and --device-split-count (2) in main.go are applied before per-node configs are read, overriding settings in ConfigMap.
Observed Behavior:
Non-MIG node (gpu24042)
Config: "operatingmode": "hami-core", "devicesplitcount": 4
Result: volcano.sh/vgpu-number: "2" (ignored 4)
allocatable:
cpu: "4"
ephemeral-storage: "58801084319"
hugepages-2Mi: "0"
memory: 16274732Ki
pods: "110"
volcano.sh/vgpu-cores: "100"
volcano.sh/vgpu-memory: "12288"
volcano.sh/vgpu-number: "2"
capacity:
cpu: "4"
ephemeral-storage: 63803260Ki
hugepages-2Mi: "0"
memory: 16377132Ki
pods: "110"
volcano.sh/vgpu-cores: "100"
volcano.sh/vgpu-memory: "12288"
volcano.sh/vgpu-number: "2"
MIG node (gracehopper)
Config: "operatingmode": "mig"
Result: MIG mode not activated; advertises zero resources.
Expected Behavior:
Non-MIG node honors devicesplitcount: 4.
allocatable:
cpu: "72"
ephemeral-storage: "849546416770"
hugepages-2Mi: "0"
hugepages-16Gi: "0"
hugepages-512Mi: "0"
memory: 548096704Ki
nvidia.com/gpu: "0"
nvidia.com/mig-1g.12gb: "0"
nvidia.com/mig-3g.48gb: "0"
pods: "110"
volcano.sh/vgpu-cores: "0"
volcano.sh/vgpu-memory: "0"
volcano.sh/vgpu-number: "0"
capacity:
cpu: "72"
ephemeral-storage: 921816860Ki
hugepages-2Mi: "0"
hugepages-16Gi: "0"
hugepages-512Mi: "0"
memory: 548199104Ki
nvidia.com/gpu: "0"
nvidia.com/mig-1g.12gb: "0"
nvidia.com/mig-3g.48gb: "0"
pods: "110"
volcano.sh/vgpu-cores: "0"
volcano.sh/vgpu-memory: "0"
volcano.sh/vgpu-number: "4"
in the yaml:
gpu24042: |
{
"nodeconfig": [
{
"name": "gpu24042",
"operatingmode": "hami-core",
"devicememoryscaling": 1,
"devicesplitcount": 4,
"migstrategy":"none"
}
]
}
gracehopper: |
{
"nodeconfig": [
{
"name": "gracehopper",
"operatingmode": "mig",
"migstrategy": "mixed"
}
]
}
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels