Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 38 additions & 5 deletions modules/virt-querying-metrics.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ ifndef::openshift-rosa,openshift-dedicated[]
The following query can identify virtual machines that are waiting for Input/Output (I/O):

`kubevirt_vmi_vcpu_wait_seconds_total`::
Returns the wait time (in seconds) on I/O for vCPUs of a virtual machine. Type: Counter.
Returns the wait time (in seconds) on I/O for vCPUs of a virtual machine. Type: Counter.

A value above '0' means that the vCPU wants to run, but the host scheduler cannot run it yet. This inability to run indicates that there is an issue with I/O.

Expand All @@ -33,8 +33,8 @@ To query the vCPU metric, the `schedstats=enable` kernel argument must first be
====

`kubevirt_vmi_vcpu_delay_seconds_total`::
Returns the cumulative time, in seconds, that a vCPU was enqueued by the host scheduler but could not run immediately.
This delay appears to the virtual machine as _steal time_, which is CPU time lost when the host runs other workloads. Steal time can impact performance and often indicates CPU overcommitment or contention on the host.
Returns the cumulative time, in seconds, that a vCPU was enqueued by the host scheduler but could not run immediately.
This delay appears to the virtual machine as _steal time_, which is CPU time lost when the host runs other workloads. Steal time can impact performance and often indicates CPU overcommitment or contention on the host.
Type: Counter.

*Example vCPU delay query*
Expand Down Expand Up @@ -168,14 +168,14 @@ The following query returns the top 3 VMs where the guest is performing the most
[source,promql]
----
topk(3, sum by (name, namespace) (rate(kubevirt_vmi_memory_swap_in_traffic_bytes[6m])) + sum by (name, namespace) (rate(kubevirt_vmi_memory_swap_out_traffic_bytes[6m]))) > 0
+
----

[NOTE]
====
Memory swapping indicates that the virtual machine is under memory pressure. Increasing the memory allocation of the virtual machine can mitigate this issue.
====
[id=virt-promql-AAQ-metrics_context]

[id="virt-promql-aaq-metrics_{context}"]
== Monitoring AAQ operator metrics
The following metrics are exposed by the Application Aware Quota (AAQ) controller for monitoring resource quotas:

Expand All @@ -184,3 +184,36 @@ Returns the current quota usage and the CPU and memory limits enforced by the AA

`kube_application_aware_resourcequota_creation_timestamp`::
Returns the time, in UNIX timestamp format, when the AAQ Operator resource is created. Type: Gauge.

[id="virt-vm-label-metrics_{context}"]
== VM label metrics

`kubevirt_vm_labels`::
Returns virtual machine labels as Prometheus labels. Type: Gauge.
+
You can expose and ignore specific labels by editing the `kubevirt-vm-labels-config` config map. After you apply the config map to your cluster, the configuration is loaded dynamically.
+
*Example config map*
+
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: kubevirt-vm-labels-config
namespace: openshift-cnv
data:
allowlist: "*"
ignorelist: ""
----
+
where:

* `allowlist` specifies labels to expose.
** If `allowlist` has a value of `"*"`, all labels are included.
** If `allowlist` has a value of `""`, the metric does not return any labels.
** If `allowlist` contains a list of label keys, only the explicitly named labels are exposed. For example: `allowlist: "example.io/name,example.io/version"`.
* `ignorelist` specifies labels to ignore. The ignore list overrides the allow list.
** The `ignorelist` field does not support wildcard patterns. It can be empty or include a list of specific labels to ignore.
** If `ignorelist` has a value of `""`, no labels are ignored.