Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ spec:
description: "Customer cloud environment is unreachable from the management cluster due to invalid aws credentials"
summary: "Cluster has invalid AWS credentials"
# Clusters tend to have their `hypershift_cluster_invalid_aws_creds` set to > 0 while the HCP didn't finish the installation, thus we check
# that the HCP is not rolling out in our expression (= hypershift_cluster_waiting_initial_avaibility_duration_seconds does not exist)
# that the HCP is not rolling out in our expression (= hypershift_cluster_waiting_initial_availability_duration_seconds does not exist)
# hypershift_cluster_waiting_initial_avaibility_duration_seconds stops being emitted once the HCP is rolled out
# This will be fixed with https://issues.redhat.com/browse/OCPBUGS-63353
expr: (max by (exported_namespace, _id) (hypershift_cluster_invalid_aws_creds) == 1) unless on (exported_namespace) (hypershift_cluster_waiting_initial_avaibility_duration_seconds or hypershift_cluster_deleting_duration_seconds)
expr: (max by (exported_namespace, _id) (hypershift_cluster_invalid_aws_creds) == 1) unless on (exported_namespace) (hypershift_cluster_waiting_initial_availability_duration_seconds or hypershift_cluster_deleting_duration_seconds)
for: 4m # api-ErrorBudgetBurn is our highest SLA and triggers after 5 minutes of CrashLooping kube-apiserver pods. KAS pods can CrashLoop due to missing OIDC/invalid AWS permissions. To reduce self-resolving alerts, we need the limited support to be in place before the alert triggers.
labels:
severity: warning
Expand Down
2 changes: 1 addition & 1 deletion hack/00-osd-managed-cluster-config-integration.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -51086,7 +51086,7 @@ objects:
cluster due to invalid aws credentials
summary: Cluster has invalid AWS credentials
expr: (max by (exported_namespace, _id) (hypershift_cluster_invalid_aws_creds)
== 1) unless on (exported_namespace) (hypershift_cluster_waiting_initial_avaibility_duration_seconds
== 1) unless on (exported_namespace) (hypershift_cluster_waiting_initial_availability_duration_seconds
or hypershift_cluster_deleting_duration_seconds)
for: 4m
labels:
Expand Down
2 changes: 1 addition & 1 deletion hack/00-osd-managed-cluster-config-production.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -51086,7 +51086,7 @@ objects:
cluster due to invalid aws credentials
summary: Cluster has invalid AWS credentials
expr: (max by (exported_namespace, _id) (hypershift_cluster_invalid_aws_creds)
== 1) unless on (exported_namespace) (hypershift_cluster_waiting_initial_avaibility_duration_seconds
== 1) unless on (exported_namespace) (hypershift_cluster_waiting_initial_availability_duration_seconds
or hypershift_cluster_deleting_duration_seconds)
for: 4m
labels:
Expand Down
2 changes: 1 addition & 1 deletion hack/00-osd-managed-cluster-config-stage.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -51086,7 +51086,7 @@ objects:
cluster due to invalid aws credentials
summary: Cluster has invalid AWS credentials
expr: (max by (exported_namespace, _id) (hypershift_cluster_invalid_aws_creds)
== 1) unless on (exported_namespace) (hypershift_cluster_waiting_initial_avaibility_duration_seconds
== 1) unless on (exported_namespace) (hypershift_cluster_waiting_initial_availability_duration_seconds
or hypershift_cluster_deleting_duration_seconds)
for: 4m
labels:
Expand Down