SREP-3117: Add E2E tests for RHOBS Synthetic Monitoring on HyperShift#461
SREP-3117: Add E2E tests for RHOBS Synthetic Monitoring on HyperShift#461Mhodesty wants to merge 12 commits intoopenshift:masterfrom
Conversation
|
@Mhodesty: This pull request references SREP-3117 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Mhodesty The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #461 +/- ##
==========================================
- Coverage 55.91% 55.46% -0.45%
==========================================
Files 31 31
Lines 2749 2771 +22
==========================================
Hits 1537 1537
- Misses 1120 1139 +19
- Partials 92 95 +3
🚀 New features to boost your workflow:
|
| // Skip VPC endpoint check for test HCPs (e.g., osde2e tests without real VPC infrastructure) | ||
| if _, osde2eTesting := hostedcontrolplane.Annotations["routemonitor.openshift.io/osde2e-testing"]; osde2eTesting { | ||
| return true, nil | ||
| } |
There was a problem hiding this comment.
Just calling out that this is another thing we should work to get coverage for 🙂
| for idx := range cdList.Items { | ||
| cd := &cdList.Items[idx] |
There was a problem hiding this comment.
I think we can simplify this to
| for idx := range cdList.Items { | |
| cd := &cdList.Items[idx] | |
| for _, cd := range cdList.Items { |
| return tokenResp.AccessToken, nil | ||
| } | ||
|
|
||
| func listRHOBSProbes(baseURL, labelSelector string, creds *OIDCCredentials) ([]map[string]interface{}, error) { |
There was a problem hiding this comment.
Rather than return a slice of maps of strings to interface{}, it might be cleaner/easier to import or copy the Probe struct used by the rhobs-synthetic-agent
|
/retest |
2 similar comments
|
/retest |
|
/retest |
|
/retest |
|
/retest |
…we need the rhobsAPI to pass the e2e pipeline.
WalkthroughAdds a skip-infrastructure-health-check configuration flag (CLI/ConfigMap/env) and propagates it through the manager into HostedControlPlane reconciliation so health checks and internal monitoring deployment can be short-circuited when enabled. Also updates tests, docs, and dependencies. Changes
sequenceDiagram
participant CLI/ConfigMap
participant Manager
participant HCP_Reconciler
participant RHOBS_API
CLI/ConfigMap->>Manager: start with skip-infrastructure-health-check value
Manager->>HCP_Reconciler: reconcile HCP with RHOBSConfig{SkipInfrastructureHealthCheck: value}
alt Skip enabled
HCP_Reconciler->>HCP_Reconciler: short-circuit health checks & skip deployInternalMonitoringObjects
HCP_Reconciler-->>Manager: reconciliation success (no infra checks)
else Skip disabled
HCP_Reconciler->>RHOBS_API: create/update probes and monitoring objects
RHOBS_API-->>HCP_Reconciler: probe responses/status
HCP_Reconciler-->>Manager: reconciliation success (with infra checks)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@Mhodesty: This pull request references SREP-3117 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
controllers/hostedcontrolplane/hostedcontrolplane.go (1)
328-332:SkipInfrastructureHealthCheckalso skips internal monitoring object deployment — consider naming clarity.This flag now controls three distinct behaviors: skipping HCP readiness checks, skipping VPC endpoint checks, and skipping internal monitoring deployment (Routes + RouteMonitors). The name "skip-infrastructure-health-check" implies only the health-check portion. While this is consistent with the test-environment use case (lightweight HCPs), the overloaded semantics could surprise operators who set the flag expecting only health checks to be bypassed. A rename (e.g.,
skip-infrastructure-checks) or documentation note would reduce ambiguity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/hostedcontrolplane/hostedcontrolplane.go` around lines 328 - 332, The config flag SkipInfrastructureHealthCheck is currently used in deployInternalMonitoringObjects, readiness checks, and VPC endpoint checks which overloads its meaning; update RHOBSConfig and all callers (e.g., deployInternalMonitoringObjects, any readiness check functions, and VPC endpoint check code) to use a clearer name such as SkipInfrastructureChecks (or add an explicit SkipMonitoringDeployment boolean) and propagate the rename/added field everywhere it is read, or alternatively add a comment/docstring near RHOBSConfig and the deployInternalMonitoringObjects function clarifying that SkipInfrastructureHealthCheck also skips internal monitoring deployment; ensure tests and any config parsing accept the new field or alias the old name to avoid breaking behavior.
🧹 Nitpick comments (8)
go.mod (1)
3-3: Consider bumping togo 1.25.7(orgo 1.26).Go 1.25.7 (released 2026-02-04) includes security fixes to the
gocommand andcrypto/tlspackage. Go 1.26 was released on February 10, 2026 with further security improvements. Thegodirective ingo.modsets the minimum required version, so pinning to 1.25.5 still allows building with newer toolchains, but updating the directive ensures CI and downstream consumers aren't misled about the minimum tested version. At minimum, bumping togo 1.25.7would pick up thecrypto/tlssecurity fix.-go 1.25.5 +go 1.25.7🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@go.mod` at line 3, Update the Go module version in go.mod by changing the go directive from "go 1.25.5" to at least "go 1.25.7" (or "go 1.26") so CI and consumers reflect the minimum tested toolchain; edit the go directive line in go.mod to the chosen version and commit the change.controllers/hostedcontrolplane/hostedcontrolplane_test.go (1)
845-996:TestHostedControlPlaneReconciler_getRHOBSConfighas no coverage for the newSkipInfrastructureHealthCheckfield.None of the existing test cases sets
"skip-infrastructure-health-check"in the ConfigMap data, and none of theexpectedstructs assertsSkipInfrastructureHealthCheck: true. Since the field was introduced in this PR andgetRHOBSConfigshould read it from the ConfigMap (matching howonly-public-clustersis handled), at least two cases are missing:
- ConfigMap with
"skip-infrastructure-health-check": "true"→ expectSkipInfrastructureHealthCheck: true- ConfigMap with
"skip-infrastructure-health-check": "false"(or absent) → expectSkipInfrastructureHealthCheck: false♻️ Example test case to add
{ name: "ConfigMap present with skip-infrastructure-health-check true - propagates field", configMap: &corev1.ConfigMap{ ObjectMeta: metav1.ObjectMeta{ Name: configMapName, Namespace: config.OperatorNamespace, }, Data: map[string]string{ "probe-api-url": "https://api.example.com/probes", "probe-tenant": "some-tenant", "oidc-client-id": "some-id", "oidc-client-secret": "some-secret", "oidc-issuer-url": "https://issuer.example.com", "only-public-clusters": "false", "skip-infrastructure-health-check": "true", }, }, fallbackConfig: fallbackConfig, expected: RHOBSConfig{ ProbeAPIURL: "https://api.example.com/probes", Tenant: "some-tenant", OIDCClientID: "some-id", OIDCClientSecret: "some-secret", OIDCIssuerURL: "https://issuer.example.com", OnlyPublicClusters: false, SkipInfrastructureHealthCheck: true, }, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/hostedcontrolplane/hostedcontrolplane_test.go` around lines 845 - 996, TestHostedControlPlaneReconciler_getRHOBSConfig is missing coverage for the new SkipInfrastructureHealthCheck field; update the test table in TestHostedControlPlaneReconciler_getRHOBSConfig to include at least two cases that exercise the ConfigMap key "skip-infrastructure-health-check": one where the ConfigMap Data sets it to "true" and the expected RHOBSConfig has SkipInfrastructureHealthCheck: true, and one where it is "false" or absent and expected is false; ensure the ConfigMap entries and expected RHOBSConfig objects in those test cases match how getRHOBSConfig parses "only-public-clusters" (use the same parsing/trim logic) so the new field is asserted correctly.test/e2e/route_monitor_operator_tests.go (3)
1111-1131: Usek8serrors.IsNotFoundinstead of string matching for not-found detection.Same concern as the conflict check — string-based error matching is fragile.
k8serrors.IsNotFound(err)is the idiomatic approach.♻️ Suggested fix
err := k8s.Get(ctx, crdName, "", u) if err != nil { - if strings.Contains(err.Error(), "not found") || strings.Contains(err.Error(), "NotFound") { + if k8serrors.IsNotFound(err) { return false, nil } return false, err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/route_monitor_operator_tests.go` around lines 1111 - 1131, In crdExists, replace fragile string matching on err.Error() with the idiomatic k8serrors.IsNotFound check: after calling k8s.Get(ctx, crdName, "", u), if err != nil use k8serrors.IsNotFound(err) to return (false, nil) for not-found and otherwise return (false, err); keep the rest of the unstructured GroupVersionKind logic intact and import k8s.io/apimachinery/pkg/api/errors as k8serrors if not already imported.
870-926: Usek8serrors.IsConflictinstead of string matching for conflict detection.Line 913–914 detects optimistic concurrency conflicts by checking for substrings in the error message. This is fragile and locale-dependent. Use the typed error check instead.
♻️ Suggested fix
+ "k8s.io/apimachinery/pkg/api/errors" // already imported as k8serrors ... // Check if it's a conflict error - if strings.Contains(err.Error(), "object has been modified") || - strings.Contains(err.Error(), "the object has been modified") { + if k8serrors.IsConflict(err) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/route_monitor_operator_tests.go` around lines 870 - 926, The retry loop in setHostedControlPlaneAvailable currently detects update conflicts by checking substrings in the error text, which is fragile; replace that string-matching logic with k8serrors.IsConflict(err) (importing "k8s.io/apimachinery/pkg/api/errors" as k8serrors) when inspecting the error returned by k8s.Update(ctx, u) so conflicts trigger the existing retry/backoff path and non-conflict errors still return immediately; keep the same retry count, log call (GinkgoLogr.V...), and sleep behavior but use k8serrors.IsConflict to detect optimistic concurrency conflicts.
1245-1286: OIDC client secret stored in a ConfigMap rather than a Secret.
createRMOConfigMapwritesoidc-client-secretinto aConfigMap. ConfigMaps are not encrypted at rest (by default) and have weaker RBAC expectations than Secrets. While this mirrors the existing operator config pattern (the operator reads from the same ConfigMap), storing credentials in a ConfigMap is a security concern worth documenting or migrating in the future.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/route_monitor_operator_tests.go` around lines 1245 - 1286, The function createRMOConfigMap currently writes the sensitive oidc client secret into a ConfigMap; instead create/update a Kubernetes Secret (Opaque) to hold creds.ClientSecret (key "oidc-client-secret") and remove that key from the ConfigMap data. Modify createRMOConfigMap to: 1) create/update a Secret object for the OIDC secret (use creds.ClientSecret and the same namespace "openshift-route-monitor-operator"), handling existing Secret via Get + Update and using creds.ClientSecret as the value; 2) stop placing "oidc-client-secret" into the ConfigMap.Data and only put non-sensitive fields (probe-api-url, oidc-client-id, oidc-issuer-url, skip-infrastructure-health-check) into the ConfigMap; and 3) ensure Update/Create logic mirrors the existing Get/Update/Create pattern used in createRMOConfigMap for both the ConfigMap and the new Secret. Ensure the secret key name matches "oidc-client-secret" so consumers can locate it.controllers/hostedcontrolplane/healthcheck.go (1)
66-70: Consider adding a log statement when skipping health checks.When
SkipInfrastructureHealthCheckis true, the function silently returnstrue. Adding a debug-level log here would help operators confirm the skip is taking effect and diagnose unexpected behavior in environments where this flag may be inadvertently set.🔧 Suggested improvement
func (r *HostedControlPlaneReconciler) hcpReady(ctx context.Context, hostedcontrolplane *hypershiftv1beta1.HostedControlPlane, cfg RHOBSConfig) (bool, error) { // Skip health check for test environments (e.g., osde2e tests without real kube-apiserver) if cfg.SkipInfrastructureHealthCheck { + ctrl.LoggerFrom(ctx).V(1).Info("Skipping HCP health check due to SkipInfrastructureHealthCheck flag") return true, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/hostedcontrolplane/healthcheck.go` around lines 66 - 70, Add a debug log when hcpReady short-circuits due to RHOBSConfig.SkipInfrastructureHealthCheck: inside HostedControlPlaneReconciler.hcpReady, before returning true,nil when cfg.SkipInfrastructureHealthCheck is set, call the reconciler's logger (or ctx logger) to emit a debug message that clearly states the health check was skipped and includes identifying context (e.g., hostedcontrolplane.Name/Namespace and the fact SkipInfrastructureHealthCheck=true) so operators can confirm the skip is taking effect.main.go (1)
176-182: Same one-way boolean override issue as ingetRHOBSConfig.This mirrors the pattern flagged in
hostedcontrolplane.goLine 136: the ConfigMap can only enable the flag, never disable it when the CLI default istrue. If you apply the bidirectional fix ingetRHOBSConfig, apply it here too for consistency.♻️ Suggested fix
- if configData.SkipInfrastructureHealthCheck { - setupLog.V(1).Info("Using skip-infrastructure-health-check from ConfigMap", "skipInfrastructureHealthCheck", configData.SkipInfrastructureHealthCheck) - skipInfrastructureHealthCheck = configData.SkipInfrastructureHealthCheck - configMapParams = append(configMapParams, "skip-infrastructure-health-check") - } else { - flagParams = append(flagParams, "skip-infrastructure-health-check") - }Note: In
getConfigFromConfigMap(Line 389), the raw"true"string check is fine since the field will betrueonly if the ConfigMap explicitly says"true". But in the consumer block above, checkingconfigData.SkipInfrastructureHealthCheck(a bool) means the ConfigMap said"true"— you'd also need to check if the key existed but was set to"false"in order to override the CLI. Consider using a tri-state (string or*bool) inOperatorConfigto distinguish "not set" from "set to false".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main.go` around lines 176 - 182, The current consumer logic using configData.SkipInfrastructureHealthCheck only detects when the ConfigMap set the value to true and cannot override a CLI default true to false; update the handling so the OperatorConfig can represent tri-state (e.g., use *bool or string for SkipInfrastructureHealthCheck) and in the consumer branch (where skipInfrastructureHealthCheck, configMapParams and flagParams are set) check for existence/explicit false as well as true; specifically change OperatorConfig.SkipInfrastructureHealthCheck to a pointer or string in getConfigFromConfigMap, preserve the existing check for the raw "true" there, and in the block that assigns skipInfrastructureHealthCheck and appends to configMapParams/flagParams, test nil vs non-nil (or explicit "false") so the ConfigMap can both enable and disable the flag (override CLI) rather than only enabling it.controllers/hostedcontrolplane/hostedcontrolplane.go (1)
136-138: Boolean config values can only be enabled, never disabled, via ConfigMap override.The pattern
if configMap == "true" { cfg.X = true }means that once the CLI flag setsSkipInfrastructureHealthCheck = true, the ConfigMap cannot override it back tofalse. The same applies toOnlyPublicClusters(Line 133). If the intent is for the ConfigMap to be the authoritative source (as described in the PR summary: "ConfigMap overrides CLI"), consider parsing both"true"and"false"explicitly:♻️ Suggested fix for bidirectional override
- if strings.TrimSpace(configMap.Data["skip-infrastructure-health-check"]) == "true" { - cfg.SkipInfrastructureHealthCheck = true - } + if v := strings.TrimSpace(configMap.Data["skip-infrastructure-health-check"]); v != "" { + cfg.SkipInfrastructureHealthCheck = v == "true" + }The same pattern should apply to
only-public-clustersat Line 133–135 for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/hostedcontrolplane/hostedcontrolplane.go` around lines 136 - 138, The ConfigMap override currently only sets cfg.SkipInfrastructureHealthCheck (and cfg.OnlyPublicClusters) to true when the value is "true", preventing a ConfigMap from disabling a flag set by CLI; change the logic to parse explicit boolean values from the ConfigMap (e.g. use strconv.ParseBool on configMap.Data["skip-infrastructure-health-check"] and configMap.Data["only-public-clusters"]) and, if parsing succeeds, assign the parsed bool to cfg.SkipInfrastructureHealthCheck and cfg.OnlyPublicClusters respectively so the ConfigMap becomes the authoritative source for enabling or disabling these flags.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/README.md`:
- Line 541: Replace the fragile hardcoded line range in README.md with a
reference to the actual symbol that embeds the CRD definitions in
route_monitor_operator_tests.go (e.g., the const or function name that
holds/returns the embedded CRDs), so update "The CRD definitions are embedded in
`route_monitor_operator_tests.go` (lines 46-96)" to instead mention that symbol
(for example: "The CRD definitions are embedded in
`route_monitor_operator_tests.go` via <EMBED_SYMBOL_NAME>"), ensuring reviewers
can find the embedding even if line numbers change.
- Line 20: Add language specifiers to the three fenced code blocks that open
without one (the ASCII/architecture/CI diagrams currently starting with ``` at
the three locations in the README) by changing their opening fences to include
"text" (e.g., ```text) so Markdown Linter MD040 warnings are resolved; update
the three diagram blocks that begin with ``` to ```text (the ASCII flow diagram,
the architecture diagram, and the CI/CD architecture diagram).
In `@test/e2e/route_monitor_operator_tests.go`:
- Around line 777-780: The code appends labelSelector directly to reqURL
(variables reqURL, baseURL, labelSelector) which can break for characters like
+, &, = or spaces; URL-encode the selector before appending — either build query
params with net/url by creating url.Values, calling Add("label_selector",
labelSelector) and using v.Encode(), or call url.QueryEscape(labelSelector) and
append "?label_selector="+escapedSelector so the final reqURL is properly
encoded.
- Around line 1217-1243: The getOIDCCredentials function returns a partially
populated OIDCCredentials when only EXTERNAL_SECRET_OIDC_CLIENT_ID is set;
update it to validate that EXTERNAL_SECRET_OIDC_CLIENT_SECRET,
EXTERNAL_SECRET_OIDC_ISSUER_URL and PROBE_API_URL (and ClientID) are all
non-empty before returning; if any are missing, return an error listing the
missing env vars instead of a partially filled struct (reference
getOIDCCredentials and OIDCCredentials and the env names
EXTERNAL_SECRET_OIDC_CLIENT_ID, EXTERNAL_SECRET_OIDC_CLIENT_SECRET,
EXTERNAL_SECRET_OIDC_ISSUER_URL, PROBE_API_URL).
- Around line 826-846: getRHOBSProbe performs an unauthenticated GET and will
401 on protected RHOBS endpoints; update it to mirror listRHOBSProbes by
obtaining the OIDC token (reuse the existing token retrieval logic used by
listRHOBSProbes or accept a token parameter) and send an Authorization: Bearer
<token> header on the request (use http.NewRequest + req.Header.Set and
client.Do instead of client.Get). Also update the probe-activation caller (the
probe activation check that calls getRHOBSProbe) to pass a token if you change
the function signature.
---
Duplicate comments:
In `@controllers/hostedcontrolplane/hostedcontrolplane.go`:
- Around line 328-332: The config flag SkipInfrastructureHealthCheck is
currently used in deployInternalMonitoringObjects, readiness checks, and VPC
endpoint checks which overloads its meaning; update RHOBSConfig and all callers
(e.g., deployInternalMonitoringObjects, any readiness check functions, and VPC
endpoint check code) to use a clearer name such as SkipInfrastructureChecks (or
add an explicit SkipMonitoringDeployment boolean) and propagate the rename/added
field everywhere it is read, or alternatively add a comment/docstring near
RHOBSConfig and the deployInternalMonitoringObjects function clarifying that
SkipInfrastructureHealthCheck also skips internal monitoring deployment; ensure
tests and any config parsing accept the new field or alias the old name to avoid
breaking behavior.
---
Nitpick comments:
In `@controllers/hostedcontrolplane/healthcheck.go`:
- Around line 66-70: Add a debug log when hcpReady short-circuits due to
RHOBSConfig.SkipInfrastructureHealthCheck: inside
HostedControlPlaneReconciler.hcpReady, before returning true,nil when
cfg.SkipInfrastructureHealthCheck is set, call the reconciler's logger (or ctx
logger) to emit a debug message that clearly states the health check was skipped
and includes identifying context (e.g., hostedcontrolplane.Name/Namespace and
the fact SkipInfrastructureHealthCheck=true) so operators can confirm the skip
is taking effect.
In `@controllers/hostedcontrolplane/hostedcontrolplane_test.go`:
- Around line 845-996: TestHostedControlPlaneReconciler_getRHOBSConfig is
missing coverage for the new SkipInfrastructureHealthCheck field; update the
test table in TestHostedControlPlaneReconciler_getRHOBSConfig to include at
least two cases that exercise the ConfigMap key
"skip-infrastructure-health-check": one where the ConfigMap Data sets it to
"true" and the expected RHOBSConfig has SkipInfrastructureHealthCheck: true, and
one where it is "false" or absent and expected is false; ensure the ConfigMap
entries and expected RHOBSConfig objects in those test cases match how
getRHOBSConfig parses "only-public-clusters" (use the same parsing/trim logic)
so the new field is asserted correctly.
In `@controllers/hostedcontrolplane/hostedcontrolplane.go`:
- Around line 136-138: The ConfigMap override currently only sets
cfg.SkipInfrastructureHealthCheck (and cfg.OnlyPublicClusters) to true when the
value is "true", preventing a ConfigMap from disabling a flag set by CLI; change
the logic to parse explicit boolean values from the ConfigMap (e.g. use
strconv.ParseBool on configMap.Data["skip-infrastructure-health-check"] and
configMap.Data["only-public-clusters"]) and, if parsing succeeds, assign the
parsed bool to cfg.SkipInfrastructureHealthCheck and cfg.OnlyPublicClusters
respectively so the ConfigMap becomes the authoritative source for enabling or
disabling these flags.
In `@go.mod`:
- Line 3: Update the Go module version in go.mod by changing the go directive
from "go 1.25.5" to at least "go 1.25.7" (or "go 1.26") so CI and consumers
reflect the minimum tested toolchain; edit the go directive line in go.mod to
the chosen version and commit the change.
In `@main.go`:
- Around line 176-182: The current consumer logic using
configData.SkipInfrastructureHealthCheck only detects when the ConfigMap set the
value to true and cannot override a CLI default true to false; update the
handling so the OperatorConfig can represent tri-state (e.g., use *bool or
string for SkipInfrastructureHealthCheck) and in the consumer branch (where
skipInfrastructureHealthCheck, configMapParams and flagParams are set) check for
existence/explicit false as well as true; specifically change
OperatorConfig.SkipInfrastructureHealthCheck to a pointer or string in
getConfigFromConfigMap, preserve the existing check for the raw "true" there,
and in the block that assigns skipInfrastructureHealthCheck and appends to
configMapParams/flagParams, test nil vs non-nil (or explicit "false") so the
ConfigMap can both enable and disable the flag (override CLI) rather than only
enabling it.
In `@test/e2e/route_monitor_operator_tests.go`:
- Around line 1111-1131: In crdExists, replace fragile string matching on
err.Error() with the idiomatic k8serrors.IsNotFound check: after calling
k8s.Get(ctx, crdName, "", u), if err != nil use k8serrors.IsNotFound(err) to
return (false, nil) for not-found and otherwise return (false, err); keep the
rest of the unstructured GroupVersionKind logic intact and import
k8s.io/apimachinery/pkg/api/errors as k8serrors if not already imported.
- Around line 870-926: The retry loop in setHostedControlPlaneAvailable
currently detects update conflicts by checking substrings in the error text,
which is fragile; replace that string-matching logic with
k8serrors.IsConflict(err) (importing "k8s.io/apimachinery/pkg/api/errors" as
k8serrors) when inspecting the error returned by k8s.Update(ctx, u) so conflicts
trigger the existing retry/backoff path and non-conflict errors still return
immediately; keep the same retry count, log call (GinkgoLogr.V...), and sleep
behavior but use k8serrors.IsConflict to detect optimistic concurrency
conflicts.
- Around line 1245-1286: The function createRMOConfigMap currently writes the
sensitive oidc client secret into a ConfigMap; instead create/update a
Kubernetes Secret (Opaque) to hold creds.ClientSecret (key "oidc-client-secret")
and remove that key from the ConfigMap data. Modify createRMOConfigMap to: 1)
create/update a Secret object for the OIDC secret (use creds.ClientSecret and
the same namespace "openshift-route-monitor-operator"), handling existing Secret
via Get + Update and using creds.ClientSecret as the value; 2) stop placing
"oidc-client-secret" into the ConfigMap.Data and only put non-sensitive fields
(probe-api-url, oidc-client-id, oidc-issuer-url,
skip-infrastructure-health-check) into the ConfigMap; and 3) ensure
Update/Create logic mirrors the existing Get/Update/Create pattern used in
createRMOConfigMap for both the ConfigMap and the new Secret. Ensure the secret
key name matches "oidc-client-secret" so consumers can locate it.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (12)
config/manager/manager.yamlcontrollers/hostedcontrolplane/healthcheck.gocontrollers/hostedcontrolplane/hostedcontrolplane.gocontrollers/hostedcontrolplane/hostedcontrolplane_test.gocontrollers/hostedcontrolplane/synthetics.godeploy/route-monitor-operator-controller-manager.Deployment.yamlgo.modhack/olm-registry/rmo-config-template.yamlmain.gopkg/rhobs/client_test.gotest/e2e/README.mdtest/e2e/route_monitor_operator_tests.go
|
|
||
| The RMO e2e test suite validates the complete synthetic monitoring workflow: | ||
|
|
||
| ``` |
There was a problem hiding this comment.
Fenced code blocks are missing language specifiers.
Lines 20, 288, and 432 all open a ``` block without a language tag, triggering MD040 warnings.
📄 Suggested fixes
Line 20 — ASCII flow diagram; use text:
-```
+```text
HostedControlPlane CR → RMO → RHOBS API → Synthetics Agent → Prometheus/Blackbox ExporterLine 288 — architecture diagram; use text:
-```
+```text
┌──────────────────...Line 432 — CI/CD architecture diagram; use text:
-```
+```text
┌──────────────────...Also applies to: 288-288, 432-432
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 20-20: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/README.md` at line 20, Add language specifiers to the three fenced
code blocks that open without one (the ASCII/architecture/CI diagrams currently
starting with ``` at the three locations in the README) by changing their
opening fences to include "text" (e.g., ```text) so Markdown Linter MD040
warnings are resolved; update the three diagram blocks that begin with ``` to
```text (the ASCII flow diagram, the architecture diagram, and the CI/CD
architecture diagram).
test/e2e/README.md
Outdated
| **Solution:** The test should install CRDs automatically in `BeforeAll`. If this fails: | ||
| - Check test logs for CRD installation errors | ||
| - Verify cluster has permissions to create CRDs | ||
| - The CRD definitions are embedded in `route_monitor_operator_tests.go` (lines 46-96) |
There was a problem hiding this comment.
Hardcoded line reference will become stale.
(lines 46-96) is fragile — any insertion above those lines will silently point reviewers at the wrong code. Reference the embedding function or a const name instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/README.md` at line 541, Replace the fragile hardcoded line range in
README.md with a reference to the actual symbol that embeds the CRD definitions
in route_monitor_operator_tests.go (e.g., the const or function name that
holds/returns the embedded CRDs), so update "The CRD definitions are embedded in
`route_monitor_operator_tests.go` (lines 46-96)" to instead mention that symbol
(for example: "The CRD definitions are embedded in
`route_monitor_operator_tests.go` via <EMBED_SYMBOL_NAME>"), ensuring reviewers
can find the embedding even if line numbers change.
| func getRHOBSProbe(baseURL, probeID string) (map[string]interface{}, error) { | ||
| client := &http.Client{Timeout: 10 * time.Second} | ||
|
|
||
| resp, err := client.Get(baseURL + "/" + probeID) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get probe: %w", err) | ||
| } | ||
| defer resp.Body.Close() | ||
|
|
||
| if resp.StatusCode != http.StatusOK { | ||
| body, _ := io.ReadAll(resp.Body) | ||
| return nil, fmt.Errorf("RHOBS API returned status %d: %s", resp.StatusCode, string(body)) | ||
| } | ||
|
|
||
| var probe map[string]interface{} | ||
| if err := json.NewDecoder(resp.Body).Decode(&probe); err != nil { | ||
| return nil, fmt.Errorf("failed to decode probe response: %w", err) | ||
| } | ||
|
|
||
| return probe, nil | ||
| } |
There was a problem hiding this comment.
getRHOBSProbe is missing OIDC authentication — will fail with 401 on protected APIs.
listRHOBSProbes correctly obtains an OIDC token and sets the Authorization header, but getRHOBSProbe performs an unauthenticated client.Get(...). This will cause the probe activation check at Line 488 to fail if the RHOBS API requires authentication (which it almost certainly does, given the auth on the list endpoint).
🐛 Proposed fix — add OIDC auth to getRHOBSProbe
-func getRHOBSProbe(baseURL, probeID string) (map[string]interface{}, error) {
- client := &http.Client{Timeout: 10 * time.Second}
+func getRHOBSProbe(baseURL, probeID string, creds *OIDCCredentials) (map[string]interface{}, error) {
+ accessToken, err := getOIDCAccessToken(creds)
+ if err != nil {
+ return nil, fmt.Errorf("failed to get OIDC access token: %w", err)
+ }
+
+ client := &http.Client{Timeout: 10 * time.Second}
- resp, err := client.Get(baseURL + "/" + probeID)
+ req, err := http.NewRequest("GET", baseURL+"/"+probeID, nil)
+ if err != nil {
+ return nil, fmt.Errorf("failed to create request: %w", err)
+ }
+ req.Header.Set("Authorization", "Bearer "+accessToken)
+
+ resp, err := client.Do(req)Then update the call site at Line 488:
- p, err := getRHOBSProbe(rhobsAPIURL, probeID)
+ p, err := getRHOBSProbe(rhobsAPIURL, probeID, oidcCredentials)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func getRHOBSProbe(baseURL, probeID string) (map[string]interface{}, error) { | |
| client := &http.Client{Timeout: 10 * time.Second} | |
| resp, err := client.Get(baseURL + "/" + probeID) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get probe: %w", err) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode != http.StatusOK { | |
| body, _ := io.ReadAll(resp.Body) | |
| return nil, fmt.Errorf("RHOBS API returned status %d: %s", resp.StatusCode, string(body)) | |
| } | |
| var probe map[string]interface{} | |
| if err := json.NewDecoder(resp.Body).Decode(&probe); err != nil { | |
| return nil, fmt.Errorf("failed to decode probe response: %w", err) | |
| } | |
| return probe, nil | |
| } | |
| func getRHOBSProbe(baseURL, probeID string, creds *OIDCCredentials) (map[string]interface{}, error) { | |
| accessToken, err := getOIDCAccessToken(creds) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get OIDC access token: %w", err) | |
| } | |
| client := &http.Client{Timeout: 10 * time.Second} | |
| req, err := http.NewRequest("GET", baseURL+"/"+probeID, nil) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to create request: %w", err) | |
| } | |
| req.Header.Set("Authorization", "Bearer "+accessToken) | |
| resp, err := client.Do(req) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get probe: %w", err) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode != http.StatusOK { | |
| body, _ := io.ReadAll(resp.Body) | |
| return nil, fmt.Errorf("RHOBS API returned status %d: %s", resp.StatusCode, string(body)) | |
| } | |
| var probe map[string]interface{} | |
| if err := json.NewDecoder(resp.Body).Decode(&probe); err != nil { | |
| return nil, fmt.Errorf("failed to decode probe response: %w", err) | |
| } | |
| return probe, nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/route_monitor_operator_tests.go` around lines 826 - 846,
getRHOBSProbe performs an unauthenticated GET and will 401 on protected RHOBS
endpoints; update it to mirror listRHOBSProbes by obtaining the OIDC token
(reuse the existing token retrieval logic used by listRHOBSProbes or accept a
token parameter) and send an Authorization: Bearer <token> header on the request
(use http.NewRequest + req.Header.Set and client.Do instead of client.Get). Also
update the probe-activation caller (the probe activation check that calls
getRHOBSProbe) to pass a token if you change the function signature.
|
@Mhodesty: This pull request references SREP-3117 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@Mhodesty: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
test/e2e/README.md (1)
20-22:⚠️ Potential issue | 🟡 MinorAdd language identifiers to diagram fences.
These three ASCII/diagram blocks still open with bare backticks, which triggers MD040 and can cause inconsistent rendering. Use ```text for the diagrams.
🛠️ Suggested fix
-``` +```text HostedControlPlane CR → RMO → RHOBS API → Synthetics Agent → Prometheus/Blackbox Exporter-``` +```text ┌─────────────────────────────────────────────────────────────┐ │ Test Process │ ... └─────────────────────────────────────────────────────────────┘-``` +```text ┌─────────────────────────────────────────────────────────────┐ │ Real OpenShift Cluster (Integration/Staging) │ ... └─────────────────────────────────────────────────────────────┘Also applies to: 288-314, 432-467
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/README.md` around lines 20 - 22, Update the three ASCII diagram code fences so they include a language identifier to satisfy MD040; replace the opening triple backticks for each diagram (for example the block that starts with "HostedControlPlane CR → RMO → RHOBS API → Synthetics Agent → Prometheus/Blackbox Exporter" and the two larger box diagrams starting "┌─────────────────────────────────────────────────────────────┐") with ```text so each diagram block begins with ```text and ends with ``` to ensure consistent markdown rendering.
🧹 Nitpick comments (2)
test/e2e/route_monitor_operator_tests.go (2)
855-858: Prefer API conflict detection helpers.String-matching conflict messages is brittle; use
k8serrors.IsConflict(err)for correctness across API versions.🛠️ Suggested fix
- if strings.Contains(err.Error(), "object has been modified") || - strings.Contains(err.Error(), "the object has been modified") { + if k8serrors.IsConflict(err) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/route_monitor_operator_tests.go` around lines 855 - 858, Replace the brittle string-based conflict detection that checks err.Error() with the Kubernetes helper k8serrors.IsConflict(err): remove the strings.Contains(...) checks and use k8serrors.IsConflict(err) in the conditional that decides to retry; add the import alias (k8serrors) for "k8s.io/apimachinery/pkg/api/errors" if not present so the code compiles. Ensure the retry path stays the same (the block that currently runs when the string match is true) but is triggered by k8serrors.IsConflict(err) instead.
1015-1024: Use k8s error helpers instead of string matching.
AlreadyExists,Forbidden, andNotFoundshould be detected viak8serrorshelpers; message matching is fragile.🛠️ Suggested fix
- if err := installCRDFromYAML(ctx, k8s, crd.yamlDef); err != nil { - // Check if CRD already exists (race condition or detection failure) - if strings.Contains(err.Error(), "AlreadyExists") || strings.Contains(err.Error(), "already exists") { + if err := installCRDFromYAML(ctx, k8s, crd.yamlDef); err != nil { + // Check if CRD already exists (race condition or detection failure) + if k8serrors.IsAlreadyExists(err) { GinkgoLogr.Info("CRD already exists (detected during create), skipping", "crd", crd.name) continue } // Check if this is a permission error - if strings.Contains(err.Error(), "Forbidden") || strings.Contains(err.Error(), "forbidden") { + if k8serrors.IsForbidden(err) { GinkgoLogr.Error(err, "Permission denied - cannot install CRD", "crd", crd.name) missingCRDs = append(missingCRDs, crd.name) continue }- if err != nil { - // Check if it's a "not found" error - if strings.Contains(err.Error(), "not found") || strings.Contains(err.Error(), "NotFound") { + if err != nil { + // Check if it's a "not found" error + if k8serrors.IsNotFound(err) {Also applies to: 1063-1067
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/route_monitor_operator_tests.go` around lines 1015 - 1024, The error checks in the installCRDFromYAML error handling should use Kubernetes error helpers instead of fragile string matching; update the conditional logic around installCRDFromYAML (and the similar block handling errors later) to call k8serrors.IsAlreadyExists(err), k8serrors.IsForbidden(err) and k8serrors.IsNotFound(err) (importing k8s.io/apimachinery/pkg/api/errors as k8serrors) and adjust the branches that log and append to missingCRDs accordingly (keep existing messages and behavior but replace strings.Contains(err.Error(), "...") with the k8serrors helper calls for functions like installCRDFromYAML and wherever the duplicate check appears).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/route_monitor_operator_tests.go`:
- Around line 1141-1157: The code currently treats an empty probe-api-url as
valid and returns an OIDCCredentials struct, which prevents env var fallback;
update the incomplete-credentials check to include probeAPIURL (the variable
probeAPIURL and the returned OIDCCredentials struct) so that if probeAPIURL ==
"" the function logs the incomplete state and returns an error (as it does for
clientID/clientSecret/issuerURL), allowing the caller to use env fallback;
adjust the GinkgoLogr.Info call to also report has_probe_api_url (probeAPIURL !=
"") for clarity.
- Around line 642-650: getOIDCAccessToken is posting to creds.IssuerURL but that
field is the issuer base URL, not the token endpoint; update getOIDCAccessToken
to perform OIDC discovery against creds.IssuerURL +
"/.well-known/openid-configuration" to fetch the token_endpoint and then POST
the client_credentials grant to that token_endpoint (or alternatively rename the
OIDCCredentials.IssuerURL/env var to indicate it is already a token endpoint and
validate it). Ensure you parse the discovery JSON for "token_endpoint", handle
errors, and use that URL for the http.NewRequest instead of creds.IssuerURL.
- Around line 948-958: The loop deletes RMO pods using
labels.SelectorFromSet(rmoDeployment.Spec.Selector.MatchLabels) which ignores
MatchExpressions; instead construct the selector once from the full deployment
LabelSelector via metav1.LabelSelectorAsSelector(&rmoDeployment.Spec.Selector)
(handle the returned error) and store it in a variable (e.g., deploySelector)
outside the podList iteration, then use
deploySelector.Matches(labels.Set(pod.Labels)) inside the loop and remove the
current SelectorFromSet usage so pods are matched by the complete selector.
---
Duplicate comments:
In `@test/e2e/README.md`:
- Around line 20-22: Update the three ASCII diagram code fences so they include
a language identifier to satisfy MD040; replace the opening triple backticks for
each diagram (for example the block that starts with "HostedControlPlane CR →
RMO → RHOBS API → Synthetics Agent → Prometheus/Blackbox Exporter" and the two
larger box diagrams starting
"┌─────────────────────────────────────────────────────────────┐") with ```text
so each diagram block begins with ```text and ends with ``` to ensure consistent
markdown rendering.
---
Nitpick comments:
In `@test/e2e/route_monitor_operator_tests.go`:
- Around line 855-858: Replace the brittle string-based conflict detection that
checks err.Error() with the Kubernetes helper k8serrors.IsConflict(err): remove
the strings.Contains(...) checks and use k8serrors.IsConflict(err) in the
conditional that decides to retry; add the import alias (k8serrors) for
"k8s.io/apimachinery/pkg/api/errors" if not present so the code compiles. Ensure
the retry path stays the same (the block that currently runs when the string
match is true) but is triggered by k8serrors.IsConflict(err) instead.
- Around line 1015-1024: The error checks in the installCRDFromYAML error
handling should use Kubernetes error helpers instead of fragile string matching;
update the conditional logic around installCRDFromYAML (and the similar block
handling errors later) to call k8serrors.IsAlreadyExists(err),
k8serrors.IsForbidden(err) and k8serrors.IsNotFound(err) (importing
k8s.io/apimachinery/pkg/api/errors as k8serrors) and adjust the branches that
log and append to missingCRDs accordingly (keep existing messages and behavior
but replace strings.Contains(err.Error(), "...") with the k8serrors helper calls
for functions like installCRDFromYAML and wherever the duplicate check appears).
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
test/e2e/README.mdtest/e2e/route_monitor_operator_tests.go
| // getOIDCAccessToken gets an access token from the OIDC provider using client credentials flow | ||
| func getOIDCAccessToken(creds *OIDCCredentials) (string, error) { | ||
| data := url.Values{} | ||
| data.Set("grant_type", "client_credentials") | ||
| data.Set("client_id", creds.ClientID) | ||
| data.Set("client_secret", creds.ClientSecret) | ||
|
|
||
| req, err := http.NewRequest("POST", creds.IssuerURL, strings.NewReader(data.Encode())) | ||
| if err != nil { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and examine the getOIDCAccessToken function and related code
rg -A 20 "func getOIDCAccessToken" test/e2e/route_monitor_operator_tests.goRepository: openshift/route-monitor-operator
Length of output: 804
🏁 Script executed:
# Look for OIDCCredentials struct definition
rg -B 5 -A 15 "type OIDCCredentials struct" test/e2e/route_monitor_operator_tests.goRepository: openshift/route-monitor-operator
Length of output: 948
🏁 Script executed:
# Search for where IssuerURL is populated
rg "IssuerURL" test/e2e/route_monitor_operator_tests.go -B 2 -A 2Repository: openshift/route-monitor-operator
Length of output: 874
🏁 Script executed:
# Look for EXTERNAL_SECRET_OIDC_ISSUER_URL usage
rg "EXTERNAL_SECRET_OIDC_ISSUER_URL" test/e2e/route_monitor_operator_tests.go -B 2 -A 2Repository: openshift/route-monitor-operator
Length of output: 888
🏁 Script executed:
# Search for OIDC discovery or token endpoint references
rg -i "token.*endpoint|well-known|.well-known|discovery" test/e2e/route_monitor_operator_tests.goRepository: openshift/route-monitor-operator
Length of output: 176
Verify whether IssuerURL is actually a token endpoint URL or perform OIDC discovery.
getOIDCAccessToken posts to creds.IssuerURL directly, but the environment variable EXTERNAL_SECRET_OIDC_ISSUER_URL and standard OIDC semantics indicate this is the issuer base URL, not the token endpoint. The token endpoint must be discovered via {issuerURL}/.well-known/openid-configuration. Posting directly to an issuer URL will fail with real OIDC providers. Either rename the env var if it actually contains a token endpoint URL, or implement OIDC discovery to obtain the correct token endpoint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/route_monitor_operator_tests.go` around lines 642 - 650,
getOIDCAccessToken is posting to creds.IssuerURL but that field is the issuer
base URL, not the token endpoint; update getOIDCAccessToken to perform OIDC
discovery against creds.IssuerURL + "/.well-known/openid-configuration" to fetch
the token_endpoint and then POST the client_credentials grant to that
token_endpoint (or alternatively rename the OIDCCredentials.IssuerURL/env var to
indicate it is already a token endpoint and validate it). Ensure you parse the
discovery JSON for "token_endpoint", handle errors, and use that URL for the
http.NewRequest instead of creds.IssuerURL.
| // Delete RMO pods matching the deployment's label selector | ||
| for i := range podList.Items { | ||
| pod := &podList.Items[i] | ||
| if pod.Namespace != rmoNamespace { | ||
| continue | ||
| } | ||
|
|
||
| // Check if pod matches deployment's selector | ||
| labelSelector := labels.SelectorFromSet(rmoDeployment.Spec.Selector.MatchLabels) | ||
| if labelSelector.Matches(labels.Set(pod.Labels)) { | ||
| GinkgoLogr.Info("Deleting RMO pod to pick up CRDs", "pod", pod.Name) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the test file
find . -name "route_monitor_operator_tests.go" -type fRepository: openshift/route-monitor-operator
Length of output: 118
🏁 Script executed:
# Read the specific lines mentioned in the review
if [ -f "test/e2e/route_monitor_operator_tests.go" ]; then
sed -n '930,970p' "test/e2e/route_monitor_operator_tests.go"
fiRepository: openshift/route-monitor-operator
Length of output: 1477
🏁 Script executed:
# Check for imports to understand what packages are being used
head -50 "test/e2e/route_monitor_operator_tests.go" | grep -E "import|labels|metav1"Repository: openshift/route-monitor-operator
Length of output: 165
🏁 Script executed:
# Search for the entire function context to understand what's happening
sed -n '900,1000p' "test/e2e/route_monitor_operator_tests.go"Repository: openshift/route-monitor-operator
Length of output: 3064
Use the deployment's full label selector to avoid deleting unrelated pods.
SelectorFromSet(MatchLabels) only handles the MatchLabels field and ignores MatchExpressions. If the deployment's selector uses matchExpressions or relies only on them, the current code will fail to match the correct pods. Build the selector from the full LabelSelector using metav1.LabelSelectorAsSelector() and reuse it outside the loop.
🛠️ Suggested fix
- // Delete RMO pods matching the deployment's label selector
+ // Delete RMO pods matching the deployment's label selector
+ labelSelector, err := metav1.LabelSelectorAsSelector(rmoDeployment.Spec.Selector)
+ if err != nil {
+ return fmt.Errorf("invalid deployment label selector: %w", err)
+ }
for i := range podList.Items {
pod := &podList.Items[i]
if pod.Namespace != rmoNamespace {
continue
}
- // Check if pod matches deployment's selector
- labelSelector := labels.SelectorFromSet(rmoDeployment.Spec.Selector.MatchLabels)
+ // Check if pod matches deployment's selector
if labelSelector.Matches(labels.Set(pod.Labels)) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Delete RMO pods matching the deployment's label selector | |
| for i := range podList.Items { | |
| pod := &podList.Items[i] | |
| if pod.Namespace != rmoNamespace { | |
| continue | |
| } | |
| // Check if pod matches deployment's selector | |
| labelSelector := labels.SelectorFromSet(rmoDeployment.Spec.Selector.MatchLabels) | |
| if labelSelector.Matches(labels.Set(pod.Labels)) { | |
| GinkgoLogr.Info("Deleting RMO pod to pick up CRDs", "pod", pod.Name) | |
| // Delete RMO pods matching the deployment's label selector | |
| labelSelector, err := metav1.LabelSelectorAsSelector(rmoDeployment.Spec.Selector) | |
| if err != nil { | |
| return fmt.Errorf("invalid deployment label selector: %w", err) | |
| } | |
| for i := range podList.Items { | |
| pod := &podList.Items[i] | |
| if pod.Namespace != rmoNamespace { | |
| continue | |
| } | |
| // Check if pod matches deployment's selector | |
| if labelSelector.Matches(labels.Set(pod.Labels)) { | |
| GinkgoLogr.Info("Deleting RMO pod to pick up CRDs", "pod", pod.Name) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/route_monitor_operator_tests.go` around lines 948 - 958, The loop
deletes RMO pods using
labels.SelectorFromSet(rmoDeployment.Spec.Selector.MatchLabels) which ignores
MatchExpressions; instead construct the selector once from the full deployment
LabelSelector via metav1.LabelSelectorAsSelector(&rmoDeployment.Spec.Selector)
(handle the returned error) and store it in a variable (e.g., deploySelector)
outside the podList iteration, then use
deploySelector.Matches(labels.Set(pod.Labels)) inside the loop and remove the
current SelectorFromSet usage so pods are matched by the complete selector.
| // Check if all required fields are present | ||
| clientID := existingCM.Data["oidc-client-id"] | ||
| clientSecret := existingCM.Data["oidc-client-secret"] | ||
| issuerURL := existingCM.Data["oidc-issuer-url"] | ||
| probeAPIURL := existingCM.Data["probe-api-url"] | ||
|
|
||
| if clientID == "" || clientSecret == "" || issuerURL == "" { | ||
| GinkgoLogr.Info("ConfigMap exists but has incomplete credentials", "has_client_id", clientID != "", "has_client_secret", clientSecret != "", "has_issuer_url", issuerURL != "") | ||
| return nil, fmt.Errorf("incomplete credentials in ConfigMap") | ||
| } | ||
|
|
||
| return &OIDCCredentials{ | ||
| ClientID: clientID, | ||
| ClientSecret: clientSecret, | ||
| IssuerURL: issuerURL, | ||
| ProbeAPIURL: probeAPIURL, | ||
| }, nil |
There was a problem hiding this comment.
Treat missing probe-api-url as incomplete ConfigMap.
If probe-api-url is empty, you return credentials and skip env fallback, then fail later. Treat it as incomplete so env vars can populate it.
🛠️ Suggested fix
- if clientID == "" || clientSecret == "" || issuerURL == "" {
- GinkgoLogr.Info("ConfigMap exists but has incomplete credentials", "has_client_id", clientID != "", "has_client_secret", clientSecret != "", "has_issuer_url", issuerURL != "")
+ if clientID == "" || clientSecret == "" || issuerURL == "" || probeAPIURL == "" {
+ GinkgoLogr.Info("ConfigMap exists but has incomplete credentials",
+ "has_client_id", clientID != "",
+ "has_client_secret", clientSecret != "",
+ "has_issuer_url", issuerURL != "",
+ "has_probe_api_url", probeAPIURL != "")
return nil, fmt.Errorf("incomplete credentials in ConfigMap")
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/route_monitor_operator_tests.go` around lines 1141 - 1157, The code
currently treats an empty probe-api-url as valid and returns an OIDCCredentials
struct, which prevents env var fallback; update the incomplete-credentials check
to include probeAPIURL (the variable probeAPIURL and the returned
OIDCCredentials struct) so that if probeAPIURL == "" the function logs the
incomplete state and returns an error (as it does for
clientID/clientSecret/issuerURL), allowing the caller to use env fallback;
adjust the GinkgoLogr.Info call to also report has_probe_api_url (probeAPIURL !=
"") for clarity.
There was a problem hiding this comment.
This file is boilerplated, might want to create one with different name . Or you can propose update in boilerplate!
Summary
This PR adds comprehensive E2E tests for RHOBS Synthetic Monitoring integration with HostedControlPlane (HCP) resources on HyperShift. The tests validate the full lifecycle of synthetic probe creation, management, and cleanup via the RHOBS API.
What We Added
1. E2E Test Suite (
test/e2e/route_monitor_operator_tests.go)Test Coverage:
private=truelabelHelper Functions:
getOIDCAccessToken(),getOrCreateOIDCCredentials(),getOIDCCredentialsFromConfigMap()listRHOBSProbes(),getRHOBSProbe()ensureCRDsInstalled(),crdExists(),installCRDFromYAML()restartRMODeployment(),setHostedControlPlaneAvailable(),createMCStyleHCP(),labelTestClusterAsManagementCluster()Embedded CRD Definitions:
HostedControlPlane(hypershift.openshift.io/v1beta1)VpcEndpoint(avo.openshift.io/v1alpha2)ClusterDeployment(hive.openshift.io/v1)2. Test Annotation Support (
controllers/hostedcontrolplane/)Added
routemonitor.openshift.io/osde2e-testingannotation check to skip infrastructure requirements for test HCPs:healthcheck.go:hostedcontrolplane.go:This allows E2E tests to create minimal HCP resources focused on RHOBS API integration without requiring full cluster infrastructure.
Issues Encountered and Resolved
1. CRD Installation
ensureCRDsInstalled()function to install embedded CRDs if missing, plusrestartRMODeployment()to reload RMO after CRD installation2. Infrastructure Requirements
routemonitor.openshift.io/osde2e-testingannotation support to skip infrastructure health checks for test HCPsTesting
Tests validated in local osde2e environment:
All HCP synthetic monitoring tests pass successfully, verifying:
Related Issues
Files Changed
test/e2e/route_monitor_operator_tests.go- Added RHOBS Synthetic Monitoring test suitecontrollers/hostedcontrolplane/healthcheck.go- Added osde2e-testing annotation supportcontrollers/hostedcontrolplane/hostedcontrolplane.go- Added osde2e-testing annotation supportgo.mod,go.sum- Dependency updates🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests
Documentation
Chores