Skip to content

NETOBSERV-2596: Make console plugin controller use health metadata for config, set some default rules as recording#2388

Merged
jotak merged 5 commits intonetobserv:mainfrom
jotak:unify-data-model
Feb 4, 2026
Merged

NETOBSERV-2596: Make console plugin controller use health metadata for config, set some default rules as recording#2388
jotak merged 5 commits intonetobserv:mainfrom
jotak:unify-data-model

Conversation

@jotak
Copy link
Member

@jotak jotak commented Jan 28, 2026

Description

  • Refactor all alerts to implement a HealthRule interface
    • HealthRule provides the Annotations, RecordingName and the PrometheusRule
    • RecordingName now provided explicitly
    • Split logic between "builder" and "context"
  • Console plugin controller just dumps annotations to config
  • Change some defaults to Recording

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labeled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

TrendDuration: &v1.Duration{Duration: time.Hour},
TrendOffset: &v1.Duration{Duration: time.Minute * 30},
TrendDuration: &v1.Duration{Duration: time.Minute * 30},
// TrendOffset: &v1.Duration{Duration: 24 * time.Hour},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: rollback those temporary changes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for testing ?

Maybe we could have a way to put some default values for developments without changing the code all the time. Similar to the useMocks in plugin 😉

}

func (r *latencyTrend) RecordingName() string {
return buildRecordingRuleName(r.ctx, "tcp_latency_p90", "2m")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: rename including the comparison offset

@codecov
Copy link

codecov bot commented Jan 28, 2026

Codecov Report

❌ Patch coverage is 87.76224% with 70 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.18%. Comparing base (b15953e) to head (925f138).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
internal/pkg/metrics/alerts/dns_nxdomain.go 50.00% 16 Missing ⚠️
internal/pkg/metrics/alerts/builder.go 89.32% 6 Missing and 5 partials ⚠️
internal/pkg/metrics/alerts/netpol_denied.go 56.00% 11 Missing ⚠️
internal/pkg/metrics/alerts/context.go 91.11% 6 Missing and 2 partials ⚠️
internal/pkg/metrics/alerts/device_drops.go 82.22% 8 Missing ⚠️
internal/pkg/metrics/alerts/ingress_errors.go 80.00% 6 Missing and 1 partial ⚠️
...l/pkg/metrics/alerts/ingress_http_latency_trend.go 82.50% 6 Missing and 1 partial ⚠️
internal/pkg/metrics/alerts/promql.go 81.81% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2388      +/-   ##
==========================================
+ Coverage   71.92%   72.18%   +0.25%     
==========================================
  Files          93      104      +11     
  Lines       10501    10619     +118     
==========================================
+ Hits         7553     7665     +112     
- Misses       2469     2479      +10     
+ Partials      479      475       -4     
Flag Coverage Δ
unittests 72.18% <87.76%> (+0.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/controller/consoleplugin/config/config.go 75.00% <ø> (ø)
.../controller/consoleplugin/consoleplugin_objects.go 86.72% <100.00%> (+0.18%) ⬆️
internal/controller/flp/flp_monolith_reconciler.go 68.27% <100.00%> (ø)
internal/controller/flp/flp_transfo_reconciler.go 63.49% <100.00%> (ø)
internal/pkg/metrics/alerts/dns_errors.go 100.00% <100.00%> (ø)
internal/pkg/metrics/alerts/external_trend.go 100.00% <100.00%> (ø)
internal/pkg/metrics/alerts/ipsec_errors.go 100.00% <100.00%> (ø)
internal/pkg/metrics/alerts/kernel_drops.go 100.00% <100.00%> (ø)
internal/pkg/metrics/alerts/latency_trend.go 100.00% <100.00%> (ø)
internal/pkg/metrics/alerts/operational_alerts.go 100.00% <100.00%> (ø)
... and 8 more

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 29, 2026
@github-actions
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:7f084ed
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-7f084ed
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-7f084ed

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:7f084ed make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-7f084ed

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-7f084ed
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

Copy link
Contributor

@leandroberetta leandroberetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of review, looks good. I understand the changes and looks cleaner for sure.

Sampling int `yaml:"sampling" json:"sampling"`
Features []string `yaml:"features" json:"features"`
Fields []FieldConfig `yaml:"fields" json:"fields"`
RecordingAnnotations map[string]map[string]string `yaml:"recordingAnnotations,omitempty" json:"recordingAnnotations,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question although I think I know the answer, the change here is to just send annotations vs healthrules, right?.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's to 100% fill the gap between alerts and recording, since the console plugin was able to get annotations on alerts but not on recordings

Copy link
Member

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good

Thanks @jotak !

@openshift-ci openshift-ci bot added lgtm and removed lgtm labels Jan 30, 2026
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 30, 2026
@openshift-ci
Copy link

openshift-ci bot commented Jan 30, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jpinsonneau. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

New images:

  • quay.io/netobserv/network-observability-operator:823f507
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-823f507
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-823f507

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:823f507 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-823f507

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-823f507
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

jotak added 3 commits February 2, 2026 13:02
…me default rules as recording

- Refactor all alerts to implement a HealthRule interface
  - HealthRule provides the Annotations, RecordingName and the
    PrometheusRule
  - RecordingName now provided explicitly
  - Split logic between "builder" and "context"
- Console plugin controller just dumps annotations to config
- Change some defaults to Recording
Since these annotations are a user-exposed API, it is preferable to use
a more k8s-standard terminology, ie "workload" instead of "owner".
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

New images:

  • quay.io/netobserv/network-observability-operator:88f417c
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-88f417c
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-88f417c

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:88f417c make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-88f417c

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-88f417c
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

New images:

  • quay.io/netobserv/network-observability-operator:18b8d5f
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-18b8d5f
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-18b8d5f

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:18b8d5f make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-18b8d5f

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-18b8d5f
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

New images:

  • quay.io/netobserv/network-observability-operator:f9d7c1e
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-f9d7c1e
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-f9d7c1e

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:f9d7c1e make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-f9d7c1e

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-f9d7c1e
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

New images:

  • quay.io/netobserv/network-observability-operator:73d01f7
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-73d01f7
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-73d01f7

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:73d01f7 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-73d01f7

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-73d01f7
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

Comment on lines +37 to +40
NodeLabels []string `json:"nodeLabels,omitempty"`
NamespaceLabels []string `json:"namespaceLabels,omitempty"`
WorkloadLabels []string `json:"workloadLabels,omitempty"`
KindLabels []string `json:"kindLabels,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah I missed that, good catch - fixed in last commit

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@jotak jotak changed the title Make console plugin controller use health metadata for config, set some default rules as recording NETOBSERV-2596: Make console plugin controller use health metadata for config, set some default rules as recording Feb 2, 2026
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Feb 2, 2026

@jotak: This pull request references NETOBSERV-2596 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Description

  • Refactor all alerts to implement a HealthRule interface
  • HealthRule provides the Annotations, RecordingName and the PrometheusRule
  • RecordingName now provided explicitly
  • Split logic between "builder" and "context"
  • Console plugin controller just dumps annotations to config
  • Change some defaults to Recording

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labeled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 2, 2026
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

New images:

  • quay.io/netobserv/network-observability-operator:c37987f
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-c37987f
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-c37987f

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:c37987f make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-sha-c37987f

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-sha-c37987f
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

Copy link
Contributor

@leandroberetta leandroberetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm although there are some todos.

@openshift-ci openshift-ci bot removed the lgtm label Feb 3, 2026
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Feb 3, 2026
@openshift-ci openshift-ci bot added the lgtm label Feb 3, 2026
@memodi
Copy link
Member

memodi commented Feb 3, 2026

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved QE has approved this pull request label Feb 3, 2026
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Feb 3, 2026

@jotak: This pull request references NETOBSERV-2596 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Description

  • Refactor all alerts to implement a HealthRule interface
  • HealthRule provides the Annotations, RecordingName and the PrometheusRule
  • RecordingName now provided explicitly
  • Split logic between "builder" and "context"
  • Console plugin controller just dumps annotations to config
  • Change some defaults to Recording

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labeled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jotak
Copy link
Member Author

jotak commented Feb 4, 2026

/cherry-pick release-1.11

@openshift-cherrypick-robot

@jotak: once the present PR merges, I will cherry-pick it on top of release-1.11 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-1.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jotak jotak merged commit 21e6368 into netobserv:main Feb 4, 2026
14 of 15 checks passed
@openshift-cherrypick-robot

@jotak: new pull request created: #2413

Details

In response to this:

/cherry-pick release-1.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference lgtm qe-approved QE has approved this pull request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants