Skip to content

Commit 3b1d687

Browse files
authored
feat(opentelemetry) add failover (#591)
* feat(opentelemetry) add failover * fix indentation * update secrets * update docs, test-values * change endpoint type to string * update docs
1 parent be02fe0 commit 3b1d687

File tree

7 files changed

+93
-26
lines changed

7 files changed

+93
-26
lines changed

opentelemetry/README.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The main terminologies used in this document can be found in [core-concepts](htt
1010

1111
OpenTelemetry is an observability framework and toolkit for creating and managing telemetry data such as metrics, logs and traces. Unlike other observability tools, OpenTelemetry is vendor and tool agnostic, meaning it can be used with a variety of observability backends, including open source tools such as _OpenSearch_ and _Prometheus_.
1212

13-
The focus of the plugin is to provide easy-to-use configurations for common use cases of receiving, processing and exporting telemetry data in Kubernetes. The storage and visualization of the same is intentionally left to other tools.
13+
The focus of the Plugin is to provide easy-to-use configurations for common use cases of receiving, processing and exporting telemetry data in Kubernetes. The storage and visualization of the same is intentionally left to other tools.
1414

1515
Components included in this Plugin:
1616

@@ -21,6 +21,7 @@ Components included in this Plugin:
2121
- [k8sevents Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8seventsreceiver)
2222
- [journald Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/journaldreceiver)
2323
- [prometheus/internal](https://opentelemetry.io/docs/collector/internal-telemetry/)
24+
- [Connector](https://opentelemetry.io/docs/collector/building/connector/)
2425
- [OpenSearch Exporter](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/opensearchexporter)
2526

2627
## Architecture
@@ -45,7 +46,7 @@ This guide provides a quick and straightforward way to use **OpenTelemetry** as
4546
**Step 1:**
4647

4748
You can install the `OpenTelemetry` package in your cluster by installing it with [Helm](https://helm.sh/docs/helm/helm_install) manually or let the Greenhouse platform lifecycle do it for you automatically. For the latter, you can either:
48-
1. Go to Greenhouse dashboard and select the **OpenTelemetry** plugin from the catalog. Specify the cluster and required option values.
49+
1. Go to Greenhouse dashboard and select the **OpenTelemetry** Plugin from the catalog. Specify the cluster and required option values.
4950
2. Create and specify a `Plugin` resource in your Greenhouse central cluster according to the [examples](#examples).
5051

5152
**Step 2:**
@@ -70,6 +71,10 @@ Based on the backend selection the telemetry data will be exporter to the backen
7071

7172
Greenhouse regularly performs integration tests that are bundled with **OpenTelemetry**. These provide feedback on whether all the necessary resources are installed and continuously up and running. You will find messages about this in the plugin status and also in the Greenhouse dashboard.
7273

74+
## Failover Connector
75+
76+
The **OpenTelemetry** Plugin comes with a [Failover Connector](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/failoverconnector) for OpenSearch for two users. The connector will periodically try to establish a stable connection for the prefered user (`failover_username_a`) and in case of a failed try, the connector will try to establish a connection with the fallback user (`failover_username_b`). This feature can be used to secure the shipping of logs in case of expiring credentials or password rotation.
77+
7378
## Configuration
7479

7580
| Name | Description | Type | required |
@@ -78,9 +83,12 @@ Greenhouse regularly performs integration tests that are bundled with **OpenTele
7883
`openTelemetry.logsCollector.kvmConfig.enabled` | Activates the configuration for KVM logs (requires logsCollector to be enabled) | bool | `false` |
7984
`openTelemetry.logsCollector.cephConfig.enabled` | Activates the configuration for Ceph logs (requires logsCollector to be enabled) | bool | `false` |
8085
`openTelemetry.metricsCollector.enabled` | Activates the standard configuration for metrics | bool | `false` |
81-
`openTelemetry.openSearchLogs.username` | Username for OpenSearch endpoint | secret | `false` |
82-
`openTelemetry.openSearchLogs.password` | Password for OpenSearch endpoint | secret | `false` |
83-
`openTelemetry.openSearchLogs.endpoint` | Endpoint URL for OpenSearch | secret | `false` |
86+
`openTelemetry.openSearchLogs.failover_username_a` | Username for OpenSearch endpoint | secret | `true` |
87+
`openTelemetry.openSearchLogs.failover_password_a` | Password for OpenSearch endpoint | secret | `true` |
88+
`openTelemetry.openSearchLogs.failover_username_b` | Second Username (as a failover) for OpenSearch endpoint | secret | `true` |
89+
`openTelemetry.openSearchLogs.failover_password_b` | Second Password (as a failover) for OpenSearch endpoint | secret | `true` |
90+
`openTelemetry.openSearchLogs.endpoint` | Endpoint URL for OpenSearch | string | `true` |
91+
`openTelemetry.openSearchLogs.index` | Name for OpenSearch index | string | `false` |
8492
`openTelemetry.region` | Region label for logging | string | `false` |
8593
`openTelemetry.cluster` | Cluster label for logging | string | `false` |
8694
`openTelemetry.prometheus.additionalLabels` | Label selector for Prometheus resources to be picked-up by the operator | map | `false` |

opentelemetry/chart/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
apiVersion: v2
55
appVersion: v0.116.0
66
name: opentelemetry-operator
7-
version: 0.7.2
7+
version: 0.7.3
88
description: OpenTelemetry Operator Helm chart for Kubernetes
99
icon: https://raw.githubusercontent.com/cncf/artwork/a718fa97fffec1b9fd14147682e9e3ac0c8817cb/projects/opentelemetry/icon/color/opentelemetry-icon-color.png
1010
type: application

opentelemetry/chart/ci/test-values.yaml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,11 @@ opentelemetry-operator:
3333
openTelemetry:
3434
openSearchLogs:
3535
endpoint: test
36-
username: test
37-
password: test
36+
failover_username_a: test
37+
failover_password_a: test
38+
failover_username_b: test
39+
failover_password_b: test
40+
index: test
3841
cluster: test
3942
region: test
4043
logsCollector:

opentelemetry/chart/templates/logs-collector.yaml

Lines changed: 47 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ spec:
3535
value: "{{ .Values.openTelemetry.cluster }}"
3636
- name: region
3737
value: "{{ .Values.openTelemetry.region }}"
38+
- name: index
39+
value: "{{ .Values.openTelemetry.openSearchLogs.index }}"
3840
envFrom:
3941
- secretRef:
4042
name: otel-basic-auth
@@ -149,6 +151,17 @@ spec:
149151
key: log.type
150152
value: "k8sevents"
151153

154+
attributes/failover_username_a:
155+
actions:
156+
- action: insert
157+
key: failover_username_opensearch
158+
value: ${failover_username_a}
159+
attributes/failover_username_b:
160+
actions:
161+
- action: insert
162+
key: failover_username_opensearch
163+
value: ${failover_username_b}
164+
152165
transform/journal:
153166
error_mode: ignore
154167
log_statements:
@@ -288,28 +301,45 @@ spec:
288301
exporters:
289302
debug:
290303
verbosity: basic
291-
opensearch/logs:
304+
opensearch/failover_a:
292305
http:
293306
auth:
294-
authenticator: basicauth
307+
authenticator: basicauth/failover_a
295308
endpoint: {{ .Values.openTelemetry.openSearchLogs.endpoint }}
296-
logs_index: ${username}-datastream
309+
logs_index: ${index}-datastream
310+
opensearch/failover_b:
311+
http:
312+
auth:
313+
authenticator: basicauth/failover_b
314+
endpoint: {{ .Values.openTelemetry.openSearchLogs.endpoint }}
315+
logs_index: ${index}-datastream
297316

298317
prometheus:
299318
endpoint: 0.0.0.0:9999
300319

301320
extensions:
302-
basicauth:
321+
basicauth/failover_a:
303322
client_auth:
304-
username: ${username}
305-
password: ${password}
323+
username: ${failover_username_a}
324+
password: ${failover_password_a}
325+
basicauth/failover_b:
326+
client_auth:
327+
username: ${failover_username_b}
328+
password: ${failover_password_b}
306329

307330
connectors:
308331
forward: {}
309-
332+
failover:
333+
priority_levels:
334+
- [logs/failover_a]
335+
- [logs/failover_b]
336+
retry_interval: 2m
337+
retry_gap: 30s
338+
max_retries: 0
310339
service:
311340
extensions:
312-
- basicauth
341+
- basicauth/failover_a
342+
- basicauth/failover_b
313343
{{- if .Values.openTelemetry.prometheus.podMonitor.enabled }}
314344
telemetry:
315345
metrics:
@@ -320,7 +350,15 @@ spec:
320350
logs/forward:
321351
receivers: [forward]
322352
processors: [batch]
323-
exporters: [opensearch/logs]
353+
exporters: [failover]
354+
logs/failover_a:
355+
receivers: [failover]
356+
processors: [attributes/failover_username_a]
357+
exporters: [opensearch/failover_a]
358+
logs/failover_b:
359+
receivers: [failover]
360+
processors: [attributes/failover_username_b]
361+
exporters: [opensearch/failover_b]
324362
logs/containerd:
325363
receivers: [filelog/containerd]
326364
processors: [k8sattributes,attributes/cluster,transform/ingress]

opentelemetry/chart/templates/secret.yaml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,8 @@ metadata:
1313
{{ toYaml .Values.openTelemetry.customLabels | nindent 4 }}
1414
{{- end }}
1515
data:
16-
password: {{ .Values.openTelemetry.openSearchLogs.password | b64enc | quote }}
17-
username: {{ .Values.openTelemetry.openSearchLogs.username | b64enc | quote }}
16+
failover_username_a: {{ .Values.openTelemetry.openSearchLogs.failover_username_a | b64enc | quote }}
17+
failover_password_a: {{ .Values.openTelemetry.openSearchLogs.failover_password_a | b64enc | quote }}
18+
failover_username_b: {{ .Values.openTelemetry.openSearchLogs.failover_username_b | b64enc | quote }}
19+
failover_password_b: {{ .Values.openTelemetry.openSearchLogs.failover_password_b | b64enc | quote }}
20+

opentelemetry/chart/values.yaml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,11 @@ openTelemetry:
4141

4242
openSearchLogs:
4343
endpoint:
44-
username:
45-
password:
44+
failover_username_a:
45+
failover_password_a:
46+
failover_username_b:
47+
failover_password_b:
48+
index:
4649
cluster:
4750
region:
4851
logsCollector:

opentelemetry/plugindefinition.yaml

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,14 @@ kind: PluginDefinition
66
metadata:
77
name: opentelemetry
88
spec:
9-
version: 0.7.2
9+
version: 0.7.3
1010
displayName: OpenTelemetry
1111
description: Observability framework for instrumenting, generating, collecting, and exporting telemetry data such as traces, metrics, and logs.
1212
icon: https://raw.githubusercontent.com/cloudoperators/greenhouse-extensions/main/opentelemetry/logo.png
1313
helmChart:
1414
name: opentelemetry-operator
1515
repository: oci://ghcr.io/cloudoperators/greenhouse-extensions/charts
16-
version: 0.7.2
16+
version: 0.7.3
1717
options:
1818
- default: true
1919
description: Activates the standard configuration for logs
@@ -36,17 +36,29 @@ spec:
3636
required: false
3737
type: bool
3838
- description: Username for OpenSearch endpoint
39-
name: openTelemetry.openSearchLogs.username
39+
name: openTelemetry.openSearchLogs.failover_username_a
4040
required: true
41-
type: secret
41+
type: string
4242
- description: Password for OpenSearch endpoint
43-
name: openTelemetry.openSearchLogs.password
43+
name: openTelemetry.openSearchLogs.failover_password_a
44+
required: true
45+
type: secret
46+
- description: Second Username (as a failover) for OpenSearch endpoint
47+
name: openTelemetry.openSearchLogs.failover_username_b
48+
required: true
49+
type: secret
50+
- description: Second Password (as a failover) for OpenSearch endpoint
51+
name: openTelemetry.openSearchLogs.failover_password_b
4452
required: true
4553
type: secret
4654
- description: Endpoint URL for OpenSearch
4755
name: openTelemetry.openSearchLogs.endpoint
4856
required: false
4957
type: string
58+
- description: Name for OpenSearch index
59+
name: openTelemetry.openSearchLogs.index
60+
required: false
61+
type: string
5062
- description: Region label for logging
5163
name: openTelemetry.region
5264
required: false

0 commit comments

Comments
 (0)