Skip to content

fix: readiness probe check moved to background#413

Open
imightbuyaboat wants to merge 3 commits intomittwald:masterfrom
MEDIASCOPE-JSC:feature/background-readiness
Open

fix: readiness probe check moved to background#413
imightbuyaboat wants to merge 3 commits intomittwald:masterfrom
MEDIASCOPE-JSC:feature/background-readiness

Conversation

@imightbuyaboat
Copy link

To test the issue, a configmap of the following type was created:

apiVersion: v1
kind: ConfigMap
metadata:
  name: test-replicator-cm-1
  namespace: test-replicator
  annotations:
    replicator.v1.mittwald.de/replicate-to-matching: replicate-test-replicate-cm
data:
  ...

and 200 namespaces of the following type:

apiVersion: v1
kind: Namespace
metadata:
  name: test-ns-1
  labels:
    replicate-test-replicate-cm: "true"

When a large number of field changes were made to the original configmap, the replicator couldn't synchronize the configmap within 60 seconds when calling the readiness probe handler. Because of this, the readiness probe check failed, and the replicator restarted:

time="2026-01-26T12:18:44Z" level=info msg="Readiness probe: syncing for all replicators: 39.703885888 s"
time="2026-01-26T12:18:44Z" level=info msg="Readiness probe: syncing for replicator ServiceAccount: 7.91e-07 s"
time="2026-01-26T12:18:44Z" level=info msg="Readiness probe: syncing for replicator RoleBinding: 5.41e-07 s"
time="2026-01-26T12:18:44Z" level=info msg="Readiness probe: syncing for replicator Role: 6.41e-07 s"
time="2026-01-26T12:18:44Z" level=info msg="Readiness probe: syncing for replicator ConfigMap: 39.703815402000004 s"
time="2026-01-26T12:18:05Z" level=info msg="Readiness probe: syncing for replicator Secret: 2.303e-06 s"

Logs of replicator:

Readiness probe failed: Get "http://..../readyz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

As a solution to the problem, replicator synchronization is moved to a background process, which writes the result to the notReady variable, and the readiness probe handler takes the values ​​from notReady and returns them when requested.

The -components-sync-period parameter has also been added to specify the time interval after which replicator synchronization will occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant