Add support for externally managed members#1214
Add support for externally managed members#1214CaptainIRS wants to merge 6 commits intogardener:masterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
@CaptainIRS Labels area/todo, kind/todo do not exist. |
8eca409 to
6f7f97e
Compare
9215661 to
6ac8046
Compare
|
/test all |
f7d4681 to
75c63f4
Compare
847faff to
4f3f9fb
Compare
4f3f9fb to
b1961fb
Compare
b1961fb to
23d8f36
Compare
23d8f36 to
40f34d5
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
…xternally managed mode
614fe13 to
33e15d3
Compare
|
/retest |
|
/test pull-etcd-druid-integration |
|
@CaptainIRS: The following test failed, say
Full PR test history. Your PR dashboard. Command help for this repository. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
How to categorize this PR?
/area control-plane
/area high-availability
/kind enhancement
/kind api-change
What this PR does / why we need it:
This pull request introduces a new mode of operation in
etcd-druidto support etcd clusters whose members are managed by an external actor, rather than byetcd-druidvia theStatefulSetcontroller. This is primarily to enable use cases like Gardener's self-hosted shoot clusters (GEP-28), where a tool likegardenadmis responsible for deploying and managing etcd members as static pods on the control plane nodes.To enable this, a new field,
spec.externallyManagedMemberAddresses, is introduced in theEtcdAPI. When this field is populated with a list of member IP addresses,etcd-druid's behavior changes as follows:etcd-druidno longer manages the lifecycle of etcd pods. It creates theStatefulSetwithreplicas=0to serve as a template but does not create any pods itself.POD_NAMEto be<etcd-name>-<ip-address>for the external tool to spin upetcdpods with theetcd-backup-restoresidecar using the template.initial-clusterandadvertised-<peer|client>-urlsconfiguration in theConfigMapis populated using the IPs fromspec.externallyManagedMemberAddresses.etcd-main-192.168.0.1) instead of StatefulSet ordinals.Services.PodDisruptionBudget.etcd-druidcontinues to provide value by:ConfigMap.etcd-backup-restorefull/delta snapshots, compaction, de-fragmentation, etc.).This approach provides a generic mechanism to integrate
etcd-druidwith diverse deployment strategies beyond the defaultStatefulSet-based model, making it more flexible.This work supersedes the previous approach in PR #1117, opting for a more explicit API field (
spec.externallyManagedMemberAddresses) over an annotation, along with design changes that overcome the limitations of the previous approach.Which issue(s) this PR fixes:
Part of #1071
Supersedes #1117
Special notes for your reviewer:
The core logic is triggered by the presence of the
spec.externallyManagedMemberAddressesfield. Please pay close attention to the validation rules for this new field, as they are crucial for preventing unsupported transitions:spec.replicas.Etcdresource.etcd-druidetcd-druid-managed one.When
externallyManagedMemberAddressesis set,etcd-druideffectively transitions from a pod lifecycle manager to a configuration and maintenance provider for an externally managed cluster.Release note:
A new field `spec.externallyManagedMemberAddresses` has been added to the `Etcd` API. When this field is specified with a list of IP addresses, `etcd-druid` enters a new operational mode where it does not manage the etcd member pods directly. In this mode, `etcd-druid` will not create pods, services, or `PodDisruptionBudget`s. Instead, it will generate the etcd `ConfigMap` with a configuration tailored for the provided member IP addresses, enabling peer communication without relying on Kubernetes services. This feature allows external actors, such as `gardenadm`, to manage the lifecycle of etcd members (e.g., as static pods) while still leveraging `etcd-druid` for configuration generation, status reporting, and other management tasks.