Table of Contents generated with DocToc
- Cluster Provisioning
- Monitor the Install Job
- Managed DNS
- Cluster Adoption
- Configuration Management
- Cluster Deprovisioning
Cluster provisioning begins when a caller creates a ClusterDeployment CR, which is the core Hive resource used to control the lifecycle of a cluster and the Hive API entrypoint.
Hive comes with an optional hiveutil binary to assist creating the ClusterDeployment and its dependencies. See the hiveutil documentation for more information.
For clouds where there is support for automated IP allocation and DNS configuration, (AWS, Azure, IBM Cloud and GCP) an OpenShift installation requires a live and functioning DNS zone in the cloud account into which you will be installing the new cluster(s). For example if you own example.com, you could create a hive.example.com subdomain in Route53, and ensure that you have made the appropriate NS entries under example.com to delegate to the Route53 zone. When creating a new cluster, the installer will make future DNS entries under hive.example.com as needed for the cluster(s).
In addition to the default OpenShift DNS support, Hive offers a DNS feature called Managed DNS. With Managed DNS, Hive can automatically create delegated zones for approved base domains. For example, if hive.example.com exists and is specified as your managed domain, you can specify a base domain of cluster1.hive.example.com on your ClusterDeployment, and Hive will create this zone for you, add forwarding records in the base domain, wait for it to resolve, and then proceed with installation. Read here for more details.
For other platforms/clouds (OpenStack and VSphere), there is presently no native DNS auto-configuration available. This requires some up-front DNS configuration before a cluster can be installed. It will typically be necessary to reserve virtual IPs (VIPs) that will be used for the cluster's management (eg api.mycluster.hive.example.com) and for the cluster's default ingress routes (eg \*.apps.mycluster.hive.example.com). Each platform/cloud's configuration will have its own system for alocating or reserving these IPs. Once the IPs are reserved, DNS entries must be published as A records (or simply making local host entries to manage the DNS-to-IP translations on the host(s) running Hive) so that the cluster's API endpoint will be accessible to Hive.
OpenShift installation requires a pull secret obtained from try.openshift.com. You can specify an individual pull secret for each cluster Hive creates, or you can use a global pull secret that will be used by all of the clusters Hive creates.
oc create secret generic mycluster-pull-secret --from-file=.dockerconfigjson=/path/to/pull-secret --type=kubernetes.io/dockerconfigjson --namespace mynamespaceapiVersion: v1
data:
.dockerconfigjson: REDACTED
kind: Secret
metadata:
name: mycluster-pull-secret
namespace: mynamespace
type: kubernetes.io/dockerconfigjsonWhen a global pull secret is defined in the hive namespace and a ClusterDeployment-specific pull secret is specified, the registry authentication in both secrets will be merged and used by the new OpenShift cluster.
When a registry exists in both pull secrets, precedence will be given to the contents of the cluster-specific pull secret.
The global pull secret must live in the hive namespace and is referenced in the HiveConfig.
oc create secret generic global-pull-secret --from-file=.dockerconfigjson=/path/to/pull-secret --type=kubernetes.io/dockerconfigjson --namespace hiveapiVersion: v1
data:
.dockerconfigjson: REDACTED
kind: Secret
metadata:
name: global-pull-secret
namespace: hive
type: kubernetes.io/dockerconfigjsonoc patch hiveconfig hive --type=merge --patch '{"spec": {"globalPullSecretRef": {"name": "global-pull-secret"}}}'spec:
globalPullSecretRef:
name: global-pull-secretHive needs to know what version of OpenShift to install. A Hive cluster represents available versions via the ClusterImageSet resource, and there can be multiple ClusterImageSets available. Each ClusterImageSet references an OpenShift release image. A ClusterDeployment references a ClusterImageSet via the spec.provisioning.imageSetRef property.
Alternatively, you can specify an individual OpenShift release image in the ClusterDeployment spec.provisioning.releaseImage property.
An example ClusterImageSet:
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
name: openshift-v4.3.0
spec:
releaseImage: quay.io/openshift-release-dev/ocp-release:4.3.0-x86_64Hive requires credentials to the cloud account into which it will install OpenShift clusters. Refer to the installer documentation for required level of permissions for each cloud.
Create a secret containing your AWS access key and secret access key:
oc create secret generic <mycluster>-aws-creds -n hive --from-literal=aws_access_key_id=<AWS_ACCESS_KEY_ID> --from-literal=aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>Take care when using the yaml below, you need to use base64 to encode the data values.
apiVersion: v1
data:
aws_access_key_id: REDACTED
aws_secret_access_key: REDACTED
kind: Secret
metadata:
name: mycluster-aws-creds
namespace: mynamespace
type: OpaqueCreate a secret containing your Azure service principal:
apiVersion: v1
data:
osServicePrincipal.json: REDACTED
kind: Secret
metadata:
name: mycluster-azure-creds
namespace: mynamespace
type: OpaqueCreate a secret containing your GCP service account key:
apiVersion: v1
data:
osServiceAccount.json: REDACTED
kind: Secret
metadata:
name: mycluster-gcp-creds
namespace: mynamespace
type: OpaqueCreate a secret containing your IBM Cloud API key:
apiVersion: v1
stringData:
ibmcloud_api_key: IBMCLOUDAPIKEY
kind: Secret
metadata:
name: mycluster-ibm-creds
namespace: mynamespace
type: OpaqueIBM Cloud credential secrets must be provided as manifests for installation. Follow instructions for using ccoctl to generate IBM Cloud service IDs and place manifests generated from running ccoctl ibmcloud create-service-id within a secret that will be referenced by the ClusterDeployment.
Create a manifests secret containing secrets generated by ccoctl:
oc create secret generic mycluster-manifests -n mynamespace --from-file=<manifests directory>Create a secret containing your vSphere credentials information:
apiVersion: v1
stringData:
password: vsphereuser
username: secretpassword
kind: Secret
metadata:
name: mycluster-vsphere-creds
namespace: mynamespace
type: OpaqueCreate a secret containing your vSphere CA certificate.
- From the vCenter home page, download the vCenter’s root CA certificates. Click Download trusted root CA certificates in the vSphere Web Services SDK section. Download, wget or curl the /certs/download.zip file.
wget https://<vCenter>/certs/download.zip
- Extract the compressed file that contains the vCenter root CA certificates. The contents of the compressed file resemble the following file structure:
certs
├── lin
│ ├── 108f4d17.0
│ ├── 108f4d17.r1
│ ├── 7e757f6a.0
│ ├── 8e4f8471.0
│ └── 8e4f8471.r0
├── mac
│ ├── 108f4d17.0
│ ├── 108f4d17.r1
│ ├── 7e757f6a.0
│ ├── 8e4f8471.0
│ └── 8e4f8471.r0
└── win
├── 108f4d17.0.crt
├── 108f4d17.r1.crl
├── 7e757f6a.0.crt
├── 8e4f8471.0.crt
└── 8e4f8471.r0.crl
3 directories, 15 files
- Create a single file by concatenating all the files in certs/lin. Save the file somewhere permanent -- you'll need it for each vSphere cluster you want to create.
cat certs/lin/* > /home/me/vsphere/ca.cert
Create a secret containing the combined CA bundle data within a .cacert key:
oc create secret generic mycluster-vsphere-certs --from-file=.cacert=/home/me/vsphere/ca.cert
apiVersion: v1
stringData:
.cacert: |
-----BEGIN CERTIFICATE-----
CA BUNDLE DATA HERE
-----END CERTIFICATE-----
kind: Secret
metadata:
name: mycluster-vsphere-certs
namespace: mynamespace
type: OpaqueCreate a secret containing your OpenStack clouds.yaml file:
apiVersion: v1
data:
clouds.yaml: REDACTED
kind: Secret
metadata:
name: mycluster-openstack-creds
namespace: mynamespace
type: OpaqueTo provision an OpenShift cluster on Nutanix using Hive, you must provide the necessary cloud credentials. These credentials are used by Hive to interact with the Nutanix environment and perform cluster provisioning operations.
Hive requires the following credentials for Nutanix:
- Prism Central Username: The username with sufficient privileges to create and manage virtual machines.
- Prism Central Password: The password associated with the provided username.
The Nutanix credentials must be stored as a Kubernetes secret in the namespace where Hive operates. Create a secret with the following format:
apiVersion: v1
kind: Secret
metadata:
name: nutanix-cloud-credentials
namespace: hive
type: Opaque
data:
username: <base64-encoded-username>
password: <base64-encoded-password>To create the secret using oc, first encode the values in Base64:
echo -n "<value>" | base64Then, apply the secret using:
oc apply -f nutanix-cloud-credentials.yamlIn addition to the Hive credentials, OpenShift requires additional secrets in specific namespaces for authentication with Nutanix. These credentials can be created manually or by using the CCO utility (ccoctl) to generate the credential Secret manifests for the OpenShift installer. (See the following link for more details)
This secret is required by the OpenShift Machine API to manage machines on Nutanix:
apiVersion: v1
kind: Secret
metadata:
name: nutanix-credentials
namespace: openshift-machine-api
type: Opaque
stringData:
credentials: |
[{"type":"basic_auth","data":{"prismCentral":{"username":"${NUTANIX_USERNAME}","password":"${NUTANIX_PASSWORD}"}}}]This secret is required by the OpenShift Cloud Controller Manager to integrate OpenShift with Nutanix infrastructure:
apiVersion: v1
kind: Secret
metadata:
name: nutanix-credentials
namespace: openshift-cloud-controller-manager
type: Opaque
stringData:
credentials: |
[{"type":"basic_auth","data":{"prismCentral":{"username":"${NUTANIX_USERNAME}","password":"${NUTANIX_PASSWORD}"}}}]
- Hive Secret: Used by Hive for provisioning clusters and managing resources.
- Machine API Secret: Required for the OpenShift Machine API to manage and create worker nodes on Nutanix.
- Cloud Controller Manager Secret: Enables OpenShift to interact with Nutanix for networking, load balancing, and other infrastructure-related tasks.
Each of these secrets plays a critical role in ensuring a seamless integration between OpenShift and Nutanix, allowing for automated cluster deployment and lifecycle management.
Once the secret is created, reference it in the ClusterDeployment or ClusterPool configuration:
spec:
platform:
nutanix:
credentialsSecretRef:
name: nutanix-cloud-credentialsThis ensures that Hive can retrieve the necessary credentials to interact with Nutanix for cluster provisioning.
- Ensure that the Nutanix user has appropriate permissions to create and manage virtual machines, networks, and other resources required by OpenShift installation and management.
- Verify that network connectivity exists between the OpenShift cluster nodes and the Nutanix infrastructure endpoints (Prism Central and Prism Elements).
- If the Nutanix Prism Central uses certificates that are not trusted by default (such as those signed by a private certificate authority), additional TLS configuration may be required.
- During installation (day 0), Prism Central certificates can be trusted by specifying an
additionalTrustBundlein theinstall-config.yaml. After installation (day 2), ongoing communication by Hive requires configuring acertificatesSecretRefin theClusterDeploymentplatform configuration. - If both
additionalTrustBundleandcertificatesSecretRefare provided, they can reference different certificate bundles if needed. Otherwise, the certificates fromcertificatesSecretRefwill be used for both installation and day 2 operations.
By setting up these credentials correctly, Hive will be able to deploy OpenShift clusters on Nutanix efficiently.
When using certificates to establish trust with Nutanix Prism Central, Hive handles certificates in the following ways depending on the configuration:
-
Case 1: additionalTrustBundle set in install-config.yaml, no certificatesSecretRef in Hive
The installer will use the provided trust bundle. Hive will not inject any additional certificates.
Important: If the cluster nodes (or installer pod) environment does not haveSSL_CERT_DIRconfigured properly, the installation might fail due to untrusted Prism Central certificates. -
Case 2: certificatesSecretRef set in Hive, no additionalTrustBundle in install-config.yaml
Hive will automatically inject the certificates from certificatesSecretRef into the install-config'sadditionalTrustBundlebefore installation.
It will also setadditionalTrustBundlePolicy: Alwaysto ensure the certificates are trusted both during installation and runtime. -
Case 3: both additionalTrustBundle and certificatesSecretRef are set
Hive will not modify the install-config. The installer will use theadditionalTrustBundleexactly as provided.
No certificate injection will occur, and the install-config's bundle will be used for establishing trust.
By carefully choosing where to specify certificates, you can control whether the trust setup is handled at install time, during day 2 operations, or both.
Hive automatically injects the necessary Nutanix credentials during the provisioning process. Therefore, there is no need to manually specify the Prism Central username and password in the install configuration. By referencing the created secrets, Hive ensures secure and seamless authentication with Nutanix.
By setting up these credentials correctly, Hive will be able to deploy OpenShift clusters on Nutanix efficiently.
(Optional) Hive uses the provided ssh key pair to ssh into the machines in the remote cluster. Hive connects via ssh to gather logs in the event of an installation failure. The ssh key pair is optional, but neither the user nor Hive will be able to ssh into the machines if it is not supplied.
Create a Kubernetes secret containing a ssh key pair in PEM format (typically generated with ssh-keygen -m PEM)
apiVersion: v1
data:
ssh-privatekey: REDACTED
ssh-publickey: REDACTED
kind: Secret
metadata:
name: mycluster-ssh-key
namespace: mynamespace
type: OpaqueThe OpenShift installer InstallConfig must be stored in a secret and referenced in the ClusterDeployment. This allows Hive to more easily support installing multiple versions of OpenShift.
First, retrieve the public key for the SSH key pair you created earlier, if you created one:
ssh_public_key=$(oc extract secret/mycluster-ssh-key --keys=ssh-publickey --to=-)Then create a file called install-config.yaml that will contain your
InstallConfig. The example below provides an InstallConfig for AWS.
cat >./install-config.yaml <<-EOF
apiVersion: v1
baseDomain: hive.example.com
compute:
- name: worker
platform:
aws:
rootVolume:
iops: 100
size: 120
type: gp3
type: m5.xlarge
replicas: 3
controlPlane:
name: master
platform:
aws:
rootVolume:
iops: 100
size: 120
type: gp3
type: m5.xlarge
metadata:
name: mycluster
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 10.0.0.0/16
networkType: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
platform:
aws:
region: us-east-1
pullSecret: mycluster-pull-secret
# Remove the line below if you did not create an SSH key.
sshKey: $ssh_public_key
EOFFinally, create a generic Kubernetes secret from the InstallConfig you just
created:
oc create secret generic mycluster-install-config --from-file=install-config.yaml=./install-config.yamlFor Azure, replace the contents of compute.platform and controlPlane.platform with:
azure:
osDisk:
diskSizeGB: 128
type: Standard_D2s_v3and replace the contents of platform with:
azure:
cloudName: AzurePublicCloud
baseDomainResourceGroupName: my-bdrgn
region: centralusNote: cloudName specifies the Azure Cloud in which to create the cluster e.g. AzurePublicCloud or AzureUSGovernmentCloud.
For GCP, replace the contents of compute.platform and controlPlane.platform with:
gcp:
type: n1-standard-4and replace the contents of platform with:
gcp:
projectID: myproject
region: us-east1For IBM Cloud, replace the contents of compute.platform and controlPlane.platform. Note that type is any valid IBM Cloud instance type. type may be omitted to use OpenShift installation defaults.
ibmcloud:
type: bx2-4x16and populate the top-level platform fields with the appropriate information:
platform:
ibmcloud:
region: us-eastand ensure that the top-level credentialsMode field has been set to Manual.
credentialsMode: ManualFor vSphere, ensure the compute and controlPlane fields are empty.
controlPlane:
compute:and populate the top-level platform fields with the appropriate information:
platform:
vsphere:
apiVIP: 192.168.1.10
cluster: devel
datacenter: dc1
defaultDatastore: ds1
folder: /dc1/vm/CLUSTER_NAME
ingressVIP: 192.168.1.11
network: "VM Network"
password: secretpassword
username: vsphereuser
vCenter: vcenter.example.comFor Openstack, replace the contents of compute.platform with:
openstack:
type: m1.largeNote: Use an instance type that meets the minimum requirement for the version of OpenShift being installed.
and replace the contents of controlPlane.platform with:
openstack:
type: ci.m4.xlargeNote: Use an instance type that meets the minimum requirement for the version of OpenShift being installed.
and replace the contents of platform with:
openstack:
cloud: mycloud
computeFlavor: m1.large
externalNetwork: openstack_network_name
lbFloatingIP: 10.0.111.158For Nutanix, you need to specify the Nutanix platform configuration in your install-config.yaml. Below is the required platform section:
platform:
nutanix:
apiVIPs:
- 10.0.2.12
ingressVIPs:
- 10.0.2.11
prismCentral:
endpoint:
address: "prism-central.example.com"
port: 9440
prismElements:
- endpoint:
address: "prism-element-1.example.com"
port: 9440
uuid: "prism-elements-uuid-1234"
name: "Prism-Element-1"
subnetUUIDs:
- "subnet-uuid-1234"
failureDomains:
- name: "Local_AZ"
subnetUUIDs:
- "subnet-uuid-1234"
prismElement:
endpoint:
address: "prism-element-1.example.com"
port: 9440
uuid: "prism-elements-uuid-1234"
name: "Prism-Element-1"Note: The failureDomains section is optional and can be omitted if not required.
nutanix-creds: A secret containing the credentials for Prism Central.install-config: A secret holding the OpenShift install configuration.ssh-private-key: A secret containing the SSH private key for cluster access.
- Ensure that the
prismCentral.endpointandprismElements.endpointaddresses specified in the install-config are reachable from the environment where Hive runs. - The
subnetUUIDsmust correspond to existing Nutanix subnets where the cluster nodes will be deployed.
Cluster provisioning begins when a ClusterDeployment is created.
Note that some parts are duplicated with the InstallConfig.
An example ClusterDeployment for AWS:
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
name: mycluster
namespace: mynamespace
spec:
baseDomain: hive.example.com
clusterName: mycluster
platform:
aws:
credentialsSecretRef:
name: mycluster-aws-creds
region: us-east-1
provisioning:
imageSetRef:
name: openshift-v4.3.0
installConfigSecretRef:
name: mycluster-install-config
sshPrivateKeySecretRef:
name: mycluster-ssh-key
pullSecretRef:
name: mycluster-pull-secretFor Azure, replace the contents of spec.platform with:
azure:
baseDomainResourceGroupName: my-bdrgn
credentialsSecretRef:
name: mycluster-azure-creds
cloudName: AzurePublicCloud
region: centralusNote: cloudName specifies the Azure Cloud in which to create the cluster e.g. AzurePublicCloud or AzureUSGovernmentCloud.
For GCP, replace the contents of spec.platform with:
gcp:
credentialsSecretRef:
name: mycluster-gcp-creds
region: us-east1For IBM Cloud, replace the contents of spec.platform with:
ibmcloud:
credentialsSecretRef:
name: mycluster-ibm-creds
region: us-eastand add a manifests secret reference to spec.provisioning:
provisioning:
manifestsSecretRef:
name: mycluster-manifestsFor vSphere, replace the contents of spec.platform with:
vsphere:
certificatesSecretRef:
name: mycluster-vsphere-certs
cluster: devel
credentialsSecretRef:
name: mycluster-vsphere-creds
datacenter: dc1
defaultDatastore: ds1
folder: /dc1/vm/CLUSTER_NAME
network: "VM Network"
vCenter: vsphere.example.comFor OpenStack, replace the contents of spec.platform with:
openstack:
cloud: mycloud
credentialsSecretRef:
name: mycluster-openstack-credsMachinePool is a YAML configuration by which you can create and scale worker nodes on a deployed cluster. A MachinePool will create MachineSet resources on the deployed cluster. If supported on your cloud, those MachineSets will automatically span all AZs, or you can specify an explicit list.
A MachinePool for the worker machinesets is not required. If the user creates a MachinePool for the worker MachineSets, then Hive will manage the worker MachineSets.
MachinePool reconciliation is limited to updating MachineSet replicas to match the replicas configured for the MachinePool. Additionally, any existing Labels or Taints on the MachineSets will be overridden if they clash with those on the MachinePool. In case of duplicate taints, the taint encountered first will be preserved and the rest collapsed on the MachineSets.
MachinePool platform is immutable and any changes made to MachinePool.spec.platform are blocked by a validating webhook. The Machine Config Operator does not support updating existing machines when platform details are changed in a MachineSet and consequently Hive does not support making such changes to MachinePool platform, see HIVE-2024.
The recommended workaround when platform details must be changed is to replace the MachinePool by creating an adjacent MachinePool with the desired configuration.
- Create replacement MachinePool with desired configuration and
MachinePool.spec.replicas = 0. - Scale down the old MachinePool while scaling up the replacement MachinePool.
InstallConfig is limited to the one worker pool, but Hive can sync additional MachinePools Day 2.
apiVersion: hive.openshift.io/v1
kind: MachinePool
metadata:
name: mycluster-worker
namespace: mynamespace
spec:
clusterDeploymentRef:
name: mycluster
name: worker
platform:
aws:
rootVolume:
iops: 100
size: 120
type: gp3
type: m5.xlarge
replicas: 3For Azure, replace the contents of spec.platform with:
azure:
osDisk:
diskSizeGB: 128
type: Standard_D2s_v3For GCP, replace the contents of spec.platform with:
gcp:
type: n1-standard-4WARNING: Due to some naming restrictions on various components in GCP, Hive will restrict you to a max of 35 MachinePools (including the original worker pool created by default). We are left with only a single character to differentiate the machines and nodes from a pool, and 'm' is already reserved for the master hosts, leaving us with a-z (minus m) and 0-9 for a total of 35. Hive will automatically create a MachinePoolNameLease for GCP MachinePools to grab one of the available characters until none are left, at which point your MachinePool will not be provisioned.
For IBM Cloud, replace the contents of spec.platform. Note that type is any valid IBM Cloud instance type. type may be omitted to use OpenShift installation defaults.
ibmcloud:
type: bx2-4x16For vSphere, replace the contents of spec.platform with the settings you want for the instances. Note that static IPs are not supported.
vsphere:
coresPerSocket: 1
cpus: 2
memoryMB: 8192
osDisk:
diskSizeGB: 120For OpenStack, replace the contents of spec.platform with the settings you want for the instances:
openstack:
rootVolume:
size: 10
type: ceph
flavor: m1.largeFor Nutanix, replace the contents of spec.platform with the settings you want for the instances:
nutanix:
prismCentral:
address: prism-central.example.com
port: 9440
credentialsSecretRef:
name: nutanix-creds
certificatesSecretRef:
name: prism-central-cert
failureDomains:
- name: "Local_AZ"
subnetUUIDs:
- "subnet-uuid-1234"
prismElement:
endpoint:
address: "prism-element-1.example.com"
port: 9440
uuid: "prism-elements-uuid-1234"
name: "Prism-Element-1"The desired Availability Zones (AZ) to create new worker nodes in can be specified in the MachinePool YAML (spec.platform.<provider>.zones), for example:
apiVersion: hive.openshift.io/v1
kind: MachinePool
metadata:
name: mycluster-worker
namespace: mynamespace
spec:
clusterDeploymentRef:
name: mycluster
name: worker
platform:
aws:
rootVolume:
iops: 100
size: 120
type: gp3
type: m5.xlarge
zones:
- us-east-1a
- us-east-1b
replicas: 3If the Availability Zones are not configured in the MachinePool, then all of the AZs in the region will be used and a MachineSet resource will be created for each AZ (only relevant for public cloud providers).
MachinePools can be configured to auto-scale the number of worker nodes as needed based on resource utilization of the deployed cluster (this feature creates a ClusterAutoscaler resource in the deployed cluster).
apiVersion: hive.openshift.io/v1
kind: MachinePool
metadata:
name: mycluster-worker
namespace: mynamespace
spec:
clusterDeploymentRef:
name: mycluster
name: worker
platform:
aws:
rootVolume:
iops: 100
size: 120
type: gp3
type: m5.xlarge
autoscaling:
minReplicas: 3
maxReplicas: 6The number of minimum replicas must be equivalent to the number of configured Availability Zones.
The spec.replicas and spec.autoscaling configurations cannot be configured simultaneously.
The spec.autoscaling.maxReplicas is an optional field. If it is not configured, then nodes will be auto-scaled without restriction based on resource utilization needs.
A MachinePool configured to auto-scaling mode creates a ClusterAutoscaler on the deployed cluster. ClusterAutoscalers can co-exist and work with Horiztonal Pod Autoscalers to ensure that there are enough available nodes to meet the auto-scaled pod replica count requirements. See excerpt from OpenShift documentation:
The horizontal pod autoscaler (HPA) and the cluster autoscaler modify cluster resources in different ways. The HPA changes the deployment’s or replica set’s number of replicas based on the current CPU load. If the load increases, the HPA creates new replicas, regardless of the amount of resources available to the cluster. If there are not enough resources, the cluster autoscaler adds resources so that the HPA-created pods can run. If the load decreases, the HPA stops some replicas. If this action causes some nodes to be underutilized or completely empty, the cluster autoscaler deletes the unnecessary nodes.
Hive supports bare metal provisioning as provided by openshift-install
At present this feature requires a separate pre-existing libvirt provisioning host to run the bootstrap node. This host will require very specific network configuration that far exceeds the scope of Hive documentation. See Bare Metal Platform Customization for more information.
To provision bare metal clusters with Hive:
Create a Secret containing a bare metal enabled InstallConfig. This InstallConfig must contain a libvirtURI property pointing to the provisioning host.
Create a Secret containing the SSH private key that can connect to your libvirt provisioning host, without a passphrase.
apiVersion: v1
kind: Secret
metadata:
name: provisioning-host-ssh-private-key
namespace: mynamespace
stringData:
ssh-privatekey: |-
-----BEGIN RSA PRIVATE KEY-----
REDACTED
-----END RSA PRIVATE KEY-----
type: OpaqueCreate a ConfigMap for manifests to inject into the installer, containing a nested ConfigMap for metal3 config.
NOTE: This will no longer be required as of OpenShift 4.4+.
kind: ConfigMap
apiVersion: v1
metadata:
name: my-baremetal-cluster-install-manifests
namespace: mynamespace
data:
99_metal3-config.yaml: |
kind: ConfigMap
apiVersion: v1
metadata:
name: metal3-config
namespace: openshift-machine-api
data:
http_port: "6180"
provisioning_interface: "enp1s0"
provisioning_ip: "172.22.0.3/24"
dhcp_range: "172.22.0.10,172.22.0.100"
deploy_kernel_url: "http://172.22.0.3:6180/images/ironic-python-agent.kernel"
deploy_ramdisk_url: "http://172.22.0.3:6180/images/ironic-python-agent.initramfs"
ironic_endpoint: "http://172.22.0.3:6385/v1/"
ironic_inspector_endpoint: "http://172.22.0.3:5050/v1/"
cache_url: "http://192.168.111.1/images"
rhcos_image_url: "https://releases-art-rhcos.svc.ci.openshift.org/art/storage/releases/rhcos-4.3/43.81.201911192044.0/x86_64/rhcos-43.81.201911192044.0-openstack.x86_64.qcow2.gz"Create a ClusterDeployment, note the libvirtSSHPrivateKeySecretRef and sshKnownHosts for bare metal:
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
name: my-baremetal-cluster
namespace: mynamespace
annotations:
hive.openshift.io/try-install-once: "true"
spec:
baseDomain: test.example.com
clusterName: my-baremetal-cluster
controlPlaneConfig:
servingCertificates: {}
platform:
baremetal:
libvirtSSHPrivateKeySecretRef:
name: provisioning-host-ssh-private-key
provisioning:
installConfigSecretRef:
name: my-baremetal-cluster-install-config
sshPrivateKeySecretRef:
name: my-baremetal-hosts-ssh-private-key
manifestsSecretRef:
name: my-baremetal-cluster-install-manifests
imageSetRef:
name: my-clusterimageset
sshKnownHosts:
# SSH known host info for the libvirt provisioning server to avoid a prompt during non-interactive install:
- "10.1.8.90 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBKWjJRzeUVuZs4yxSy4eu45xiANFIIbwE3e1aPzGD58x/NX7Yf+S8eFKq4RrsfSaK2hVJyJjvVIhUsU9z2sBJP8="
pullSecretRef:
name: my-baremetal-cluster-pull-secretThere is not presently support for MachinePool management on bare metal clusters. The pools defined in your InstallConfig are authoritative.
There is not presently support for "deprovisioning" a bare metal cluster, as such deleting a bare metal ClusterDeployment has no impact on the running cluster, it is simply removed from Hive and the systems would remain running. This may change in the future.
- Get the namespace in which your cluster deployment was created
- Get the install pod name
oc get pods -l "hive.openshift.io/job-type=provision,hive.openshift.io/cluster-deployment-name=${CLUSTER_NAME}" -o jsonpath='{.items[0].metadata.name}'
- Run following command to watch the cluster deployment
Alternatively, you can watch the summarized output of the installer using
oc logs -f <install-pod-name> -c hive
oc exec -c hive <install-pod-name> -- tail -f /tmp/openshift-install-console.log
In the event of installation failures, please see Troubleshooting.
Hive can be configured as follows to upload logs to an AWS S3 bucket when provisioning fails.
- Create an S3 bucket. The bucket must be accessible from the environment from which your cluster will be provisioned, using credentials you will specify (below). Take note of the name of the bucket and the region in which you created it.
- Create a credentials secret. This secret will need to exist in the target namespace of your hive deployment (HiveConfig.spec.targetNamespace, default
hive), and contain AWS credentials sufficient to write to your bucket. The secret should data contain base64-encoded values for "aws_access_key_id" and "aws_secret_access_key". (You may wish to reuse the secret from your cluster deployment.) - Create an SSH private key secret. The secret data must contain a key called
ssh-privatekeywhose value is the base64-encoded contents of the private key file corresponding to the public key in your install config. Create this secret in the namespace of your ClusterDeployment. - Tell HiveConfig where to find your bucket. Under
.spec.failedProvisionConfig.aws, add the bucket name, the reference to the AWS credentials secret, and the region. For example:spec: failedProvisionConfig: aws: bucket: failed-provision-logs credentialsSecretRef: name: test-retry-aws-creds region: us-east-1
- Ensure your ClusterDeployment is configured with your SSH private key secret. Reference the
SSH private key secret in your ClusterDeployment's `.spec.provisioning.sshPrivateKeySecretRef.
For example:
(If using hiveutil, you can provide the key pair from your file system via
spec: provisioning: sshPrivateKeySecretRef: name: mycluster-ssh-key
--ssh-private-key-fileand--ssh-public-key-file.)
The troubleshooting doc provides more information about extracting and processing the logs.
Once the cluster is provisioned, the admin kubeconfig will be stored in a secret. You can use this with:
./hack/get-kubeconfig.sh ${CLUSTER_NAME} > ${CLUSTER_NAME}.kubeconfig
export KUBECONFIG=${CLUSTER_NAME}.kubeconfig
oc get nodes-
Get the webconsole URL
oc get cd ${CLUSTER_NAME} -o jsonpath='{ .status.webConsoleURL }' -
Retrieve the password for
kubeadminuseroc extract secret/$(oc get cd ${CLUSTER_NAME} -o jsonpath='{.spec.clusterMetadata.adminPasswordSecretRef.name}') --to=-
Hive can optionally create delegated DNS zones for each cluster.
NOTE: This feature only works for provisioning to AWS, GCP, and Azure.
To use this feature:
-
Manually create a DNS zone for your "root" domain (i.e. hive.example.com in the example below) and ensure your DNS is operational.
-
Create a secret in the "hive" namespace with your cloud credentials with permissions to manage the root zone.
- AWS
The following AWS IAM permissions should be associated with these credentials:
apiVersion: v1 data: aws_access_key_id: REDACTED aws_secret_access_key: REDACTED kind: Secret metadata: name: route53-aws-creds type: Opaque
route53:ChangeResourceRecordSets route53:ChangeTagsForResource route53:CreateHostedZone route53:DeleteHostedZone route53:GetHostedZone route53:ListHostedZonesByName route53:ListResourceRecordSets route53:ListTagsForResource tag:GetResources - GCP
apiVersion: v1 data: osServiceAccount.json: REDACTED kind: Secret metadata: name: gcp-creds type: Opaque
- Azure
Service principal needs DNS Zone Contributor role on DNS zone resource.
apiVersion: v1 data: osServicePrincipal.json: REDACTED kind: Secret metadata: name: azure-creds type: Opaque
- AWS
-
Update your HiveConfig to enable externalDNS and set the list of managed domains:
- AWS
apiVersion: hive.openshift.io/v1 kind: HiveConfig metadata: name: hive spec: managedDomains: - aws: credentialsSecretRef: name: route53-aws-creds domains: - hive.example.com
- GCP
apiVersion: hive.openshift.io/v1 kind: HiveConfig metadata: name: hive spec: managedDomains: - gcp: credentialsSecretRef: name: gcp-creds domains: - hive.example.com
- Azure
apiVersion: hive.openshift.io/v1 kind: HiveConfig metadata: name: hive spec: managedDomains: - azure: credentialsSecretRef: name: azure-creds domains: - hive.example.com
- AWS
-
Specify which domains Hive is allowed to manage by adding them to the
.spec.managedDomains[].domainslist. When specifyingmanageDNS: truein a ClusterDeployment, the ClusterDeployment's baseDomain must be a direct child of one of these domains, otherwise the ClusterDeployment creation will result in a validation error. The baseDomain must also be unique to that cluster and must not be used in any other ClusterDeployment, including on separate Hive instances.As such, a domain may exist in the
.spec.managedDomains[].domainslist in multiple Hive instances. Note that the specified credentials must be valid to add and remove NS record entries for all domains listed in.spec.managedDomains[].domains.
You can now create clusters with manageDNS enabled and a basedomain of mydomain.hive.example.com.
bin/hiveutil create-cluster --base-domain=mydomain.hive.example.com mycluster --manage-dns
Hive will then:
- Create a mydomain.hive.example.com DNS zone.
- Create NS records in the hive.example.com to forward DNS to the new mydomain.hive.example.com DNS zone.
- Wait for the SOA record for the new domain to be resolvable, indicating that DNS is functioning.
- Launch the install, which will create DNS entries for the new cluster ("*.apps.mycluster.mydomain.hive.example.com", "api.mycluster.mydomain.hive.example.com", etc) in the new mydomain.hive.example.com DNS zone.
It is possible to adopt cluster deployments into Hive. This will allow you to manage the cluster as if it had been provisioned by Hive, including:
To do so you will need to create a ClusterDeployment with Spec.Installed set to True, no Spec.Provisioning section, and include the following:
- cluster INFRAID (obtained from
oc get infrastructure cluster -o json | jq .status.infrastructureName) - cluster ID (obtained from
oc get clusterversion version -o json | jq .spec.clusterID) - reference to a properly formatted admin kubeconfig Secret:
oc create secret generic mycluster-admin-kubeconfig --from-file=kubeconfig=/tmp/admin.kubeconfig - Spec.Platform.YourCloudProvider for your cluster, most importantly region and a properly formatted credentials Secret
Use Spec.PreserveOnDelete = true if you do not want Hive to deprovision resources when the ClusterDeployment is deleted.
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
name: my-gcp-cluster
namespace: hive
spec:
baseDomain: gcp.example.com
clusterMetadata:
adminKubeconfigSecretRef:
name: my-gcp-cluster-admin-kubeconfig
clusterID: 61010205-c91d-44c9-8394-3e1790bd76f3
infraID: my-gcp-cluster-wsvdn
clusterName: my-gcp-cluster
installed: true
platform:
gcp:
credentialsSecretRef:
name: my-gcp-creds
region: us-east1
pullSecretRef:
name: pull-secretIf the cluster you are looking to adopt is on AWS and leverages Privatelink, you'll also need to include that setting under spec.platform.aws to ensure the VPC Endpoint Service for the cluster is tracked in the ClusterDeployment.
platform:
aws:
credentialsSecretRef:
name: my-aws-cluster-creds
privateLink:
enabled: true
region: us-east-1If the cluster you are looking to adopt is on AWS and uses a shared VPC, you will also need to include the name of the hosted zone role in spec.clusterMetadata.platform.aws.hostedZoneRole.
clusterMetadata:
adminKubeconfigSecretRef:
name: my-gcp-cluster-admin-kubeconfig
clusterID: 61010205-c91d-44c9-8394-3e1790bd76f3
infraID: my-gcp-cluster-wsvdn
platform:
aws:
hostedZoneRole: account-b-zone-roleIf the cluster you are looking to adopt is on GCP and uses a shared VPC, you will also need to include the name of the network project ID in spec.clusterMetadata.platform.gcp.networkProjectID.
clusterMetadata:
adminKubeconfigSecretRef:
name: my-gcp-cluster-admin-kubeconfig
clusterID: 61010205-c91d-44c9-8394-3e1790bd76f3
infraID: my-gcp-cluster-wsvdn
platform:
gcp:
networkProjectID: some@project.idhiveutil is a development focused CLI tool which can be built from the hive repo. To adopt a cluster specify the following flags:
bin/hiveutil create-cluster --namespace=namespace-to-adopt-into --base-domain=example.com mycluster --adopt --adopt-admin-kubeconfig=/path/to/cluster/admin/kubeconfig --adopt-infra-id=[INFRAID] --adopt-cluster-id=[CLUSTERID]If you wish to transfer ownership of a cluster which is already managed by hive, and have access to the ClusterDeployment, there is no need to create a new ClusterDeployment using hiveutil. Instead, simply do the following:
- Save the current
ClusterDeploymentand relevant creds and certs manifests locally.oc get cd <clusterdeployment_name> -n <namespace> -o yaml > clusterdeployment.yaml oc get secrets <clusterdeployment_name_creds> -n <namespace> -o yaml > clusterdeployment_creds.yaml
- Edit the
ClusterDeployment, settingspec.preserveOnDeletetotrue. This ensures that the next step will only release the hive resources without destroying the cluster in the cloud infrastructure. - Delete the
ClusterDeployment - From the hive instance that will adopt the cluster,
oc applytheClusterDeployment, creds and certs manifests you saved in the first step.
hive-operator deploys each component (the hive-controllers and hiveadmission Deployments; and the hive-clustersync and hive-machinepool StatefulSets) with default resource requests.
If you need to scale any of these components vertically, you may add one or more deploymentConfig sections to HiveConfig's spec. For example:
deploymentConfig:
- deploymentName: hive-controllers
resources:
requests:
memory: 256Mi
- deploymentName: hive-clustersync
resources:
requests:
cpu: 30m
memory: 257Mi
limits:
cpu: 50m
- deploymentName: hiveadmission
resources:
requests:
cpu: 20mFor each entry, the deploymentName must match the metadata.name of the Deployment/StatefulSet.
The resources is a standard corev1.ResourceRequirements.
See below for information on horizontally scaling the clustersync or machinepool controller.
Note: The hive-operator itself must be scaled by directly editing its Deployment.
Hive offers two CRDs for applying configuration in a cluster once it is installed: SyncSet for config destined for specific clusters in a specific namespace, and SelectorSyncSet for config destined for any cluster matching a label selector.
For more information please see the SyncSet documentation.
The clustersync and machinepool controllers are designed to scale horizontally, so increasing the number of controller replicas will scale the number of pods running, thereby increasing the number of simultaneous clusters getting syncsets or machinepools applied to them.
In order to scale these controllers, a section like the following should be added to HiveConfig:
spec:
controllersConfig:
controllers:
- config:
replicas: 3
name: clustersyncThe above example scales the clustersync controller. Use a (separate) section with name: machinepool to scale the machinepool controller.
Hive offers explicit API support for configuring identity providers in the OpenShift clusters it provisions. This is technically powered by the above SyncSet mechanism, but is provided directly in the API to support configuring per cluster identity providers, merged with global identity providers, all of which must land in the same object in the cluster.
For more information please see the SyncIdentityProvider documentation.
oc delete clusterdeployment ${CLUSTER_NAME} --wait=falseDeleting a ClusterDeployment will create a ClusterDeprovision resource, which in turn will launch a pod to attempt to delete all cloud resources created for and by the cluster. This is done by scanning the cloud provider for resources tagged with the cluster's generated InfraID. (i.e. kubernetes.io/cluster/mycluster-fcp4z=owned or sigs.k8s.io/cluster-api-provider-aws/cluster/mycluster-fcp4z=owned) Once all resources have been deleted the pod will terminate, finalizers will be removed, and the ClusterDeployment and dependent objects will be removed. The deprovision process is powered by vendoring the same code from the OpenShift installer used for openshift-install destroy cluster.

