Skip to content

playingfield/iec62443-kubernetes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Kubernetes & IEC 62443 OT Security Reviewed

Bas Meijer 2025

Securing Kubernetes Clusters

A deployment of Kubernetes in OT should be secured with defense-in-depth principles. Seven foundational requirements of the ISA/IEC 62443 standard can be used as a framework for designing the solution. Nine attack scenarios help to model threats. The Kyverno policy agent helps enforcing the required policies.

  • FR1: Identification and Authentication Control (IAC) - Ensures access is restricted to authenticated entities
  • FR2: Use Control (UC) - Enforces authorization and least privilege principles
  • FR3: System Integrity (SI) - Protects against unauthorized system modifications
  • FR4: Data Confidentiality (DC) - Prevents unauthorized data disclosure through encryption
  • FR5: Restricted Data Flow (RDF) - Implements network segmentation and flow control
  • FR6: Timely Response to Events (TRE) - Enables incident detection and response
  • FR7: Resource Availability (RA) - Ensures system availability and prevents denial of service

While IEC 62443 was designed for industrial control systems, its risk-based, defense-in-depth approach translates effectively to cloud-native Kubernetes environments. Each risk scenario requires layered security controls across multiple foundational requirements, with configuration management and runtime governance being critical success factors. This document discusses nine risk scenarios with their perspective and attack vectors, and mitigations.

Target: Kubernetes Cluster

Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. A Kubernetes cluster consists of a control plane and one or more worker nodes. Typically the control plane runs highly available on 3 to 5 dedicated nodes (servers). One cluster can scale to thousands of worker nodes.

The control plane is the brain of Kubernetes. It's responsible for managing the state of the cluster, which is stored in etcd (a key-value database). The control plane consists of components like the API server, scheduler, and controller manager. The control plane is not a network path that a container can use to "escape", "tunnel", or "obtain control". Only the API server should be accessible from the network. The scheduler and the controller manager should bind to localhost. The data plane, on the other hand, is the actual network traffic between pods and services. The data plane is a dynamic, software-defined network that runs on top of the VM network. This "overlay network" provides a way for pods to communicate across different VM nodes without the underlying VM network needing to be aware of the pod's IP addresses. The CNI plugin assigns IP addresses to pods, and kube-proxy manages network rules to handle service routing and load balancing. Calico and Cilium are popular choices for CNI, among others. Kubernetes comes with lots of choices and addons to enable a flexible architecture.

Scenario 1: Container Escape and Gaining Control

This scenario involves an attacker breaking out of the security boundary of a single container to gain control of the underlying host worker node. Once on the node, the attacker can move laterally to other containers or even attempt to compromise the control plane. This is a critical first step for many sophisticated attacks.

Relevant Attack Vectors

  • Kernel Vulnerabilities: Exploiting vulnerabilities in the host's operating system kernel that allow a process within a container to elevate its privileges or escape the container's namespace isolation.
  • Misconfigured Privileged Containers: If a container is run with the privileged: true flag, it essentially has root access to the host. An attacker could exploit a vulnerability in an application inside this container to take control of the host.
  • Container Runtime Exploits: Finding and exploiting weaknesses in the container runtime (e.g., containerd).
  • Mounting Sensitive Host Directories: A misconfigured container that mounts sensitive directories from the host (e.g., /var/run/docker.sock) can give an attacker a direct path to manipulate the host and its other containers.

Possible Mitigations

  • Use a limited set of minimal base containers.
  • Avoid 'living off the land', i.e. remove tools from containers.
  • Alternative Container Runtimes
  • Apply Pod Security Standards (Baseline/Restricted) to prevent unsafe capabilities.
  • Run containers as non-root, drop unnecessary Linux capabilities, and prohibit use of privileged: true unless absolutely necessary.
  • Require strong, unique credentials for host OS logins (disable password SSH, use key-based auth).
  • Restrict hostPath mounts to only required, non-sensitive directories. Avoid mounting host secrets into pods.
  • Keep host OS kernel and container runtime patched.
  • Enable seccomp, AppArmor/SELinux profiles for all workloads.
  • Use encrypted node-local storage for sensitive data.
  • Isolate nodes by function (dedicated security zones for sensitive workloads).
  • Use runtimeClass to separate sensitive pods from untrusted ones (gVisor, Kata).
  • Alert on abnormal host system calls from containers.
  • Monitor for privileged container deployments via admission controllers (Kyverno, OPA Gatekeeper).
  • Implement node auto-replacement in cluster autoscaler for compromised nodes.
  • Maintain hot spare nodes for workload failover.

Scenario 2: Accessing Other Containers

This scenario focuses on lateral movement within the data plane. An attacker who has compromised one container tries to access other containers they shouldn't, either to exfiltrate data or to gain a further foothold. The goal is to move from a less sensitive container to a more sensitive one.

Relevant Attack Vectors

  • Open Network Policies: The most common vector. By default, pods can communicate with any other pod in the cluster. If Kubernetes Network Policies are not in place to restrict this communication, an attacker can freely scan and attack other containers.
  • Vulnerable Services: Exploiting vulnerabilities in services running inside other containers. For example, a vulnerable web server or an unauthenticated database instance.
  • Credential Theft: If a container is compromised, the attacker may be able to steal credentials or secrets stored within its environment variables or file system, which can then be used to authenticate to other services.

Possible Mitigations

  • Define zones by creating distinct namespaces for different application tiers or security levels.
  • Create deny-by-default policies for incoming and outgoing traffic
  • Secure the conduits between these namespace using NetworkPolicy
  • Implement comprehensive vulnerability management for container images and patch application-level vulnerabilities.
  • Short-lived Kubernetes SecretRef or External Secret Management
  • Role-based Access Control
  • Authenticate service-to-service calls with mTLS
  • Limit service account permissions to the least privilege required.
  • Enable container runtime process isolation.
  • Alert on unexpected inter-pod communication attempts.
  • Enable audit logging for Kubernetes API calls between services.
  • Use rate-limiting and connection quotas for internal services to limit damage.

Scenario 3: Attacking the Control Plane

This is an attack on the "brain" of the cluster. An attacker's goal here is to gain control over the Kubernetes control plane, which would give them the ability to create, modify, or delete any resource in the cluster.

Relevant Attack Vectors

  • API Server Vulnerabilities: Exploiting a vulnerability in the Kubernetes API Server itself. This could be a publicly known CVE or a zero-day exploit.
  • Weak or Stolen Credentials: Obtaining a kubeconfig file or a service account token with overly permissive access and using it to directly interact with the API server.
  • Role-Based Access Control (RBAC) Misconfigurations: If RBAC is not properly configured, a low-privilege user or service account might have elevated permissions that an attacker can exploit.
  • Man-in-the-Middle Attacks: Intercepting communication between cluster components (like the Kubelet and API Server) to steal credentials or inject malicious commands.

Possible Mitigations

  • Keep Kubernetes updated.
  • RBAC Separate admin and operational accounts.
  • Use a centralized identity provider (like AD).
  • Configure service accounts with minimal permissions and disable default service account.
  • Run API server with minimal enabled admission plugins.
  • Enforce TLS for all API server communications.
  • Restrict API server network access using firewall rules.
  • Enable API server audit logging.
  • Deploy a highly available control plane.
  • Rate-limit API requests per user/IP.

Scenario 4: Attacking etcd

etcd is Kubernetes' distributed key-value store, and it holds the entire state of the cluster, including all configuration data, secrets, and pod information. A direct attack on etcd is catastrophic.

Relevant Attack Vectors

  • Overlay Network Access: This is the primary vector. If the etcd pods are not isolated by strong network policies or if the network is not properly segmented, an attacker who gains access to a worker node could potentially communicate with the etcd cluster.
  • Direct Network Access: If etcd runs as a service on (control plane) hosts its tcp ports are accessible over the network, depending on firewall configuration.
  • Lack of Encryption: If communication with etcd is not encrypted with TLS, an attacker could snoop on the network and steal sensitive data in transit.
  • Compromised Node: An attacker who has a foothold on a node could use that access to gain a direct network connection to the etcd pods running on the same or other nodes.
  • Weak Authentication: Misconfigured etcd with weak or default authentication, allowing unauthorized users to read or write data.

Possible Mitigations

  • Deploy etcd on (control plane) hosts with systemd, not as pods in Kubernetes.
  • Use separate PKI/certificate authority for etcd and control plane.
  • Require client certificate authentication and limit etcd access to only the API server.
  • Harden the host operating system.
  • Monitor etcd logs for unauthorized access attempts.
  • Run etcd as a 3- or 5-member HA cluster.
  • Regularly back up etcd and test restore procedures.

Scenario 5: Attacking the Worker Nodes

This scenario involves an attacker targeting the operating system of the Kubernetes worker nodes themselves, rather than just the containers running on them. A compromised worker node can be used to spy on, tamper with, or take down all pods running on it. It also serves as a launchpad for attacks on other nodes or the control plane.

Relevant Attack Vectors

  • Vulnerable Kubelet: The Kubelet is an agent that runs on each worker node and communicates with the control plane. An attacker could exploit a vulnerability in the Kubelet to execute code on the host.
  • SSH Access: SSH is usually enabled, but if it is secured with weak credentials, or if it has an unpatched vulnerability, an attacker could gain direct access to the node's operating system.

Possible Mitigations

  • Hardening the hosts
  • Patch management for the operating system
  • Manage container users and host users carefully with privileged login controls.
  • Use readOnlyRootFilesystem for containers.
  • Implement comprehensive SSH security: public key authentication, limited access from management servers, disabling password authentication.
  • Secure Kubelet API: lock down access, restrict ports to control plane traffic only, and monitor audit logs for unusual requests.
  • Enable Linux audit logging to SOC
  • Access Management
  • Intrusion detection

Scenario 6: Cluster Admin Compromised

This is arguably the most dangerous scenario, as it bypasses many internal cluster security controls. The attacker assumes the identity of a legitimate, highly-privileged user.

Relevant Attack Vectors

  • Stolen Credentials: A kubeconfig file with cluster-admin privileges being exfiltrated from the management server.
  • Compromised Management Server: The management server being compromised through a different attack, allowing the attacker to access their local files and credentials.
  • Lack of Multi-Factor Authentication (MFA): Without MFA, a stolen password is all an attacker needs.

Possible Mitigations

  • Separate on-premises work from office work environments.
  • Limit cluster admin access to a small group and avoid liberal use of cluster admin privileges.
  • Implement secure credential management, like public key authentication for SSH.
  • Enforce signed kubectl plugins.
  • Require a management server for admin network connectivity.
  • Alert on admin role usage outside normal hours.
  • Maintain break-glass procedures with secondary admin credentials
  • Ensure cluster backup & restore capability.

Scenario 7: Supply Chain Attacks

This is a modern, critical attack vector that targets the integrity of the software before it ever runs in your cluster. An attacker compromises the build and deployment process, not the runtime environment itself.

Relevant Attack Vectors

  • Compromised Container Images: An attacker injects malicious code into a seemingly legitimate base image on a public registry.
  • Vulnerable Dependencies: Malicious code is introduced via a third-party library or dependency used in the application's source code.
  • Compromised CI/CD Pipeline: The attacker gains access to the continuous integration/continuous delivery (CI/CD) pipeline, allowing them to modify application code or images before they are deployed.

Possible Mitigations

  • Scan images for vulnerabilities pre-deployment.
  • Restrict image sources to trusted registries.
  • Sign and verify container images with cosign, block unsigned image deployments.
  • Pin dependencies to verified versions.
  • Use admission control to block unverified images.
  • Maintain company registry for container images.
  • Not allowing image version "latest"

Scenario 8: Denial of Service (DoS) Attacks

Unlike attacks aimed at gaining control, a DoS attack's goal is to make the cluster or specific applications within it unavailable. This can be just as damaging, leading to financial loss or reputational damage.

Relevant Attack Vectors

  • Resource Exhaustion: An attacker deploys a malicious workload designed to consume all available CPU, memory, or disk space on a node.
  • Exploiting HPA (Horizontal Pod Autoscaler): An attacker generates a high volume of traffic to a service, tricking the autoscaler into creating a massive number of pods that exhaust cluster resources.
  • Targeting Core Services: Overwhelming the Kubernetes API server with a flood of requests, making it unresponsive.

Possible Mitigations

  • Priority Classes - Control which pods get scheduled first and which get evicted first during resource pressure
  • PodDisruptionBudgets - Ensure minimum availability during voluntary disruptions like node maintenance.
  • ResourceQuotas - Constrain the total amount of resources a namespace can consume.
  • LimitRanges - Set default CPU/memory requests and limits for pods within a namespace to prevent any single pod from consuming excessive resources
  • Readiness & Liveness Probes

Scenario 9: Exploiting Misconfigurations

This is a broad category that underpins many other attacks. While some of our scenarios mention misconfigurations as a vector, it's a valuable exercise to consider it as a stand-alone, high-level threat.

Relevant Attack Vectors

  • Insecure Default Settings: Relying on default Kubernetes settings that may be insecure, such as a permissive Pod Security Admission policy.
  • Exposed Secrets: Storing secrets in plain text or in insecure locations like environment variables. Critical secrets should not be long-living Kubernetes secrets.
  • Open Network Ports: A service or pod is inadvertently exposed to the network, creating an easy entry point for an attacker.
  • Lack of Pod Security: Running a pod with overly permissive security contexts, such as allowPrivilegeEscalation: true or running as a root user.

Possible Mitigations

  • Using a firewalled VLAN around the Kubernetes cluster
  • Extended Admission Control (Kyverno)
  • External Secrets Management (Hashicorp Vault)
  • Enable authentication for all exposed services.
  • Apply RBAC reviews periodically.
  • Regular CIS Kubernetes Benchmark scans with kube-bench.
  • Automated validation of manifests in deployment.
  • Enable secret scanning tools.
  • Close unused network ports with NetworkPolicies.
  • Limit network exposure via ingress.
  • Alert on new public service exposures.
  • Maintain configuration in version control.
  • Test restore from known good configuration versions.