Skip to content

Fix AKS Scaling (Cluster Autoscaler) #112

@mithunshanbhag

Description

@mithunshanbhag

We're investigating the following options for AKS scaling:

1. ACI VIRTUAL NODES

Status

Currently BLOCKED.

Where

Change Description

  • Redeployed AKS cluster via bicep template from mithun/hpa2 branch, which has the Azure CNI network policy (instead of the default kubenet policy).
  • Had to manually modify AKS's vnet to create a new subnet aci-subnet with address space 10.255.0.0/16.
  • Tethered it to existing AKS cluster using az aks enable-addons (full instructions here).
  • Applied the Deployment.yaml manifest from mithun/hpa2 branch, which has the nodeSelector, tolerations changes to configure pods to only run in virtual nodes.

Issue Details

The pods (configured to run in ACI virtual nodes) are stuck in waiting state.

image

The logs only show that an active endpoint is not being detected for the services / ingress

image

Hypothesis

  • Could have something to do with the fact that we switched over to Azure CNI network policy instead of the default kubenet policy.
  • Could have something to do with the nodeSelector, tolerations changes made in the Deployment.yaml file to configure pod to only run in virtual nodes.

2. CLUSTER AUTOSCALER

Status

Currently INVESTIGATING

Where

Change Description

  • Enable autoscaling with minCount: 1 and maxCount: 10

Issue Details

Hypothesis

Currently none, still investigating.

Misc Notes

Ingress controller was stuck in PENDING state for a few minutes after provisioning. Then automatically went to OK state.

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions