Skip to content
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 86 additions & 0 deletions dev/load-test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Agent Sandbox Load Testing

This directory contains configuration files for running load tests on the Agent Sandbox using [ClusterLoader2](https://github.com/kubernetes/perf-tests/tree/master/clusterloader2).

## Prerequisites

1. **Kubernetes Cluster**: You need a running Kubernetes cluster.
2. **Agent Sandbox Controller**: The controller and CRDs must be installed on the cluster.
3. **Go Lang**: The clusterloader2 uses go to execute the load tests.

## Setup

### 1. Install Agent Sandbox Controller

You can install the agent-sandbox controller and its CRDs with the following command.

```bash
# Replace "vX.Y.Z" with a specific version tag (e.g., "v0.1.0") from
# https://github.com/kubernetes-sigs/agent-sandbox/releases
export VERSION="vX.Y.Z"

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/${VERSION}/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/${VERSION}/extensions.yaml
```

### 2. Install ClusterLoader2

Follow the instructions to install [ClusterLoader2](https://github.com/kubernetes/perf-tests/blob/master/clusterloader2/docs/GETTING_STARTED.md#clusterloader2) here. This creates a new local repository where you can find the `clusterloader2` directory.

## Running the Load Test

The load test is defined in `agent-sandbox-load-test.yaml`.

It creates a specified number of Sandboxes using the template in `cluster-loader-sandbox.yaml` and measures startup latency.

### 1. Build the cluster loader

Make sure your current directory is: `perf-tests/clusterloader2`. Build the cluster loader first.

```bash
go build -o clusterloader2 ./cmd/clusterloader.go
```

### 2. Run the load test

To run the test against your Kubernetes cluster, execute the command below:

```bash
./clusterloader2
--testconfig=../../agent-sandbox/dev/load-test/agent-sandbox-load-test.yaml
--kubeconfig=$HOME/.kube/config
--provider=gke
```

To run the test against against your local kind Kubernetes cluster, please follow
the [kind installation](https://kind.sigs.k8s.io/docs/user/quick-start#installation) guide.

Then execute the command below:

```bash
./clusterloader2
--testconfig=../../agent-sandbox/dev/load-test/agent-sandbox-load-test.yaml
--kubeconfig=$HOME/.kube/config
--provider=kind
```

**Note:** Ensure you are in the `clusterloader2/` directory when running this command, as the configuration references `agent-sandbox-load-test.yaml` via a relative path.

### 3. Verify results

Once the test is run, the results will be saved in `junit.xml` under the `clusterloader2/` directory.
The result will look like this.

```xml
<?xml version="1.0" encoding="UTF-8"?>
<testsuite name="ClusterLoaderV2" tests="0" failures="0" errors="0" time="57.957">
<testcase name="agent-sandbox-load-test overall (../../agent-sandbox-initial-playing/load-test/agent-sandbox-load-test.yaml)" classname="ClusterLoaderV2" time="57.955555557"></testcase>
<testcase name="agent-sandbox-load-test: [step: 01] Start Startup Latency Measurement [00] - SandboxStartupLatency" classname="ClusterLoaderV2" time="0.225971844"></testcase>
<testcase name="agent-sandbox-load-test: [step: 02] Create Sandboxes" classname="ClusterLoaderV2" time="2.012305727"></testcase>
<testcase name="agent-sandbox-load-test: [step: 03] Wait for Sandboxes to be Ready [00] - WaitForSandboxes" classname="ClusterLoaderV2" time="5.095579777"></testcase>
<testcase name="agent-sandbox-load-test: [step: 04] Gather Results [00] - SandboxStartupLatency" classname="ClusterLoaderV2" time="0.126157956"></testcase>
</testsuite>
```
29 changes: 29 additions & 0 deletions dev/load-test/agent-sandbox-sample-load-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: agent-sandbox-load-test
namespace:
number: 1
tuningSets:
- name: BurstCreate
qpsLoad:
qps: 10 # Adjust based on your cluster's capacity
steps:
- name: Start Startup Latency Measurement
measurements:
- Identifier: SandboxStartupLatency
Method: PodStartupLatency # We use Pod latency as Sandboxes wrap Pods
Params:
action: start
labelSelector: test = agent-load
- name: Create Sandboxes
phases:
- namespaceRange: {min: 1, max: 1}
replicasPerNamespace: 20 # Total sandboxes to spin up
tuningSet: BurstCreate
objectBundle:
- basename: agent-env
objectTemplatePath: "cluster-loader-sandbox.yaml"
- name: Wait for Sandboxes to be Ready
measurements:
- identifier: PodStartupLatency
method: PodStartupLatency
params:
action: gather
7 changes: 7 additions & 0 deletions dev/load-test/cluster-loader-sandbox-claim.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: extensions.agents.x-k8s.io/v1alpha1
kind: SandboxClaim
metadata:
name: {{.Name}}
spec:
sandboxTemplateRef:
name: {{.TemplateName}}
17 changes: 17 additions & 0 deletions dev/load-test/cluster-loader-sandbox-template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: extensions.agents.x-k8s.io/v1alpha1
kind: SandboxTemplate
metadata:
name: {{.Name}}
spec:
podTemplate:
metadata:
labels:
latency-type: {{.LatencyType}}
spec:
restartPolicy: Never
runtimeClassName: gvisor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we template this to {{.RuntimeClass}} ? This would allow us to run these load tests against local clusters (e.g., Kind with runc) or accept other runtimeClassNames in general. It could also help for CI verification as well.

containers:
- name: python-agent
image: python:3.11-slim
command: ["/bin/sh", "-c"]
args: ["echo 'Hello from the Sandbox!' && sleep 3600"]
17 changes: 17 additions & 0 deletions dev/load-test/cluster-loader-sandbox.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: agents.x-k8s.io/v1alpha1
kind: Sandbox
metadata:
# CL2 will automatically populate {{.Name}}
name: {{.Name}}
spec:
podTemplate:
metadata:
labels:
churn-group: {{.Group}}
spec:
terminationGracePeriodSeconds: 1
restartPolicy: Never
containers:
- name: sample-sandbox
image: alpine
command: ["/bin/sh", "-c", "echo 'Hello from the Agent Sandbox!'; sleep 3600"]
9 changes: 9 additions & 0 deletions dev/load-test/cluster-loader-warmpool.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Warmpool Template
apiVersion: extensions.agents.x-k8s.io/v1alpha1
kind: SandboxWarmPool
metadata:
name: {{.Name}}
spec:
replicas: {{.Replicas}}
sandboxTemplateRef:
name: {{.TemplateName}}
71 changes: 71 additions & 0 deletions dev/load-test/high-volume-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
name: high-volume-test
automanagedNamespaces: 0

tuningSets:
- name: RampUp
qpsLoad:
qps: 10 # 10 sandbox creation per second
- name: RampDown
qpsLoad:
qps: 50 # 50 sandbox deletions per second

steps:
# 1. Start Metric Measurement
- name: Start Metric Measurement
measurements:
- identifier: SchedulingThroughput
method: SchedulingThroughput
params:
action: start
- identifier: SandboxStartupLatency
method: PodStartupLatency
params:
action: start
labelSelector: "agents.x-k8s.io/sandbox-name-hash"

# 2. Increase the Load with Linear Ramp Up with a steady qps of 10 / sec.
- name: Linear Ramp Up
phases:
# Ramp Up to total replicas at a linear rate of 10 / sec.
- name: Scale Up
namespaceRange: {min: 1, max: 1}
tuningSet: RampUp
replicasPerNamespace: 3000 # Total sandboxes to spin up
objectBundle:
- basename: agent-sandbox
identifier: "linear-rampup"
objectTemplatePath: cluster-loader-sandbox.yaml
templateFillMap:
Group: linear-rampup
waits:
- type: Running
labelSelector: "agents.x-k8s.io/sandbox-name-hash"

- name: Wait for Prometheus Scrape
measurements:
- identifier: Wait
method: Sleep
params:
duration: 20s

# 3. Gather Measurement
- name: Gather Latency Measurement
measurements:
- identifier: SchedulingThroughput
method: SchedulingThroughput
params:
action: gather
- identifier: SandboxStartupLatency
method: PodStartupLatency
params:
action: gather

# 4. Delete Sandboxes to keep the cluster clean
- name: Cleanup Sandboxes
phases:
- namespaceRange: {min: 1, max: 1}
replicasPerNamespace: 0
tuningSet: RampDown
objectBundle:
- basename: agent-sandbox
objectTemplatePath: cluster-loader-sandbox.yaml
88 changes: 88 additions & 0 deletions dev/load-test/medium-scale-concurrent-load-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
name: performance-churn-test-steady-state
automanagedNamespaces: 0

tuningSets:
- name: ConstantChurn
qpsLoad:
qps: 2 # 2 constant sandbox creation and deletion per second
- name: QuickFinalDeletion
qpsLoad:
qps: 100 # 100 sandbox deletions per second for final cleanup

steps:
# 1. Initial Warmup to establish steady state for Group A
- name: Initial Warmup
phases:
- namespaceRange: {min: 1, max: 1}
replicasPerNamespace: 1200
tuningSet: ConstantChurn
objectBundle:
- basename: churn-warmup-group
objectTemplatePath: cluster-loader-sandbox.yaml
templateFillMap:
Group: warmup-group
waits:
- type: Running
labelSelector: "agents.x-k8s.io/sandbox-name-hash"

# # 2. Start Measurements for Group B sandbox creations
- name: Start Measurements
measurements:
- identifier: PodStartupLatency
method: PodStartupLatency
params:
action: start
labelSelector: "churn-group=continuous-churn-group"

# # 3. Create Continuous Churn Loop with sandboxes getting created
# # in Group B while sandboxes in Group A are getting deleted concurrently
# # to maintain steady state.
- name: Continuous Churn Cycle
phases:
- name: Create
namespaceRange: {min: 1, max: 1}
replicasPerNamespace: 1200
tuningSet: ConstantChurn
objectBundle:
- basename: churn-continuous-churn-group
objectTemplatePath: cluster-loader-sandbox.yaml
templateFillMap:
Group: continuous-churn-group
waits:
- type: Running
labelSelector: "agents.x-k8s.io/sandbox-name-hash"
- name: Delete
namespaceRange: {min: 1, max: 1}
replicasPerNamespace: 0
tuningSet: ConstantChurn
objectBundle:
- basename: churn-warmup-group
objectTemplatePath: cluster-loader-sandbox.yaml
waits:
- type: Deleted
labelSelector: "churn-group=warmup-group"

# # 4. Gather Measurements for Group B sandbox creations
- name: Gather Measurements
measurements:
- identifier: PodStartupLatency
method: PodStartupLatency
params:
action: gather
labelSelector: "churn-group=continuous-churn-group"

# 5. Cleanup any remaining Sandboxes to keep the cluster clean
- name: Cleanup Remaining Sandboxes
phases:
- namespaceRange: {min: 1, max: 1}
replicasPerNamespace: 0
tuningSet: ConstantChurn
objectBundle:
- basename: churn-warmup-group
objectTemplatePath: cluster-loader-sandbox.yaml
- namespaceRange: {min: 1, max: 1}
replicasPerNamespace: 0
tuningSet: QuickFinalDeletion
objectBundle:
- basename: churn-continuous-churn-group
objectTemplatePath: cluster-loader-sandbox.yaml
Loading