AI coding agent guide for YDB SLO Action project.
GitHub Actions toolkit for automated SLO testing of YDB database SDKs.
Two actions:
init— deploys YDB cluster with chaos testing and Prometheus monitoringreport— generates performance reports, posts them as PR comments with base branch comparison
Stack: TypeScript (ESM, Node.js 24), Bun (bundler), Docker Compose, GitHub Actions API, Prometheus
- Actions split into lifecycle files (
main.ts,post.ts) and utility modules (lib/) - Lifecycle files = thin orchestrators
- Business logic = focused modules (one per domain: Docker, GitHub API, Prometheus, etc.)
- Prevents monolithic files
- Docker Compose defines all infrastructure
- Metrics = YAML files (not hardcoded)
- Chaos scenarios = shell scripts
deploy/directory copied to.slo/at runtime- Users customize without code changes
inituploads metrics/logs/PR data as artifactsreportdownloads artifacts from current + base branch- Allows separate jobs, re-running reports without re-testing
- Prometheus queries in YAML
- Docker Compose configurable via env vars
- Chaos scenarios = scripts (no rebuild needed)
- Action inputs for customization
- Action manages infrastructure only
- Users bring their own test scripts
- Sane defaults, full customization available
- Reports auto-posted to PR comments
action-name/
├── action.yml # Interface definition
├── main.ts # Entry point (orchestrator)
├── post.ts # Cleanup/post-processing
└── lib/ # Utility modules (single responsibility)
Rules:
- Entry points = orchestrators (high-level steps only)
- Modules = specialists (one domain each)
- Files > 150-200 lines → consider splitting
- Single responsibility principle
- Source:
init/andreport/ - Output:
dist/(auto-generated, NEVER edit manually) - Husky pre-commit hook auto-rebuilds and stages
dist/ - Run
bun run bundleto rebuild manually
deploy/= all infrastructure definitions (Docker Compose, configs, chaos scenarios, metrics)- Copied to
.slo/at runtime - Local testing = CI testing
Rootfs pattern:
- Each image dir (e.g.,
ydb/,chaos/) hasDockerfile+rootfs/dir COPY rootfs /in Dockerfile copies entirerootfs/to container root- Example:
deploy/chaos/rootfs/opt/ydb.tech/scripts/→/opt/ydb.tech/scripts/in container - Makes container filesystem structure explicit and easy to navigate
# Setup
bun install
# Development
bun run bundle # Build and verify
# Commit (husky auto-handles dist/ rebuild and staging)
git commit -m "emoji subject"Testing: E2E in real GitHub Actions workflows (no unit tests). Stub SLO test in repo verifies action works.
TypeScript conventions:
- ESM with
.jsextensions in imports:import { x } from './module.js' node:protocol for built-ins:import * as fs from 'node:fs'- Prefer
letoverconst - No semicolons, single quotes, tabs (Prettier config)
Formatting: Auto via Prettier + oxlint (runs on pre-commit)
main.tsruns PRE user workload → deploys infrastructure, saves state viasaveState()- User workload runs
post.tsruns POST → collects metrics, uploads artifacts, cleanup
State passed via saveState()/getState(): cwd, workload name, PR number, start timestamp
- Define metrics in YAML:
name, PromQLquery, optionaltype, optionalstep - (Optional) Control numeric precision in reports with
round(number step), e.g.round: 0.01to round to 2 decimal places.- Rounding is applied consistently to instant values and aggregated range values during analysis/report rendering.
- This avoids floating-point noise in charts/tables without adding name-based heuristics in code.
- Parse YAML at runtime
- Query Prometheus API
- Serialize as JSONL (not JSON array)
- Download current run metrics from artifacts
- Fetch latest successful base branch workflow run
- Download base branch metrics
- Merge (current first, base second)
- Render with ASCII charts
Principle: Simple shell scripts, easy to write/understand.
Pattern:
#!/bin/sh
set -e
. /opt/ydb.tech/scripts/chaos/libchaos.sh
echo "Scenario: Description"
nodeForChaos=$(get_random_database_node)
# chaos logic (docker stop/pause/network manipulation)
# restore to healthy state
echo "Scenario completed"Naming: NN-descriptive-name.sh (e.g., 01-graceful-stop.sh)
Helper functions: get_random_database_node, get_random_storage_node, get_random_node, log "msg"
Rules:
- Always restore system to healthy state
- Use randomization (random node selection)
- Add
echofor observability
emoji subject (max 80 chars)
Body: WHAT and WHY (not HOW). Wrap at 80 chars.
Emojis: ✨ feature | 🐛 fix | 📝 docs | ♻️ refactor | 🔧 config/build | 🐳 docker | 🧪 tests | 🚀 CI/CD
Style: Imperative mood, capital after emoji, no period at end of subject
Minimal inputs:
github_token(API access)workload_name(test identifier)
Extension points:
- Custom metrics:
metrics_yamlormetrics_yaml_pathinput - Custom chaos: add scenarios to fork
- Custom analysis: download artifacts
Report: Finds existing comment and updates (one per workload)
- Never edit
dist/manually — auto-generated, changes will be lost - Import paths must include
.js—import { x } from './module.js'(ESM requirement) - Docker Compose
cwdmatters — always set to directory withcompose.yml - Artifact naming:
{workload}-{type}.{extension}(e.g.,my-workload-metrics.jsonl) - Husky handles dist/ rebuild — don't commit
dist/manually
github_token: minimum permissions (resolve PR, artifacts, comments)- Chaos container: privileged Docker socket access (review scripts carefully)
- Artifacts may contain sensitive data (logs, metrics)
Update this file when: core architectural decisions, design patterns, workflow, or coding standards change. Focus on PRINCIPLES, not file listings.