Production-oriented automation for provisioning and operating a multi-node k3s cluster across Proxmox hosts, with TrueNAS-backed persistent storage and Traefik ingress.
Beginner journey (from scratch): docs/BEGINNER_JOURNEY.md
Quick operator start: docs/GETTING_STARTED.md
Kubernetes manager + Dockhand sync: docs/DOCKHAND_HEADLAMP_WORKFLOW.md
Notify monitoring + alerting: docs/NOTIFY_MONITORING.md
Latest live ops audit: docs/OPERATIONS_AUDIT_2026-02-22.md
Contributing: CONTRIBUTING.md
Security policy: SECURITY.md
Security rulebook: docs/SECURITY_RULEBOOK.md
Deep environment reference: STACK.md
- Terraform VM provisioning across multiple Proxmox hosts
- Ansible-based cluster bootstrap and app deployment
- Traefik ingress with MetalLB LoadBalancer IP
- Native Kubernetes web manager (Headlamp) exposed via Traefik
- Mixed persistent storage (
truenas-nfs+local-path) based on live workload constraints - Security-first manifest policy: tracked app manifests contain no plaintext secret values
Mac/Linux control node
|
|- Terraform -> Proxmox PVE1 + PVE2
| |- k3s-master-01
| |- k3s-worker-01
| |- k3s-worker-02
| '- k3s-worker-03
|
|- Ansible -> k3s bootstrap + addons
| |- MetalLB
| |- Traefik
| '- NFS storage class
|
'- kubectl -> app manifests -> PVCs on TrueNAS NFS
.
|- ansible/ # cluster + apps playbooks
|- manifests/ # namespaces, ingress, app manifests
| |- apps/
| |- ingress/
| '- secrets/ # local-only overlays (ignored)
|- scripts/ # end-to-end lifecycle scripts
|- terraform/ # Proxmox infrastructure definitions
|- docs/ # operator documentation and security runbooks
|- STACK.md # full environment reference
'- README.md
bash,git,python3terraform,ansible,kubectl,helm- Proxmox API access + template VM available
- TrueNAS NFS export for persistent volumes
# 1) Install tooling
bash scripts/00-install-tools.sh
# 2) Prepare Proxmox template (one-time)
bash scripts/00-create-proxmox-template.sh
# 3) Configure Terraform vars locally
cp terraform/terraform.tfvars.example terraform/terraform.tfvars
# edit terraform/terraform.tfvars (do not commit)
# 4) Provision VMs
bash scripts/01-provision.sh
# 5) Bootstrap k3s
export K3S_TOKEN="CHANGE_ME_MIN_20_CHARS"
bash scripts/02-cluster-setup.sh
# 6) Configure storage
bash scripts/03-storage-setup.sh
# 7) Create runtime secrets locally (see manifests/secrets/README.md)
# Then deploy apps
bash scripts/04-deploy-apps.sh
# 8) Sync kube contexts into Dockhand tracked environments
bash scripts/05-sync-dockhand-contexts.sh- Real credentials must never be committed.
- App manifests reference Secrets by name only; they do not carry real secret values.
- Runtime secrets should be created in
manifests/secrets/*.yml(ignored) or directly withkubectl create secret .... - Local sensitive artifacts (
.k3s-node-token,kubeconfig-raw.yml,terraform/terraform.tfvars,terraform.tfstate) are ignored by.gitignore.
Example secret creation:
kubectl create secret generic vaultwarden-secret -n vaultwarden \
--from-literal=ADMIN_TOKEN='CHANGE_ME' \
--from-literal=DOMAIN='https://vault.smartmur.ca' \
--dry-run=client -o yaml | kubectl apply -f -bash -n scripts/*.sh
pre-commit run --all-files
python3 scripts/security_scrub.pyInstall hooks once per clone:
brew install pre-commit
pre-commit install- CI checks:
.github/workflows/ci.yml - Dependabot updates:
.github/dependabot.yml - Dependabot auto-merge flow:
.github/workflows/dependabot-automerge.yml
- Cluster not reachable:
kubectl cluster-info - App rollout stuck:
kubectl get pods -Aandkubectl logs -n <ns> deploy/<app> - Traefik VIP missing:
kubectl get svc traefik -n traefik - Placeholder guard triggered: replace
CHANGE_ME_*values via secret creation flow
bash scripts/99-teardown.sh