Conversation
This ones a bit more of a challenge then CPU/Memory, due to three problems: 1. Cloudcost exporter does not emit metrics for persistent volumes for Azure(grafana/cloudcost-exporter#236) 2. AWS ebs cost metrics does not have a cluster label(grafana/cloudcost-exporter#450) 3. persisent volumes in GKE and EKS emit the total hourly cost of the volume, _not_ the hourly cost per GiB I utilized Prometheus or ooperator(https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators) to overcome not having Azure pv costs. Effectively the query will attempt to find the average cost of pvs for 1. eks volumes via CloudCost Exporter 2. gke volumes via CloudCost Exporter 3. azure volumes via OpenCost This works because we're only querying one cluster at a time _by name_, and we rely upon the fact that cluster names are unique within Grafana Labs infrastructure. The missing cluster label for eks cost metrics and persistent volumes not having cluster labels can be overcome by utilizing `kube_persistentvolume_capacity_bytes` metrics emitted by kube-state-metrics. This was tested by looking at an EKS cluster like so: ```shell go run ./cmd/estimator/ \ -use.cloud.cost.exporter.metrics=true -from $PWD/pkg/costmodel/testdata/resource/StatefulSet.json \ -to $PWD/pkg/costmodel/testdata/resource/StatefulSet-more-storage.json \ -http.config.file ~/.config/dev.yaml \ -prometheus.address $PROMETHEUS_ADDRESS \ dev-us-east-0 ```
| pv_hourly_cost{cluster="%s"} | ||
| )[24h:1m] | ||
| )` | ||
| cloudcostQueryPersistentVolumeCost = ` |
There was a problem hiding this comment.
Any particular reason to use 24h for opencost vs instant query for cloudcost-exporter metrics?
There was a problem hiding this comment.
Great question! During the hackathon, I had arbitrarily picked a 24h lookup window. When testing out the new queries, there was a negligible difference between using a lookback vs instant query. The resulting values were ~$.00001 different. I don't think that difference is worth the computational increase of issuing the queries with a 24h look back.
There was a problem hiding this comment.
I can't really link to the explore's as they would surface internal metrics. If you want, I can share the explores in slack
There was a problem hiding this comment.
thanks for the details, no need to hard proof them :) 👍🏼
| pv_hourly_cost{cluster="%s"} | ||
| )[24h:1m] | ||
| )` | ||
| cloudcostQueryPersistentVolumeCost = ` |
There was a problem hiding this comment.
thanks for the details, no need to hard proof them :) 👍🏼
This ones a bit more of a challenge then CPU/Memory, due to three problems:
I utilized Prometheus or ooperator(https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators) to overcome not having Azure pv costs. Effectively the query will attempt to find the average cost of pvs for
This works because we're only querying one cluster at a time by name, and we rely upon the fact that cluster names are unique within Grafana Labs infrastructure.
The missing cluster label for eks cost metrics and persistent volumes not having cluster labels can be overcome by utilizing
kube_persistentvolume_capacity_bytesmetrics emitted by kube-state-metrics.This was tested by looking at an EKS cluster like so: