Skip to content

Commit c774183

Browse files
authored
OCPBUGS-55755: gather P50, P95 and P99 for etcd disk metrics from all CI runs (#70577)
* gather P50, P95 and P99 for etcd disk metrics from all CI runs Data will be charted and analyzed to help inform the alert thresholds we should ship with. * Fix a query problem I hope is why we didn't get metrics here * Fix use of irate vs rate and add a p999 to capture spikes
1 parent 13a9fc0 commit c774183

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

ci-operator/step-registry/gather/extra/gather-extra-commands.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -563,6 +563,16 @@ ${t_all} cluster:etcd:write:requests:latency:total:quantile histogram_quanti
563563
${t_install} cluster:etcd:write:requests:latency:install:quantile histogram_quantile(0.99, sum(rate(etcd_request_duration_seconds_bucket{operation=~"create|update|delete"}[${d_install}])) by (le,scope))
564564
${t_test} cluster:etcd:write:requests:latency:test:quantile histogram_quantile(0.99, sum(rate(etcd_request_duration_seconds_bucket{operation=~"create|update|delete"}[${d_test}])) by (le,scope))
565565
566+
${t_test} cluster:etcd:disk:wal:fsync:test:p999:quantile avg(histogram_quantile(0.999, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job="etcd"}[${d_test}])))
567+
${t_test} cluster:etcd:disk:wal:fsync:test:p99:quantile avg(histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job="etcd"}[${d_test}])))
568+
${t_test} cluster:etcd:disk:wal:fsync:test:p95:quantile avg(histogram_quantile(0.95, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job="etcd"}[${d_test}])))
569+
${t_test} cluster:etcd:disk:wal:fsync:test:p50:quantile avg(histogram_quantile(0.50, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job="etcd"}[${d_test}])))
570+
571+
${t_test} cluster:etcd:disk:backend:commit:test:p999:quantile avg(histogram_quantile(0.999, rate(etcd_disk_backend_commit_duration_seconds_bucket{job=~".*etcd.*"}[${d_test}])))
572+
${t_test} cluster:etcd:disk:backend:commit:test:p99:quantile avg(histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket{job=~".*etcd.*"}[${d_test}])))
573+
${t_test} cluster:etcd:disk:backend:commit:test:p95:quantile avg(histogram_quantile(0.95, rate(etcd_disk_backend_commit_duration_seconds_bucket{job=~".*etcd.*"}[${d_test}])))
574+
${t_test} cluster:etcd:disk:backend:commit:test:p50:quantile avg(histogram_quantile(0.50, rate(etcd_disk_backend_commit_duration_seconds_bucket{job=~".*etcd.*"}[${d_test}])))
575+
566576
${t_all} cluster:etcd:read:requests:latency:total:avg sum(rate(etcd_request_duration_seconds_sum{operation=~"get|list|listWithCount"}[${d_all}])) by (le,scope) / sum(rate(etcd_request_duration_seconds_count{operation=~"get|list|listWithCount"}[${d_all}])) by (le,scope)
567577
${t_install} cluster:etcd:read:requests:latency:install:avg sum(rate(etcd_request_duration_seconds_sum{operation=~"get|list|listWithCount"}[${d_install}])) by (le,scope) / sum(rate(etcd_request_duration_seconds_count{operation=~"get|list|listWithCount"}[${d_install}])) by (le,scope)
568578
${t_test} cluster:etcd:read:requests:latency:test:avg sum(rate(etcd_request_duration_seconds_sum{operation=~"get|list|listWithCount"}[${d_test}])) by (le,scope) / sum(rate(etcd_request_duration_seconds_count{operation=~"get|list|listWithCount"}[${d_test}])) by (le,scope)

0 commit comments

Comments
 (0)