Compare commits

...

58 Commits

Author SHA1 Message Date
MichelHollands
7e38d19814 Update Tempo Distributed 2024-05-08 07:03:26 +00:00
Michel Hollands
3879207e05 Merge pull request #101 from grafana/fix_minio_secret_name
Fix secret name
2024-05-07 14:40:52 +01:00
Michel Hollands
cd42da2197 Fix secret name
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-07 14:39:20 +01:00
Michel Hollands
56cab04af8 Merge pull request #92 from grafana/use_secret_for_minio
Use a secret for the Minio access
2024-05-07 12:37:07 +01:00
Michel Hollands
b99140d3f4 Merge pull request #97 from grafana/chore/update-tempo-distributed
[dependency] Update the Tempo Distributed subchart
2024-05-07 11:00:53 +01:00
MichelHollands
749e271455 Update Tempo Distributed 2024-05-07 09:22:20 +00:00
Michel Hollands
de91b4dac7 Merge pull request #96 from grafana/chore/update-tempo-distributed
[dependency] Update the Tempo Distributed subchart
2024-05-07 09:22:43 +01:00
MichelHollands
9f6e52d7a1 Update Tempo Distributed 2024-05-07 08:22:14 +00:00
Michel Hollands
025bb5b0c3 Merge pull request #94 from grafana/chore/update-loki
[dependency] Update the Loki subchart
2024-05-07 08:51:44 +01:00
MichelHollands
0b31eae425 Update loki 2024-05-07 07:02:51 +00:00
Michel Hollands
ab42a96949 Update installation instructions
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-06 16:29:33 +01:00
Michel Hollands
386ff25fca Use the secret in the ruler for the dashboards
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-06 16:18:44 +01:00
Michel Hollands
c6889131a7 Use structuredConfig correctly
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-06 16:12:48 +01:00
Michel Hollands
2739bae0c0 Use correct variables
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-03 15:40:36 +01:00
Michel Hollands
cea8076b75 Start using a secret
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-03 15:38:07 +01:00
Michel Hollands
29b831ca00 Merge pull request #91 from grafana/set_step_time_to_1_minute_in_dashboards
Set the interval to 1m to match the scrape interval
2024-05-03 09:58:40 +01:00
Michel Hollands
09cf8f812c Merge pull request #90 from grafana/chore/update-tempo-distributed
[dependency] Update the Tempo Distributed subchart
2024-05-03 09:58:25 +01:00
Michel Hollands
f8436a8e44 Set the interval to 1m to match the scrape interval
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-03 09:28:26 +01:00
MichelHollands
2b26abedbb Update Tempo Distributed 2024-05-03 07:02:32 +00:00
Michel Hollands
017c041007 Merge pull request #89 from grafana/add_extra_metrics
Add way to gather extra metrics and logs
2024-05-02 17:15:18 +01:00
Michel Hollands
e7ad1383a6 Add way to gather extra metrics and logs
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-02 17:14:09 +01:00
Michel Hollands
2906836eae Merge pull request #88 from grafana/more_dashboard_cleanup
Cleanup dashboards
2024-05-02 14:25:50 +01:00
Michel Hollands
c70ef27e48 Set scrape interval to 1m
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-02 14:24:36 +01:00
Michel Hollands
3c187def47 Fix error in query
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-02 14:24:09 +01:00
Michel Hollands
54eda36ec3 Cleanup dashboards we won't ship
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-02 13:30:45 +01:00
Michel Hollands
bc33e5a2a5 Merge pull request #83 from grafana/chore/update-loki
[dependency] Update the Loki subchart
2024-05-02 13:29:03 +01:00
Michel Hollands
31e82bbf16 Merge pull request #87 from grafana/remove_non_loki_dashboards
Remove non Loki and agent charts
2024-05-02 10:49:11 +01:00
Michel Hollands
52c1bf1778 Remove non loki and agent charts
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-02 10:47:58 +01:00
MichelHollands
2c5c4d8e38 Update loki 2024-05-02 07:02:34 +00:00
Michel Hollands
b6a5a3cfe3 Merge pull request #85 from grafana/open_extra_ports_for_otlp
Enabled traces from Loki
2024-05-01 17:40:35 +01:00
Michel Hollands
a01992194b Use OpenTelemetry token for traces
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-01 17:24:38 +01:00
Michel Hollands
636b654828 Add docs on how to setup Loki
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-01 17:20:00 +01:00
Michel Hollands
5d553e50f6 Add config for Loki traces to OTEL
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-01 16:54:51 +01:00
Michel Hollands
3f200115f9 Consider the meta monitoring namespace for logs as well
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-05-01 10:16:57 +01:00
Michel Hollands
f0bdf0760d Merge pull request #82 from grafana/fix_correct_version
Update version correctly
2024-04-29 16:44:48 +01:00
Michel Hollands
314b1db19b Update version correctly
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-29 16:44:28 +01:00
Michel Hollands
b547784d54 Merge pull request #81 from grafana/update_version
Update Chart to version 0.0.2
2024-04-29 16:40:03 +01:00
Michel Hollands
af4cd1f8c0 Update to version 2
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-29 16:39:05 +01:00
Michel Hollands
116119bdc4 Merge pull request #80 from grafana/fix_logic_for_secrets
Fix a few more things
2024-04-29 16:36:24 +01:00
Michel Hollands
df794115f0 Fix a few more things
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-29 16:34:21 +01:00
Michel Hollands
c26e509f65 Merge pull request #79 from grafana/use_updated_dashboards2
Cleanup dashboards
2024-04-29 15:17:15 +01:00
Michel Hollands
95f7905e34 Cleanup dashboards
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-29 15:14:43 +01:00
Michel Hollands
ad1b619a33 Merge pull request #78 from grafana/chore/update-minio
[dependency] Update the Minio subchart
2024-04-29 08:35:41 +01:00
MichelHollands
446c0be743 Update minio 2024-04-29 07:02:52 +00:00
Michel Hollands
be7a32de27 Merge pull request #77 from grafana/filter_cadvisor_kubelet_metrics_on_namespace
Only get cadvisor and kubelet metrics from the required namespaces
2024-04-28 14:33:32 +01:00
Michel Hollands
e41b2f360f Merge branch 'main' into filter_cadvisor_kubelet_metrics_on_namespace 2024-04-28 14:33:07 +01:00
Michel Hollands
1cafd696c7 Merge pull request #76 from grafana/add_creation_of_dashboard
Create the mixin locally
2024-04-26 17:27:46 +01:00
Michel Hollands
c614f41d66 Only keep metrics from the monitored namespaces
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-26 16:45:40 +01:00
Michel Hollands
2144cea411 Reformat loki-rules
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-26 14:43:08 +01:00
Michel Hollands
81a017551b This was moved to a separate PR
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-26 14:20:39 +01:00
Michel Hollands
1871a4ef87 Only get cadvisor and kubelet metrics from the required namespaces
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-26 14:15:33 +01:00
Michel Hollands
11d80263a7 Update dashboards from Loki
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-26 14:08:59 +01:00
Michel Hollands
cdb0bee56e First draft
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-25 19:15:00 +01:00
Michel Hollands
58171a6a42 Merge pull request #75 from grafana/more_docs_fixes
Small docs fixes
2024-04-25 15:29:02 +01:00
Michel Hollands
c65445384b Merge pull request #72 from grafana/chore/update-tempo-distributed
[dependency] Update the Tempo Distributed subchart
2024-04-25 15:28:37 +01:00
Michel Hollands
1f980f393e Small docs fixes
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
2024-04-25 15:28:09 +01:00
Michel Hollands
47d9190eda Merge pull request #74 from grafana/add_more_docs
Add docs on how to install dashboards and rules in the cloud
2024-04-25 15:25:26 +01:00
MichelHollands
d6faaf88f5 Update Tempo Distributed 2024-04-25 07:02:32 +00:00
68 changed files with 6001 additions and 59675 deletions

1
.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
.DS_Store

View File

@@ -19,7 +19,7 @@ In the cloud mode the logs, metrics and/or traces are sent to Grafana Cloud.
To enable cloud mode set `cloud.<logs|metrics|traces>.enabled` to true. The `endpoint`, `username` and `password` settings for your Grafana Cloud logs, metrics and traces instances have to be filled in as well.
Both modes can be enabled at the same time.
Both modes can be enabled at the same time. Cloud mode is preferred.
## Installation
@@ -33,8 +33,6 @@ For more instructions including how to update the chart go to the [installation]
- Specify PII regexes that are applied to logs before they are sent to Loki (cloud or local). The capture group in the regex is replaced with *****.
- a Grafana instance is installed (when local mode is used) with the relevant datasources installed. The following dashboards are installed:
- logs dashboards
- metrics dashboards
- traces dashboards
- agent dashboards
- Retention is set to 24 hours

View File

@@ -1,7 +1,7 @@
dependencies:
- name: loki
repository: https://grafana.github.io/helm-charts
version: 6.3.4
version: 6.5.0
- name: alloy
repository: https://grafana.github.io/helm-charts
version: 0.1.1
@@ -10,9 +10,9 @@ dependencies:
version: 5.3.0
- name: tempo-distributed
repository: https://grafana.github.io/helm-charts
version: 1.9.3
version: 1.9.9
- name: minio
repository: https://charts.min.io
version: 5.1.0
digest: sha256:3a47359025acb478698760e2f27b1608fab95174508e32f86e31a06d7856fe32
generated: "2024-04-23T07:02:48.466787696Z"
version: 5.2.0
digest: sha256:5328702b5f6b0487aba8f7bc77d6abfcd5e094569e9205cd725971e3e31255dd
generated: "2024-05-08T07:03:21.797461955Z"

View File

@@ -13,7 +13,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.0.1
version: 0.0.2
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
@@ -22,7 +22,7 @@ appVersion: "0.0.1"
dependencies:
- name: loki
repository: https://grafana.github.io/helm-charts
version: 6.3.4
version: 6.5.0
condition: local.logs.enabled
- name: alloy
repository: https://grafana.github.io/helm-charts
@@ -33,9 +33,9 @@ dependencies:
condition: local.metrics.enabled
- name: tempo-distributed
repository: https://grafana.github.io/helm-charts
version: 1.9.3
version: 1.9.9
condition: local.traces.enabled
- name: minio
repository: https://charts.min.io
version: 5.1.0
version: 5.2.0
condition: local.minio.enabled

Binary file not shown.

Binary file not shown.

View File

@@ -51,6 +51,7 @@
"overrides": [ ]
},
"id": 1,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -64,7 +65,7 @@
"span": 6,
"targets": [
{
"expr": "sum(loki_ingester_memory_chunks{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"})",
"expr": "sum(loki_ingester_memory_chunks{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "series",
"legendLink": null
@@ -98,6 +99,7 @@
"overrides": [ ]
},
"id": 2,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -111,7 +113,7 @@
"span": 6,
"targets": [
{
"expr": "sum(loki_ingester_memory_chunks{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}) / sum(loki_ingester_memory_streams{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"})",
"expr": "sum(loki_ingester_memory_chunks{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}) / sum(loki_ingester_memory_streams{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "chunks",
"legendLink": null
@@ -157,6 +159,7 @@
"overrides": [ ]
},
"id": 3,
"interval": "1m",
"links": [ ],
"nullPointMode": "null as zero",
"options": {
@@ -171,19 +174,19 @@
"span": 6,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_utilization_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) by (le)) * 1",
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_utilization_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le)) * 1",
"format": "time_series",
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(loki_ingester_chunk_utilization_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) by (le)) * 1",
"expr": "histogram_quantile(0.50, sum(rate(loki_ingester_chunk_utilization_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le)) * 1",
"format": "time_series",
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(loki_ingester_chunk_utilization_sum{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) * 1 / sum(rate(loki_ingester_chunk_utilization_count{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval]))",
"expr": "sum(rate(loki_ingester_chunk_utilization_sum{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) * 1 / sum(rate(loki_ingester_chunk_utilization_count{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "Average",
"refId": "C"
@@ -235,6 +238,7 @@
"overrides": [ ]
},
"id": 4,
"interval": "1m",
"links": [ ],
"nullPointMode": "null as zero",
"options": {
@@ -249,19 +253,19 @@
"span": 6,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_age_seconds_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) by (le)) * 1e3",
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_age_seconds_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(loki_ingester_chunk_age_seconds_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) by (le)) * 1e3",
"expr": "histogram_quantile(0.50, sum(rate(loki_ingester_chunk_age_seconds_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(loki_ingester_chunk_age_seconds_sum{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) * 1e3 / sum(rate(loki_ingester_chunk_age_seconds_count{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval]))",
"expr": "sum(rate(loki_ingester_chunk_age_seconds_sum{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) * 1e3 / sum(rate(loki_ingester_chunk_age_seconds_count{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "Average",
"refId": "C"
@@ -325,6 +329,7 @@
"overrides": [ ]
},
"id": 5,
"interval": "1m",
"links": [ ],
"nullPointMode": "null as zero",
"options": {
@@ -339,19 +344,19 @@
"span": 6,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_entries_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) by (le)) * 1",
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_entries_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le)) * 1",
"format": "time_series",
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(loki_ingester_chunk_entries_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) by (le)) * 1",
"expr": "histogram_quantile(0.50, sum(rate(loki_ingester_chunk_entries_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le)) * 1",
"format": "time_series",
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(loki_ingester_chunk_entries_sum{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) * 1 / sum(rate(loki_ingester_chunk_entries_count{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval]))",
"expr": "sum(rate(loki_ingester_chunk_entries_sum{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) * 1 / sum(rate(loki_ingester_chunk_entries_count{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "Average",
"refId": "C"
@@ -403,6 +408,7 @@
"overrides": [ ]
},
"id": 6,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -416,7 +422,7 @@
"span": 6,
"targets": [
{
"expr": "sum(rate(loki_chunk_store_index_entries_per_chunk_sum{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[5m])) / sum(rate(loki_chunk_store_index_entries_per_chunk_count{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[5m]))",
"expr": "sum(rate(loki_chunk_store_index_entries_per_chunk_sum{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) / sum(rate(loki_chunk_store_index_entries_per_chunk_count{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "Index Entries",
"legendLink": null
@@ -462,6 +468,7 @@
"overrides": [ ]
},
"id": 7,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -475,7 +482,7 @@
"span": 6,
"targets": [
{
"expr": "loki_ingester_flush_queue_length{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"} or cortex_ingester_flush_queue_length{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}",
"expr": "loki_ingester_flush_queue_length{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"} or cortex_ingester_flush_queue_length{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
@@ -658,6 +665,7 @@
},
"fill": 10,
"id": 8,
"interval": "1m",
"linewidth": 0,
"links": [ ],
"options": {
@@ -673,7 +681,7 @@
"stack": true,
"targets": [
{
"expr": "sum by (status) (\n label_replace(label_replace(rate(loki_ingester_chunk_age_seconds_count{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n \"status\", \"${1}\", \"status_code\", \"([a-zA-Z]+)\"))\n",
"expr": "sum by (status) (\n label_replace(label_replace(rate(loki_ingester_chunk_age_seconds_count{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n \"status\", \"${1}\", \"status_code\", \"([a-zA-Z]+)\"))\n",
"format": "time_series",
"legendFormat": "{{status}}",
"refId": "A"
@@ -719,6 +727,7 @@
"overrides": [ ]
},
"id": 9,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -732,7 +741,7 @@
"span": 6,
"targets": [
{
"expr": "sum(rate(loki_ingester_chunks_flushed_total{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval]))",
"expr": "sum(rate(loki_ingester_chunks_flushed_total{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
@@ -766,6 +775,7 @@
"overrides": [ ]
},
"id": 10,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -780,7 +790,7 @@
"stack": true,
"targets": [
{
"expr": "sum by (reason) (rate(loki_ingester_chunks_flushed_total{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) / ignoring(reason) group_left sum(rate(loki_ingester_chunks_flushed_total{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval]))",
"expr": "sum by (reason) (rate(loki_ingester_chunks_flushed_total{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) / ignoring(reason) group_left sum(rate(loki_ingester_chunks_flushed_total{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "{{reason}}",
"legendLink": null
@@ -837,13 +847,14 @@
"hideZeroBuckets": false,
"highlightCards": true,
"id": 11,
"interval": "1m",
"legend": {
"show": true
},
"span": 12,
"targets": [
{
"expr": "sum by (le) (rate(loki_ingester_chunk_utilization_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval]))",
"expr": "sum by (le) (rate(loki_ingester_chunk_utilization_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
"format": "heatmap",
"intervalFactor": 2,
"legendFormat": "{{le}}",
@@ -899,13 +910,14 @@
"hideZeroBuckets": false,
"highlightCards": true,
"id": 12,
"interval": "1m",
"legend": {
"show": true
},
"span": 12,
"targets": [
{
"expr": "sum(rate(loki_ingester_chunk_size_bytes_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[$__rate_interval])) by (le)",
"expr": "sum(rate(loki_ingester_chunk_size_bytes_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le)",
"format": "heatmap",
"intervalFactor": 2,
"legendFormat": "{{le}}",
@@ -968,6 +980,7 @@
"overrides": [ ]
},
"id": 13,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -981,19 +994,19 @@
"span": 12,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_size_bytes_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[1m])) by (le))",
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_size_bytes_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le))",
"format": "time_series",
"legendFormat": "p99",
"legendLink": null
},
{
"expr": "histogram_quantile(0.90, sum(rate(loki_ingester_chunk_size_bytes_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[1m])) by (le))",
"expr": "histogram_quantile(0.90, sum(rate(loki_ingester_chunk_size_bytes_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le))",
"format": "time_series",
"legendFormat": "p90",
"legendLink": null
},
{
"expr": "histogram_quantile(0.50, sum(rate(loki_ingester_chunk_size_bytes_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[1m])) by (le))",
"expr": "histogram_quantile(0.50, sum(rate(loki_ingester_chunk_size_bytes_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le))",
"format": "time_series",
"legendFormat": "p50",
"legendLink": null
@@ -1039,6 +1052,7 @@
"overrides": [ ]
},
"id": 14,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -1052,19 +1066,19 @@
"span": 12,
"targets": [
{
"expr": "histogram_quantile(0.5, sum(rate(loki_ingester_chunk_bounds_hours_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[5m])) by (le))",
"expr": "histogram_quantile(0.5, sum(rate(loki_ingester_chunk_bounds_hours_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le))",
"format": "time_series",
"legendFormat": "p50",
"legendLink": null
},
{
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_bounds_hours_bucket{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[5m])) by (le))",
"expr": "histogram_quantile(0.99, sum(rate(loki_ingester_chunk_bounds_hours_bucket{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) by (le))",
"format": "time_series",
"legendFormat": "p99",
"legendLink": null
},
{
"expr": "sum(rate(loki_ingester_chunk_bounds_hours_sum{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[5m])) / sum(rate(loki_ingester_chunk_bounds_hours_count{cluster=\"$cluster\", job=~\"$namespace/(loki|enterprise-logs)-write\"}[5m]))",
"expr": "sum(rate(loki_ingester_chunk_bounds_hours_sum{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval])) / sum(rate(loki_ingester_chunk_bounds_hours_count{cluster=\"$cluster\", job=~\"$namespace/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "avg",
"legendLink": null

View File

@@ -35,6 +35,7 @@
"fill": 1,
"format": "none",
"id": 1,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -110,6 +111,7 @@
"fill": 1,
"format": "dtdurations",
"id": 2,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -213,6 +215,7 @@
"overrides": [ ]
},
"id": 3,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -260,6 +263,7 @@
"overrides": [ ]
},
"id": 4,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -307,6 +311,7 @@
"overrides": [ ]
},
"id": 5,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -366,6 +371,7 @@
"overrides": [ ]
},
"id": 6,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -413,6 +419,7 @@
"overrides": [ ]
},
"id": 7,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -460,6 +467,7 @@
"overrides": [ ]
},
"id": 8,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -519,6 +527,7 @@
"overrides": [ ]
},
"id": 9,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -566,6 +575,7 @@
"overrides": [ ]
},
"id": 10,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -579,7 +589,7 @@
"span": 6,
"targets": [
{
"expr": "sum(rate(loki_compactor_deleted_lines{cluster=~\"$cluster\",job=~\"$namespace/(loki|enterprise-logs)-read\"}[$__rate_interval])) by (user)",
"expr": "sum(rate(loki_compactor_deleted_lines{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*/compactor|(loki|enterprise-logs)-backend.*|loki-single-binary)\"}[$__rate_interval])) by (user)",
"format": "time_series",
"legendFormat": "{{user}}",
"legendLink": null
@@ -603,10 +613,11 @@
{
"datasource": "$loki_datasource",
"id": 11,
"interval": "1m",
"span": 6,
"targets": [
{
"expr": "{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"compactor\"} |~ \"Started processing delete request|delete request for user marked as processed\" | logfmt | line_format \"{{.ts}} user={{.user}} delete_request_id={{.delete_request_id}} msg={{.msg}}\" ",
"expr": "{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*/compactor|(loki|enterprise-logs)-backend.*|loki-single-binary)\"} |~ \"Started processing delete request|delete request for user marked as processed\" | logfmt | line_format \"{{.ts}} user={{.user}} delete_request_id={{.delete_request_id}} msg={{.msg}}\" ",
"refId": "A"
}
],
@@ -616,10 +627,11 @@
{
"datasource": "$loki_datasource",
"id": 12,
"interval": "1m",
"span": 6,
"targets": [
{
"expr": "{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"compactor\"} |~ \"delete request for user added\" | logfmt | line_format \"{{.ts}} user={{.user}} query='{{.query}}'\"",
"expr": "{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*/compactor|(loki|enterprise-logs)-backend.*|loki-single-binary)\"} |~ \"delete request for user added\" | logfmt | line_format \"{{.ts}} user={{.user}} query='{{.query}}'\"",
"refId": "A"
}
],
@@ -701,6 +713,16 @@
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"hide": 0,
"label": null,
"name": "loki_datasource",
"options": [ ],
"query": "loki",
"refresh": 1,
"regex": "",
"type": "datasource"
}
]
},

View File

@@ -38,6 +38,7 @@
},
"hiddenSeries": false,
"id": 35,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -114,6 +115,11 @@
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"unit": "s"
}
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
@@ -124,6 +130,7 @@
},
"hiddenSeries": false,
"id": 41,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -211,6 +218,7 @@
},
"hiddenSeries": false,
"id": 36,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -236,7 +244,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(container_cpu_usage_seconds_total{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$deployment.*\", pod=~\"$pod\", container=~\"$container\"}[5m]))",
"expr": "sum(rate(container_cpu_usage_seconds_total{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$deployment.*\", pod=~\"$pod\", container=~\"$container\"}[$__rate_interval]))",
"refId": "A"
}
],
@@ -287,6 +295,11 @@
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"unit": "bytes"
}
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
@@ -297,6 +310,7 @@
},
"hiddenSeries": false,
"id": 40,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -373,6 +387,11 @@
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"unit": "binBps"
}
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
@@ -383,6 +402,7 @@
},
"hiddenSeries": false,
"id": 38,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -408,7 +428,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$deployment.*\", pod=~\"$pod\"}[5m]))",
"expr": "sum(rate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$deployment.*\", pod=~\"$pod\"}[$__rate_interval]))",
"refId": "A"
}
],
@@ -459,6 +479,11 @@
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"unit": "binBps"
}
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
@@ -469,6 +494,7 @@
},
"hiddenSeries": false,
"id": 39,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -494,7 +520,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$deployment.*\", pod=~\"$pod\"}[5m]))",
"expr": "sum(rate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$deployment.*\", pod=~\"$pod\"}[$__rate_interval]))",
"refId": "A"
}
],
@@ -555,6 +581,7 @@
},
"hiddenSeries": false,
"id": 37,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -632,6 +659,11 @@
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"unit": "ops"
}
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
@@ -642,6 +674,7 @@
},
"hiddenSeries": false,
"id": 42,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -667,7 +700,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(promtail_custom_bad_words_total{cluster=\"$cluster\", exported_namespace=\"$namespace\", exported_pod=~\"$deployment.*\", exported_pod=~\"$pod\", container=~\"$container\"}[5m])) by (level)",
"expr": "sum(rate(promtail_custom_bad_words_total{cluster=\"$cluster\", exported_namespace=\"$namespace\", exported_pod=~\"$deployment.*\", exported_pod=~\"$pod\", container=~\"$container\"}[$__rate_interval])) by (level)",
"legendFormat": "{{level}}",
"refId": "A"
}
@@ -719,6 +752,11 @@
"dashLength": 10,
"dashes": false,
"datasource": "$loki_datasource",
"fieldConfig": {
"defaults": {
"unit": "ops"
}
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
@@ -729,6 +767,7 @@
},
"hiddenSeries": false,
"id": 31,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -771,7 +810,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum(rate({cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$deployment.*\", pod=~\"$pod\", container=~\"$container\" } |logfmt| level=\"$level\" |= \"$filter\" [5m])) by (level)",
"expr": "sum(rate({cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$deployment.*\", pod=~\"$pod\", container=~\"$container\" } |logfmt| level=\"$level\" |= \"$filter\" | __error__=\"\" [$__interval])) by (level)",
"intervalFactor": 3,
"legendFormat": "{{level}}",
"refId": "A"
@@ -827,6 +866,7 @@
"y": 6
},
"id": 29,
"interval": "1m",
"maxDataPoints": "",
"options": {
"showLabels": false,

View File

@@ -1,723 +0,0 @@
{
"annotations": {
"list": [ ]
},
"editable": true,
"fiscalYearStartMonth": 0,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"iteration": 1635347545534,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"loki"
],
"targetBlank": false,
"title": "Loki Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
"datasource": "${datasource}",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"mappings": [ ],
"noValue": "0",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 1
}
]
}
},
"overrides": [ ]
},
"gridPos": {
"h": 10,
"w": 2,
"x": 0,
"y": 0
},
"id": 2,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "8.3.0-38205pre",
"targets": [
{
"datasource": "${datasource}",
"exemplar": false,
"expr": "sum(loki_ruler_wal_appender_ready) by (pod, tenant) == 0",
"instant": true,
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"title": "Appenders Not Ready",
"type": "stat"
},
{
"datasource": "${datasource}",
"description": "",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [ ],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [ ]
},
"gridPos": {
"h": 10,
"w": 11,
"x": 2,
"y": 0
},
"id": 4,
"options": {
"legend": {
"calcs": [ ],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": "${datasource}",
"exemplar": true,
"expr": "sum(rate(loki_ruler_wal_samples_appended_total{tenant=~\"${tenant}\"}[$__rate_interval])) by (tenant) > 0",
"interval": "",
"legendFormat": "{{tenant}}",
"refId": "A"
}
],
"title": "Samples Appended to WAL per Second",
"type": "timeseries"
},
{
"datasource": "${datasource}",
"description": "Series are unique combinations of labels",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [ ],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [ ]
},
"gridPos": {
"h": 10,
"w": 11,
"x": 13,
"y": 0
},
"id": 5,
"options": {
"legend": {
"calcs": [ ],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": "${datasource}",
"exemplar": true,
"expr": "sum(rate(loki_ruler_wal_storage_created_series_total{tenant=~\"${tenant}\"}[$__rate_interval])) by (tenant) > 0",
"interval": "",
"legendFormat": "{{tenant}}",
"refId": "A"
}
],
"title": "Series Created per Second",
"type": "timeseries"
},
{
"datasource": "${datasource}",
"description": "Difference between highest timestamp appended to WAL and highest timestamp successfully written to remote storage",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [ ],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [ ]
},
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 10
},
"id": 6,
"options": {
"legend": {
"calcs": [ ],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": "${datasource}",
"exemplar": true,
"expr": "loki_ruler_wal_prometheus_remote_storage_highest_timestamp_in_seconds{tenant=~\"${tenant}\"}\n- on (tenant)\n (\n loki_ruler_wal_prometheus_remote_storage_queue_highest_sent_timestamp_seconds{tenant=~\"${tenant}\"}\n or vector(0)\n )",
"interval": "",
"legendFormat": "{{tenant}}",
"refId": "A"
}
],
"title": "Write Behind",
"type": "timeseries"
},
{
"datasource": "${datasource}",
"description": "",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [ ],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [ ]
},
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 10
},
"id": 7,
"options": {
"legend": {
"calcs": [ ],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": "${datasource}",
"exemplar": true,
"expr": "sum(rate(loki_ruler_wal_prometheus_remote_storage_samples_total{tenant=~\"${tenant}\"}[$__rate_interval])) by (tenant) > 0",
"interval": "",
"legendFormat": "{{tenant}}",
"refId": "A"
}
],
"title": "Samples Sent per Second",
"type": "timeseries"
},
{
"datasource": "${datasource}",
"description": "\n",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [ ],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "bytes"
},
"overrides": [ ]
},
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 20
},
"id": 8,
"options": {
"legend": {
"calcs": [ ],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": "${datasource}",
"exemplar": true,
"expr": "sum by (tenant) (loki_ruler_wal_disk_size{tenant=~\"${tenant}\"})",
"interval": "",
"legendFormat": "{{tenant}}",
"refId": "A"
}
],
"title": "WAL Disk Size",
"type": "timeseries"
},
{
"datasource": "${datasource}",
"description": "Some number of pending samples is expected, but if remote-write is failing this value will remain high",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [ ],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [ ]
},
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 20
},
"id": 9,
"options": {
"legend": {
"calcs": [ ],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": "${datasource}",
"exemplar": true,
"expr": "max(loki_ruler_wal_prometheus_remote_storage_samples_pending{tenant=~\"${tenant}\"}) by (tenant,pod) > 0",
"interval": "",
"legendFormat": "{{tenant}}",
"refId": "A"
}
],
"title": "Pending Samples",
"type": "timeseries"
}
],
"refresh": "10s",
"rows": [ ],
"schemaVersion": 14,
"style": "dark",
"tags": [
"loki"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data source",
"name": "datasource",
"options": [ ],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [ ],
"query": "label_values(loki_build_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": false,
"name": "namespace",
"options": [ ],
"query": "label_values(loki_build_info{cluster=~\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"hide": 0,
"label": null,
"name": "loki_datasource",
"options": [ ],
"query": "loki",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": ".+",
"current": { },
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": null,
"multi": false,
"name": "tenant",
"options": [ ],
"query": "query_result(sum by (id) (grafanacloud_logs_instance_info) and sum(label_replace(loki_tenant:active_streams{cluster=\"$cluster\",namespace=\"$namespace\"},\"id\",\"$1\",\"tenant\",\"(.*)\")) by(id))",
"refresh": 0,
"regex": "/\"([^\"]+)\"/",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "utc",
"title": "Loki / Recording Rules",
"uid": "recording-rules",
"version": 0,
"weekStart": ""
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -90,6 +90,7 @@
]
},
"id": 1,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -103,19 +104,19 @@
"span": 4,
"targets": [
{
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-read.*\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*compactor.*|(loki|enterprise-logs)-backend.*|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-read.*\", resource=\"cpu\"} > 0)",
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*compactor.*|(loki|enterprise-logs)-backend.*|loki-single-binary)\", resource=\"cpu\"} > 0)",
"format": "time_series",
"legendFormat": "request",
"legendLink": null
},
{
"expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-read.*\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-read.*\"})",
"expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*compactor.*|(loki|enterprise-logs)-backend.*|loki-single-binary)\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*compactor.*|(loki|enterprise-logs)-backend.*|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "limit",
"legendLink": null
@@ -191,6 +192,7 @@
]
},
"id": 2,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -204,19 +206,19 @@
"span": 4,
"targets": [
{
"expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-read.*\"})",
"expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*compactor.*|(loki|enterprise-logs)-backend.*|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-read.*\", resource=\"memory\"} > 0)",
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*compactor.*|(loki|enterprise-logs)-backend.*|loki-single-binary)\", resource=\"memory\"} > 0)",
"format": "time_series",
"legendFormat": "request",
"legendLink": null
},
{
"expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-read.*\"} > 0)",
"expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*compactor.*|(loki|enterprise-logs)-backend.*|loki-single-binary)\"} > 0)",
"format": "time_series",
"legendFormat": "limit",
"legendLink": null
@@ -253,6 +255,7 @@
"overrides": [ ]
},
"id": 3,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -266,7 +269,7 @@
"span": 4,
"targets": [
{
"expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(loki|enterprise-logs)-read\"})",
"expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(.*compactor|(loki|enterprise-logs)-backend.*|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
@@ -317,6 +320,7 @@
},
"fill": 1,
"id": 4,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -423,6 +427,7 @@
"overrides": [ ]
},
"id": 5,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -482,6 +487,7 @@
"overrides": [ ]
},
"id": 6,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -529,6 +535,7 @@
"overrides": [ ]
},
"id": 7,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -590,6 +597,7 @@
},
"fill": 1,
"id": 8,
"interval": "1m",
"legend": {
"avg": false,
"current": false,
@@ -696,6 +704,7 @@
"overrides": [ ]
},
"id": 9,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -743,6 +752,7 @@
"overrides": [ ]
},
"id": 10,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -802,6 +812,7 @@
"overrides": [ ]
},
"id": 11,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -849,6 +860,7 @@
"overrides": [ ]
},
"id": 12,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -896,6 +908,7 @@
"overrides": [ ]
},
"id": 13,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -956,6 +969,7 @@
},
"format": "short",
"id": 14,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -1004,6 +1018,7 @@
"overrides": [ ]
},
"id": 15,
"interval": "1m",
"links": [ ],
"nullPointMode": "null as zero",
"options": {
@@ -1095,6 +1110,7 @@
},
"format": "short",
"id": 16,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -1143,6 +1159,7 @@
"overrides": [ ]
},
"id": 17,
"interval": "1m",
"links": [ ],
"nullPointMode": "null as zero",
"options": {
@@ -1233,6 +1250,7 @@
"overrides": [ ]
},
"id": 18,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -1280,6 +1298,7 @@
"overrides": [ ]
},
"id": 19,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -1327,6 +1346,7 @@
"overrides": [ ]
},
"id": 20,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -1367,7 +1387,7 @@
"span": 12,
"targets": [
{
"expr": "{cluster=~\"$cluster\", job=~\"($namespace)/(loki|enterprise-logs)-read\"}",
"expr": "{cluster=~\"$cluster\", job=~\"($namespace)/(.*compactor|(loki|enterprise-logs)-backend.*|loki-single-binary)\"}",
"refId": "A"
}
],

View File

@@ -22,6 +22,273 @@
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"custom": {
"drawStyle": "line",
"fillOpacity": 10,
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
}
},
"thresholds": {
"mode": "absolute",
"steps": [ ]
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "request"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "#FFC000",
"mode": "fixed"
}
},
{
"id": "custom.fillOpacity",
"value": 0
}
]
},
{
"matcher": {
"id": "byName",
"options": "limit"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "#E02F44",
"mode": "fixed"
}
},
{
"id": "custom.fillOpacity",
"value": 0
}
]
}
]
},
"id": 1,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"span": 4,
"targets": [
{
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\", resource=\"cpu\"} > 0)",
"format": "time_series",
"legendFormat": "request",
"legendLink": null
},
{
"expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"})",
"format": "time_series",
"legendFormat": "limit",
"legendLink": null
}
],
"title": "CPU",
"tooltip": {
"sort": 2
},
"type": "timeseries"
},
{
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"custom": {
"drawStyle": "line",
"fillOpacity": 10,
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
}
},
"thresholds": {
"mode": "absolute",
"steps": [ ]
},
"unit": "bytes"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "request"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "#FFC000",
"mode": "fixed"
}
},
{
"id": "custom.fillOpacity",
"value": 0
}
]
},
{
"matcher": {
"id": "byName",
"options": "limit"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "#E02F44",
"mode": "fixed"
}
},
{
"id": "custom.fillOpacity",
"value": 0
}
]
}
]
},
"id": 2,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"span": 4,
"targets": [
{
"expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"})",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\", resource=\"memory\"} > 0)",
"format": "time_series",
"legendFormat": "request",
"legendLink": null
},
{
"expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"} > 0)",
"format": "time_series",
"legendFormat": "limit",
"legendLink": null
}
],
"title": "Memory (workingset)",
"tooltip": {
"sort": 2
},
"type": "timeseries"
},
{
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"custom": {
"drawStyle": "line",
"fillOpacity": 10,
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
}
},
"thresholds": {
"mode": "absolute",
"steps": [ ]
},
"unit": "bytes"
},
"overrides": [ ]
},
"id": 3,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"span": 4,
"targets": [
{
"expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/.*distributor\"})",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
}
],
"title": "Memory (go heap inuse)",
"tooltip": {
"sort": 2
},
"type": "timeseries"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Distributor",
"titleSize": "h6"
},
{
"collapse": false,
"collapsed": false,
@@ -51,7 +318,8 @@
"overrides": [ ]
},
"gridPos": { },
"id": 1,
"id": 4,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -64,7 +332,7 @@
},
"targets": [
{
"expr": "sum by(pod) (loki_ingester_memory_streams{cluster=~\"$cluster\", job=~\"($namespace)/(loki|enterprise-logs)-write\"})",
"expr": "sum by(pod) (loki_ingester_memory_streams{cluster=~\"$cluster\", job=~\"($namespace)/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
@@ -140,7 +408,8 @@
]
},
"gridPos": { },
"id": 2,
"id": 5,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -153,19 +422,19 @@
},
"targets": [
{
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\"}[$__rate_interval]))",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\", resource=\"cpu\"} > 0)",
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\", resource=\"cpu\"} > 0)",
"format": "time_series",
"legendFormat": "request",
"legendLink": null
},
{
"expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\"})",
"expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "limit",
"legendLink": null
@@ -241,7 +510,8 @@
]
},
"gridPos": { },
"id": 3,
"id": 6,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -254,19 +524,19 @@
},
"targets": [
{
"expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\"})",
"expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\", resource=\"memory\"} > 0)",
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\", resource=\"memory\"} > 0)",
"format": "time_series",
"legendFormat": "request",
"legendLink": null
},
{
"expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\"} > 0)",
"expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\"} > 0)",
"format": "time_series",
"legendFormat": "limit",
"legendLink": null
@@ -303,7 +573,8 @@
"overrides": [ ]
},
"gridPos": { },
"id": 4,
"id": 7,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -316,7 +587,7 @@
},
"targets": [
{
"expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(loki|enterprise-logs)-write\"})",
"expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"})",
"format": "time_series",
"legendFormat": "{{pod}}",
"legendLink": null
@@ -353,7 +624,8 @@
"overrides": [ ]
},
"gridPos": { },
"id": 5,
"id": 8,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -366,7 +638,7 @@
},
"targets": [
{
"expr": "sum by(instance, pod, device) (rate(node_disk_written_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
"expr": "sum by(instance, pod, device) (rate(node_disk_written_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
"format": "time_series",
"legendFormat": "{{pod}} - {{device}}",
"legendLink": null
@@ -400,7 +672,8 @@
"overrides": [ ]
},
"gridPos": { },
"id": 6,
"id": 9,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -413,7 +686,7 @@
},
"targets": [
{
"expr": "sum by(instance, pod, device) (rate(node_disk_read_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"loki\", pod=~\"(loki|enterprise-logs)-write.*\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
"expr": "sum by(instance, pod, device) (rate(node_disk_read_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"loki|ingester\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary)\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
"format": "time_series",
"legendFormat": "{{pod}} - {{device}}",
"legendLink": null
@@ -447,7 +720,8 @@
"overrides": [ ]
},
"gridPos": { },
"id": 7,
"id": 10,
"interval": "1m",
"links": [ ],
"options": {
"legend": {
@@ -460,7 +734,7 @@
},
"targets": [
{
"expr": "max by(persistentvolumeclaim) (kubelet_volume_stats_used_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\"} / kubelet_volume_stats_capacity_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\"}) and count by(persistentvolumeclaim) (kube_persistentvolumeclaim_labels{cluster=~\"$cluster\", namespace=~\"$namespace\",label_name=~\"(loki|enterprise-logs)-write.*\"})",
"expr": "max by(persistentvolumeclaim) (kubelet_volume_stats_used_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\"} / kubelet_volume_stats_capacity_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\"}) and count by(persistentvolumeclaim) (kube_persistentvolumeclaim_labels{cluster=~\"$cluster\", namespace=~\"$namespace\",label_name=~\"(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary).*\"})",
"format": "time_series",
"legendFormat": "{{persistentvolumeclaim}}",
"legendLink": null
@@ -474,7 +748,7 @@
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Write path",
"title": "Ingester",
"titleSize": "h6",
"type": "row"
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,836 +0,0 @@
{
"__requires": [
{
"id": "grafana",
"name": "Grafana",
"type": "grafana",
"version": "8.0.0"
}
],
"annotations": {
"list": [ ]
},
"editable": true,
"gnetId": null,
"graphTooltip": 1,
"hideControls": false,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"mimir"
],
"targetBlank": false,
"title": "Mimir dashboards",
"type": "dashboards"
}
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "request",
"color": "#FFC000",
"dashLength": 5,
"dashes": true,
"fill": 0
},
{
"alias": "limit",
"color": "#E02F44",
"dashLength": 5,
"dashes": true,
"fill": 0
}
],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"alertmanager\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"alertmanager\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"alertmanager\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "limit",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"alertmanager\",resource=\"cpu\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "request",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "CPU",
"tooltip": {
"sort": 2
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "request",
"color": "#FFC000",
"dashLength": 5,
"dashes": true,
"fill": 0
},
{
"alias": "limit",
"color": "#E02F44",
"dashLength": 5,
"dashes": true,
"fill": 0
}
],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"alertmanager\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"alertmanager\"} > 0)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "limit",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"alertmanager\",resource=\"memory\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "request",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Memory (workingset)",
"tooltip": {
"sort": 2
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"alertmanager\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Memory (go heap inuse)",
"tooltip": {
"sort": 2
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Alertmanager",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (rate(container_network_receive_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\",pod=~\"(.*mimir-)?alertmanager.*\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Receive bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (rate(container_network_transmit_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\",pod=~\"(.*mimir-)?alertmanager.*\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Transmit bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(instance, pod, device) (\n rate(\n node_disk_written_bytes_total[$__rate_interval]\n )\n)\n+\nignoring(pod) group_right() (\n label_replace(\n count by(\n instance,\n pod,\n device\n )\n (\n container_fs_writes_bytes_total{\n cluster=~\"$cluster\", namespace=~\"$namespace\",\n container=~\"alertmanager\",\n device!~\".*sda.*\"\n }\n ),\n \"device\",\n \"$1\",\n \"device\",\n \"/dev/(.*)\"\n ) * 0\n)\n\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}} - {{device}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Disk writes",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(instance, pod, device) (\n rate(\n node_disk_read_bytes_total[$__rate_interval]\n )\n) + ignoring(pod) group_right() (\n label_replace(\n count by(\n instance,\n pod,\n device\n )\n (\n container_fs_writes_bytes_total{\n cluster=~\"$cluster\", namespace=~\"$namespace\",\n container=~\"alertmanager\",\n device!~\".*sda.*\"\n }\n ),\n \"device\",\n \"$1\",\n \"device\",\n \"/dev/(.*)\"\n ) * 0\n)\n\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}} - {{device}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Disk reads",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Disk",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max by(persistentvolumeclaim) (\n kubelet_volume_stats_used_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\"} /\n kubelet_volume_stats_capacity_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\"}\n)\nand\ncount by(persistentvolumeclaim) (\n kube_persistentvolumeclaim_labels{\n cluster=~\"$cluster\", namespace=~\"$namespace\",\n label_name=~\"(alertmanager).*\"\n }\n)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{persistentvolumeclaim}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Disk space utilization",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"mimir"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [ ],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": ".*",
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [ ],
"query": "label_values(cortex_build_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": false,
"name": "namespace",
"options": [ ],
"query": "label_values(cortex_build_info{cluster=~\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "utc",
"title": "Mimir / Alertmanager resources",
"uid": "a6883fb22799ac74479c7db872451092",
"version": 0
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,940 +0,0 @@
{
"__requires": [
{
"id": "grafana",
"name": "Grafana",
"type": "grafana",
"version": "8.0.0"
}
],
"annotations": {
"list": [ ]
},
"editable": true,
"gnetId": null,
"graphTooltip": 1,
"hideControls": false,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"mimir"
],
"targetBlank": false,
"title": "Mimir dashboards",
"type": "dashboards"
}
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "request",
"color": "#FFC000",
"dashLength": 5,
"dashes": true,
"fill": 0
},
{
"alias": "limit",
"color": "#E02F44",
"dashLength": 5,
"dashes": true,
"fill": 0
}
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "limit",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\",resource=\"cpu\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "request",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "CPU",
"tooltip": {
"sort": 2
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Memory (go heap inuse)",
"tooltip": {
"sort": 2
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU and memory",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "request",
"color": "#FFC000",
"dashLength": 5,
"dashes": true,
"fill": 0
},
{
"alias": "limit",
"color": "#E02F44",
"dashLength": 5,
"dashes": true,
"fill": 0
}
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max by(pod) (container_memory_rss{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\"} > 0)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "limit",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\",resource=\"memory\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "request",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Memory (RSS)",
"tooltip": {
"sort": 2
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "request",
"color": "#FFC000",
"dashLength": 5,
"dashes": true,
"fill": 0
},
{
"alias": "limit",
"color": "#E02F44",
"dashLength": 5,
"dashes": true,
"fill": 0
}
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
},
{
"expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\"} > 0)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "limit",
"legendLink": null
},
{
"expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\",container=~\"compactor\",resource=\"memory\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "request",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Memory (workingset)",
"tooltip": {
"sort": 2
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (rate(container_network_receive_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\",pod=~\"(.*mimir-)?compactor.*\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Receive bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(pod) (rate(container_network_transmit_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\",pod=~\"(.*mimir-)?compactor.*\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Transmit bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(instance, pod, device) (\n rate(\n node_disk_written_bytes_total[$__rate_interval]\n )\n)\n+\nignoring(pod) group_right() (\n label_replace(\n count by(\n instance,\n pod,\n device\n )\n (\n container_fs_writes_bytes_total{\n cluster=~\"$cluster\", namespace=~\"$namespace\",\n container=~\"compactor\",\n device!~\".*sda.*\"\n }\n ),\n \"device\",\n \"$1\",\n \"device\",\n \"/dev/(.*)\"\n ) * 0\n)\n\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}} - {{device}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Disk writes",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(instance, pod, device) (\n rate(\n node_disk_read_bytes_total[$__rate_interval]\n )\n) + ignoring(pod) group_right() (\n label_replace(\n count by(\n instance,\n pod,\n device\n )\n (\n container_fs_writes_bytes_total{\n cluster=~\"$cluster\", namespace=~\"$namespace\",\n container=~\"compactor\",\n device!~\".*sda.*\"\n }\n ),\n \"device\",\n \"$1\",\n \"device\",\n \"/dev/(.*)\"\n ) * 0\n)\n\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}} - {{device}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Disk reads",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 0,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max by(persistentvolumeclaim) (\n kubelet_volume_stats_used_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\"} /\n kubelet_volume_stats_capacity_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\"}\n)\nand\ncount by(persistentvolumeclaim) (\n kube_persistentvolumeclaim_labels{\n cluster=~\"$cluster\", namespace=~\"$namespace\",\n label_name=~\"(compactor).*\"\n }\n)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{persistentvolumeclaim}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Disk space utilization",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Disk",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"mimir"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [ ],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "cluster",
"multi": true,
"name": "cluster",
"options": [ ],
"query": "label_values(cortex_build_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": true,
"name": "namespace",
"options": [ ],
"query": "label_values(cortex_build_info{cluster=~\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "utc",
"title": "Mimir / Compactor resources",
"uid": "09a5c49e9cdb2f2b24c6d184574a07fd",
"version": 0
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,312 +0,0 @@
{
"__requires": [
{
"id": "grafana",
"name": "Grafana",
"type": "grafana",
"version": "8.0.0"
}
],
"annotations": {
"list": [ ]
},
"editable": true,
"gnetId": null,
"graphTooltip": 1,
"hideControls": false,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"mimir"
],
"targetBlank": false,
"title": "Mimir dashboards",
"type": "dashboards"
}
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "count(cortex_config_hash{cluster=~\"$cluster\", namespace=~\"$namespace\"}) by (sha256)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "sha256:{{sha256}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Startup config file hashes",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "instances",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Startup config file",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "count(cortex_runtime_config_hash{cluster=~\"$cluster\", namespace=~\"$namespace\"}) by (sha256)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "sha256:{{sha256}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Runtime config file hashes",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "instances",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Runtime config file",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"mimir"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [ ],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "cluster",
"multi": true,
"name": "cluster",
"options": [ ],
"query": "label_values(cortex_build_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "namespace",
"multi": true,
"name": "namespace",
"options": [ ],
"query": "label_values(cortex_build_info{cluster=~\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "utc",
"title": "Mimir / Config",
"uid": "5d9d0b4724c0f80d68467088ec61e003",
"version": 0
}

View File

@@ -1,938 +0,0 @@
{
"__requires": [
{
"id": "grafana",
"name": "Grafana",
"type": "grafana",
"version": "8.0.0"
}
],
"annotations": {
"list": [ ]
},
"editable": true,
"gnetId": null,
"graphTooltip": 1,
"hideControls": false,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"mimir"
],
"targetBlank": false,
"title": "Mimir dashboards",
"type": "dashboards"
}
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(component) (rate(thanos_objstore_bucket_operations_total{cluster=~\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{component}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "RPS / component",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "reqps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"max": 1,
"min": 0,
"noValue": "0",
"unit": "percentunit"
}
},
"id": 2,
"links": [ ],
"options": {
"legend": {
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"span": 6,
"targets": [
{
"expr": "sum by(component) (rate(thanos_objstore_bucket_operation_failures_total{cluster=~\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) / sum by(component) (rate(thanos_objstore_bucket_operations_total{cluster=~\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) >= 0",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{component}}",
"legendLink": null
}
],
"title": "Error rate / component",
"type": "timeseries"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Components",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum by(operation) (rate(thanos_objstore_bucket_operations_total{cluster=~\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{operation}}",
"legendLink": null
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "RPS / operation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "reqps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"max": 1,
"min": 0,
"noValue": "0",
"unit": "percentunit"
}
},
"id": 4,
"links": [ ],
"options": {
"legend": {
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"span": 6,
"targets": [
{
"expr": "sum by(operation) (rate(thanos_objstore_bucket_operation_failures_total{cluster=~\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) / sum by(operation) (rate(thanos_objstore_bucket_operations_total{cluster=~\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) >= 0",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{operation}}",
"legendLink": null
}
],
"title": "Error rate / operation",
"type": "timeseries"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Operations",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"get\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"get\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(thanos_objstore_bucket_operation_duration_seconds_sum{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"get\"}[$__rate_interval])) * 1e3 / sum(rate(thanos_objstore_bucket_operation_duration_seconds_count{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"get\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Average",
"refId": "C"
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Op: Get",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "ms",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"get_range\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"get_range\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(thanos_objstore_bucket_operation_duration_seconds_sum{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"get_range\"}[$__rate_interval])) * 1e3 / sum(rate(thanos_objstore_bucket_operation_duration_seconds_count{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"get_range\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Average",
"refId": "C"
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Op: GetRange",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "ms",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"exists\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"exists\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(thanos_objstore_bucket_operation_duration_seconds_sum{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"exists\"}[$__rate_interval])) * 1e3 / sum(rate(thanos_objstore_bucket_operation_duration_seconds_count{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"exists\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Average",
"refId": "C"
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Op: Exists",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "ms",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"attributes\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"attributes\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(thanos_objstore_bucket_operation_duration_seconds_sum{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"attributes\"}[$__rate_interval])) * 1e3 / sum(rate(thanos_objstore_bucket_operation_duration_seconds_count{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"attributes\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Average",
"refId": "C"
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Op: Attributes",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "ms",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"upload\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"upload\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(thanos_objstore_bucket_operation_duration_seconds_sum{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"upload\"}[$__rate_interval])) * 1e3 / sum(rate(thanos_objstore_bucket_operation_duration_seconds_count{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"upload\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Average",
"refId": "C"
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Op: Upload",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "ms",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 10,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"span": 4,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"delete\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "99th Percentile",
"refId": "A"
},
{
"expr": "histogram_quantile(0.50, sum(rate(thanos_objstore_bucket_operation_duration_seconds_bucket{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"delete\"}[$__rate_interval])) by (le)) * 1e3",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "50th Percentile",
"refId": "B"
},
{
"expr": "sum(rate(thanos_objstore_bucket_operation_duration_seconds_sum{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"delete\"}[$__rate_interval])) * 1e3 / sum(rate(thanos_objstore_bucket_operation_duration_seconds_count{cluster=~\"$cluster\", namespace=~\"$namespace\",operation=\"delete\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Average",
"refId": "C"
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Op: Delete",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "ms",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"mimir"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [ ],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "cluster",
"multi": true,
"name": "cluster",
"options": [ ],
"query": "label_values(cortex_build_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "namespace",
"multi": true,
"name": "namespace",
"options": [ ],
"query": "label_values(cortex_build_info{cluster=~\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "utc",
"title": "Mimir / Object Store",
"uid": "e1324ee2a434f4158c00a9ee279d3292",
"version": 0
}

View File

@@ -1,266 +0,0 @@
{
"__requires": [
{
"id": "grafana",
"name": "Grafana",
"type": "grafana",
"version": "8.0.0"
}
],
"annotations": {
"list": [ ]
},
"editable": true,
"gnetId": null,
"graphTooltip": 1,
"hideControls": false,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"mimir"
],
"targetBlank": false,
"title": "Mimir dashboards",
"type": "dashboards"
}
],
"refresh": "",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"datasource": "${datasource}",
"id": 1,
"span": 12,
"targets": [
{
"expr": "max by(limit_name) (cortex_limits_defaults{cluster=~\"$cluster\",namespace=~\"$namespace\"})",
"instant": true,
"legendFormat": "",
"refId": "A"
}
],
"title": "Defaults",
"transformations": [
{
"id": "labelsToFields",
"options": { }
},
{
"id": "merge",
"options": { }
},
{
"id": "organize",
"options": {
"excludeByName": {
"Time": true
},
"indexByName": {
"Value": 1,
"limit_name": 0
}
}
},
{
"id": "sortBy",
"options": {
"fields": { },
"sort": [
{
"field": "limit_name"
}
]
}
}
],
"type": "table"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"datasource": "${datasource}",
"id": 2,
"span": 12,
"targets": [
{
"expr": "max by(user, limit_name) (cortex_limits_overrides{cluster=~\"$cluster\",namespace=~\"$namespace\",user=~\"${tenant_id}\"})",
"instant": true,
"legendFormat": "",
"refId": "A"
}
],
"title": "Per-tenant overrides",
"transformations": [
{
"id": "labelsToFields",
"options": {
"mode": "columns",
"valueLabel": "limit_name"
}
},
{
"id": "merge",
"options": { }
},
{
"id": "organize",
"options": {
"excludeByName": {
"Time": true
},
"indexByName": {
"user": 0
}
}
}
],
"type": "table"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"mimir"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [ ],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": ".*",
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [ ],
"query": "label_values(cortex_build_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": false,
"name": "namespace",
"options": [ ],
"query": "label_values(cortex_build_info{cluster=~\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"selected": true,
"text": ".*",
"value": ".*"
},
"hide": 0,
"label": "Tenant ID",
"name": "tenant_id",
"options": [
{
"selected": true,
"text": ".*",
"value": ".*"
}
],
"query": ".*",
"type": "textbox"
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "utc",
"title": "Mimir / Overrides",
"uid": "1e2c358600ac53f09faea133f811b5bb",
"version": 0
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,362 +0,0 @@
{
"__requires": [
{
"id": "grafana",
"name": "Grafana",
"type": "grafana",
"version": "8.0.0"
}
],
"annotations": {
"list": [ ]
},
"editable": true,
"gnetId": null,
"graphTooltip": 1,
"hideControls": false,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"mimir"
],
"targetBlank": false,
"title": "Mimir dashboards",
"type": "dashboards"
}
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "200px",
"panels": [
{
"id": 1,
"options": {
"content": "This dashboard identifies scaling-related issues by suggesting services that you might want to scale up.\nThe table that follows contains a suggested number of replicas and the reason why.\nIf the system is failing and depending on the reason, try scaling up to the specified number.\nThe specified numbers are intended as helpful guidelines when things go wrong, rather than prescriptive guidelines.\n\nReasons:\n- **sample_rate**: There are not enough replicas to handle the\n sample rate. Applies to distributor and ingesters.\n- **active_series**: There are not enough replicas\n to handle the number of active series. Applies to ingesters.\n- **cpu_usage**: There are not enough replicas\n based on the CPU usage of the jobs vs the resource requests.\n Applies to all jobs.\n- **memory_usage**: There are not enough replicas based on the memory\n usage vs the resource requests. Applies to all jobs.\n- **active_series_limits**: There are not enough replicas to hold 60% of the\n sum of all the per tenant series limits.\n- **sample_rate_limits**: There are not enough replicas to handle 60% of the\n sum of all the per tenant rate limits.\n",
"mode": "markdown"
},
"span": 12,
"title": "",
"type": "text"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Service scaling",
"titleSize": "h6"
},
{
"collapse": false,
"height": "400px",
"panels": [
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [ ],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"sort": {
"col": 0,
"desc": false
},
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Required Replicas",
"colorMode": null,
"colors": [ ],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value",
"thresholds": [ ],
"type": "number",
"unit": "short"
},
{
"alias": "Cluster",
"colorMode": null,
"colors": [ ],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "__name__",
"thresholds": [ ],
"type": "hidden",
"unit": "short"
},
{
"alias": "Cluster",
"colorMode": null,
"colors": [ ],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "cluster",
"thresholds": [ ],
"type": "number",
"unit": "short"
},
{
"alias": "Service",
"colorMode": null,
"colors": [ ],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "deployment",
"thresholds": [ ],
"type": "number",
"unit": "short"
},
{
"alias": "Namespace",
"colorMode": null,
"colors": [ ],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "namespace",
"thresholds": [ ],
"type": "number",
"unit": "short"
},
{
"alias": "Reason",
"colorMode": null,
"colors": [ ],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTargetBlank": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "reason",
"thresholds": [ ],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [ ],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [ ],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sort_desc(\n cluster_namespace_deployment_reason:required_replicas:count{cluster=~\"$cluster\", namespace=~\"$namespace\"}\n > ignoring(reason) group_left\n cluster_namespace_deployment:actual_replicas:count{cluster=~\"$cluster\", namespace=~\"$namespace\"}\n)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": [ ],
"timeFrom": null,
"timeShift": null,
"title": "Workload-based scaling",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Scaling",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"mimir"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [ ],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "cluster",
"multi": true,
"name": "cluster",
"options": [ ],
"query": "label_values(cortex_build_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".+",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "namespace",
"multi": true,
"name": "namespace",
"options": [ ],
"query": "label_values(cortex_build_info{cluster=~\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "utc",
"title": "Mimir / Scaling",
"uid": "64bbad83507b7289b514725658e10352",
"version": 0
}

View File

@@ -1,323 +0,0 @@
{
"__requires": [
{
"id": "grafana",
"name": "Grafana",
"type": "grafana",
"version": "8.0.0"
}
],
"annotations": {
"list": [ ]
},
"editable": true,
"gnetId": null,
"graphTooltip": 1,
"hideControls": false,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"mimir"
],
"targetBlank": false,
"title": "Mimir dashboards",
"type": "dashboards"
}
],
"refresh": "",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"datasource": "${lokidatasource}",
"fieldConfig": {
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Time range"
},
"properties": [
{
"id": "mappings",
"value": [
{
"from": "",
"id": 1,
"text": "Instant query",
"to": "",
"type": 1,
"value": "0"
}
]
},
{
"id": "unit",
"value": "s"
}
]
},
{
"matcher": {
"id": "byName",
"options": "Step"
},
"properties": [
{
"id": "unit",
"value": "s"
}
]
}
]
},
"id": 1,
"span": 12,
"targets": [
{
"expr": "{cluster=~\"$cluster\",namespace=~\"$namespace\",name=~\"query-frontend.*\"} |= \"query stats\" != \"/api/v1/read\" | logfmt | user=~\"${tenant_id}\" | response_time > ${min_duration}",
"instant": false,
"legendFormat": "",
"range": true,
"refId": "A"
}
],
"title": "Slow queries",
"transformations": [
{
"id": "extractFields",
"options": {
"source": "labels"
}
},
{
"id": "calculateField",
"options": {
"alias": "Time range",
"binary": {
"left": "param_end",
"operator": "-",
"reducer": "sum",
"right": "param_start"
},
"mode": "binary",
"reduce": {
"reducer": "sum"
},
"replaceFields": false
}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Line": true,
"Time": true,
"caller": true,
"cluster": true,
"container": true,
"host": true,
"id": true,
"job": true,
"labels": true,
"level": true,
"line": true,
"method": true,
"msg": true,
"name": true,
"namespace": true,
"param_end": true,
"param_start": true,
"param_time": true,
"path": true,
"pod": true,
"pod_template_hash": true,
"query_wall_time_seconds": true,
"stream": true,
"traceID": true,
"tsNs": true
},
"indexByName": {
"Time range": 3,
"param_query": 2,
"param_step": 4,
"response_time": 5,
"ts": 0,
"user": 1
},
"renameByName": {
"org_id": "Tenant ID",
"param_query": "Query",
"param_step": "Step",
"response_time": "Duration"
}
}
}
],
"type": "table"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"mimir"
],
"templating": {
"list": [
{
"current": {
"text": "default",
"value": "default"
},
"hide": 0,
"label": "Data Source",
"name": "datasource",
"options": [ ],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": ".*",
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": true,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [ ],
"query": "label_values(cortex_build_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": false,
"name": "namespace",
"options": [ ],
"query": "label_values(cortex_build_info{cluster=~\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [ ],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"hide": 0,
"includeAll": false,
"label": "Logs datasource",
"multi": false,
"name": "lokidatasource",
"query": "loki",
"type": "datasource"
},
{
"current": {
"selected": true,
"text": "5s",
"value": "5s"
},
"hide": 0,
"label": "Min duration",
"name": "min_duration",
"options": [
{
"selected": true,
"text": "5s",
"value": "5s"
}
],
"query": "5s",
"type": "textbox"
},
{
"current": {
"selected": true,
"text": ".*",
"value": ".*"
},
"hide": 0,
"label": "Tenant ID",
"name": "tenant_id",
"options": [
{
"selected": true,
"text": ".*",
"value": ".*"
}
],
"query": ".*",
"type": "textbox"
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "utc",
"title": "Mimir / Slow queries",
"uid": "6089e1ce1e678788f46312a0a1e647e6",
"version": 0
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,3 @@
groups:
- name: "loki_rules"
rules:
- expr: "histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[5m]))
@@ -50,4 +49,4 @@ groups:
record: "cluster_namespace_job_route:loki_request_duration_seconds_sum:sum_rate"
- expr: "sum(rate(loki_request_duration_seconds_count[5m])) by (cluster, namespace,
job, route)"
record: "cluster_namespace_job_route:loki_request_duration_seconds_count:sum_rate"
record: "cluster_namespace_job_route:loki_request_duration_seconds_count:sum_rate"

View File

@@ -6,6 +6,15 @@
{{- join ", " $list }}
{{- end }}
{{- define "agent.all_namespaces" -}}
{{- $list := list }}
{{- range .Values.namespacesToMonitor }}
{{- $list = append $list (printf "%s" .) }}
{{- end }}
{{- $list = append $list .Release.Namespace }}
{{- join "|" $list }}
{{- end }}
{{- define "agent.loki_write_targets" -}}
{{- $list := list }}
{{- if .Values.local.logs.enabled }}
@@ -42,7 +51,29 @@
{{- $list = append $list ("otelcol.exporter.otlp.local.input") }}
{{- end }}
{{- if .Values.cloud.traces.enabled }}
{{- $list = append $list ("otelcol.exporter.otlp.cloud.input") }}
{{- $list = append $list ("otelcol.exporter.otlphttp.cloud.input") }}
{{- end }}
{{- join ", " $list }}
{{- end }}
{{- define "agent.all_logs" -}}
{{- $list := list }}
{{- range .Values.logs.retain }}
{{- $list = append $list . }}
{{- end }}
{{- range .Values.logs.extraLogs }}
{{- $list = append $list . }}
{{- end }}
{{- join "|" $list }}
{{- end }}
{{- define "agent.all_metrics" -}}
{{- $list := list }}
{{- range .Values.metrics.retain }}
{{- $list = append $list . }}
{{- end }}
{{- range .Values.metrics.extraMetrics }}
{{- $list = append $list . }}
{{- end }}
{{- join "|" $list }}
{{- end }}

View File

@@ -40,10 +40,12 @@ data:
{{- if or .Values.local.logs.enabled .Values.cloud.logs.enabled }}
// Logs
{{- if .Values.cloud.logs.enabled }}
remote.kubernetes.secret "logs_credentials" {
namespace = "{{- $.Release.Namespace -}}"
name = "{{- .Values.cloud.logs.secret -}}"
}
{{- end }}
loki.source.kubernetes "pods" {
clustering {
@@ -57,9 +59,9 @@ data:
loki.process "filter" {
forward_to = [ {{ include "agent.loki_write_targets" . }} ]
{{- if not (empty .Values.logs.retain) }}
{{- if or (not (empty .Values.logs.retain)) (not (empty .Values.logs.extraLogs)) }}
stage.match {
selector = "{cluster=\"{{- .Values.clusterLabelValue -}}\", namespace=~\"{{- join "|" .Values.namespacesToMonitor -}}|{{- $.Release.Namespace -}}\", pod=~\"loki.*\"} !~ \"{{ join "|" .Values.logs.retain }}\""
selector = "{cluster=\"{{- .Values.clusterLabelValue -}}\", namespace=~\"{{- join "|" .Values.namespacesToMonitor -}}|{{- $.Release.Namespace -}}\", pod=~\"loki.*\"} !~ \"{{ include "agent.all_logs" . }}\""
action = "drop"
}
{{- end }}
@@ -80,10 +82,12 @@ data:
{{- if or .Values.local.metrics.enabled .Values.cloud.metrics.enabled }}
// Metrics
{{- if .Values.cloud.metrics.enabled }}
remote.kubernetes.secret "metrics_credentials" {
namespace = "{{- $.Release.Namespace -}}"
name = "{{- .Values.cloud.metrics.secret -}}"
}
{{- end }}
discovery.kubernetes "metric_pods" {
role = "pod"
@@ -133,7 +137,14 @@ data:
prometheus.relabel "filter" {
rule {
source_labels = ["__name__"]
regex = "({{ join "|" .Values.metrics.retain }})"
regex = "({{ include "agent.all_metrics" . }})"
action = "keep"
}
rule {
source_labels = ["namespace"]
regex = "{{ include "agent.all_namespaces" . }}"
action = "keep"
}
@@ -154,6 +165,10 @@ data:
// Based on https://github.com/Chewie/loutretelecom-manifests/blob/main/manifests/addons/monitoring/config.river
discovery.kubernetes "all_nodes" {
role = "node"
namespaces {
own_namespace = true
names = [ {{ include "agent.namespaces" . }} ]
}
}
discovery.relabel "all_nodes" {
@@ -267,10 +282,12 @@ data:
{{- if or .Values.local.traces.enabled .Values.cloud.traces.enabled }}
// Traces
{{- if .Values.cloud.traces.enabled }}
remote.kubernetes.secret "traces_credentials" {
namespace = "{{- $.Release.Namespace -}}"
name = "{{- .Values.cloud.traces.secret -}}"
}
{{- end }}
// Shamelessly copied from https://github.com/grafana/intro-to-mlt/blob/main/agent/config.river
otelcol.receiver.otlp "otlp_receiver" {
@@ -278,13 +295,23 @@ data:
// In this case, the Agent is listening on all available bindable addresses on port 4317 (which is the
// default OTLP gRPC port) for the OTLP protocol.
grpc {
endpoint = "0.0.0.0:4317"
endpoint = "0.0.0.0:4317"
}
// We define where to send the output of all ingested traces. In this case, to the OpenTelemetry batch processor
// named 'default'.
output {
traces = [otelcol.processor.batch.default.input]
traces = [otelcol.processor.batch.default.input]
}
}
otelcol.receiver.jaeger "jaeger" {
protocols {
thrift_http {}
}
output {
traces = [otelcol.processor.batch.default.input]
}
}
@@ -305,7 +332,7 @@ data:
{{- if .Values.local.logs.enabled }}
loki.write "local" {
endpoint {
url = "http://loki-gateway.{{- .Release.Namespace -}}.svc.cluster.local:80/loki/api/v1/push"
url = "http://{{- .Release.Namespace -}}-loki-gateway.{{- .Release.Namespace -}}.svc.cluster.local:80/loki/api/v1/push"
}
}
{{- end }}
@@ -318,25 +345,6 @@ data:
}
{{- end }}
{{- if or .Values.local.traces.enabled .Values.cloud.traces.enabled }}
// The OpenTelemetry exporter exports processed trace spans to another target that is listening for OTLP format traces.
// A unique label, 'local', is added to uniquely identify this exporter.
otelcol.exporter.otlp "local" {
// Define the client for exporting.
client {
// Send to the locally running Tempo instance, on port 4317 (OTLP gRPC).
endpoint = "meta-tempo-distributor:4317"
// Configure TLS settings for communicating with the endpoint.
tls {
// The connection is insecure.
insecure = true
// Do not verify TLS certificates when connecting.
insecure_skip_verify = true
}
}
}
{{- end }}
{{- if .Values.cloud.logs.enabled }}
loki.write "cloud" {
endpoint {
@@ -362,7 +370,7 @@ data:
{{- end }}
{{- if .Values.cloud.traces.enabled }}
otelcol.exporter.otlp "cloud" {
otelcol.exporter.otlphttp "cloud" {
client {
endpoint = nonsensitive(remote.kubernetes.secret.traces_credentials.data["endpoint"])
auth = otelcol.auth.basic.creds.handler

View File

@@ -27,58 +27,6 @@ data:
path: /var/lib/grafana/dashboards/loki-2
orgId: 1
type: file
{{- end }}
{{- if .Values.dashboards.metrics.enabled }}
- disableDeletion: true
editable: false
folder: Mimir
name: mimir-1
options:
path: /var/lib/grafana/dashboards/mimir-1
orgId: 1
type: file
- disableDeletion: true
editable: false
folder: Mimir
name: mimir-2
options:
path: /var/lib/grafana/dashboards/mimir-2
orgId: 1
type: file
- disableDeletion: true
editable: false
folder: Mimir
name: mimir-3
options:
path: /var/lib/grafana/dashboards/mimir-3
orgId: 1
type: file
- disableDeletion: true
editable: false
folder: Mimir
name: mimir-4
options:
path: /var/lib/grafana/dashboards/mimir-4
orgId: 1
type: file
- disableDeletion: true
editable: false
folder: Mimir
name: mimir-5
options:
path: /var/lib/grafana/dashboards/mimir-5
orgId: 1
type: file
{{- end }}
{{- if .Values.dashboards.traces.enabled }}
- disableDeletion: true
editable: false
folder: Tempo
name: tempo-1
options:
path: /var/lib/grafana/dashboards/tempo-1
orgId: 1
type: file
{{- end }}
- disableDeletion: true
editable: false

View File

@@ -12,7 +12,7 @@ data:
# List of data sources to delete from the database.
deleteDatasources:
- name: Loki
- name: Loki
orgId: 1
# List of data sources to insert/update depending on what's
@@ -32,7 +32,7 @@ data:
uid: loki_ds
# <string> Sets the data source's URL, including the
# port.
url: http://loki-gateway.{{- $.Release.Namespace -}}.svc.cluster.local
url: http://{{- $.Release.Namespace -}}-loki-gateway.{{- $.Release.Namespace -}}.svc.cluster.local
# <bool> Toggles whether the data source is pre-selected
# for new panels. You can set only one default
# data source per organization.
@@ -61,6 +61,10 @@ data:
# <bool> Allows users to edit data sources from the
# Grafana UI.
editable: true
# Extra config.
jsonData:
# Scrape interval
timeInterval: 1m
{{- end }}
{{- if .Values.local.traces.enabled }}
- name: Tempo

View File

@@ -75,22 +75,6 @@ spec:
- mountPath: /var/lib/grafana/dashboards/loki-2
name: loki-dashboards-2
{{- end }}
{{- if .Values.dashboards.metrics.enabled }}
- mountPath: /var/lib/grafana/dashboards/mimir-1
name: mimir-dashboards-1
- mountPath: /var/lib/grafana/dashboards/mimir-2
name: mimir-dashboards-2
- mountPath: /var/lib/grafana/dashboards/mimir-3
name: mimir-dashboards-3
- mountPath: /var/lib/grafana/dashboards/mimir-4
name: mimir-dashboards-4
- mountPath: /var/lib/grafana/dashboards/mimir-5
name: mimir-dashboards-5
{{- end }}
{{- if .Values.dashboards.traces.enabled }}
- mountPath: /var/lib/grafana/dashboards/tempo-1
name: tempo-dashboards-1
{{- end }}
- mountPath: /var/lib/grafana/dashboards/agent-1
name: agent-dashboards-1
volumes:
@@ -111,28 +95,6 @@ spec:
configMap:
name: loki-dashboards-2
{{- end }}
{{- if .Values.dashboards.metrics.enabled }}
- name: mimir-dashboards-1
configMap:
name: mimir-dashboards-1
- name: mimir-dashboards-2
configMap:
name: mimir-dashboards-2
- name: mimir-dashboards-3
configMap:
name: mimir-dashboards-3
- name: mimir-dashboards-4
configMap:
name: mimir-dashboards-4
- name: mimir-dashboards-5
configMap:
name: mimir-dashboards-5
{{- end }}
{{- if .Values.dashboards.traces.enabled }}
- name: tempo-dashboards-1
configMap:
name: tempo-dashboards-1
{{- end }}
- name: agent-dashboards-1
configMap:
name: agent-dashboards-1

View File

@@ -1,19 +0,0 @@
{{- if and .Values.local.grafana.enabled .Values.dashboards.metrics.enabled }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mimir-dashboards-1
namespace: {{ $.Release.Namespace }}
data:
"mimir-alertmanager-resources.json": |
{{ $.Files.Get "src/dashboards/mimir-alertmanager-resources.json" | fromJson | toJson }}
"mimir-alertmanager.json": |
{{ $.Files.Get "src/dashboards/mimir-alertmanager.json" | fromJson | toJson }}
"mimir-compactor-resources.json": |
{{ $.Files.Get "src/dashboards/mimir-compactor-resources.json" | fromJson | toJson }}
"mimir-compactor.json": |
{{ $.Files.Get "src/dashboards/mimir-compactor.json" | fromJson | toJson }}
"mimir-config.json": |
{{ $.Files.Get "src/dashboards/mimir-config.json" | fromJson | toJson }}
{{- end }}

View File

@@ -1,19 +0,0 @@
{{- if and .Values.local.grafana.enabled .Values.dashboards.metrics.enabled }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mimir-dashboards-2
namespace: {{ $.Release.Namespace }}
data:
"mimir-object-store.json": |
{{ $.Files.Get "src/dashboards/mimir-object-store.json" | fromJson | toJson }}
"mimir-overrides.json": |
{{ $.Files.Get "src/dashboards/mimir-overrides.json" | fromJson | toJson }}
"mimir-overview-networking.json": |
{{ $.Files.Get "src/dashboards/mimir-overview-networking.json" | fromJson | toJson }}
"mimir-overview-resources.json": |
{{ $.Files.Get "src/dashboards/mimir-overview-resources.json" | fromJson | toJson }}
"mimir-overview.json": |
{{ $.Files.Get "src/dashboards/mimir-overview.json" | fromJson | toJson }}
{{- end }}

View File

@@ -1,19 +0,0 @@
{{- if and .Values.local.grafana.enabled .Values.dashboards.metrics.enabled }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mimir-dashboards-3
namespace: {{ $.Release.Namespace }}
data:
"mimir-queries.json": |
{{ $.Files.Get "src/dashboards/mimir-queries.json" | fromJson | toJson }}
"mimir-reads-networking.json": |
{{ $.Files.Get "src/dashboards/mimir-reads-networking.json" | fromJson | toJson }}
"mimir-reads-resources.json": |
{{ $.Files.Get "src/dashboards/mimir-reads-resources.json" | fromJson | toJson }}
"mimir-reads.json": |
{{ $.Files.Get "src/dashboards/mimir-reads.json" | fromJson | toJson }}
"mimir-remote-ruler-reads-resources.json": |
{{ $.Files.Get "src/dashboards/mimir-remote-ruler-reads-resources.json" | fromJson | toJson }}
{{- end }}

View File

@@ -1,19 +0,0 @@
{{- if and .Values.local.grafana.enabled .Values.dashboards.metrics.enabled }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mimir-dashboards-4
namespace: {{ $.Release.Namespace }}
data:
"mimir-remote-ruler-reads.json": |
{{ $.Files.Get "src/dashboards/mimir-remote-ruler-reads.json" | fromJson | toJson }}
"mimir-rollout-progress.json": |
{{ $.Files.Get "src/dashboards/mimir-rollout-progress.json" | fromJson | toJson }}
"mimir-ruler.json": |
{{ $.Files.Get "src/dashboards/mimir-ruler.json" | fromJson | toJson }}
"mimir-scaling.json": |
{{ $.Files.Get "src/dashboards/mimir-scaling.json" | fromJson | toJson }}
"mimir-slow-queries.json": |
{{ $.Files.Get "src/dashboards/mimir-slow-queries.json" | fromJson | toJson }}
{{- end }}

View File

@@ -1,19 +0,0 @@
{{- if and .Values.local.grafana.enabled .Values.dashboards.metrics.enabled }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mimir-dashboards-5
namespace: {{ $.Release.Namespace }}
data:
"mimir-tenants.json": |
{{ $.Files.Get "src/dashboards/mimir-tenants.json" | fromJson | toJson }}
"mimir-top-tenants.json": |
{{ $.Files.Get "src/dashboards/mimir-top-tenants.json" | fromJson | toJson }}
"mimir-writes-networking.json": |
{{ $.Files.Get "src/dashboards/mimir-writes-networking.json" | fromJson | toJson }}
"mimir-writes-resources.json": |
{{ $.Files.Get "src/dashboards/mimir-writes-resources.json" | fromJson | toJson }}
"mimir-writes.json": |
{{ $.Files.Get "src/dashboards/mimir-writes.json" | fromJson | toJson }}
{{- end }}

View File

@@ -1,21 +0,0 @@
{{- if and .Values.local.grafana.enabled .Values.dashboards.traces.enabled }}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: tempo-dashboards-1
namespace: {{ $.Release.Namespace }}
data:
"tempo-operational.json": |
{{ $.Files.Get "src/dashboards/tempo-operational.json" | fromJson | toJson }}
"tempo-reads.json": |
{{ $.Files.Get "src/dashboards/tempo-reads.json" | fromJson | toJson }}
"tempo-resources.json": |
{{ $.Files.Get "src/dashboards/tempo-resources.json" | fromJson | toJson }}
"tempo-rollout-progress.json": |
{{ $.Files.Get "src/dashboards/tempo-rollout-progress.json" | fromJson | toJson }}
"tempo-tenants.json": |
{{ $.Files.Get "src/dashboards/tempo-tenants.json" | fromJson | toJson }}
"tempo-writes.json": |
{{ $.Files.Get "src/dashboards/tempo-writes.json" | fromJson | toJson }}
{{- end }}

View File

@@ -49,6 +49,9 @@ spec:
- containerPort: 7946
name: memberlist
protocol: TCP
envFrom:
- secretRef:
name: minio
readinessProbe:
failureThreshold: 3
httpGet:

View File

@@ -70,12 +70,14 @@ logs:
- caller=metrics.go
# This shows any errors
- level=error
# This shows the ingest requests and is very noisy. Uncomment to include.
# - caller=push.go
# Log lines for delete requests
- delete request for user added
- Started processing delete request
- delete request for user marked as processed
# This shows the ingest requests and is very noisy. Uncomment to include.
# - caller=push.go
# Additional log lines to retain
extraLogs: []
metrics:
# The list of metrics to retain for logging dashboards
@@ -103,7 +105,9 @@ metrics:
- go_memstats_heap_inuse_bytes
- kubelet_volume_stats_used_bytes
- kubelet_volume_stats_capacity_bytes
- kube_deployment_created
- kube_persistentvolumeclaim_labels
- kube_pod_container_info
- kube_pod_container_resource_requests
- kube_pod_container_status_last_terminated_reason
- kube_pod_container_status_restarts_total
@@ -136,6 +140,7 @@ metrics:
- loki_distributor_bytes_received_total
- loki_distributor_lines_received_total
- loki_distributor_structured_metadata_bytes_received_total
- loki_index_request_duration_seconds_count
- loki_ingester_chunk_age_seconds_bucket
- loki_ingester_chunk_age_seconds_count
- loki_ingester_chunk_age_seconds_sum
@@ -172,8 +177,10 @@ metrics:
- node_disk_read_bytes_total
- node_disk_written_bytes_total
- promtail_custom_bad_words_total
# Additional metrics to retain
extraMetrics: []
# Set enabled = true to add the default logs/metrics/traces dashboards to the local Grafana
# Set enabled = true to add the default logs dashboards to the local Grafana
dashboards:
logs:
enabled: true
@@ -182,11 +189,6 @@ dashboards:
traces:
enabled: true
global:
minio:
rootUser: "rootuser"
rootPassword: "rootpassword"
kubeStateMetrics:
# Scrape https://github.com/kubernetes/kube-state-metrics by default
enabled: true
@@ -222,9 +224,9 @@ loki:
common:
storage:
s3:
access_key_id: "{{ .Values.global.minio.rootUser }}"
access_key_id: "${rootUser}"
endpoint: "{{ .Release.Name }}-minio.{{ .Release.Namespace }}.svc:9000"
secret_access_key: "{{ .Values.global.minio.rootPassword }}"
secret_access_key: "${rootPassword}"
compactor:
retention_enabled: true
delete_request_store: s3
@@ -247,8 +249,24 @@ loki:
installOperator: false
lokiCanary:
enabled: false
test:
enabled: false
write:
extraArgs:
- "-config.expand-env=true"
extraEnvFrom:
- secretRef:
name: "minio"
read:
extraArgs:
- "-config.expand-env=true"
extraEnvFrom:
- secretRef:
name: "minio"
backend:
extraArgs:
- "-config.expand-env=true"
extraEnvFrom:
- secretRef:
name: "minio"
alloy:
alloy:
@@ -264,6 +282,15 @@ alloy:
memory: '600Mi'
limits:
memory: '4Gi'
extraPorts:
- name: "otel"
port: 4317
targetPort: 4317
protocol: "TCP"
- name: "thrifthttp"
port: 14268
targetPort: 14268
protocol: "TCP"
controller:
type: "statefulset"
autoscaling:
@@ -276,30 +303,31 @@ alloy:
mimir-distributed:
minio:
enabled: false
global:
extraEnvFrom:
- secretRef:
name: "minio"
mimir:
structuredConfig:
alertmanager_storage:
s3:
bucket_name: mimir-ruler
access_key_id: "{{ .Values.global.minio.rootUser }}"
endpoint: "{{ .Release.Name }}-minio.{{ .Release.Namespace }}.svc:9000"
secret_access_key: "{{ .Values.global.minio.rootPassword }}"
insecure: true
blocks_storage:
backend: s3
s3:
bucket_name: mimir-tsdb
access_key_id: "{{ .Values.global.minio.rootUser }}"
endpoint: "{{ .Release.Name }}-minio.{{ .Release.Namespace }}.svc:9000"
secret_access_key: "{{ .Values.global.minio.rootPassword }}"
insecure: true
ruler_storage:
s3:
bucket_name: mimir-ruler
access_key_id: "{{ .Values.global.minio.rootUser }}"
endpoint: "{{ .Release.Name }}-minio.{{ .Release.Namespace }}.svc:9000"
secret_access_key: "{{ .Values.global.minio.rootPassword }}"
insecure: true
common:
storage:
backend: s3
s3:
bucket_name: mimir-ruler
access_key_id: "${rootUser}"
endpoint: "{{ .Release.Name }}-minio.{{ .Release.Namespace }}.svc:9000"
secret_access_key: "${rootPassword}"
insecure: true
limits:
compactor_blocks_retention_period: 30d
@@ -312,12 +340,39 @@ tempo-distributed:
s3:
bucket: tempo
endpoint: "{{ .Release.Name }}-minio.{{ .Release.Namespace }}.svc:9000"
access_key: "{{ .Values.global.minio.rootUser }}"
secret_key: "{{ .Values.global.minio.rootPassword }}"
access_key: "${rootUser}"
secret_key: "${rootPassword}"
insecure: true
compactor:
compaction:
block_retention: 30d
distributor:
extraArgs:
- "-config.expand-env=true"
extraEnvFrom:
- secretRef:
name: "minio"
ingester:
extraArgs:
- "-config.expand-env=true"
extraEnvFrom:
- secretRef:
name: "minio"
compactor:
extraArgs:
- "-config.expand-env=true"
extraEnvFrom:
- secretRef:
name: "minio"
querier:
extraArgs:
- "-config.expand-env=true"
extraEnvFrom:
- secretRef:
name: "minio"
queryFrontend:
extraArgs:
- "-config.expand-env=true"
extraEnvFrom:
- secretRef:
name: "minio"
traces:
otlp:
http:
@@ -326,8 +381,7 @@ tempo-distributed:
enabled: true
minio:
rootUser: rootuser
rootPassword: rootpassword
existingSecret: "minio"
buckets:
- name: loki-chunks
policy: none

View File

@@ -25,21 +25,21 @@
```
kubectl create secret generic logs -n meta \
--from-literal=username=<logs username> \
--from-literal=password=<token>
--from-literal=password=<token> \
--from-literal=endpoint='https://logs-prod-us-central1.grafana.net/loki/api/v1/push'
kubectl create secret generic metrics -n meta \
--from-literal=username=<metrics username> \
--from-literal=password=<token>
--from-literal=password=<token> \
--from-literal=endpoint='https://prometheus-us-central1.grafana.net/api/prom/push'
kubectl create secret generic traces -n meta \
--from-literal=username=<traces username> \
--from-literal=password=<token>
--from-literal=endpoint='https://tempo-us-central1.grafana.net/tempo'
--from-literal=username=<OTLP instance ID> \
--from-literal=password=<token> \
--from-literal=endpoint='https://otlp-gateway-prod-us-east-0.grafana.net/otlp'
```
The logs, metrics and traces usernames are the `User / Username / Instance IDs` of the Loki, Prometheus/Mimir and Tempo instances in Grafana Cloud. From `Home` in Grafana click on `Stacks`. Then go to the `Details` pages of Loki, Prometheus/Mimir and Tempo.
The logs, metrics and traces usernames are the `User / Username / Instance IDs` of the Loki, Prometheus/Mimir and OpenTelemetry instances in Grafana Cloud. From `Home` in Grafana click on `Stacks`. Then go to the `Details` pages of Loki and Prometheus/Mimir. For OpenTelemetry go to the `Configure` page.
1. Create a values.yaml file based on the [default one](../charts/meta-monitoring/values.yaml). Fill in the names of the secrets created above as needed. An example minimal values.yaml looks like this:
@@ -67,6 +67,14 @@
kubectl create namespace meta
```
1. Create a secret named `minio` with the user and password for the local Minio:
```
kubectl create secret generic minio -n meta \
--from-literal=rootPassword=<password> \
--from-literal=rootUser=<user>
```
1. Create a values.yaml file based on the [default one](../charts/meta-monitoring/values.yaml). An example minimal values.yaml looks like this:
```
@@ -131,7 +139,9 @@
## Installing the dashboards on Grafana Cloud
For each of the dashboard files in charts/meta-monitoring/src/dashboards do the following:
Only the files for the application monitored have to be copied. When monitoring Loki import dashboard files starting with 'loki-'.
For each of the dashboard files in charts/meta-monitoring/src/dashboards folder do the following:
1. Click on 'Dashboards' in Grafana
@@ -141,8 +151,6 @@ For each of the dashboard files in charts/meta-monitoring/src/dashboards do the
1. Click 'Import'
Only the files for the application monitored have to be copied. When monitoring Loki import dashboard files starting with 'loki-'.
## Installing the rules on Grafana Cloud
1. Select the rules files in charts/meta-monitoring/src/rules for the application to monitor. When monitoring Loki use loki-rules.yaml.
@@ -155,12 +163,31 @@ Only the files for the application monitored have to be copied. When monitoring
1. Use them to load the rules using mimirtool as follows:
```
```
mimirtool rules load --address=<your_cloud_prometheus_endpoint> --id=<your_instance_id> --key=<your_cloud_access_policy_token> *.yaml
```
1. To check the rules you have uploaded run:
```
```
mimirtool rules print --address=<your_cloud_prometheus_endpoint> --id=<your_instance_id> --key=<your_cloud_access_policy_token>
```
```
## Configure Loki to send traces
1. In the Loki config enable tracing:
```
loki:
tracing:
enabled: true
```
1. Add the following environment variables to your Loki binaries. When using the Loki Helm chart these can be added using the `extraEnv` setting for the Loki components.
1. JAEGER_ENDPOINT: http address of the mmc-alloy service installed by the meta-monitoring chart, for example "http://mmc-alloy:14268/api/traces"
1. JAEGER_AGENT_TAGS: extra tags you would like to add to the spans, for example 'cluster="abc",namespace="def"'
1. JAEGER_SAMPLER_TYPE: the sampling strategy, for example to sample all use 'const' with a value of 1 for the next environment variable
1. JAEGER_SAMPLER_PARAM: 1
1. If Loki is installed in a different namespace you can create an [ExternalName service](https://kubernetes.io/docs/concepts/services-networking/service/#externalname) in Kubernetes to point to the mmc-alloy service in the meta monitoring namespace

20
scripts/clone_loki_mixin.sh Executable file
View File

@@ -0,0 +1,20 @@
#!/usr/bin/env bash
clean_up() {
test -d "$tmp_dir" && rm -fr "$tmp_dir"
}
here=${PWD}
tmp_dir=$( mktemp -d -t my-script )
cd $tmp_dir
echo "Cloning Loki"
git clone --filter=blob:none --no-checkout "https://github.com/grafana/loki"
cd loki
git sparse-checkout init --cone
git checkout main
git sparse-checkout set production/loki-mixin
echo "Copying production/loki-mixin to ${here}"
cp -r production ${here}

View File

@@ -0,0 +1,18 @@
(import 'dashboards.libsonnet') +
(import 'alerts.libsonnet') +
(import 'recording_rules.libsonnet') + {
grafanaDashboardFolder: 'Loki Meta Monitoring',
_config+:: {
internal_components: false,
// The Meta Monitoring helm chart uses Grafana Alloy instead of promtail
promtail+: {
enabled: false,
},
meta_monitoring+: {
enabled: true,
},
},
}