Temp check in

Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
Put the annotations in the right place
2024-05-31 13:54:18 +01:00 · 2024-05-22 17:14:59 +01:00 · 2024-05-22 17:06:01 +01:00 · 2024-05-22 16:58:07 +01:00 · 2024-05-22 16:57:40 +01:00 · 2024-05-22 16:56:50 +01:00
21 changed files with 288 additions and 121 deletions
--- a/.github/configs/cluster-config.yaml
+++ b/.github/configs/cluster-config.yaml
@@ -0,0 +1,19 @@
+apiVersion: kind.x-k8s.io/v1alpha4
+kind: Cluster
+nodes:
+  - role: control-plane
+    kubeadmConfigPatches:
+      - |
+        kind: ClusterConfiguration
+        controllerManager:
+          extraArgs:
+            bind-address: 0.0.0.0
+            secure-port: "10257"
+        scheduler:
+          extraArgs:
+            bind-address: 0.0.0.0
+            secure-port: "10259"
+      - |
+        kind: KubeProxyConfiguration
+        metricsBindAddress: 0.0.0.0:10249
+  - role: worker
--- a/.github/workflows/helm-ci.yml
+++ b/.github/workflows/helm-ci.yml
@@ -1,6 +1,7 @@
 ---
 name: helm-ci
 on:
+  workflow_dispatch:
  pull_request:
    paths:
      - "charts/meta-monitoring/**"
@@ -19,48 +20,48 @@ jobs:
      - name: Lint Yaml
        run: make helm-lint

-  # call-test:
-  #   name: Test Helm Chart
-  #   runs-on: ubuntu-latest
-  #   steps:
-  #     - name: Checkout
-  #       uses: actions/checkout@v3
-  #       with:
-  #         fetch-depth: 0
+  call-test:
+    name: Test Helm Chart
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0

-  #     - name: Set up Helm
-  #       uses: azure/setup-helm@v3
-  #       with:
-  #         version: v3.8.2
+      - name: Set up Helm
+        uses: azure/setup-helm@v3
+        with:
+          version: v3.14.0

-  #     # Python is required because `ct lint` runs Yamale (https://github.com/23andMe/Yamale) and
-  #     # yamllint (https://github.com/adrienverge/yamllint) which require Python
-  #     - name: Set up Python
-  #       uses: actions/setup-python@v4
-  #       with:
-  #         python-version: 3.7
+      # Python is required because `ct lint` runs Yamale (https://github.com/23andMe/Yamale) and
+      # yamllint (https://github.com/adrienverge/yamllint) which require Python
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: 3.9

-  #     - name: Set up chart-testing
-  #       uses: helm/chart-testing-action@v2.4.0
+      - name: Set up chart-testing
+        uses: helm/chart-testing-action@v2

-  #     - name: Run chart-testing (list-changed)
-  #       id: list-changed
-  #       run: |
-  #         changed=$(ct list-changed --config "${CT_CONFIGFILE}")
-  #         if [[ -n "$changed" ]]; then
-  #           echo "changed=true" >> $GITHUB_OUTPUT
-  #         fi
+      - name: Run chart-testing (list-changed)
+        id: list-changed
+        run: |
+          changed=$(ct list-changed --config "${CT_CONFIGFILE}")
+          if [[ -n "$changed" ]]; then
+            echo "changed=true" >> $GITHUB_OUTPUT
+          fi

-  #     - name: Run chart-testing (lint)
-  #       run: ct lint --config "${CT_CONFIGFILE}" --check-version-increment=false
+      - name: Run chart-testing (lint)
+        run: ct lint --config "${CT_CONFIGFILE}" --check-version-increment=false

-  #     - name: Create kind cluster
-  #       uses: helm/kind-action@v1.8.0
-  #       if: steps.list-changed.outputs.changed == 'true'
-  #       with:
-  #         config: tools/kind.config
+      - name: Create kind cluster
+        uses: helm/kind-action@v1
+        if: steps.list-changed.outputs.changed == 'true'
+        with:
+           config: "${{ github.workspace }}/.github/configs/cluster-config.yaml"

-  #     - name: Run chart-testing (install)
-  #       run: |
-  #         changed=$(ct list-changed --config "${CT_CONFIGFILE}")
-  #         ct install --config "${CT_CONFIGFILE}"
+      - name: Run chart-testing (install)
+        run: |
+          changed=$(ct list-changed --config "${CT_CONFIGFILE}")
+          ct install --config "${CT_CONFIGFILE}"
--- a/README.md
+++ b/README.md
@@ -1,8 +1,6 @@
 # meta-monitoring-chart

-This is a meta-monitoring chart for Loki.
-
-Note that this is pre-production software at the moment.
+This is a meta-monitoring chart for Loki, specifically Loki installed via the Loki helm chart.

 ## Local and cloud modes

@@ -11,19 +9,15 @@ to small Loki, Mimir and Tempo installations running in the meta-monitoring name

 ![local mode](docs/images/Meta%20monitoring%20local.png)

-To enable local mode set `local.<logs|metrics|traces>.enabled` to true.
-
 In the cloud mode the logs, metrics and/or traces are sent to Grafana Cloud.

 ![cloud mode](docs/images/Meta%20monitoring%20cloud.png)

-To enable cloud mode set `cloud.<logs|metrics|traces>.enabled` to true. The `endpoint`, `username` and `password` settings for your Grafana Cloud logs, metrics and traces instances have to be filled in as well.
-
 Both modes can be enabled at the same time. Cloud mode is preferred.

 ## Installation

-For more instructions including how to update the chart go to the [installation](docs/installation.md) page.
+For more instructions including how to install the chart go to the [installation](docs/installation.md) page.

 ## Supported features

@@ -33,8 +27,7 @@ For more instructions including how to update the chart go to the [installation]
 - Specify PII regexes that are applied to logs before they are sent to Loki (cloud or local). The capture group in the regex is replaced with *****.
 - a Grafana instance is installed (when local mode is used) with the relevant datasources installed. The following dashboards are installed:
  - logs dashboards
-  - agent dashboards
- Retention is set to 24 hours
+  - Alloy dashboards

 Most of these features are enabled by default. See the values.yaml file for how to enable/disable them.

@@ -42,8 +35,7 @@ Most of these features are enabled by default. See the values.yaml file for how

 - This has not been tested on Openshift yet.
 - The underlying Loki, Mimir and Tempo are at the default size installed by the Helm chart. This might need changing when monitoring bigger Loki, Mimir or Tempo installations.
- MinIO is used as storage at the moment with a limited retention. At the moment this chart cannot be used for monitoring over longer periods.
- Agent self monitoring is not done at the moment.
+- MinIO is used as storage for the local mode at the moment with a limited retention. At the moment this chart cannot be used for monitoring over longer periods.

 ## Developer help topics

--- a/charts/meta-monitoring/Chart.lock
+++ b/charts/meta-monitoring/Chart.lock
@@ -1,10 +1,10 @@
 dependencies:
 - name: loki
  repository: https://grafana.github.io/helm-charts
-  version: 6.5.1
+  version: 6.5.2
 - name: alloy
  repository: https://grafana.github.io/helm-charts
-  version: 0.1.1
+  version: 0.3.0
 - name: mimir-distributed
  repository: https://grafana.github.io/helm-charts
  version: 5.3.0
@@ -14,5 +14,5 @@ dependencies:
 - name: minio
  repository: https://charts.min.io
  version: 5.2.0
-digest: sha256:e0c7af6d328fe35f4b9a3557235f458d92225b84b1366dbb77c4626d3cdb5be9
-generated: "2024-05-09T07:02:42.911579524Z"
+digest: sha256:0eaa504de24724505fa4fff5169cd86628465ec366c253392c4ed24f15902b6b
+generated: "2024-05-22T07:02:54.054326052Z"
--- a/charts/meta-monitoring/Chart.yaml
+++ b/charts/meta-monitoring/Chart.yaml
@@ -13,7 +13,7 @@ type: application
 # This is the chart version. This version number should be incremented each time you make changes
 # to the chart and its templates, including the app version.
 # Versions are expected to follow Semantic Versioning (https://semver.org/)
-version: 0.0.3
+version: 1.0.0
 # This is the version number of the application being deployed. This version number should be
 # incremented each time you make changes to the application. Versions are not expected to
 # follow Semantic Versioning. They should reflect the version the application is using.
@@ -22,11 +22,11 @@ appVersion: "0.0.1"
 dependencies:
 - name: loki
  repository: https://grafana.github.io/helm-charts
-  version: 6.5.1
+  version: 6.5.2
  condition: local.logs.enabled
 - name: alloy
  repository: https://grafana.github.io/helm-charts
-  version: 0.1.1
+  version: 0.3.0
 - name: mimir-distributed
  repository: https://grafana.github.io/helm-charts
  version: 5.3.0
--- a/charts/meta-monitoring/charts/alloy-0.1.1.tgz
+++ b/charts/meta-monitoring/charts/alloy-0.1.1.tgz
--- a/charts/meta-monitoring/charts/alloy-0.3.0.tgz
+++ b/charts/meta-monitoring/charts/alloy-0.3.0.tgz
--- a/charts/meta-monitoring/charts/loki-6.5.1.tgz
+++ b/charts/meta-monitoring/charts/loki-6.5.1.tgz
--- a/charts/meta-monitoring/charts/loki-6.5.2.tgz
+++ b/charts/meta-monitoring/charts/loki-6.5.2.tgz
--- a/charts/meta-monitoring/ci/local-values.yaml
+++ b/charts/meta-monitoring/ci/local-values.yaml
@@ -0,0 +1,116 @@
+namespacesToMonitor:
+- loki
+
+local:
+  grafana:
+    enabled: true
+  logs:
+    enabled: true
+  metrics:
+    enabled: true
+  traces:
+    enabled: true
+  minio:
+    enabled: true
+    createSecret: false
+
+cloud:
+  logs:
+    enabled: false
+    secret: logs
+  metrics:
+    enabled: false
+    secret: metrics
+  traces:
+    enabled: false
+    secret: traces
+
+grafana:
+  ingress:
+    hosts:
+      - host: monitoring.example.com
+        paths:
+          - path: /
+            pathType: Prefix
+
+minio:
+  existingSecret: ""
+  rootUser: "abcdefghi"
+  rootPassword: "defghijkl"
+
+loki:
+  deploymentMode: SingleBinary
+  singleBinary:
+    replicas: 1
+    resources:
+      limits:
+        cpu: 3
+        memory: 4Gi
+      requests:
+        cpu: 2
+        memory: 2Gi
+    extraEnv:
+      # Keep a little bit lower than memory limits
+      - name: GOMEMLIMIT
+        value: 3750MiB
+
+  chunksCache:
+    # default is 500MB, with limited memory keep this smaller
+    writebackSizeLimit: 10MB
+
+  # Zero out replica counts of other deployment modes
+  backend:
+    replicas: 0
+  read:
+    replicas: 0
+  write:
+    replicas: 0
+
+  ingester:
+    replicas: 0
+  querier:
+    replicas: 0
+  queryFrontend:
+    replicas: 0
+  queryScheduler:
+    replicas: 0
+  distributor:
+    replicas: 0
+  compactor:
+    replicas: 0
+  indexGateway:
+    replicas: 0
+  bloomCompactor:
+    replicas: 0
+  bloomGateway:
+    replicas: 0
+
+mimir-distributed:
+  minio:
+    enabled: false
+  global:
+    extraEnvFrom:
+    - secretRef:
+        name: "meta-minio"
+
+tempo-distributed:
+  distributor:
+    extraEnvFrom:
+    - secretRef:
+        name: "meta-minio"
+  ingester:
+    extraEnvFrom:
+    - secretRef:
+        name: "meta-minio"
+  compactor:
+    extraEnvFrom:
+    - secretRef:
+        name: "meta-minio"
+  querier:
+    extraEnvFrom:
+    - secretRef:
+        name: "meta-minio"
+  queryFrontend:
+    extraEnvFrom:
+    - secretRef:
+        name: "meta-minio"
--- a/charts/meta-monitoring/ct.yaml
+++ b/charts/meta-monitoring/ct.yaml
@@ -6,6 +6,5 @@ chart-dirs:
 chart-repos:
  - grafana=https://grafana.github.io/helm-charts
  - minio=https://charts.min.io
-helm-extra-args: --timeout 1200s
 check-version-increment: false
 validate-maintainers: false
--- a/charts/meta-monitoring/src/dashboards/loki-operational.json
+++ b/charts/meta-monitoring/src/dashboards/loki-operational.json
@@ -1824,7 +1824,7 @@
                  "steppedLine": false,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*distributor.*|(loki|enterprise-logs)-write)\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*distributor.*|(loki|enterprise-logs)-write.*|$namespace-[0-9]+)\"}[$__rate_interval]))",
                        "intervalFactor": 3,
                        "legendFormat": "{{pod}}",
                        "refId": "A"
@@ -1921,7 +1921,7 @@
                  "steppedLine": false,
                  "targets": [
                     {
-                        "expr": "go_memstats_heap_inuse_bytes{cluster=\"$cluster\", namespace=\"$namespace\", job=~\"(.*/distributor|(loki|enterprise-logs)-write|.*/loki)\"}",
+                        "expr": "go_memstats_heap_inuse_bytes{cluster=\"$cluster\", namespace=\"$namespace\", job=~\"(.*/.*distributor|$namespace/(loki|enterprise-logs)-write|.*/loki|$namespace/loki-single-binary)\"}",
                        "instant": false,
                        "intervalFactor": 3,
                        "legendFormat": "{{pod}}",
@@ -2525,7 +2525,7 @@
                  "steppedLine": false,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary|$namespace-[0-9]+)\"}[$__rate_interval]))",
                        "intervalFactor": 3,
                        "legendFormat": "{{pod}}",
                        "refId": "A"
@@ -2622,7 +2622,7 @@
                  "steppedLine": false,
                  "targets": [
                     {
-                        "expr": "go_memstats_heap_inuse_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\"}",
+                        "expr": "go_memstats_heap_inuse_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"(.*ingester.*|(loki|enterprise-logs)-write.*|loki-single-binary|$namespace-[0-9]+)\"}",
                        "instant": false,
                        "intervalFactor": 3,
                        "legendFormat": "{{pod}}",
@@ -3308,7 +3308,7 @@
                  "steppedLine": false,
                  "targets": [
                     {
-                        "expr": "sum by(reason) (rate(loki_ingester_chunks_flushed_total{cluster=~\"$cluster\",job=~\"$namespace/.*ingester.*\", namespace=~\"$namespace\"}[$__rate_interval])) / ignoring(reason) group_left sum(rate(loki_ingester_chunks_flushed_total{cluster=~\"$cluster\",job=~\"$namespace/.*ingester.*\", namespace=~\"$namespace\"}[$__rate_interval]))",
+                        "expr": "sum by(reason) (rate(loki_ingester_chunks_flushed_total{cluster=~\"$cluster\",job=~\"($namespace)/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\", namespace=~\"$namespace\"}[$__rate_interval])) / ignoring(reason) group_left sum(rate(loki_ingester_chunks_flushed_total{cluster=~\"$cluster\",job=~\"($namespace)/(.*ingester.*|(loki|enterprise-logs)-write|loki-single-binary)\", namespace=~\"$namespace\"}[$__rate_interval]))",
                        "interval": "",
                        "legendFormat": "{{ reason }}"
                     }
@@ -3388,7 +3388,7 @@
                  "reverseYBuckets": false,
                  "targets": [
                     {
-                        "expr": "sum by (le) (rate(loki_ingester_chunk_utilization_bucket{cluster=\"$cluster\", job=~\"($namespace)/(ingester|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
+                        "expr": "sum by (le) (rate(loki_ingester_chunk_utilization_bucket{cluster=\"$cluster\", job=~\"($namespace)/(.*ingester|(loki|enterprise-logs)-write|loki-single-binary)\"}[$__rate_interval]))",
                        "format": "heatmap",
                        "instant": false,
                        "interval": "",
@@ -3481,7 +3481,7 @@
                  "steppedLine": false,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*querier.*|(loki|enterprise-logs)-read|loki-single-binary)\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", pod=~\"(.*querier.*|(loki|enterprise-logs)-read.*|loki-single-binary|$namespace-[0-9]+)\"}[$__rate_interval]))",
                        "intervalFactor": 3,
                        "legendFormat": "{{pod}}",
                        "refId": "A"
@@ -3578,7 +3578,7 @@
                  "steppedLine": false,
                  "targets": [
                     {
-                        "expr": "go_memstats_heap_inuse_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"(.*querier.*|(loki|enterprise-logs)-read|.*loki-single-binary)\"}",
+                        "expr": "go_memstats_heap_inuse_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"(.*querier.*|(loki|enterprise-logs)-read.*|.*loki-single-binary|$namespace-[0-9]+)\"}",
                        "instant": false,
                        "intervalFactor": 3,
                        "legendFormat": "{{pod}}",
--- a/charts/meta-monitoring/src/dashboards/loki-reads-resources.json
+++ b/charts/meta-monitoring/src/dashboards/loki-reads-resources.json
@@ -104,19 +104,19 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"}[$__rate_interval]))",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend\", resource=\"cpu\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", resource=\"cpu\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend\"})",
+                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -206,19 +206,19 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend\"})",
+                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend\", resource=\"memory\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", resource=\"memory\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend\"} > 0)",
+                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -269,7 +269,7 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/.*query-frontend\"})",
+                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(.*query-frontend|loki-read|loki-single-binary)\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
@@ -371,19 +371,19 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler|loki\", pod=~\"query-scheduler|loki-read-.*|$namespace-[0-9]*\"}[$__rate_interval]))",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler\", resource=\"cpu\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler|loki\", pod=~\"query-scheduler|loki-read-.*|$namespace-[0-9]*\", resource=\"cpu\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler\"})",
+                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler|loki\", pod=~\"query-scheduler|loki-read-.*|$namespace-[0-9]*\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler|loki\", pod=~\"query-scheduler|loki-read-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -473,19 +473,19 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler\"})",
+                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler|loki\", pod=~\"query-scheduler|loki-read-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler\", resource=\"memory\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler|loki\", pod=~\"query-scheduler|loki-read-.*|$namespace-[0-9]*\", resource=\"memory\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler\"} > 0)",
+                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-scheduler|loki\", pod=~\"query-scheduler|loki-read-.*|$namespace-[0-9]*\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -536,7 +536,7 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/.*query-scheduler\"})",
+                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(.*query-scheduler|loki-read|loki-single-binary)\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
@@ -638,19 +638,19 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"}[$__rate_interval]))",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier\", resource=\"cpu\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", resource=\"cpu\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier\"})",
+                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -740,19 +740,19 @@
                  },
                  "targets": [
                     {
-                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier\"})",
+                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier\", resource=\"memory\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", resource=\"memory\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier\"} > 0)",
+                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"querier|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -803,7 +803,7 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/.*querier\"})",
+                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(.*querier|loki-read|loki-single-binary)\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
@@ -854,7 +854,7 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(instance, pod, device) (rate(node_disk_written_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"querier\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
+                        "expr": "sum by(instance, pod, device) (rate(node_disk_written_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
                        "format": "time_series",
                        "legendFormat": "{{pod}} - {{device}}",
                        "legendLink": null
@@ -902,7 +902,7 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(instance, pod, device) (rate(node_disk_read_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"querier\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
+                        "expr": "sum by(instance, pod, device) (rate(node_disk_read_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"query-frontend|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
                        "format": "time_series",
                        "legendFormat": "{{pod}} - {{device}}",
                        "legendLink": null
@@ -1462,19 +1462,19 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"}[$__rate_interval]))",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway\", resource=\"cpu\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", resource=\"cpu\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway\"})",
+                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -1564,19 +1564,19 @@
                  },
                  "targets": [
                     {
-                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway\"})",
+                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway\", resource=\"memory\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", resource=\"memory\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway\"} > 0)",
+                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -1627,7 +1627,7 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/.*bloom-gateway\"})",
+                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(.*bloom-gateway|loki-read|loki-single-binary)\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
@@ -1678,7 +1678,7 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(instance, pod, device) (rate(node_disk_written_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"bloom-gateway\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
+                        "expr": "sum by(instance, pod, device) (rate(node_disk_written_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
                        "format": "time_series",
                        "legendFormat": "{{pod}} - {{device}}",
                        "legendLink": null
@@ -1726,7 +1726,7 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(instance, pod, device) (rate(node_disk_read_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=\"bloom-gateway\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
+                        "expr": "sum by(instance, pod, device) (rate(node_disk_read_bytes_total[$__rate_interval])) + ignoring(pod) group_right() (label_replace(count by(instance, pod, device) (container_fs_writes_bytes_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"bloom-gateway|loki\", pod=~\"query-frontend|loki-read-.*|$namespace-[0-9]*\", device!~\".*sda.*\"}), \"device\", \"$1\", \"device\", \"/dev/(.*)\") * 0)\n",
                        "format": "time_series",
                        "legendFormat": "{{pod}} - {{device}}",
                        "legendLink": null
@@ -2189,19 +2189,19 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler|loki\", pod=~\"ruler|loki-backend-.*|$namespace-[0-9]*\"}[$__rate_interval]))",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler\", resource=\"cpu\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler|loki\", pod=~\"ruler|loki-backend-.*|$namespace-[0-9]*\", resource=\"cpu\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler\"})",
+                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler|loki\", pod=~\"ruler|loki-backend-.*|$namespace-[0-9]*\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler|loki\", pod=~\"ruler|loki-backend-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -2291,19 +2291,19 @@
                  },
                  "targets": [
                     {
-                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler\"})",
+                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler|loki\", pod=~\"ruler|loki-backend-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler\", resource=\"memory\"} > 0)",
+                        "expr": "min(kube_pod_container_resource_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler|loki\", pod=~\"ruler|loki-backend-.*|$namespace-[0-9]*\", resource=\"memory\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "request",
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler\"} > 0)",
+                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"ruler|loki\", pod=~\"ruler|loki-backend-.*|$namespace-[0-9]*\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -2354,7 +2354,7 @@
                  },
                  "targets": [
                     {
-                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/ruler\"})",
+                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(.*ruler|loki-backend|loki-single-binary)\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
--- a/charts/meta-monitoring/src/dashboards/loki-writes-resources.json
+++ b/charts/meta-monitoring/src/dashboards/loki-writes-resources.json
@@ -104,7 +104,7 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"}[$__rate_interval]))",
+                        "expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor|loki\", pod=~\"distributor|loki-write-.*|$namespace-[0-9]*\"}[$__rate_interval]))",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
@@ -116,7 +116,7 @@
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"})",
+                        "expr": "min(container_spec_cpu_quota{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor|loki\", pod=~\"distributor|loki-write-.*|$namespace-[0-9]*\"} / container_spec_cpu_period{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor|loki\", pod=~\"distributor|loki-write-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -206,7 +206,7 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"})",
+                        "expr": "max by(pod) (container_memory_working_set_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor|loki\", pod=~\"distributor|loki-write-.*|$namespace-[0-9]*\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
@@ -218,7 +218,7 @@
                        "legendLink": null
                     },
                     {
-                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor\"} > 0)",
+                        "expr": "min(container_spec_memory_limit_bytes{cluster=~\"$cluster\", namespace=~\"$namespace\", container=~\"distributor|loki\", pod=~\"distributor|loki-write-.*|$namespace-[0-9]*\"} > 0)",
                        "format": "time_series",
                        "legendFormat": "limit",
                        "legendLink": null
@@ -269,7 +269,7 @@
                  "span": 4,
                  "targets": [
                     {
-                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/.*distributor\"})",
+                        "expr": "sum by(pod) (go_memstats_heap_inuse_bytes{cluster=~\"$cluster\", job=~\"($namespace)/(.*distributor|loki-write|loki-single-binary)\"})",
                        "format": "time_series",
                        "legendFormat": "{{pod}}",
                        "legendLink": null
--- a/charts/meta-monitoring/templates/agent/config.yaml
+++ b/charts/meta-monitoring/templates/agent/config.yaml
@@ -135,6 +135,11 @@ data:
    }

    prometheus.relabel "filter" {
+      rule {
+        target_label = "cluster"
+        replacement = "{{- .Values.clusterLabelValue -}}"
+      }
+
      rule {
        source_labels = ["__name__"]
        regex = "({{ include "agent.all_metrics" . }})"
@@ -330,7 +335,7 @@ data:
    {{- if .Values.local.logs.enabled }}
    loki.write "local" {
      endpoint {
-        url = "http://{{- .Release.Namespace -}}-loki-gateway.{{- .Release.Namespace -}}.svc.cluster.local:80/loki/api/v1/push"
+        url = "http://loki-write.{{- .Release.Namespace -}}.svc.cluster.local:3100/loki/api/v1/push"
      }
    }
    {{- end }}
--- a/charts/meta-monitoring/templates/minio/secret.yaml
+++ b/charts/meta-monitoring/templates/minio/secret.yaml
@@ -0,0 +1,13 @@
+{{- if .Values.local.minio.createSecret }}
+apiVersion: v1
+kind: Secret
+metadata:
+  name: minio
+  namespace: {{ $.Release.Namespace }}
+  annotations:
+    "helm.sh/hook": pre-install
+    "helm.sh/hook-weight": "-5"
+data:
+  rootUser: dmFsdWUtMg0KDQo=
+  rootPassword: dmFsdWUtMg0KDQo=
+{{- end }}
--- a/charts/meta-monitoring/templates/ruler/ruler.yaml
+++ b/charts/meta-monitoring/templates/ruler/ruler.yaml
@@ -51,7 +51,11 @@ spec:
              protocol: TCP
          envFrom:
          - secretRef:
+{{- if .Values.local.minio.enabled }}
+             name: {{ $.Release.Namespace }}-minio
+{{- else }}
             name: minio
+{{- end }}
          readinessProbe:
            failureThreshold: 3
            httpGet:
--- a/charts/meta-monitoring/templates/validate.yaml
+++ b/charts/meta-monitoring/templates/validate.yaml
@@ -41,3 +41,4 @@
 {{- if empty .Values.metrics.retain -}}
  {{- fail "All metrics will be collected, please specify some in metrics.retain" -}}
 {{- end -}}
+
--- a/charts/meta-monitoring/values.yaml
+++ b/charts/meta-monitoring/values.yaml
@@ -2,7 +2,7 @@
 namespacesToMonitor:
 - loki
 # The name of the cluster where this will be installed
-clusterLabelValue: "meta-monitoring"
+clusterLabelValue: "meta"
 # Set to true to write logs, metrics or traces to Grafana Cloud
 # The secrets have to be created first
 cloud:
@@ -26,7 +26,8 @@ local:
  traces:
    enabled: false
  minio:
-    enabled: false # This should be set to true if any of the previous is enabled
+    enabled: false  # This should be set to true if any of the previous is enabled
+    createSecret: false  # This is used for testing, do not use in production
 grafana:
  version: 10.4.2
  # Gateway ingress configuration
@@ -52,14 +53,14 @@ grafana:
            #     port:
            #       number: TODO
    # -- TLS configuration for the gateway ingress. Hosts passed through the `tpl` function to allow templating
-    #tls:
+    # tls:
    #  - secretName: grafana-tls
    #    hosts:
    #      - monitoring.example.com
 logs:
  # Adding regexes here will add a stage.replace block for logs. For more information see
  # https://grafana.com/docs/agent/latest/flow/reference/components/loki.process/#stagereplace-block
-  piiRegexes: null # This example replaces the word after password with *****
+  piiRegexes: null  # This example replaces the word after password with *****
 # - expression: "password (\\\\S+)"
 #   source: ""         # Empty uses the log message
 #   replace: "*****""
@@ -149,6 +150,7 @@ metrics:
  - kube_pod_container_resource_requests
  - kube_pod_container_status_last_terminated_reason
  - kube_pod_container_status_restarts_total
+  - loki_azure_blob_request_duration_seconds_bucket
  - loki_boltdb_shipper_compact_tables_operation_duration_seconds
  - loki_boltdb_shipper_compact_tables_operation_last_successful_run_timestamp_seconds
  - loki_boltdb_shipper_retention_marker_count_total
@@ -174,11 +176,15 @@ metrics:
  - loki_compactor_deleted_lines
  - loki_compactor_oldest_pending_delete_request_age_seconds
  - loki_compactor_pending_delete_requests_count
+  - loki_consul_request_duration_seconds_bucket
  - loki_discarded_samples_total
  - loki_discarded_bytes_total
  - loki_distributor_bytes_received_total
  - loki_distributor_lines_received_total
  - loki_distributor_structured_metadata_bytes_received_total
+  - loki_gcs_request_duration_seconds_bucket
+  - loki_gcs_request_duration_seconds_count
+  - loki_index_request_duration_seconds_bucket
  - loki_index_request_duration_seconds_count
  - loki_ingester_chunk_age_seconds_bucket
  - loki_ingester_chunk_age_seconds_count
@@ -191,6 +197,7 @@ metrics:
  - loki_ingester_chunk_entries_sum
  - loki_ingester_chunk_size_bytes_bucket
  - loki_ingester_chunk_utilization_bucket
+  - loki_ingester_chunk_utilization_count
  - loki_ingester_chunk_utilization_sum
  - loki_ingester_chunks_flushed_total
  - loki_ingester_flush_queue_length
@@ -208,6 +215,8 @@ metrics:
  - loki_ruler_wal_prometheus_remote_storage_samples_total
  - loki_ruler_wal_samples_appended_total
  - loki_ruler_wal_storage_created_series_total
+  - loki_s3_request_duration_seconds_bucket
+  - loki_s3_request_duration_seconds_count
  - loki_write_batch_retries_total
  - loki_write_dropped_bytes_total
  - loki_write_dropped_entries_total
--- a/docs/dev_update_dependencies.md
+++ b/docs/dev_update_dependencies.md
@@ -1,8 +1,12 @@
 # Update the dependencies

-The dependencies are the version of Loki, Mimir, Agent and so on that are included in this chart.
+The dependencies are the versions of Loki, Mimir, Agent and so on that are included in this chart.
 The current versions can be found in the [Chart.yaml](../charts/meta-monitoring/Chart.yaml) file.

+A Github action runs daily to see if updated versions are available. A PR will be created.
+
+The manual steps are as follows:
+
 Run this in the charts/meta-monitoring directory after updating a dependency:

 ```
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -4,7 +4,7 @@

 1. Use an existing Grafana Cloud account or setup a new one. Then create an access token:

-   1. In Grafana go to Administration -> Users and Access -> Cloud access policies.
+   1. In a Grafana instance on Grafana Cloud go to Administration -> Users and Access -> Cloud access policies.

   1. Click `Create access policy`.

@@ -39,7 +39,7 @@
    --from-literal=endpoint='https://otlp-gateway-prod-us-east-0.grafana.net/otlp'
   ```

-   The logs, metrics and traces usernames are the `User / Username / Instance IDs` of the Loki, Prometheus/Mimir and OpenTelemetry instances in Grafana Cloud. From `Home` in Grafana click on `Stacks`. Then go to the `Details` pages of Loki and Prometheus/Mimir. For OpenTelemetry go to the `Configure` page.
+   The logs, metrics and traces usernames are the `User / Username / Instance IDs` of the Loki, Prometheus/Mimir and OpenTelemetry instances in Grafana Cloud. From `Home` in Grafana click on `Stacks`. Then go to the `Details` pages of Loki and Prometheus/Mimir. For OpenTelemetry go to the `Configure` page. The endpoints will also have to be changed to match your settings.

 1. Create a values.yaml file based on the [default one](../charts/meta-monitoring/values.yaml). Fill in the names of the secrets created above as needed. An example minimal values.yaml looks like this:

@@ -102,7 +102,7 @@
       enabled: true
   ```

-## Installing the chart
+## Installing, updating and deleting the chart

 1. Add the repo

@@ -175,7 +175,7 @@ For each of the dashboard files in charts/meta-monitoring/src/dashboards folder

 ## Configure Loki to send traces

-1. In the Loki config enable tracing:
+1. In the Loki that is being monitored enable tracing in the config:

   ```
   loki:
@@ -194,4 +194,8 @@ For each of the dashboard files in charts/meta-monitoring/src/dashboards folder

 ## Configure external access using an Ingress in local mode

-When using local mode by default a Kubernetes [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) object is created to access the Grafana instance. This will need to be adapted to your cloud provider by updating the `grafana.ingress` section of the `values.yaml` file provided to Helm. Check the documentation of your cloud provider for available options.
+When using local mode by default a Kubernetes [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) object is created to access the Grafana instance. This will need to be adapted to your cloud provider by updating the `grafana.ingress` section of the `values.yaml` file provided to Helm. Check the documentation of your cloud provider for available options.
+
+## Kube-state-metrics
+
+Metrics about Kubernetes objects are scraped from [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics). This needs to be installed in the cluster. The `kubeStateMetrics.endpoint` entry in values.yaml should be set to it's address (without the `/metrics` part in the URL).
Author	SHA1	Message	Date
Michel Hollands	cb31d42f57	Temp check in Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-31 13:54:18 +01:00
Michel Hollands	e6db102da8	Put the annotations in the right place Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 17:14:59 +01:00
Michel Hollands	1a33ef0d2b	Turn on ci Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 17:06:01 +01:00
Michel Hollands	0e95fcc5cb	Add default secret for testing Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:58:07 +01:00
Michel Hollands	f5b0477a2d	Use correct name for values file Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:57:40 +01:00
Michel Hollands	2939c3cd63	Add ci dir to chart with local values Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:56:50 +01:00
Michel Hollands	76c8884a3c	Add pre-install hook Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	edc556b074	Add default secret for testing Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	e5e13ac517	Remove empty line Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	8b9ed3c9f7	Use correct name for values file Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	844708681f	Add ci dir to chart with local values Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	ce216cd558	Remove unused parameter Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	0418d16a1b	Fix lint issues Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	8cff0e0e75	Fix lint issues Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	65995dce4f	Empty change to provoke CI Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-22 16:53:26 +01:00
Michel Hollands	4d42fb664d	Merge pull request #127 from grafana/chore/update-dependencies [dependency] Update the subcharts	2024-05-22 09:14:16 +01:00
MichelHollands	9457c25ced	Update dependencies	2024-05-22 07:03:07 +00:00
Michel Hollands	ca686afc3e	Merge pull request #125 from grafana/update_version Update to version 1.0	2024-05-14 14:10:13 +01:00
Michel Hollands	4b01214225	Update to version 1.0 Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-14 14:09:41 +01:00
Michel Hollands	0e63a86fe5	Merge pull request #124 from grafana/update_main_page Update the README and docs	2024-05-14 14:08:00 +01:00
Michel Hollands	4e8b2be044	Update the README and docs Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-14 14:06:49 +01:00
Michel Hollands	df12d96f9c	Merge pull request #123 from grafana/cleanup_ci Comment out installation in CI for now	2024-05-14 11:51:24 +01:00
Michel Hollands	fcb5de6793	Comment out installation in CI for now Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-14 11:50:57 +01:00
Michel Hollands	661662caec	Merge pull request #121 from grafana/add_ci_install Add CI install step	2024-05-14 10:52:01 +01:00
Michel Hollands	2a681ce1eb	Add workflow dispatch Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-14 10:50:12 +01:00
Michel Hollands	52e4516e04	Add CI install step Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-14 10:48:45 +01:00
Michel Hollands	95085c4e72	Merge pull request #114 from grafana/chore/update-dependencies [dependency] Update the subcharts	2024-05-14 10:38:16 +01:00
Michel Hollands	55d3c9d723	Merge pull request #120 from grafana/add_ksm_docs Add kube-state-metrics documentation	2024-05-14 10:31:47 +01:00
Michel Hollands	618ab3778b	Add kube-state-metrics documentation Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-14 10:31:11 +01:00
Michel Hollands	89d9bdb5e2	Merge pull request #119 from grafana/fix_cluster_name Use shorter name for cluster	2024-05-14 08:47:56 +01:00
Michel Hollands	291f680c16	Use shorter name for cluster Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-14 08:46:45 +01:00
MichelHollands	3658769c7a	Update dependencies	2024-05-14 07:04:01 +00:00
Michel Hollands	1be9bc8d0a	Merge pull request #118 from grafana/fix_dashboards Fix dashboards a bit more	2024-05-13 17:04:45 +01:00
Michel Hollands	81d63a4383	Fix CPU usage of ssd querier Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 16:59:05 +01:00
Michel Hollands	333ba3a3fd	Add cluster to kube state metrics Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 16:58:07 +01:00
Michel Hollands	7aa091cbf8	Merge pull request #117 from grafana/fix_dashboards Fix dashboards	2024-05-13 14:48:45 +01:00
Michel Hollands	d309a5bc50	Fix mistakes Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 14:44:08 +01:00
Michel Hollands	346dd4968e	Make reads-resources work for all 3 deployment modes Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 14:36:53 +01:00
Michel Hollands	f5c9fa0593	Update operation so it works with all types of deployment Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 14:07:59 +01:00
Michel Hollands	d5e8df856d	Update writes dashboard work with all types Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 14:06:41 +01:00
Michel Hollands	2d85e7e120	Update dashboards so they work with single binary Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 10:56:35 +01:00
Michel Hollands	1a4a1ad885	Fix ruler panel Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 10:17:55 +01:00
Michel Hollands	c1ff364c29	Add missing metric in reads dashboard Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 09:45:37 +01:00
Michel Hollands	bd0ef0e2cc	Add missing values Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 09:21:05 +01:00
Michel Hollands	0216163885	Add chunk reason Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 09:11:20 +01:00
Michel Hollands	c42718649f	Fix distributor memory panel Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-13 09:03:02 +01:00
Michel Hollands	650df8217a	Merge pull request #116 from grafana/fix_loki_write_endpoint Fix local write end point	2024-05-13 08:24:18 +01:00
Michel Hollands	f7946ff713	Fix local write end point Signed-off-by: Michel Hollands <michel.hollands@gmail.com>	2024-05-12 14:32:39 +01:00
Michel Hollands	b312fc37fc	Merge pull request #115 from grafana/fix_traces_forwarding Fix local tracing pipeline	2024-05-10 15:44:03 +01:00