Monitoring with Prometheus and Grafana
The monitoring stack runs three components: Prometheus collects and stores metrics, Grafana visualises them, and Alertmanager routes alerts. All three are deployed as a single Helm release from the prometheus-community chart repository, managed by Flux.
Directory structure
Section titled “Directory structure”Everything lives under infrastructure/base/monitoring/:
infrastructure/base/monitoring/├── kustomization.yaml├── helmrepository.yaml├── helmrelease.yaml└── grafana-secret.yamlThe Kustomization lists all three manifests:
apiVersion: kustomize.config.k8s.io/v1beta1kind: Kustomizationresources: - grafana-secret.yaml - helmrepository.yaml - helmrelease.yamlHelm source
Section titled “Helm source”helmrepository.yaml points Flux at the upstream chart index:
apiVersion: source.toolkit.fluxcd.io/v1kind: HelmRepositorymetadata: name: prometheus-community namespace: flux-systemspec: interval: 1h0m0s url: https://prometheus-community.github.io/helm-chartsFlux refreshes this index every hour and notifies the HelmRelease controller when a new chart version appears.
HelmRelease
Section titled “HelmRelease”helmrelease.yaml pins chart version 82.15.0 and installs everything into the monitoring namespace:
apiVersion: helm.toolkit.fluxcd.io/v2kind: HelmReleasemetadata: name: kube-prometheus namespace: flux-systemspec: chart: spec: chart: kube-prometheus-stack reconcileStrategy: ChartVersion sourceRef: kind: HelmRepository name: prometheus-community version: 82.15.0 interval: 5m0s releaseName: kube-prometheus storageNamespace: monitoring targetNamespace: monitoring values: ...The storageNamespace and targetNamespace both point to monitoring. Flux reconciles the release every five minutes, applying any drift back to the declared state.
Grafana credentials
Section titled “Grafana credentials”Grafana admin credentials live in grafana-secret.yaml, encrypted with SOPS and age. The file holds two fields — admin-user and admin-password — both encrypted at rest in Git. SOPS decrypts them during reconciliation using the cluster’s age key.
The HelmRelease references the secret by name rather than embedding plaintext values:
grafana: admin: existingSecret: grafana-adminWhen Helm renders the chart, it reads credentials from the grafana-admin Secret in the monitoring namespace. The secret name in the encrypted file matches this reference.
Persistent storage
Section titled “Persistent storage”Without persistence, Prometheus and Grafana lose all data on pod restart. The HelmRelease configures PVCs for all three components:
| Component | Size | Access mode |
|---|---|---|
| Prometheus | 20 GiB | ReadWriteOnce |
| Grafana | 10 GiB | ReadWriteOnce |
| Alertmanager | 2 GiB | ReadWriteOnce |
ReadWriteOnce means one node mounts the volume at a time — appropriate for single-replica deployments.
Kubernetes does not support an unbounded PVC size. A concrete storage request is required; if omitted, Helm falls back to the chart default.
Reclaim policy risk
Section titled “Reclaim policy risk”The default storage class on this cluster uses reclaim policy Delete. If you remove the HelmRelease from Git, or run flux delete helmrelease, Flux uninstalls the release. That deletes the PVCs, and the Delete policy then destroys the backing volumes.
To protect a volume, patch its reclaim policy to Retain after Kubernetes binds it:
kubectl get pvc -n monitoringkubectl get pvkubectl patch pv <pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'With Retain, deleting the PVC moves the PV to Released state rather than destroying the disk. Recovery requires manually re-binding a new PVC to that volume.
Accessing Grafana
Section titled “Accessing Grafana”IngressRoute (recommended)
Section titled “IngressRoute (recommended)”The local overlay adds a Traefik IngressRoute that routes grafana.k8s.local to the Grafana service:
apiVersion: traefik.io/v1alpha1kind: IngressRoutemetadata: name: grafana namespace: monitoringspec: entryPoints: - websecure routes: - match: Host(`grafana.k8s.local`) kind: Rule services: - name: kube-prometheus-grafana port: 80 tls: {}The route uses the websecure entrypoint (port 443) with TLS enabled. Add grafana.k8s.local to your /etc/hosts file pointing at the cluster’s load balancer address, then open https://grafana.k8s.local.
Port-forward
Section titled “Port-forward”If the IngressRoute is unavailable or you want direct access without DNS, forward the Grafana service port to your local machine:
kubectl port-forward -n monitoring svc/kube-prometheus-grafana 3000:80Then open http://localhost:3000. Log in with the credentials stored in the grafana-admin secret.
To retrieve the password:
kubectl get secret -n monitoring grafana-admin \ -o jsonpath="{.data.admin-password}" | base64 -d; echoVerify the deployment
Section titled “Verify the deployment”After committing the manifests and pushing, trigger reconciliation and check status:
flux reconcile source git flux-systemflux get helmreleases -Akubectl get pods -n monitoringA healthy deployment shows the operator, Grafana, Prometheus, node-exporter, and kube-state-metrics pods all running. The Alertmanager pod may enter a crash loop on first deploy if the node’s inotify limit is too low — see the inotify troubleshooting guide for the fix.