Prometheus full - no data being displayed in Grafana

Knowledge Base - Solution

When running grafana, no data is being shown - data in grafana is from either prometheus or influxdb. This problem is when prometheus device is full.

when checking prometheus pod logs (you see errors for no space left on device): k get pods -A | grep prometheus

k logs -f nci-service-monitoring-default-prometheus-758755bd9b-zw75v -n nci-service-monitoring-default

level=warn ts=2022-03-14T15:34:34.862Z caller=manager.go:619 component="rule manager" group=elasticsearch-alert-rules msg="Rule sample appending failed" err="write to WAL: log samples: write /prometheus/wal/00000604: no space left on device"

NO SPACE LEFT ON DEVICE - note the pod numbers will not be same on your system

k logs -f nci-service-monitoring-default-prometheus-758755bd9b-zw75v -n nci-service-monitoring-default

level=warn ts=2022-03-14T15:34:34.862Z caller=manager.go:619 component="rule manager" group=elasticsearch-alert-rules msg="Rule sample appending failed" err="write to WAL: log samples: write /prometheus/wal/00000604: no space left on device"

k exec -it -n nci-service-monitoring-default nci-service-monitoring-default-prometheus-5ff8b957d-6ps7p -- sh /prometheus $ df -h | grep rb /dev/rbd4 10G 9.7G 0G 98% /prometheus

solution 1:

log onto the pod and remove files in wal directory

k exec -it -n nci-service-monitoring-default nci-service-monitoring-default-prometheus-5ff8b957d-6ps7p -- sh

cd /prometheus/wal/ rm *

if you cannot remove from within the pod then find out which server it is running on, in our example nciloader2

[protean@nciloader3 ~]$ k get pods -A -o wide | grep -i prometheus nci-service-monitoring-default nci-service-monitoring-default-prometheus-5ff8b957d-6ps7p 1/1 Running 0 24h 10.244.1.218 nciloader2

ssh onto that server run df to see which one is full df -h | grep rbd

cd to its directory which you should see wall, cd to this directory and remove files.

  1. you can try and resize the pod https://conf1.ds.jdsu.net/wiki/display/NCIR/Resize+the+RBD+PersistentVolume+in+Cluster?src=contextnavpagetreemode