Submitted by ryan.grace on April 5, 2024 - 3:36am
Problem:
NO SPACE LEFT ON DEVICE - note the pod numbers will not be same on your system
k logs -f nci-service-monitoring-default-prometheus-758755bd9b-zw75v -n nci-service-monitoring-default
level=warn ts=2022-03-14T15:34:34.862Z caller=manager.go:619 component="rule manager" group=elasticsearch-alert-rules msg="Rule sample appending failed" err="write to WAL: log samples: write /prometheus/wal/00000604: no space left on device" k exec -it -n nci-service-monitoring-default nci-service-monitoring-default-prometheus-5ff8b957d-6ps7p -- sh
/prometheus $ df -h | grep rb
/dev/rbd4 10G 9.7G 0G 98% /prometheus
Solution:
solution 1:
log onto the pod and remove files in wal directory
k exec -it -n nci-service-monitoring-default nci-service-monitoring-default-prometheus-5ff8b957d-6ps7p -- sh
cd /prometheus/wal/
rm *
if you cannot remove from within the pod then find out which server it is running on, in our example nciloader2
[protean@nciloader3 ~]$ k get pods -A -o wide | grep -i prometheus
nci-service-monitoring-default nci-service-monitoring-default-prometheus-5ff8b957d-6ps7p 1/1 Running 0 24h 10.244.1.218 nciloader2 ssh onto that server
run df to see which one is full
df -h | grep rbd
cd to its directory which you should see wall, cd to this directory and remove files.
2. you can try and resize the pod https://conf1.ds.jdsu.net/wiki/display/NCIR/Resize+the+RBD+PersistentVol...