Well, I just stumbled upon something .. My Nagios at work wasn’t working anymore, and I went looking.
1 2 3 4 5 6 7 8 9 10 11 |
nagios3 ~ [0] > tail -f /var/log/nagios/nagios.log [1238658394] Error: Unable to save status file: No space left on device [1238658403] Error: Unable to save status file: No space left on device [1238658413] Error: Unable to save status file: No space left on device [1238658423] SERVICE ALERT: tsm1;POWER WARN;OK;SOFT;4;-u OK - 0 [1238658423] Error: Unable to save status file: No space left on device [1238658433] SERVICE ALERT: tsm2;LOAD;WARNING;SOFT;1;WARNING - load average: 6.25, 5.72, 5.36 [1238658433] Error: Unable to save status file: No space left on device [1238658443] Error: Unable to save status file: No space left on device [1238658453] Error: Unable to save status file: No space left on device [1238658463] Error: Unable |
After that, zip – nada. Next thing, check whether or not the device is really full … Okay, df ..
1 2 3 4 5 |
nagios3 ~ [130] > df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 3.5G 1.2G 2.1G 37% / udev 506M 88K 506M 1% /dev /dev/sdb1 7.9G 7.7G 0 100% /var |
So, it is actually completely filled up. So, now we need to find who’s hogging the space. Since I had a assumption (pnp4nagios), I went straight for /var/lib …
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
nagios3 lib [0] > du -sh * 16K CAM 1.1M YaST2 8.0K acpi 4.0K apache2 28K autoinstall 16K dhcpcd 4.0K empty 96K hardware 4.0K logrotate.status 8.0K misc 78M mysql 2.1M nagios 4.0K net-snmp 4.0K news 24K nfs 8.0K nobody 36K ntp 4.0K pam_devperm 824K php5 359M pnp4nagios 22M rpm 28K scpm 4.0K smpppd 4.0K sshd 4.0K support 8.0K suseRegister 4.0K uniconf 4.0K update-messages 4.0K wwwrun 33M zmd 14M zypp |
That wasn’t it .. so heading to the next place, that’s suspicious most of the time, /var/log.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
nagios3 log [0] > du -sh * 5.2G YaST2 4.0K acpid 1.4G apache2 28K boot.msg 28K boot.omsg 4.0K cups 4.0K dsmerror.log 148K dsmsched.log 4.0K faillog 4.0K krb5 12K lastlog 4.0K localmessages 16K mail 16K mail.info 198M messages 0 mysqld.log 14M nagios 0 ntp 4.0K pnp4nagios 4.0K sa 8.0K scpm 4.0K vmdesched.log 16K vmware-imc 4.0K vmware-tools-guestd 82M warn 348K wtmp 115M zmd-backend.log 24M zmd-messages.log |
I was like “WTF ? 5.2G for YaST2 logs ?” when I initially saw that output … As of now, I got a crontab emptying /var/log/YaST2 every 24 hours …