Hi,
a VMware ESXi becomes unresponsive when the root ramdisk is full.
Symptoms:
- Can’t move VMs away from an ESXi host (vMotion)
- vSphere HA is not working
- ESXi Host does not join the cluster
- Could not start vpxa daemon
~ # cat /var/log/vmkwarning.log
shows warnings like
WARNING: VisorFS
....
Obj: 1940: Cannot create file /var/run/vmware/tickets/vmtck-526c7464-61d9-f8 for process hostd-worker because the inode table of its ramdisk (root) is full.
....
~ # cat /var/log/vpxa.log
....
[FFA691A0 warning 'Libs'] Cannot make directory /var/run/vmware/root_0/1438931685015351_5476781: No space left on device
...
There are muliple eventualities:
- No space is left
- inode table is full
I checked the (visorfs) freespace :
~ # vdf
......
-----
Ramdisk 1k-blocks Used Available Use% Mounted on
root 32768 17401 15367 53% --
etc 28672 288 28384 1% --
tmp 196608 12 196596 0% --
hostdstats 822272 16628 805644 2% --
snmptraps 1024 0 1024 0% --
It seems all is ok. Also the inodes:
/var/log # stat -f /
File: "/"
ID: 100000000 Namelen: 127 Type: visorfs
Block size: 4096
Blocks: Total: 443834 Free: 266260 Available: 266260
Inodes: Total: 524288 Free: 519045
but a
~ # mkdir /var/test
fails with not enough space on disk
This is very strange behaviour. I looked for files in the root filesystem, starting in /var
~ # /var # find /var/|wc -l
4217
A deeper look into the directory shows that there a lot of files in /var/log, all with the prefix log.svs.???????.
~ # ls -l /var/log/log.svs.*
can’t list all files 🙁 , too many arguments….
I decided to delete all of them which are older then 1 day
~ # find /var/log -mtime +1 |grep log.svs.[0-9]|xargs rm
after deleting the files I could start the vpxa daemon, and the host joined the cluster.
~ # /etc/init.d/vpxa start
root filesystem is now at 5% usage
~ # vdf
Ramdisk 1k-blocks Used Available Use% Mounted on
root 32768 1844 30924 5% --
etc 28672 288 28384 1% --
tmp 196608 108 196500 0% --
hostdstats 822272 13044 809228 1% --
A usefull full link to VMwares support page.
Michael
great article!!!
to get the number of free inodes because vdf will not show you this:
stat -f / | grep Inodes | awk ‘{ print $NF }’
Really great, thank you.
The full log folder was also causing the mks connection problem.