A customer of mine had within 6 months twice a full database partition on a VMware vCenter Server Appliance. After the first outage, the customer increased the size of the partition which is mounted to /storage/db. Some months later, some days ago, the vCSA became unresponsive again. Again because of a filled up database partition. The customer increased the size of the database partition again (~ 200 GB!!) and today I had time to take a look at this nasty vCSA.
Within 2 days, the storage usage of the databse increased from 75% to 77%. First, I checked the size of the database:
vcsa:/opt/vmware/vpostgres/current/bin # /opt/vmware/vpostgres/current/bin/psql -h localhost -U vc VCDB
Type "help" for help.
VCDB=> SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database;
datname | size
template1 | 5353 kB
template0 | 5345 kB
postgres | 5449 kB
VCDB | 2007 MB
As you can see, the database had only 2 GB. The pg_log directory was more interesting:
vcsa:/storage/db/vpostgres # du -shc /storage/db/vpostgres/*
The directory was full with log files. The log files containted only one message:
vcsa:/storage/db/vpostgres/pg_log # more postgresql-2015-03-04_090525.log
123462 tm:2015-03-04 09:05:25.488 UTC db:VCDB pid:1527 WARNING: there is already a transaction in progress
This led me to VMware KB2092127 (After upgrading to vCenter Server Appliance 5.5 Update 2, pg_log file reports this error: WARNING: there is already a transaction in progress). And yes, this appliance was upgraded to U2 with high probability. The solution is described in KB2092127, and is really easy to implement. Please note that this is only a workaround. There’s currently no solution, as mentioned in the article.
- Virtually reseated: Reset blade in a HPE C7000 enclosure - July 19, 2020
- Update Manager fails with unknown error during host remediation - July 19, 2020
- Access to on-premise hosted Public Folders using Exchange Online mailboxes - July 18, 2020