14 March 2012
backups mlocate.db and PRUNEPATHS on CentOS 5.x
While carving out some free space on a full backup server, I found a 1.6GB sized file /var/lib/mlocate/mlocate.db that was present in all backups of the backup server itself (yeah I know, don't backup a machine to itself). My backup scheme uses rsync with hardlinks and takes no space for copies of files that don't change... this file was changing.
My first thought was to add it to the excludes for that backup job but after looking into what mlocate.db is, I changed my mind.
mlocate.db is the index/database for the locate command. It's generated nightly by a cron job. updatedb has a config file at /etc/updatedb.conf. There's a setting for PRUNEPATHS in there which by default includes some preset dirs to exclude from indexing.
All the hosts that are backed up to this server have their backups kept in subdirs of /backup which was not being excluded from the nightly updatedb indexing. So, the executables for all backed up hosts were being indexed by the nightly updatedb run, resulting in a 1.6GB mlocate.db file
I have /var/log, /var/cache, /backup and some others excluded from the nightly "backup of the backup server" itself, but not /var/lib/. So, I was backing up the /var/lib/mlocate/mlocate.db file daily and since it was changing daily, my hardlink scheme was ineffective at saving any space, chewing up 1.6GB/day.
So, the proper fix is to put "/backup" into PRUNEPATHS in /etc/updatedb.conf which will prevent updatedb from indexing your executables in /backup.
I then ran the updatedb cron job manually (sh /etc/cron.daily/mlocate.cron on my system) to see how things looked. Now the mlocate.db file is >99% smaller:
-rw-r----- 1 root slocate 1.4M Mar 14 08:51 /var/lib/mlocate/mlocate.db
That should save at least 20Gb in 2 weeks of daily backups, more with weeklys kept for a year.
While that may not seem like alot of space with modern SATA disks, it is for servers with SCSI arrays using disks from the 36 and 72 GB era.