Help needed building Mac exclude list

I’m in the process of building an OS X exclude list. I discovered that TimeMachine, (the Apple backup) has one, and as TM can be trusted on this matter, I took inspiration from it.
So, the TimeMachine exclude list can be found at
/System/Library/CoreServices/backupd.bundle/Contents/Resources/StdExclusions.plist
The file contains (on my Mac):

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
    	<!-- paths we do not want to include in a system backup -->
    	<key>PathsExcluded</key>
    	<array>
    		<string>/.MobileBackups</string>
    		<string>/MobileBackups.trash</string>
    		<string>/.MobileBackups.trash</string>
    		<string>/.Spotlight-V100</string>
    		<string>/.TemporaryItems</string>
    		<string>/.Trashes</string>
    		<string>/.com.apple.backupd.mvlist.plist</string>
    		<string>/.fseventsd</string>
    		<string>/.hotfiles.btree</string>
    		<string>/Backups.backupdb</string>
    		<string>/Desktop DB</string>
    		<string>/Desktop DF</string>
    		<string>/Network/Servers</string>
    		<string>/Library/Updates</string>
    		<string>/Previous Systems</string>
    		<string>/Users/Shared/SC Info</string>
    		<string>/Users/Guest</string>
    		<string>/dev</string>
    		<string>/home</string>
    		<string>/net</string>
    		<string>/private/var/db/com.apple.backupd.backupVerification</string>
    		<string>/private/var/db/efw_cache</string>
    		<string>/private/var/db/Spotlight</string>			<!-- old tiger location of the Spotlight db -->
    		<string>/private/var/db/Spotlight-V100</string>		<!-- old tiger location of the Spotlight db -->
    		<string>/private/var/db/systemstats</string>
    		<string>/private/var/lib/postfix/greylist.db</string>
    	</array>
    	<!-- paths where we need to capture top level folder to restore disk structure, but don't want to backup any contents -->
    	<key>ContentsExcluded</key>
    	<array>
    		<string>/Volumes</string>
    		<string>/Network</string>
    		<string>/automount</string>
    		<string>/.vol</string>
    		<string>/tmp</string>
    		<string>/cores</string>
    		<string>/private/tmp</string>
    		<string>/private/Network</string>
    		<string>/private/tftpboot</string>
    		<string>/private/var/automount</string>
    		<string>/private/var/folders</string>
    		<string>/private/var/run</string>
    		<string>/private/var/tmp</string>
    		<string>/private/var/vm</string>
    		<string>/private/var/db/dhcpclient</string>
    		<string>/private/var/db/fseventsd</string>
    		<string>/Library/Caches</string>
    		<string>/Library/Logs</string>
    		<string>/System/Library/Caches</string>
    		<string>/System/Library/Extensions/Caches</string>
    	</array>
    	<!-- paths where we need to capture entire subtree folder layout to restore disk structure, but don't want to backup contained files -->
    	<key>FileContentsExcluded</key>
    	<array>
    		<string>/private/var/log</string>
    		<string>/private/var/spool/cups</string>
    		<string>/private/var/spool/fax</string>
    		<string>/private/var/spool/uucp</string>
    	</array>
    	<!-- standard user paths we want to skip for each user (subpath relative to root of home directory) -->
    	<key>UserPathsExcluded</key>
    	<array>
    		<string>Library/Application Support/SyncServices/data.version</string>
    		<string>Library/Application Support/Ubiquity</string>
    		<string>Library/Caches</string>
    		<string>Library/Logs</string>
    		<string>Library/Mail/Envelope Index</string>
    		<string>Library/Mail/Envelope Index-journal</string>
    		<string>Library/Mail/AvailableFeeds</string>
    		<string>Library/Mail/Metadata/BackingStoreUpdateJournal</string>
    		<string>Library/Mail/V2/MailData/Envelope Index</string>
    		<string>Library/Mail/V2/MailData/Envelope Index-journal</string>
    		<string>Library/Mail/V2/MailData/AvailableFeeds</string>
    		<string>Library/Mail/V2/MailData/BackingStoreUpdateJournal</string>
    		<string>Library/Mail/V2/MailData/Envelope Index-shm</string>
    		<string>Library/Mail/V2/MailData/Envelope Index-wal</string>
    		<string>Library/Mirrors</string>
    		<string>Library/PubSub/Database</string>
    		<string>Library/PubSub/Downloads</string>
    		<string>Library/PubSub/Feeds</string>
    		<string>Library/Safari/Icons.db</string>
    		<string>Library/Safari/WebpageIcons.db</string>
    		<string>Library/Safari/HistoryIndex.sk</string>
    	</array>
    </dict>
    </plist>

So we can see four categories :
1- paths we do not want to include in a system backup
2- paths where we need to capture top level folder to restore disk structure, but don’t want to backup any contents
3- paths where we need to capture entire subtree folder layout to restore disk structure, but don’t want to backup contained files
4- standard user paths we want to skip for each user (subpath relative to root of home directory)

And we must add a fifth one which is the application files exclusions. On OS X developpers can precise in metadata if a file of their app should be forgotten from a backup. To find these files, we must use the following command :

$ sudo mdfind "com_apple_backup_excludeItem = 'com.apple.backupd'"
For 1, 2, and 4, the translation from xml to restic exlude list should be easy :
1- /.MobileBackups becomes /.MobileBackups
2- /Volumes becomes /Volumes/*
4- Library/Application Support/SyncServices/data.version becomes
/Users/*/Library/Application Support/SyncServices/data.version

and the app exclude list
5- should be obtained by

$ sudo mdfind "com_apple_backup_excludeItem = 'com.apple.backupd'" >> resticexludes.txt 

I’m still facing two problems
A - How can I translate "paths where we need to capture entire subtree folder layout to restore disk structure, but don’t want to backup contained files "
B - What if I install a new app ? Should I generate an app file exclude list before each backup ?

Thanks for your help
Regards

1 Like

I don’t know about Mac because I use Linux but this is my script for backups. I hope this can help.

#!/usr/bin/env bash

#set repo pass
export RESTIC_PASSWORD='yourresticpassword'

#set restic bucket repo
export RESTIC_REPOSITORY='/path/to/your/repo'
restic unlock

#backup Home directory excluding any unwanted directories
#script will not backup Downloads
#script backs all home directory
restic backup /home/USER --tag TAG --verbose   \
--exclude='/home/USER/Downloads'		     \
--exclude='/home/USER/Desktop'	            \
--exclude='/home/USER/.dbus'                   \
--exclude='/home/USER/.cache'		     \
--exclude='/home/*/.cache/*'		     \

#check data is correctly in repo
#
restic check
restic snapshots

#Remove old repos based on backup strategy
restic forget 		\
--keep-hourly 8		\
--keep-daily 7 		\
--keep-weekly 4	 	\
--keep-monthly 6     	\
--keep-yearly 10	       \

#Delete removed snapshots
restic prune

#reset credentials
export RESTIC_PASSWORD='yourresticpassword'

exit 0

Thanks for your script, that’ll be usefull when my exclude list is complete.

Perhaps you know how to translate this sentence in bash
paths where we need to capture entire subtree folder layout to restore disk structure, but don’t want to backup contained files

In fact I’m looking for a way to do that, which means I have to keep the whole directory tree with no files inside. I thought

$ find /private/var/log -type d >> resticexcludes.txt

should do the trick but, it’s obviously the other way around, if I do that I exclude all directories with their content, what I want is to include these directories but without their content.

Finally the real question here seems to be how to backup a empty directory tree without files ?
Everything in exclude seems to be about files, what setup should I use to make restic backup the directories but not the files inside these dirs ?

I really wish I could help more but I don’t really have an answer about that. In fact, looking at your question right now I don’t even know if my first answer was really related to your post.

I’m not sure how you’d do that with an exclude file. However, a work around that might work would be to have a post restore shell script which re-creates those folders. The idea being that if you can’t find a way to have restic keep the subtree folder layout only (no files), you could have a script create those for you after you run restic restore.

I’m planning to build the exclude list each time before backup with

$ find /private/var/log -type f > excludelist.txt

and then to feed this exclude list to restic.
I think this will do the trick as every file in these path are going to be excluded but the fact is I end with an exclude list that is about 2000 lines, is that a problem for restic ? Do that impact backup speed ?

I’m not sure. I recommend benchmarking it with the time command to see what impact it has. Restic is pretty efficient (barring network bottlenecks) so I doubt it will have a major impact. I’m not familiar enough with the source code to say for sure, but testing with and without the excludes file will give you the best idea as it will reflect results within your environment on your hardware.

If anyone is insterested I’ve made a quick and dirty script to build restic excludes for Mac, inspired by TimeMachine excldues.

You can find the scripts at https://gist.github.com/Vartkat/1b097d8e9a6ad648bd3a356be86d97af

Unfortunately it looks like this file no longer exists on 10.12.