Best way to back up Google Takeout ZIP files?

Hi, I’ve downloaded my Google data using https://takeout.google.com/ and I’m wondering what’s the optimal way to back this up to a Restic repository?

The download from Google is a multi-gigabyte ZIP file containing lots of files: emails, photos, etc.

When I download successive takeouts from Google over time, many of the files within the ZIPs will presumably be the same (for example each successive ZIP will contain all the same photos, plus any new ones that I’ve taken since). But each success ZIP file taken as a whole will be different from the previous ZIP.

I’m wondering if it might make sense for Restic’s deduplication to extract the ZIPs locally first and back up the extracted contents to a Restic repo? Or is it fine to just back up the ZIP files directly?

Same question applies to downloads of your Apple data from privacy.apple.com.

Thanks!

“Optimal way” needs consideration: what do you optimise for?

  • Backup space: Decompress the ZIPs, then deduplicate with restic. Every unchanged block will not consume space (beyond a pointer to the block content).

  • Backup time: Might be worth a consideration to not use restic at all. The ZIP is already compressed and a backup takes only the time to copy it as opposed to lengthy deduplication.

  • Laptop battery time: Ditto. No deduplication consumes power.

  • Easy access to individual files: Decompress ZIP, then use restic. That way you can retrieve each file without processing a whole ZIP, and can search across backups.

HTH