Restic backup deletes from data directory

We have just recently started using Restic to push a fairly large dataset (around 3T) to S3 for archive storage. These are actually filesystem backups from about 60 individual linux servers which are pushed to a central server 3 times a week, and once complete, that central server uses restic to push the entire set up to S3 for archival purposes. Because this is a fairly high-security setup (some of this data falls under PCI data compliance rules) I wanted to ensure backup data is not accidentally deleted. So, I’ve started setting up a Cloudwatch Event + Lambda that will trigger on any DeleteObject event in the bucket. If the DeleteObject is part of the locks folder, it immediately exits, but if it’s anywhere else in the bucket, it will trigger up an alert for us to examine. However, during my initial testing of this Lambda, it actually detected a delete in the data directory - data/ec/ec12297407d6ac25c8d35c0a220fc57cfe2a78bc9404aca2adb848e107b4194d. At the time this occured, the only action occuring against the repository is a “restic backup” command that started earlier this morning.

As I understand it, restic’s backup command should not be removing data from the repository, but only adding - data should not be deleted until a prune/forget/etc is run - so I can’t explain why this occured, and if it’s a normal thing, I can’t setup this alert like I hoped.

This is correct.

It would be helpful if you could find a consistent reproduction of the problem. We could then help determine if it is a bug in restic or an incorrect invocation.

Note also that object storage like S3 uses an “eventual consistency” model where operations may not immediately be applied to storage. The operations return success only when the operation has been committed to a durable queue, not necessarily when the change is visible in the bucket. It’s possible that some deletion was delayed for some amount of time, though this is probably unlikely.

Indeed @cdhowie is right, restic should not delete things from the data directory during backup. However, I can imagine that the client library we’re using for accessing S3 may retry a failed upload, and remove the part of the file that failed before that. Apart from such things it is a bug when restic removes data files during backup.

If you’re interested in how the repository format works, we’ve a design document here: