Restic check --read-data --read-data-subset 1/12

kitwo · March 2, 2020, 3:32pm

Hiya,

I tried to get my head around this by reading the various forum posts out there:

If I’ve understood the command correctly:

restic check --read-data --read-data-subset 1/12

Will divide my 2TB worth of data into 12 sections, and read the 1st section for integrity. If I ran my script once every week, then it should have checked my data every 3 months.

The part that confuses me, is if the next section is i.e 2/12, how would restic know to automatically check the next section, when I run the same command?

I don’t see how, but wanted to double check if I need to increment the section/‘n’ each week, or is it normal to only check a portion of the data each time i.e 1/12?

Many thanks,

kit

rawtaz · March 2, 2020, 4:18pm

Which of the 12 sections restic checks is determined by the first number you give it in the n/t argument. If you give it 1/12 it will check the first section, if you give it 2/12 it will check the second section, etc. So it’s not automatic, you explicitly tell it which section to check (and how big the sections should be). See the bottom part of Working with repositories — restic 0.16.3 documentation .

kitwo · March 4, 2020, 7:21am

Thank you Rawtaz,

Thank you clarifying that, would you be able to suggest whether it’s sufficient to run the check --read-data-subset n/52, and have this run each week. I figured it would read most of the data each year, excluding new data that has been added.

Or should all of the data be read say every 3 months?

I opted to check it each week for 52 weeks, as I found a simple way to work out the week number, but didn’t find anything on incrementing each week for say n/12 sections. Would appreciate any pointers on that.

Kind regards,

Kit

764287 · March 5, 2020, 8:42am

There are no general rules for things like that, it always depends on your usecase. You need to ask yourself some questions:

How important is the data that you are backing up?
How much is changing?
Would check be interrupting your scheduled backups (as it requires an exclusive lock) too much?
Will it be catastrophic to find out that some hardware failure made the last 3 months of your backups (or parts of it) useless? Or do you have a 2nd set of independend backups anyways?

kitwo · March 5, 2020, 11:25am

Ok, I’ll have a think about that,

many thanks,

Kit

ifedorenko · March 5, 2020, 4:18pm

FWIW, here is the bash script I use to do rolling check of all pack files over 30 days period

# failure email config
NAME=$(hostname)
EMAIL=...

# all data is checked over CHECK_PERIOD number of days
# current day calculated as days since 1970-01-01 modulo $CHECK_PERIOD
# (date +%s returns seconds since 1970-01-01 00:00:00 UTC)
CHECK_PERIOD=30
CHECK_DAY=$(( 1 + ($(date +%s) / 86400) % $CHECK_PERIOD ))

LOG=/var/log/restic-backup.log
RESTIC=/usr/local/bin/restic

printf "\nrestic check data start $(date +%Y%m%d_%H%M%S)\n" >> $LOG
$RESTIC \
  --cache-dir /var/cache/restic \
  check --with-cache --read-data-subset=$CHECK_DAY/$CHECK_PERIOD >> $LOG 2>&1

if [ $? != 0 ]; then
  mail -s "$NAME backup check failed" $EMAIL <<EOF
...
$(tail /var/log/restic-backup.log)
EOF
  exit
fi

kitwo · March 7, 2020, 3:57pm

Thank you for sharing this, really appreciate it