Feature request: --sort-by

Something like: --sort-by [mtime|size|path|ext][,asc|desc].

This option could take one or an evolving combination of forms:

  • A single option.
  • Multiple hierarchical options, e.g.:
    • Sort first by extension, then by size
  • Multiple option groups, e.g.:
    • Sort first by age groups increasing by 10x (e.g. 1-10 days old, 10-100 days, 100-1000 days, etc.)
    • Then within each of those groupings, by size

Based on this comment, I understand this wouldn’t be easy, and isn’t how Restic was designed. It appears that Restic simply walks the directly tree, and backs up files as it encounters them. (?) This also appears to be how most FLOSS backup solutions work - but not how most commercial all-in-one “backup product and cloud storage” solutions, like Crashplan or Carbonite (which backup newest first…actually Crashplan does some combination of newest and smallest first).

My rationale (I suspect I’m not alone):

  • The older my data gets, the more ways I already have it backed up, both locally and in the cloud. I really don’t want to waste time, space, and money backing up stuff first that’s already backed up 20 ways to Sunday.
  • Most of my data is organized in folder structures like “…/YYYY/YYYYMMDD/…”. With quasi-alphabetical directory crawling, my oldest stuff would get backed up first.
  • Not even considering those factors, my newest data is simply more important. (E.g. active working projects that generate income. You’re more likely to get audited for recent tax years than past. Etc.)
  • With 7TB of data to back up, and growing ~1.5x per year, I can’t wait 1-2 years to back up my most recent changes to cloud storage!

I realize I can get creative with things like an ever-updating configuration of mount --bind and exclude options, to get small blocks of newest stuff backed up first. And I could live with that. But I’d really like a “set-and-forget” solution that kept my newest stuff backing up. That’s one of the many things I loved about Crashplan, backing up newest first. (Among all of the awful things, like the rut of disastrous business decisions Code42 seems to be stuck in lately which has me convinced they won’t survive much longer.)

How does Crashplan do it? It does a full filesystem scan once a day (or however frequently you define). In addition to that, it subscribes to filesystem change notifications.

Note that, to some extent, you can do this yourself (assuming a Unixy environment) using e.g. find -type f -mtime -30 to only list files that were last modified in the last 30 days, and combine this with restic backup --files-from.

1 Like

Bingo! That’s totally the solution. I’ve already been working on a separate solution for generating --files-from style output (for rsync but would also work for this), from an input definition file of include & exclude regexes and other declarations - basically the first bullet points section at the top of this post. Thanks a million.