Restic never finds a parent snapshot

I’ve recently started using restic and things after some teething problems have settled in nicely. However using the v=2 flag gives me lots of info and one of the things that I’ve been noticing for every snapshot is this line:

no parent snapshot found, will read all files

I looked at various posts regarding parent snapshots and this one :

seemed to have a list of things by fd0 that I could check for, which I’ve done and everything seems to be fine.

My op sys is ubuntu 16.04 in a small Rackspace Server and I’m only backing up 2 directory parths:

/data/www/*
/home/nexargi/GI/mysql/*

which are in an include file.

My server name hasn’t changed.

I’m running restic via a cron job and a bash shell which has a number of variables but essentially the restic backup command is in the following format:

restic backup -v -v --tag main --files-from ${rbkup}restic-includefiles.txt --repository-file $rrepo --password-file ${rbkup}rchuparustom

Below is the output from a cat snapshot of the last snapshot which seemed like some information that may prove useful:

…/restic/restic --repository-file restic-repofile.txt --password-file rchuparustom cat snapshot 4547713d
repository 5e1dceea opened successfully, password is correct
{
“time”: “2021-05-02T03:00:01.329523007Z”,
“tree”: “5c353287007b9801afd20773ca22fa27d777561ebbd74e9c78ecaf574d7d82ba”,
“paths”: [
“/data/www/gi”,
“/data/www/gi20140224_1612.tar.gz”,
“/data/www/gi20150126.tar.gz”,
“/data/www/r5”,
“/data/www/r5.tar.gz”,
“/data/www/r5_old”,
“/home/nexargi/GI/mysql/cronerrlog.txt”,
“/home/nexargi/GI/mysql/cronlog.txt”,
“/home/nexargi/GI/mysql/cronlog20170326.txt”,
“/home/nexargi/GI/mysql/cronlog20170327.txt”,
“/home/nexargi/GI/mysql/cronlog20170328.txt”,
“/home/nexargi/GI/mysql/cronlog20170329.txt”,
“/home/nexargi/GI/mysql/cronlog20191225.txt”,
“/home/nexargi/GI/mysql/cronlog20210425.txt”,
“/home/nexargi/GI/mysql/cronlog20210426.txt”,
“/home/nexargi/GI/mysql/cronlog20210427.txt”,
“/home/nexargi/GI/mysql/cronlog20210428.txt”,
“/home/nexargi/GI/mysql/cronlog20210429.txt”,
“/home/nexargi/GI/mysql/cronlog20210430.txt”,
“/home/nexargi/GI/mysql/gidatadump.sh”,
“/home/nexargi/GI/mysql/giproddata20140222_BeforeDelTrialdatabase.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20140222_NewServerBeforeStart.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20140222_OldServerFinal.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20170326.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20170327.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20170328.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20170329.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20170426_1616.sql”,
“/home/nexargi/GI/mysql/giproddata20170426_1616.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20180307_1810.sql”,
“/home/nexargi/GI/mysql/giproddata20180307_1810.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20191225.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20210426.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20210427.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20210428.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20210429.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20210430.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddata20210501.sql.tar.gz”,
“/home/nexargi/GI/mysql/giproddatatest.sql”
],
“hostname”: “server-02”,
“username”: “nexargi”,
“uid”: 1001,
“gid”: 1001,
“tags”: [
“main”
]
}

Currently my backup needs are small so reading each file again is not a major issue but I’m hoping to ramp things up and would rather resolve this issue now.

All help gratefully received with thanks.

Using the --files-from with changing input file prevents restic from choosing a parent. Why don’t you simply give restic these paths to backup?

Note that there is also

which would allow you to manually change the paths (and thus allowing restic to find a parent).

@alexweiss thanks very much for your prompt response. You said:

Using the --files-from with changing input file prevents restic from choosing a parent

But I’m not changing input file. It’s the same input file. The reason I’m using an input file is because longer term I’m going to have MANY MORE paths and so I was writing a script that would allow me to change things in the file without having to update the script all the time.

I could understand if my paths inside the input file were changing but currently they are not.

Hope I’m understanding you correctly and also answering appropriately.

I’m also wondering what the difference is between --files-from and --set-paths-from? I must admit I hadn’t looked at --set-paths-from.


Looking at the PR for --set-paths-from it would appear that it still has not been merged into the main branch. Given my relative inexperience in this arena I really can’t afford to take anything other than what the main branch provides.

Thanks very much though, for your efforts.

1 Like

Sorry for misleading you about your --files-from option. Of course this is equal to just putting all paths to the command line.

In your case the problem is that

/data/www/*
/home/nexargi/GI/mysql/*

resolves the paths to all entries within this dirs. So if files or dirs within those two dirs change, you have changed backup paths and restic will not find parent snapshots.

Try using

/data/www/
/home/nexargi/GI/mysql/

instead. (The same could be achieved using the PR above, but you are right this is still experimental)

1 Like

@alexweiss Aaah I see the problem about having * at the end of the path. I hadn’t realised the significance although for all intents and purposes surely the path with and without the * should be considered equal. Perhaps a Change Request?

These are not at all equal. The one without the * means “that folder”, and the one with * means “all the things in that folder”. So in this context of restic, the former resolves to one item, and the latter resolves to multiple items, which can also the change, as @alexweiss pointed out. So there’s nothing to fix here, this is something one has to manage, and it’s the same regardless of using restic or some other software.

2 Likes

@rawtaz by ‘that folder’ restic takes to mean everything in that folder, which is also what ‘*’ means.

I truly am not understanding the subtle difference here but since not having the ‘*’ works I’m happy to leave it at that.

I would however request that the documentation perhaps clarifies that a bit better.

It’s quite simple: The * means “all items”, so when you put that into the context of foo/* that means “all items in the foo/ directory”. While at the same time foo/ only means “the foo directory”.

I understand that, but at the same time you have to understand that this is absolute fundamentals of regular path management. This is not something that has to do with restic in any way, this is about syntax that is dictated by your shell and how it lets you specify paths and glob/match paths. We cannot turn the restic documentation into an education platform for basics of using one’s shell and other similar basics, that’s why we don’t add things like this. It would be quite large and take focus from the restic-specific things. FWIW we already have a bit of non-restic specific information in there, but there’s a limit to how “low” or generic we can and should go.

1 Like

@rawtaz Sure I understand. Thanks for the explanation.

1 Like

Thanks. And I apologize if I came across as rude or similar, re-reading my text it might seem a bit condescending, that wasn’t my intention.

1 Like

@rawtaz No problem. Always appreciate your comments. :slight_smile:

tl;dr: I think the confusion lies in using wildcards to specify list of files/dir paths to back up and not realizing that restic uses the expansion of those paths to group sets of related backups.

This is kind of long, but I’m hoping to provide clarity through verbosity.

Given following files in my home directory foo:

$ find foo -type f
foo/eep/op
foo/flim/flam
foo/bar/blargh
foo/bar/boo
foo/bar/baz

Because restic backs up recursively, if I want to back up all of foo/bar I only need to specify ~/foo/bar:

$ restic backup -v -r ~/repo/test ~/foo/bar
open repository
enter password for repository:
repository e55f0b71 opened successfully, password is correct
created new cache in /Users/jdwhite/Library/Caches/restic
lock repository
load index files
start scan on [/Users/jdwhite/foo/bar]
start backup on [/Users/jdwhite/foo/bar]
scan finished in 0.219s: 3 files, 15 B

Files:           3 new,     0 changed,     0 unmodified
Dirs:            4 new,     0 changed,     0 unmodified
Data Blobs:      1 new
Tree Blobs:      5 new
Added to the repo: 2.550 KiB

processed 3 files, 15 B in 0:00
snapshot 1206987f saved

Note the snapshot ID, 1206987f.
Now I’m going to add a file in foo/bar called boink and do the backup again:

$ restic backup -v -r ~/repo/test ~/foo/bar
open repository
enter password for repository:
repository e55f0b71 opened successfully, password is correct
lock repository
load index files
using parent snapshot 1206987f    <=== Note this snapshot matches output above.
start scan on [/Users/jdwhite/foo/bar]
start backup on [/Users/jdwhite/foo/bar]
scan finished in 0.229s: 4 files, 22 B

Files:           1 new,     0 changed,     3 unmodified
Dirs:            0 new,     4 changed,     0 unmodified
Data Blobs:      1 new
Tree Blobs:      5 new
Added to the repo: 2.913 KiB

processed 4 files, 22 B in 0:00
snapshot 11bd4d5d saved

Now I list snapshots:

$ restic snapshots -r ~/repo/test
enter password for repository:
repository e55f0b71 opened successfully, password is correct
ID        Time                 Host                      Tags        Paths
-------------------------------------------------------------------------------------------
1206987f  2021-05-03 17:19:54  caprica.home.menelos.com              /Users/jdwhite/foo/bar
11bd4d5d  2021-05-03 17:24:34  caprica.home.menelos.com              /Users/jdwhite/foo/bar
-------------------------------------------------------------------------------------------
2 snapshots

Hopefully this is as expected.
Now I’m going to delete foo/bar/boink and backup the same files again, but to test repository and using the wildcard suffix to specify paths:

$ restic backup -v -r ~/repo/test2 ~/foo/bar/*
open repository
enter password for repository:
repository cb51bdf5 opened successfully, password is correct
created new cache in /Users/jdwhite/Library/Caches/restic
lock repository
load index files
start scan on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boo]
start backup on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boo]
scan finished in 0.218s: 3 files, 15 B

Files:           3 new,     0 changed,     0 unmodified
Dirs:            4 new,     0 changed,     0 unmodified
Data Blobs:      1 new
Tree Blobs:      5 new
Added to the repo: 2.550 KiB

processed 3 files, 15 B in 0:00
snapshot 034b927e saved

Note the status on Files/Dirs/Blobs/processed: they’re the same as when I didn’t specify the wildcard.
I backed up the same data, same number of bytes.

What’s different is what was explicitly scanned for – compare the start scan on and start backup on lines between the first backup with no wildcard versus wildcard.

No wildcard:

start scan on [/Users/jdwhite/foo/bar]
start backup on [/Users/jdwhite/foo/bar]

Wildcard:

start scan on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boo]
start backup on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boo]

Again, as in the non-wildcard test, I’m going to add a file in foo/bar called boink and backup again:

$ restic backup -v -r ~/repo/test2 ~/foo/bar/*
open repository
enter password for repository:
repository cb51bdf5 opened successfully, password is correct
lock repository
load index files
start scan on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boink /Users/jdwhite/foo/bar/boo]
start backup on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boink /Users/jdwhite/foo/bar/boo]
scan finished in 0.272s: 4 files, 22 B

Files:           4 new,     0 changed,     0 unmodified
Dirs:            4 new,     0 changed,     0 unmodified
Data Blobs:      1 new
Tree Blobs:      5 new
Added to the repo: 2.913 KiB

processed 4 files, 22 B in 0:00
snapshot e17f3729 saved

Note the absence of the line “using parent snapshot” line. Also note the difference in the start scan on and start backup on lines between the two backups using the wildcard suffix – the path list differs.

First wildcard backup:

start scan on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boo]
start backup on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boo]

Second wildcard backup:

start scan on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boink /Users/jdwhite/foo/bar/boo]
start backup on [/Users/jdwhite/foo/bar/baz /Users/jdwhite/foo/bar/blargh /Users/jdwhite/foo/bar/boink /Users/jdwhite/foo/bar/boo]

Looking at the snapshots for the second test repo:

$ restic -r ~/repo/test2 snapshots
enter password for repository:
repository cb51bdf5 opened successfully, password is correct
ID        Time                 Host                      Tags        Paths
--------------------------------------------------------------------------------------------------
034b927e  2021-05-03 17:32:41  caprica.home.menelos.com              /Users/jdwhite/foo/bar/baz
                                                                     /Users/jdwhite/foo/bar/blargh
                                                                     /Users/jdwhite/foo/bar/boo

e17f3729  2021-05-03 17:41:22  caprica.home.menelos.com              /Users/jdwhite/foo/bar/baz
                                                                     /Users/jdwhite/foo/bar/blargh
                                                                     /Users/jdwhite/foo/bar/boink
                                                                     /Users/jdwhite/foo/bar/boo
--------------------------------------------------------------------------------------------------

Note that difference in Paths. 034b927e is not a parent of e17f3729 because the Paths are not the same.

Compare with the non-wildcard repo:

$ restic -r ~/repo/test snapshots
enter password for repository:
repository e55f0b71 opened successfully, password is correct
ID        Time                 Host                      Tags        Paths
-------------------------------------------------------------------------------------------
1206987f  2021-05-03 17:19:54  caprica.home.menelos.com              /Users/jdwhite/foo/bar
11bd4d5d  2021-05-03 17:24:34  caprica.home.menelos.com              /Users/jdwhite/foo/bar
-------------------------------------------------------------------------------------------
2 snapshots

1206987f is a parent snapshot of 11bd4d5d and the Paths are the same.

Note: I know one is a parent of the other from the restic backup output, not the restic snapshots output. I don’t think you can assume that from the restic snapshots output alone, but I may be mistaken.

Again, the same files are being backed up whether you specify the wildcard or not but because restic is snapshot based if the paths differ from an existing backup then it’s not considered a parent backup.

@nexar - after your initial backup when you added new files to a directory that is specified as a backup path (in your case by using a wildcard) you’re changing the explicit path list. When that list doesn’t match a previous backup’s Path list then there is no parent.

In the end I think the confusion lies in using wildcards to specify list of files/dir paths to back up and not realizing that restic uses the expansion of those paths to group sets of related backups.

I may be over-simplifying things here, or just plain wrong. I’m sure someone will correct me if I am.

Hope this helps,
-Jason

1 Like

It’s not restic doing it. It has nothing to do with restic. It’s just basics of how the shell works.

Here’s a simple (if not the simplest) example showing that it’s the shell that expands the *, as per the common globbing rules:

$ echo foo2/
foo2/

$ echo foo2/*
foo2/apa.txt foo2/blah.txt foo2/peepdf_0.3.zip

Notice how in the second example, the shell expands the additional * to all files in the foo2/ directory.

The same happens when you specify foo2/* or similar on the command line when calling restic - the shell expands the foo2/* and the result is what restic sees and is thereby asked to back up.

To clarify even further; The command restic backup foo2/* would with the above example of files result in the equivalent of restic backup foo2/apa.txt foo2/blah.txt foo2/peepdf_0.3.zip because the shell expands the * before executing the command. As opposed to restic backup foo2/ which would just back up the foo2/ directory.

There are several articles about this on the Internet, here’s one: Bash Globbing Tutorial

Yes, shell globbing is shell globbing. What I was hoping to make clear is that by using wildcards the path list can differ between backups – and did the case that started this thread – and that having differing paths between backups means restic won’t form a parent/child relationship between them.

I did not mean to say that restic was doing the expanstion, which is why I said restic uses the expansion

1 Like

Yeah, that’s what @alexweiss said in Restic never finds a parent snapshot - #4 by alexweiss .

1 Like

Guys I am truly grateful for the detailed explanations. In my simple mind it was the end result that mattered i.e. /foo/ and /foo/* end up backing up the same thing and hence I couldn’t understand why a parent was not being found. However I am now a lot more educated because of your explanations.

Again Thanks very much.

4 Likes