Hi,
I am trying to restore a deleted file exactly like it was before deletion. I am not able to do it.
This is the file before deletion (ls -lkh)
1.1G -rw------- 1 root root /var/snap/lxd/common/lxd/disks/default.img (1045376 bytes)
This the file after restic restore
31G -rw------- 1 root root /var/snap/lxd/common/lxd/disks/default.img (31457288)
This is the file after restic restore --sparse
1021M -rw------- 1 root root /var/snap/lxd/common/lxd/disks/default.img (1050484)
Is it possible to restore the file like it was before (in terms of disk space)? If yes, how? If not, why?
And my second question, why is the --sparse option not default? Is there a potential problem with that option. Is there a reason I should not add it to all my restore scripts?
Just to clarify, all three files are identical (âsame sha256 checksumâ) except for the space used on the drive.
restic does not handle sparse files (correctly). During backup
, no information about sparseness is saved and the --sparse
option for restore
is nothing but a simply hack to replace empty blobs (=blobs full of zeroes) with a sparse hole in the restored file.
This also means that a non-sparse file containing only zeroes (assumption: large enough) will be restored to a spares file if you use restore --sparse
.
Unless the sparseness information is completely saved by backup
, a file cannot be exactly restored regarding sparseness.
@alexweiss Thanks.
Unless the sparseness information is completely saved by backup
, a file cannot be exactly restored regarding sparseness
How about rustic and sparse files?
[Found the answer in issue #3914. Rustic does not support sparse files either.]
This also means that a non-sparse file containing only zeroes (assumption: large enough) will be restored to a spares file if you use restore --sparse
.
Is there any downside to that? I mean, could this create any issues?
I am wondering now whether restic is the right tool for me. Or are all backup tools like that? Am I expecting too much from a backup tool?
What is the reason why you want to exactly restore the sparse regions of a file? It wonât affect the file content, so itâs purely a non-functional aspect. What is âexactâ enough? Technically, the restored file contents will always be stored at different parts of a disk than before and thus yield a different data layout on disk.
2 Likes
Spending time on this would be counterproductive and with dubious use - restic --sparse
at the moment does what is the best option - restores content bringing sparse original ballooning to the minimum.
What is the reason why you want to exactly restore the sparse regions of a file? It wonât affect the file content, so itâs purely a non-functional aspect.
@MichaelEischer I guess the problem is that with Restic default options a couple of few âinnocent lookingâ files could suddenly take a huge amount of disk space after a restore (see my example 1GB vs 30GB). I would not consider this as a ânon-functional aspectâ even though the content is technically the same.
The work-around for that would be to use the --sparse option, but then all files apparently could become sparse. Apparently that is not a good thing either. Of course we could try to exclude and include sparse files in separate backup/restore runs (but often we donât know which files are sparse in advance, although with a shell script we could find them, but then the hope was that the backup tool would do all that ).
What is âexactâ enough?
Sparse files from the source should be restored as sparse by default. If the restored sparse file is actually a few MBs smaller, that would be âexact enoughâ for me.
Once the backup command detects that a file is sparse, then we can just as well restore it exactly. Thereâs already an issue for that Precise tracking of sparseness information · Issue #3914 · restic/restic · GitHub .
Ok, great. Then there is hope (like with âcompressionâ).
Sure, but it is not necessary, because, as you said already, âitâs purely a non-functional aspectâ. To save time in the implementation, I would go for the simple version (not âpreserving the exact file regionsâ).
I will use this Linux one-liner to find and exclude sparse files
find . -type f -printf "%S\t%p\n" | gawk '$1 < 1.0 {print}'
Although that results in a rather large number files. Just need to find a tool that handles sparse files correctly to backup those files (maybe Duplicacy).
Why does it matter? Or rather, what breaks if you always use --sparse?
1 Like
wirhout --sparse: sparsed file are restored âunsparesâ so they need more space
with --sparse: âunsparsedâ files are restored sparsed so they need less space
in both cases a restore of your backup does not match the original state and this is nothing I would expect from a backup-tool.
â and of course I donât talk about where the bits are stored on the disk.
In my opinion, size, timestamps, content, permissions and maybe more should be the same after a restore.
Yes, with ââsparseâ the restored files are smaller that the originals. Yes, there is a ticket to follow up on that, so if this is the only complaint, then you can stop reading here.
My question still stands: What breaks if you always use sparse? IFAICT nothing, except someoneâs test routine which might compare actual disk usage, as measured by för example âduâ.
A little bit problem is that -sparse
is not default so some users might be caught by surprise. But otherwise spot on.
I asked this question already, but the developer dodged the question
I donât know the answer either, but if there is no problem why arenât all files on your Linux (or Windows or APFS) installation sparse? Why isnât ârestore sparseâ the default in Restic? Why isnât âcp --sparseâ the default? And so on. Common sense suggests âthere is probably a reasonâ.
I saw this in a recent StackExchange question.
IMO people should avoid creating/using sparse files unless they absolutely need them. Such files result in insane amount of FS fragmentation and extra work from the FS driver. There are many more disadvantages and pitfalls. âŠ
⊠all works well if you have >80% free space (looks crazy but thatâs what it is). If youâre under 60% and have lots of files, it all goes downhill fast. ext4 cannot defragment free space, only individual files and fragmentation quickly becomes an insurmountable issue. AFAIK xfs is the only native Linux FS which can defragment everything (files and free space). [1]
[1] [ How transparent are sparse files for applications?](filesystems - How transparent are sparse files for applications? - Unix & Linux Stack Exchange)
One usecase of having non-sparse files: If you want to âreserveâ some space on disc for whatever purpose, you could save an empty non-sparse file which you can delete on-demand. If you are in such a setting and have backup+restore, youâll loose that âspace reservationâ.
Funnily, exactly this option was discussed in this forum as a remedy for repositories being unprunable due to full disks
Related topic: xkcd: Workflow
This also applies to things changed by a backup + restore
I donât think anybody dodged the question, actually. When I read it I thught it was rhetorical. Just go for it!
I tested Duplicacy with sparse files and, out of the box, Duplicacy restores sparse files exactly like there were (âsparse in, sparse outâ, âsame size on diskâ).
I will have to use Duplicacy now too. Unfortunately, because Duplicacy is really âan unpleasant to useâ backup tool. I guess, I can still use Restic for folders which do not have special files, like my âimage and videoâ collection.