SEEK_HOLE is a seek target that asks the kernel to seek to the beginning of the next “hole” after the offset (or back to the start of the current “hole” if the offset indicates a position in the middle of a hole).
http://man7.org/linux/man-pages/man2/lseek.2.html
Not specifically, but the presence of SEEK_HOLE would indicate that some programs likely use this functionality.
To elaborate on my “all or nothing” approach, programs are likely written so that they don’t depend on holes being there – otherwise, they would not work on filesystems without sparse file support. (That is, they should be able to function if a file was copied to and from a filesystem that doesn’t support sparse files, or restored from a backup.)
However, it is probably a very reasonable assumption on the part of many developers that if there are holes, the holes are where the application put them, otherwise SEEK_HOLE would serve no purpose. Restic’s chunking approach means that it’s virtually guaranteed that the holes will not be in the same place.
- Too-small holes won’t get their own chunk and the hole won’t be restored.
- Chunks on the border of a hole, particularly at the beginning of a hole, are likely to be chunked such that some number of zeros are part of a chunk that isn’t all zeroes, causing the restored hole to start too late in the file.
- There may be a runs of zeros in a sparse file, but that run wasn’t a hole, and making that section a hole could confuse the application.
- To a lesser degree, putting holes in a file that wasn’t originally sparse could pose some problems. In particular, it can lead to more file fragmentation when that hole is later filled, as well as an optimistic reporting of free space – as soon as those holes are filled with data, additional volume space is consumed. If a sysadmin is not expecting files to be sparse in the first place, this could cause problems down the line following a restore operation where the sysadmin thought they had more disk space than they needed. (In other words, having a run of allocated zeros not only reduces fragmentation, but reserves the disk space.)
It does not seem like guessing where holes should be put is a good idea. If we want to implement this hack, I would suggest making it opt-in with a flag, and giving ample warnings in the documentation that this flag could break applications.
I submit that it would be a better use of time and resources to work towards a patch that adds real support for sparse files, rather than investing in a hack that has the potential to restore effectively corrupt data (from the perspective of an application).