While reading issue #2679 I learned that Restic assembles all pack files in a temporary folder, thus practically piping all data through a folder that typically resides on the system disk on Windows systems. In my case the system disk is a relatively small SSD partition (around 128G) while the data being backed up is on a much larger spinning drive (4TB). If all the data first goes to the SSD, that may cause some wear on the SSD.
So my questions are:
Does Restic work as I assume?
How can I find out if those temp files (User\AppData\Local\Temp\restic-temp-pack-[…]) are really written to disk or if Windows manages to cache them in RAM? I checked in Windows File Explorer and these files have a size > 0 at times. But that does not mean they are written to disk, does it?
Does anyone use a RAMDISK as temp folder for restic?
Would it be reasonable/possible to change Restic code to assemble pack files in RAM right away?
There may be cached writes but the data will eventually be written, yes. The filesystem layer has no idea how a file will be used and will never store a file only in RAM, though it may keep the contents in RAM to supply read requests without having to consult the disk.
Maybe. Keep in mind that restic’s memory usage is already quite high on large repositories, so putting more stuff into RAM may not be ideal.
You should be able to set TEMP to control which directory restic uses.
Windows actually has the option to keep files marked with a special temporary flag (FILE_ATTRIBUTE_TEMPORARY) in RAM as long as possible before writing it to disk (and avoid it completely if the file is deleted shortly after closing). The temp file mechanism of Go, however, afaik does not set this flag. However, there might be some windows magic happening such that the temporary files are still primarily kept in RAM.
For the current pack size of around 8MB it would probably work to keep everything in memory. However, if the pack size is increased in the future, then the memory usage will definitively become a problem.
Let’s say I use a ramdisk. It’s necessarily a software I do not trust particularly much because I have not used ramdisks in years. Let’s say that ramdisk software has some type of bug that causes it to randomly flip bits. Would Restic detect that change in its temp files?
I saw a hash being created in the code but could not find the place where it’s checked.
The hash is calculated while writing the temporary pack file (line 70 and 98 in the packer_manager), for uploading the file is just read from disk and uploaded (line 101 and 106).
If a ramdisk damages the data sometime between writing and upload the pack, then a damaged file will end up in to backend.
Did you check whether the temporary pack files are actually written to disk using the task manager or resource manager?
In the meantime I checked the “host writes” with Crystal DiskInfo on the SSD before and after a restic backup. It did not increase. So it seems that the temp pack files are not actually written to disk. I checked with Explorer that the files were actually shown in the Windows temp folder on the SSD during backup. For comparison I copied a few GB directly to the SSD and the host writes increased. The measurement seems to be accurate by 1 GB.
So there does not seem to be an issue, at least with the configuration I use (Win 10 v19.03, Crucial SSD).
If you have used Linux then you have used a ramdisk on a regular basis. Most distros provide a ramdisk in /dev/shm that is used by the system for various things. On many distros, /tmp itself is a ramdisk.
With all due respect, this is FUD. Such a bug could also exist in any filesystem driver and silently corrupt data on your disk.
It would be better to run tests for yourself to determine the reliability of your storage software than to opine about possible reliability issues in theoretical software you aren’t even using.