Does check --read-data use RAM or I/O?

The docs are not clear. If I do restic check --read-data 100 and my repository is S3, does restic download and check my repo using my local hard disk or local RAM? Meaning, is I/O being wasted downloading and checking files or is it all being done with RAM?

I don’t know the answer, but I expect it could be found out easily enough by running a restic check –read-data and then doing iotop -p <pid of restic> and watching the output.

IIRC, check –read-data reads the data blobs directly from the backend and checks them in-memory.

(Note that tree packs are read from the backend and stored in the cache dir - unless –no-cache is used - if they are not yet present there. But this is the wanted behavior :wink: )

This is what Gemini says:

The short answer is no, restic check --read-data does not write the backup data to your local disk (SSD/HDD) to process it.

It performs “streaming” verification. Here is exactly how that impacts your local storage:

1. No “Staging” of Data (Zero Write Load)

When restic downloads a 500MB pack file from your repository (cloud or external drive) to check it, it does not save that file to a temporary folder on your hard drive.

  • The Flow: Repository $\rightarrow$ Network/USB $\rightarrow$ RAM $\rightarrow$ CPU (Hash Calculation) $\rightarrow$ Discard.

  • The Benefit: You do not need free disk space equivalent to your backup size to run a check. You also avoid “wearing out” your local SSD with massive write operations.

2. The Exception: Metadata Cache (Minor I/O)

While it doesn’t write the content (your photos, documents) to disk, it does write the metadata (indexes) to your local cache directory (usually ~/.cache/restic).

  • What this is: Small files that tell restic where your data is located.

  • Impact: This results in very minimal local disk I/O—typically a few megabytes or gigabytes of writes, even for massive repositories.

3. Scenario: What creates the I/O?

The specific “I/O” load depends entirely on where the repository lives:

  • If Repo is Remote (Cloud/SFTP): Your local disk I/O is near 0%. The bottleneck is your network.

  • If Repo is on USB/NAS: The external drive will experience 100% Read I/O. Your internal OS drive will sit idle.

Summary

You can safely run --read-data on a laptop with a small SSD without worrying about filling up the drive or causing heavy wear and tear.

The answer from Gemini is somewhat usable although rather blurry on the details. To give a bit more details on the cache behavior:

restic check --read-data creates an empty, temporary cache and downloads all files necessary from the repository. Each file is only downloaded a single time. The files in the index folder and a subset of files from the data folder are stored in the newly created cache. All other files are only processed in memory. The files in the cache maybe be accessed in a rather random pattern while processing metadata, but in total everything in the cache is also read once.

The size of the temporary cache will in the end normally have reached a similar size as the cache use for other restic operations.

All non-metadata files are directly processed in RAM and not written to disk, so no disk I/O happens for them only network I/O.

(If you specify --no-cache then metadata may be downloaded multiple times and in a very large number of requests).

2 Likes