Hi!
I just tested restic on some data, I’d like to share the results and discuss some observations.
Setup
restic version: restic 0.16.3 compiled with go1.21.6 on linux/amd64
data: 121191 files, 3.869 TiB, mostly video and photo files
source machine: Ubuntu 22.04.3, ZFS, compression on (zstd, default), dedup off
destination machine: Synology NAS, ext4
The source machine connects to the destination machine via SFTP through a 1000Mbps network.
The repo is initialized without any parameters.
Results
The backup took about 42.5 hours to finish.
restic reported “3.655 TiB added to repo, 3.531 TiB stored on disk”.
With ncdu I manually measured that the source file (on ZFS) has apparent size 3.8TiB and on-disk size 3.6TiB. (The apparent size is different from the number reported by restic. I am not sure why, but I guess there might be a bug in my exclude filters)
There were 220568 files in the repo.
Observations
If I read it correctly, dedup saved ~200GiB (3.869TiB - 3.655 TiB) data, which surprised me.
Meanwhile, the final repo size on disk (~3.5TiB) is similar to the original data size on disk (~3.6TiB). I didn’t check whether ZFS and restic are using the same compression level, but I expected the restic repo size is smaller, because dedup already saved quite some bytes.
It also seems that the system was not fully utilized during the backup. I was not able to determine the bottleneck. I have checked that
- The CPU usage is low on both machines
- On the source machine there are 8 cores, but restic was using at most 200%.
- The ssh process also used ~10% CPU
- I was using a customized
sftp.command
, basicallyssh
with specific private key.
- I was using a customized
- Network load was ~50MB/s
- If I had to guess, I’d say the bottleneck is the network, but I have no idea.
- Disks were not fully utilized, I knew the HDD on both machines can handle at least 100MB/s
Lastly, it seems that restic became completely idle from time to time.
The screenshot below shows CPU and Network usage.
Is it expected?