Hi,
I have two different restic jobs that run against the same repository. The first one reads a file from stdin and adds it to the repo. The second scans a file system and adds its data. They require a significantly different amount of times and I wonder why that is.
Some basic information: this is restic 0.15.1 compiled with go1.19.5 on linux/amd64, the jobs run on Kubernetes pods and don’t have persistent storage to keep a cache. The repository is on Hetzner Object Storage (which claims to be S3 compatible, running on Ceph and could have some performance issues). It has around 6k snapshots from multiple “hosts” which all run these two jobs. The repo size itself is only a few GBs.
The first execution is something along the line:
restic backup --verbose \
--host "${SOME_CONSTANT_HOST_FOR_BOTH_JOBS}" \
--tag "${SOME_CONSTANT_TAGS_FOR_THIS_JOB}" \
--stdin --stdin-filename "/b-mariadb-db.sql" </mnt/shared/fifo
and outputs (with added times):
2026-01-30T13:02:50.828473441Z open repository
2026-01-30T13:02:52.134840301Z lock repository
2026-01-30T13:02:52.541555554Z load index files
2026-01-30T13:02:52.970665666Z read data from stdin
2026-01-30T13:02:52.970708581Z start scan on [/b-mariadb-db.sql]
2026-01-30T13:02:52.971071269Z start backup on [/b-mariadb-db.sql]
2026-01-30T13:02:52.971092412Z scan finished in 0.837s: 1 files, 0 B
2026-01-30T13:02:55.206683755Z
2026-01-30T13:02:55.206725267Z Files: 1 new, 0 changed, 0 unmodified
2026-01-30T13:02:55.206731890Z Dirs: 0 new, 0 changed, 0 unmodified
2026-01-30T13:02:55.206737498Z Data Blobs: 2 new
2026-01-30T13:02:55.206742521Z Tree Blobs: 1 new
2026-01-30T13:02:55.206747296Z Added to the repository: 3.412 MiB (798.876 KiB stored)
2026-01-30T13:02:55.206752028Z
2026-01-30T13:02:55.206756982Z processed 1 files, 32.582 MiB in 0:03
2026-01-30T13:02:55.206774517Z snapshot 90ba907e saved
The whole job is done in ~5s. It takes restic around 2s to get from open repository to load index files.
The other job runs with something like:
restic backup --verbose \
--host "${SOME_CONSTANT_HOST_FOR_BOTH_JOBS}" \
--tag "${SOME_CONSTANT_TAGS_FOR_THIS_JOB}" \
/mnt/strg/
and outputs:
2026-01-30T13:03:13.461515792Z open repository
2026-01-30T13:03:14.485344783Z lock repository
2026-01-30T13:17:20.641713149Z using parent snapshot c752c167
2026-01-30T13:17:20.641774042Z load index files
2026-01-30T13:17:21.261648763Z start scan on [/mnt/strg/]
2026-01-30T13:17:21.261684731Z start backup on [/mnt/strg/]
2026-01-30T13:17:21.261732099Z scan finished in 846.777s: 12 files, 62.730 MiB
2026-01-30T13:17:24.598012735Z
2026-01-30T13:17:24.598085190Z Files: 0 new, 12 changed, 0 unmodified
2026-01-30T13:17:24.598107641Z Dirs: 0 new, 4 changed, 0 unmodified
2026-01-30T13:17:24.598117881Z Data Blobs: 0 new
2026-01-30T13:17:24.598127829Z Tree Blobs: 5 new
2026-01-30T13:17:24.598138729Z Added to the repository: 8.041 KiB (2.946 KiB stored)
2026-01-30T13:17:24.598149700Z
2026-01-30T13:17:24.598160460Z processed 12 files, 62.730 MiB in 14:10
2026-01-30T13:17:24.598171942Z snapshot f379d51c saved
Total execution time is almost 15min and restic spends most of the time between open repository and load index files. Due to the lack of cache I guess it has to fetch everything from the backend.
I am assuming it reads all snapshots to get to using parent snapshot which it doesn’t do in case it reads from stdin. I wonder if it is worth not searching for the parent at all. Is that even possible? What could happen? restic has to scan and hash all files in the directory again? Before uploading, would it notice that the object is already there and skip the upload?
Any advice would be very appreciated.
Best regards,
Frank.