Does repo-size not change after repacking?

Hi guys,
i made a small experiment on a random dataset (about 338GB).

  1. Created multiple repositories and created a backup from the random dataset with different pack sizes.
  2. I saw that pack size of 8 is the sweet-spot for the source-data. So i tried to convert the 128Mib packed repository to 8Mib. Command was "–from-repo D:\128\ copy -r D:\128_CopyRepackTo_8 --pack-size 8 --compression max"

Results below:

image

Why is the size of the latest repo called “128_CopyRepackTo_8” not the same as the 8Mib repo ? Is the repacking only splitting the 128Mib files to 8Mib without touching the data again?

Thank you !

A pack file contains a bunch of file chunks that each have a size of 1MB on average (or smaller for small files). For the overall size of the repsitory, it makes therefore no significant difference whether the chunks are grouped in to 8 or 128MB pack files.

Judging from that output, it look like you’re comparing something that completely unrelated to the packsize. If you create independent repositories, then each repo will use a different chunker polynomial, which determines how files are cut into chunks. Apparently the deduplication in your test dataset is very sensitive to how those chunks are cut. For a proper comparison of packsize, create one repository and create all other ones using restic init --copy-chunker-params --from-repo first-repo.

copy does not change how files are cut into chunks, even if copying the chunks into a repository with a different chunker polynomial. That is the reason why 128_CopyRepackTo_8 has the size of 8.