Huge amount of data read from S3 backend

alexweiss · July 4, 2020, 11:58am

Note that forget only accesses/removes comparably small files under snapshot and those are usually contained in your local cache. If you specify snapshots to remove directly, it will just remove those snapshot files. Else it reads all snapshot files to determine which one to delete.
As a rule of thumb: forget does not need much traffic

Another thing is if you issue --prune to forget. This is just a shortcut to running forget and then prune.

prune actually always causes much traffic for remote repositories.

No, it uses only metadata which is usually also contained in your local cache.

If you don’t specify --read-data it also only uses metadata that is cached. The default option, however, is not to use the cache which means that this metadata will be downloaded from a remote repository. You can specify --with-cache.
If you use --read-data (or --read-data-subset) it will have to download much data from your remote repository.