Restore performance recommendations for metadata

flowinh2o · March 24, 2025, 6:49pm

Hello everyone. I am restoring a large dataset of 35 TB with lots of little files and and wondering if it’s possible to speed up the process? At the moment it seems like the data is already in place but the comparision and metadata sync process looks like it take a very long time even though there is very little cpu activitiy.

I am using an s3 compatible backend on OCI with the --pack-size 64 --overwrite if-newer options at the moment. I am running the restore on some very beefy systems with 80 cores and 3TB of memory with a file system that can handle 100K++ iops so could turn up threads if that were an option.

Any suggestions?

flowinh2o · March 25, 2025, 11:17pm

Actually the restore might have been working just fine but since it was in a pod I think it was somehow restarting since it was submitted by the volcano scheduler. I was able to get the restore down into the minutes here but leveraging an argo workflow job and fanning it out over multiple nodes!