Restoring from from a mounted repo stored on both Google Cloud Storage and S3 was running at 3.5 to 7.5MB/s on a 1Gbit line. The destination was a RAID array capable of writing data at nearly 10GB/s. I tried a lot of tricks to speed this up. Most of them didn’t work. I’m documenting some of them here in the hopes someone else might avoid any costly experiments.
Convert coldline to standard
The first thing I thought was slowing the process down was the storage class. The blobs were all nearline or coldline. Turns out on GCS the rewrite operation wasn’t working so the workaround was to clone the entire repo into another bucket with standard storage class. This was expensive and did not speed up the restore. Would not recommend.
Proxy data on a fast cloud box
I provisioned a compute instance with a high-performance SSD attached, large enough to hold the entire repo and then copied the repo over. I then tried the restore from that location. This was also expensive and did not increase performance. Would not recommend.
Copy the repo locally
There was enough extra room on the RAID array that I figured why not copy the entire kit and kaboodle down and restore from there. Not only was this a very very slow operation, it took a lot of retries for individual files. This was the most disappointing effort because of all the extra time it added to the operation. In the end, mounted restore speeds were only marginally faster, and that’s when the multiple concurrent rsync operations didn’t all hang because the RAID server was both the source and destination for the restore. I’d often find all the rsyncs hung because one of them was missing a single blob file. Would only recommend if the source and destination are different filesystems, and the source has fast read access.
Multiple concurrent rsyncs from cloud
In the end I went back to basics. I had already broken the restore down into several scripts that would run rsyncs on different parts of the restored filesystem. I can run them all in parallel and they all run at the original rates from above. The individual restore speed is no faster, but the aggregate restore is much better. Mounting directly from the cloud proved to be the most effective because pulling multiple blobs at random is no sweat for the cloud. In the end, this is the most effective method I could find. If I had to do this over again, I’d write a script that would carve up the final filesystem and automatically generate any number individual rsync scripts to run in parallel.
Hope this helps someone down the line.