With restic dump -a zip
, is the dump file actually compressed, or is it just “stored” like a tarball? If it’s just “stored”, is there any way to use compression? I mostly want this for Windows users, when I dump their data from my repository either into OneDrive or an external drive for them. For Mac / Linux it’s easy enough to just pipe the output to zstd or xz.
Judging from the code, data should just be “stored”. But you might want to just dump some data and check the used compression method. It should be easy to patch though:
diff --git a/internal/dump/zip.go b/internal/dump/zip.go
index e5ef5c95b..b4340d909 100644
--- a/internal/dump/zip.go
+++ b/internal/dump/zip.go
@@ -37,6 +37,7 @@ func (d *Dumper) dumpNodeZip(ctx context.Context, node *restic.Node, zw *zip.Wri
Name: filepath.ToSlash(relPath),
UncompressedSize64: node.Size,
Modified: node.ModTime,
+ Method: zip.Deflate,
}
header.SetMode(node.Mode)
Awesome, thanks! I checked with 7-zip on a Windows box and it did indeed report a compression ratio of 100%.
Yes! But, it works only for dumping a folder, when I dump a file, restic doesn’t comress yet.
So I never did get around to trying this. It came up again recently when dumping a rather large user profile for someone.
I just added the Method: Deflate
line and it worked fine, albeit pretty slowly. I can see why it wasn’t on by default! I did some research, figured out how to make a custom compressor, and used flate.BestSpeed
which worked much faster with decent results.
uncompressed: 125.20GB / 10:35.72
deflate: 93.87GB / 53:35.04
flate.BestSpeed: 96.62GB / 16:07.70
When a user loses a significant amount of data, I compress it into a tar.gz or tar.xz for Mac users, use zip for Windows users, then upload to OneDrive. I typically use an external compressor for zip, to expedite the upload/download process. The scientists I work with often have highly compressible data. However, using an external compressor with restic mount
can be slow and sometimes fails (especially with special characters, which my scientists love to put in filenames). This patch should allow for an easier and more reliable transfer process.
Here’s the code I wrote for flate.BestSpeed. As far as further increasing speed goes, I’m interested in trying to implement multithreaded zip support, or perhaps trying something like saracen/fastzip.
Disclaimer: I do NOT know Golang, and am totally winging this after some googling. I’m shocked it compiled at all, honestly
internal/dump/zip.go
package dump
import (
"archive/zip"
"compress/flate"
"context"
"io"
"path/filepath"
"github.com/restic/restic/internal/errors"
"github.com/restic/restic/internal/restic"
)
func fastCompressor(w io.Writer) (io.WriteCloser, error) {
return flate.NewWriter(w, flate.BestSpeed)
}
func (d *Dumper) dumpZip(ctx context.Context, ch <-chan *restic.Node) (err error) {
w := zip.NewWriter(d.w)
defer w.Close()
// Register custom compressor without error handling
w.RegisterCompressor(zip.Deflate, fastCompressor)
for node := range ch {
if err := d.dumpNodeZip(ctx, node, w); err != nil {
return err
}
}
return nil
}
func (d *Dumper) dumpNodeZip(ctx context.Context, node *restic.Node, zw *zip.Writer) error {
relPath, err := filepath.Rel("/", node.Path)
if err != nil {
return err
}
header := &zip.FileHeader{
Name: filepath.ToSlash(relPath),
UncompressedSize64: node.Size,
Modified: node.ModTime,
Method: zip.Deflate,
}
header.SetMode(node.Mode)
if IsDir(node) {
header.Name += "/"
}
w, err := zw.CreateHeader(header)
if err != nil {
return errors.Wrap(err, "ZipHeader")
}
if IsLink(node) {
if _, err = w.Write([]byte(node.LinkTarget)); err != nil {
return errors.Wrap(err, "Write")
}
return nil
}
return d.writeNode(ctx, w, node)
}