Kopia vs Restic comparison

Did a similar test against a local USB 3.0 HD, but also different dataset and OS (Linux this time, above it was Windows):

  • Restic 0.9.6: Added to the repo: 4.841 GiB, processed 20542 files, 5.243 GiB in 1:10
  • Kopia 0.5.2: same in 2m15s (with zstd compression) and 1m22s without
    :slight_smile:

I used restic 0.9.6 compiled with go1.13.4 on windows/amd64. That bug is supposedly introduced by go1.14 only.

In Kopia the hashing and uploading seems to be strongly decoupled. Not sure how Restic does this exactly but if it waits with the next block until the previous block has been uploaded to ensure consistency at all times (or sth. similar; I am guessing naively here) this could explain the difference in performance too.

Source: internal SATA SSD, Btrfs, 58496 files 31.029 GiB. Target: internal SATA harddisk, Ext4.

  • Kopia 0.5.2 (AppImage downloaded from GitHub), repository initialized with default parameters (GUI)
  • Restic 0.9.6 (current Arch package), repository also initialized with default parameters (CLI)

My results:

  • Kopia: 7:48, repository size 30.72 GiB
  • Restic: 6:55, repository size 30.56 GiB

So, no surprise here.

2 Likes

From an earlier comment, by eventual consistency I do mean being able to handle object storage (e.g. AWS S3) eventual consistency correctly.

I am also benchmarking the two systems and hope to have a blog post with more data out soon. kopia results are quite impressive.

1 Like

Does restic (via rclone) have issues with handling S3 backends? I think that rclone does all the work on managing the backend pieces and so restic should be ok too?

(I dont know much about eventual consistency but as long as restic isn’t immediately get at recently uploaded files and modity them; it shouldn’t be a problem ?)

Despite Kopia being more recent it will be interesting to see how the release and feature implementation cadence compares to Restic.

Admittedly there is a balance to be had in terms of stability, but some feature discussions for Restic date back several years, often with promising contributions sitting idle with little directional feedback.

Kopia author here - I just found this thread.

Eventual consistency improvements in Kopia 0.6 (to be released soon) mean that on a reasonable eventually-consistent store such as S3 which guarantees “read-after-create” semantics (with the “as long as you don’t peek before creating” caveat) but does not guarantee “list-after-create”, kopia will be able to ensure:

  • connected repository clients will see their own writes for as long as repository is connected
  • you will be able to see writes from other repository clients and they won’t “go back in time”

This is done using caching and two-stage compaction logs, which guarantee that as long as eventual consistency settles writes within ~1hr (currently that time is hardcoded, but could be configurable for extreme cases in the future), clients should observe this reasonable behavior and garbage collection remains safe.

BTW. It’s been >3 months since 0.5 release and there have been big performance improvements since then in the Git repository, so when benchmarking it’s best to compile the binary yourself.

If you have any more questions, feel free to ping me on https://slack.kopia.io

FWIW, fast-forward a number of years and that will be the case with most software, including Kopia.

You may well be right, and my comment was not intended as a criticism.

For those of us on the Restic bandwagon it is simply disheartening at times to realise that features/improvements we desire are unlikely to materialise any time soon (in my case VSS support and of course speedier prunes).

1 Like

Being one that gets a mail for every comment/issue/PR/etc in the restic repository I can tell you that there’s a lot of work, not to mention thought, going on even if it may not seem so (including for the prune command). I personally think the next restic release will be quite an improvement :slight_smile:

The fact that there’s some years-old issues and PRs is mostly due to us not wanting to straight off close the ones we don’t envision merging soon (we’d rather keep them around for various reasons) as well as there being a lot more work to dealing with even seemingly simple ones than most people probably imagine. Writing code is easy, reviewing and making sure code won’t break things is hard and takes a lot of time, even for seemingly simple things. Not meant as an excuse or anything, just saying there’s more to it than one might think.

I have your VSS thing (assuming yours is the one I’m thinking of) on the radar, it’s one of those I’d like to get merged sometime soon. But there’s still not a lot of testing having been done on it, and if I’m not mistaken someone reported a 32-bit problem with it just the other day, so it’s not entirely straight-forward.

1 Like

Fair comment, I understand and appreciate all that you have said, no doubt most Restic users would echo the same.

Sadly I am not a bidding contributor, indeed my programming skills are more than a little rusty (and not in the trendy way). That said I do recognise the work which must go into ensuring stability, while also balancing the challenge of feature parity.

Looking forward to the next release, I am sure it will be well recieved.

1 Like

If particular issue with prunes there is some good work done by Alex Weiss with built binaries on a new prune mechanism here:

@jkowalski since we have you here: I guess you did some market research and were aware of Restic prior to starting Kopia. Skimming Kopia’s documentation its architecture seems to resemble Restic a lot.

Can you share some details regarding the (planned) differences and maybe your motivation to start from scratch rather than, e.g., forking Restic?

Believe it or not I never actually used restic, even to this day, I think only became aware of it around 1.5 years ago when somebody using Kopia asked me about comparison.

I’ve been experimenting with building what is now Kopia for >5 years now (btw it used to be called “FREDI” which stood for “fast, remote, encrypted, deduplicated, incremental” backup). Initially it was meant to be more of a personal research project than a practical tool. I’ve been working for Google Cloud since 2012 and I became fascinated with the potential for very cheap and virtually unlimited and highly-available storage. I wanted to build not just a cloud backup utility, but first and foremost a properly-layered, fully client-side encrypted, multi-user, content addressable storage without a dedicated server (so only using GCS or S3). You can still see remnants of it in https://github.com/kopia/repo

Over the early years I was experimenting with organizing the repository (or “vault” as it used to be called) and tried, wrote&rewrote and renamed things a lot. I think went through maybe 3 or 4 major ideas for packing, index organization, object naming, splitting, compression, encryption.

Around 0.3 timeframe when it became clear I’m onto something, I started focusing more on polishing the user experience, adding HTML-based UI and things became usable enough to be able to finally get rid of my CrashPlan dependency on all my computers around the house.

At this point (with 0.6.0 release out very shortly), Kopia has almost all the core features I originally envisioned in a personal/LAN backup solution.

Since Kopia started recently getting more serious adoption and contributions from Kasten and others, we started focusing more on robustness, performance, resource utilization, safety, etc. We just got major Garbage Collection safety and performance improvements in.

In addition to what we have today, for v1.0.0 I want to be able to have endurance tests for bigger repositories (10TB+), complete the UI and finish support for remote repositories with sharing/deduplicating data amongst folks where there’s not complete trust (imagine LAN or a dorm room situation).

What’s beyond is largely up to community - folks are already trying to push Kopia in some interesting directions and I’m sure more of that will come.

3 Likes

Thanks for your detailed response, Jarek! It’s really interesting to see how these two project developed in a very similar direction, apparently independent of each other! It will be interesting to compare Restic to Kopia to see what each projects’ strengths are and which use cases they are best suited for!

I recently benchmarked different versions of kopia and, given the discussions there, also included restic 0.9.6 to the mix. Results include both large file and small file benchmarks and are available at https://blog.kasten.io/benchmarking-kopia-architecture-scale-and-performance. I would be happy to answer any questions people have.

1 Like

That’s quite interesting. You might want to expand into some other backup programs to add value to the backup comparison:

(a) Borg / Duplicati: These do consolidation of data-chunks into 500Mb (Borg) / 50Mb (Duplicati) blobs which may show interesting results for the small file test you did.

(b) Duplicacy - The dedup of Restic doesn’t deal well with changes that shift the chunk boundaries (see some work I did in link below). Duplicacy is much better in that respective (and generally a strong performer; does lock-free like kopia). Worth to have a look and compare as well.

1 Like

Interesting benchmark, however I’d like to see results for more common use case: snapshotting existing directory where very little, or nothing has changed.

Consider this: I run hourly snapshots of ~500 000 files, and I want to know which one is least demanding (IO / CPU), and fastest in daily usage of taking snapshots where very little has changed.

My preliminary tests shows that kopia is faster:
restic: 1 minutes 02 seconds, IO usage 58534 reads, read bytes 1.6MB
kopia: 25 seconds, IO usage 100558 reads, read bytes: 1.9 MB

IO usage are very rough, I haven’t get my benchmarking game up to the task. But total time taken is correct, kopia is considerably faster if you run snapshotting often.

If it runs for only one minute every hour and you don’t want it to heavily hit on CPU and IO, you can consider using a command like this:

/usr/bin/nice -n 19 /usr/bin/ionice -c3 /usr/bin/restic ...
3 Likes

Good point. They didn’t make it into the post but I also ran a “0 change” backup right after the first full run for all tools. For the large files experiment, it didn’t make too much of a difference in absolute terms (kopia: 0.5 sec vs. restic 3 sec). For the small files experiment, it was a bit more (kopia: 37 secs vs. restic 69 secs).