FWIW, fast-forward a number of years and that will be the case with most software, including Kopia.
You may well be right, and my comment was not intended as a criticism.
For those of us on the Restic bandwagon it is simply disheartening at times to realise that features/improvements we desire are unlikely to materialise any time soon (in my case VSS support and of course speedier prunes).
Being one that gets a mail for every comment/issue/PR/etc in the restic repository I can tell you that there’s a lot of work, not to mention thought, going on even if it may not seem so (including for the prune command). I personally think the next restic release will be quite an improvement
The fact that there’s some years-old issues and PRs is mostly due to us not wanting to straight off close the ones we don’t envision merging soon (we’d rather keep them around for various reasons) as well as there being a lot more work to dealing with even seemingly simple ones than most people probably imagine. Writing code is easy, reviewing and making sure code won’t break things is hard and takes a lot of time, even for seemingly simple things. Not meant as an excuse or anything, just saying there’s more to it than one might think.
I have your VSS thing (assuming yours is the one I’m thinking of) on the radar, it’s one of those I’d like to get merged sometime soon. But there’s still not a lot of testing having been done on it, and if I’m not mistaken someone reported a 32-bit problem with it just the other day, so it’s not entirely straight-forward.
Fair comment, I understand and appreciate all that you have said, no doubt most Restic users would echo the same.
Sadly I am not a bidding contributor, indeed my programming skills are more than a little rusty (and not in the trendy way). That said I do recognise the work which must go into ensuring stability, while also balancing the challenge of feature parity.
Looking forward to the next release, I am sure it will be well recieved.
If particular issue with prunes there is some good work done by Alex Weiss with built binaries on a new prune mechanism here:
@jkowalski since we have you here: I guess you did some market research and were aware of Restic prior to starting Kopia. Skimming Kopia’s documentation its architecture seems to resemble Restic a lot.
Can you share some details regarding the (planned) differences and maybe your motivation to start from scratch rather than, e.g., forking Restic?
Believe it or not I never actually used restic, even to this day, I think only became aware of it around 1.5 years ago when somebody using Kopia asked me about comparison.
I’ve been experimenting with building what is now Kopia for >5 years now (btw it used to be called “FREDI” which stood for “fast, remote, encrypted, deduplicated, incremental” backup). Initially it was meant to be more of a personal research project than a practical tool. I’ve been working for Google Cloud since 2012 and I became fascinated with the potential for very cheap and virtually unlimited and highly-available storage. I wanted to build not just a cloud backup utility, but first and foremost a properly-layered, fully client-side encrypted, multi-user, content addressable storage without a dedicated server (so only using GCS or S3). You can still see remnants of it in https://github.com/kopia/repo
Over the early years I was experimenting with organizing the repository (or “vault” as it used to be called) and tried, wrote&rewrote and renamed things a lot. I think went through maybe 3 or 4 major ideas for packing, index organization, object naming, splitting, compression, encryption.
Around 0.3 timeframe when it became clear I’m onto something, I started focusing more on polishing the user experience, adding HTML-based UI and things became usable enough to be able to finally get rid of my CrashPlan dependency on all my computers around the house.
At this point (with 0.6.0 release out very shortly), Kopia has almost all the core features I originally envisioned in a personal/LAN backup solution.
Since Kopia started recently getting more serious adoption and contributions from Kasten and others, we started focusing more on robustness, performance, resource utilization, safety, etc. We just got major Garbage Collection safety and performance improvements in.
In addition to what we have today, for v1.0.0 I want to be able to have endurance tests for bigger repositories (10TB+), complete the UI and finish support for remote repositories with sharing/deduplicating data amongst folks where there’s not complete trust (imagine LAN or a dorm room situation).
What’s beyond is largely up to community - folks are already trying to push Kopia in some interesting directions and I’m sure more of that will come.
Thanks for your detailed response, Jarek! It’s really interesting to see how these two project developed in a very similar direction, apparently independent of each other! It will be interesting to compare Restic to Kopia to see what each projects’ strengths are and which use cases they are best suited for!
I recently benchmarked different versions of kopia and, given the discussions there, also included restic 0.9.6 to the mix. Results include both large file and small file benchmarks and are available at https://blog.kasten.io/benchmarking-kopia-architecture-scale-and-performance. I would be happy to answer any questions people have.
That’s quite interesting. You might want to expand into some other backup programs to add value to the backup comparison:
(a) Borg / Duplicati: These do consolidation of data-chunks into 500Mb (Borg) / 50Mb (Duplicati) blobs which may show interesting results for the small file test you did.
(b) Duplicacy - The dedup of Restic doesn’t deal well with changes that shift the chunk boundaries (see some work I did in link below). Duplicacy is much better in that respective (and generally a strong performer; does lock-free like kopia). Worth to have a look and compare as well.
Interesting benchmark, however I’d like to see results for more common use case: snapshotting existing directory where very little, or nothing has changed.
Consider this: I run hourly snapshots of ~500 000 files, and I want to know which one is least demanding (IO / CPU), and fastest in daily usage of taking snapshots where very little has changed.
My preliminary tests shows that kopia is faster:
restic: 1 minutes 02 seconds, IO usage 58534 reads, read bytes 1.6MB
kopia: 25 seconds, IO usage 100558 reads, read bytes: 1.9 MB
IO usage are very rough, I haven’t get my benchmarking game up to the task. But total time taken is correct, kopia is considerably faster if you run snapshotting often.
If it runs for only one minute every hour and you don’t want it to heavily hit on CPU and IO, you can consider using a command like this:
/usr/bin/nice -n 19 /usr/bin/ionice -c3 /usr/bin/restic ...
Good point. They didn’t make it into the post but I also ran a “0 change” backup right after the first full run for all tools. For the large files experiment, it didn’t make too much of a difference in absolute terms (kopia: 0.5 sec vs. restic 3 sec). For the small files experiment, it was a bit more (kopia: 37 secs vs. restic 69 secs).
Having tried both it doesn’t come as a surprise that both have strong points that the other one misses (kopia: compression, performance, GUI; restic: rclone backend, maturity, documentation/forum, user base). I know this is a long shot but a combined restia/kopic would leap-frog both projects and save a lot (re-)implementation effort. @fd0, @jkowalski – any luck?
I applaud the idealism, however this really does seem like a huge ask for two well-invested and independent authors, that said we watch with interest to see what evolves (joint effort or otherwise).
I don’t think a merge is a good thing — it is the (friendly) competition that drives the progress.
Say the rclone backend - that would be work but presumably with some cost-benefit analysis then Kopia would implement eventually.
It’s usually already a huge amount of work to reunite two projects which have split some time in the past. But restic and kopia don’t share any code at all, along with having wildly different internal abstractions, command-line interface philosophy and repository format.
So merging the projects would essentially amount to a full-blown rewrite at which point the result would just be a third backup program. But that doesn’t preclude either project to benefit from good ideas of the other.
I’ve just come across restic and kopia. I found restic first, then read this thread and followed some of the benchmark links and thought kopia sounded better.
I could not find it in the Debian repos, so I installed kopia from the latest release on its github page. I then spent hours trying to get kopia to connect over sftp, and failed! It seems to have a particular implementation of SSH that means you can’t use ssh key agents, or local .ssh/config files, or encrypted private keys(!). I then spent a good while trying to get kopia to connect over webdav, and also failed there, too. There was scant documentation for these on the kopia doucmentation site.
I thought I’d try restic. In 5 minutes I had it connected to my SFTP storage and working at committing my first chunk of data, apparrently without error (maybe I’m typing too soon…!).
I value kopia’s compression, but restic was quick and easy to get going.
Since yesterday, Kopia does have a Discourse forum as well.
Please see HELP WANTED: Testing Windows VSS support .