CERN is testing restic for their backups

Maybe an other success story: https://cds.cern.ch/record/2659420

(if this is not the good place for that please delete this post).

6 Likes

Awesome, thanks for the hint!

1 Like

@fd0 maybe you can contact them to see what happens at LAAAARGE scale :wink:
However this is still a WIP but promising.

1 Like

Heh, they have 16k users with (combined) 3PB of data, but they use one repository (in one S3 bucket) per user, so the memory usage will not be such a huge issue :slight_smile: Good trade-off, IMHO.

And it’s just at the evaluation stage for now. I’m curious for the result of their evaluation…

1 Like

I bet you are. :slight_smile: I am too and I hope they do publish results or recommendation

keep calm this is their goal :slight_smile: for now they just have 200 users with a total of 5M files.
This is already more than one can have for personal backup :wink:

1 Like

I would say that being able to survive a WIP at an organisation like CERN with this size of files is already a reason to open a (or two) glass of your favourite drink.

4 Likes

Hi, I am the person running this project, I’ve been around since a while bothering you in the forum/github :slight_smile:

As a quick update just to say that this project is progressing fast and I’m very confident about it going into production at some point. Currently we are backing-up 370 accounts daily, and we plan to increase it to 1k shortly.

And also if we do a mesh in one repository only one user would be affected :slight_smile: Other reason of this is that we can have more flexibility with bucket placement policies, like moving important users to critical areas, adding extra S3-side replication to certain users, etc… The main problem of this is that we don’t get the full power of the de-duplication but as you said, is a fair trade-off.

Yes, for sure! now the orchestration tools are very coupled to our environment but my idea if this goes into production is to make it more generic and share it.

I will maintain you updated about any news regarding this project and feel free to contact me if you have any question :slight_smile:

4 Likes

@robvalca when you say S3 I assume Ceph, right?

1 Like

@fbarbeira Yes, we are using ceph+radosgw.

1 Like

I’m looking forward to that! :slight_smile:

1 Like

I’m very glad to hear that! It’s the same approach that we are implementing in our infrastructure. Not so ambitious like yours, but also huge (4k users and 1PB data).

I will stay tuned to your advances! :smiley:

Dear friends,

Tomorrow at Ceph Day at CERN I will do a short re-cap of the current status of this project, which is still very promising and growing (14,5k users now, 35M files processed per day, 270T of combined backup repositories). I have a slide only for restic and i will try to spread the word about all its beautifulness!

Cheers!

8 Likes

Thank you very much for keeping us posted :slight_smile: