One repo - multiple sources possible?

Hello,

our work setup is that each of us has the full project locally on our workstations (filmmusic production here…). We often run into the problem that someone needs to pickup a file that was created by someone else and that person forgot to send the file after finishing work. No, we do not want any synchronization software running on our project folder - we have team members that create huge messes in their folders and it would mess up everyone elses folder as well.

My thought was that I could setup a restic server on my side and backup the project folders of all other workstations to that one restic repo. Not really as a backup but as a way of accessing files when everyone else is asleep (time zone difference coming into play here…) and can’t send missing files.

That way I (or any of the team members who are a little more technically skilled) could pickup files from the backup snapshots anytime I want.

I would have to open a restic server to the internet though from my location…

Is this feasible or am I overlooking major pitfalls? The deduplication should work for this use case as well, right?

I believe that you actually SHOULD use synchronization. Not to a common central directory, but to per-user directory on a central device. If you are afraid the this would require huge disk usage in that central device, use for example BTRFS which has deduplicating built in.

That would have the added “benefit” (I like tinkering…) that I finally have a real argument to build a solid storage machine and familiarize myself with modern storage on Linux… something I’ve been holding off because of my desire to minimize power consumption and the raspberry pis are just not made to handle large storage and data throughput… tried that multiple times (with EXT4 file systems) but got unrecoverable drive corruption every time after a (short!) while… most likely due to the USB connection… I don’t know…

Hmmm… I was having so much fun using Restic that I thought that this could be a way of dealing with the challenge.

EDIT: sorry, I had tried BTRFS and ZFS not BTFS - just realized that…

As much as I love restic, I wouldn’t use it as a “file syncing” tool. You’d be much, much better off in my opinion to use something like Nextcloud with shared folders. Then you all have access to up-to-the-minute files etc.

You’d use restic to backup the central Nextcloud datastore.

You could do what you’re proposing but your workflow would be hard - backup from site a, site b would have to restore etc. Not a good workflow.

BTRFS is what I meant…! :slight_smile: My bad. Correcting my previous reply now.

The folders to sync are up to 3 TB (one of the projects…) - I’m a bit afraid of keeping that much data in sync using a synchronization tool… I’ve had absolutely great success using Syncthing - but never on as large a dataset as that - and I have had Syncthing produce failures before too… I would not trust Nextcloud with this amount of data - I love Nextcloud, but I’ve had mediocre results both in self-hosted and commercially hosted solutions - there are sync-errors, time-outs, deletion delays - it’s not a complete mess, but for me it was messy enough to be scared of it…

And perhaps I’m underestimating the time Restic takes to actually create a snapshot for that… I’ve only tried it with one of our smaller projects (450GB). It’s incredibly fast - but of course that’s on a locally attached repo.

Hmm… but no matter what I will probably have to setup a dedicated machine for whichever solution I choose anyway.

Sorry, I’m rambling a little bit…

1 Like

I would look more at maybe rsync to achieve what you’re after?

I mean, try with restic, but I just don’t think shoehorning a backup solution into a replications/sync situation will work well for you. But, I’ve been wrong 1,000 times before so this could well be 1,001!

I get it, but synchronization was not my question. We’ve used Backblaze for this exact purpose before - picking up something that was forgotten to upload and it works quite well so I thought, why not build this ourselves?

The question really I had was if Restics deduplication works as well as it does when coming from different sources to the same repo.

Yes it does.

But, as you might have realized by now, restic (and quite possibly no other backup program either) does not seem to be a good choice for your use case, at least as far as I understand your use case.

Thank you!

Yes, I understand that and I agree!

Having to protect the files from certain team members is the biggest issue we’re having and is the reason why at the moment we are moving material into place by hand.

And until I have a deduplicating file system in place Restic seems to be the next best choice for being able to go back in time and pick something up from a previous snapshot.

Some realities are what they are and can’t be changed even if rational people see that there are better ways.

1 Like

You’re welcome. :slight_smile:

Hmm… THAT is a use case for restic! Backup every 10 minutes or so (there is an option to not create a snapshot if nothing changed), then empower the team members to update themselves.

You might very well get increased productivity from this, for reasons similar to the reduced number of emergency stops on the Toyota production line when they introduced the right for EVERYONE to stop production instead of only some selected people. :slight_smile:

Just some food för thoughts.

I see several messages saying this is not a good idea, but other than “workflow would be difficult”, I did not find a concrete reason why among all the replies.

I think it’s great, with appropriate pruning it should work really well.

The workflow *may* be difficult, but for example I have a script that I use that leverages restic find to pull down one specific file from an entire repository. It throws up a list of snapshots which contain that file, I scroll down to the one I want, hit enter and that’s it.

1 Like