Using a database in cache to reduce memory consumption?


#1

I was reading discussions in this rclone issue regarding caches, where people talked about how a database could be used to avoid loading everything in memory. Also, currently restic seems to have memory issue with some expensive operations like check and prune.

What do restic developers think of using a database to help solve this issue?

Index in the remote repository is still encrypted as it is right now, but as it is downloaded, a local decrypted database can be built in cache, allowing less memory consumption in certain operations.


#2

I’ve already thought about something like this. The memory issues are probably caused by my not optimal index handling code, which keeps a rather large data structure in memory. However, before we make restic more complex, I’d like to trim down and improve the data structure, let’s see how that goes first.


#3

Duplicati is dependent on sqlite databases for all its operations the drawback here is database performance.

In case of restic how would this be?

Would it be used just temporarily to reduce memory load Or even store local caches, logs?


#4

Using a database is good Idea but can performance be a next question?

Memory IO vs Disk IO
Doesn’t make much difference… It varies pc to pc

Database format sqlite or simple txt or central database like MySQL?
Txt is faster than sqlite I think… central database will take a bit of bandwidth but ok for very large datasets multiple sources 10TB++

In case of restic how would this be?
Would it be used just temporarily to reduce memory load Or even store local caches, logs?


#5

I think it’s probably a bit early to answer that. The feature hasn’t even been properly designed yet, and benchmarking will likely be a big part of the design process.