Debugging: File paths exposed in debug logs - thoughts on anonymization?

Hey everyone,

I’ve noticed that when sharing debug logs to troubleshoot issues, restic redacts sensitive info like access keys (which is great!) but leaves file paths completely visible. This can be a privacy issue since paths often contain:

  • Usernames (/home/john/...)
  • Sensitive folder names (think “TaxDocuments”, “MedicalRecords”, etc.)
  • Sometimes even recovery keys or passwords in filenames :grimacing:

Currently, if I need to share logs, I have to manually go through and redact paths, which is tedious and error-prone. I’m wondering if others have this concern too?

I’m thinking of submitting a feature request for something like a --anonymize-paths flag that would transform paths like:

/home/john/Documents/FinancialRecords/crypto-keys.txt

into something like:

/path1/path2/path3/path4/file1.txt

This would keep the structure intact for debugging but hide the actual names.

Questions for the community:

  1. Is this something others would find useful?
  2. Any suggestions on how this should work?
  3. Are there existing workarounds you use?

I have already opened a GitHub Issue: [Feature Request] Redact sensitive file paths in debug output · Issue #5430 · restic/restic

  • Restic will never know what is a sensitive path in your specific case.

  • This means that for obfuscation to happen, you have to provide restic with something, let’s call it a pattern, that defines what is sensitive.

  • That pattern can be a static path, a glob pattern, a regular expression, or similar.

  • You might find that you want to provide one or more of these patterns, of varying complexity.

  • An option by which you specify a pattern can be of limited functionality - a static path would work fine, even a glob, but a complex regular expression might be tricky to provide as the value to an option on the command line.

  • That leads you to wanting to provide your pattern(s) using files similar to the exclude files that restic currently supports, which support globbing.

  • Either way, regardless of how you provide the patterns to restic, you still need to produce them - that is, you need to come up with one or more patterns that match the strings you want to obfuscate (and preferably nothing else).

  • At the end of the day, what is stopping you from just taking these patterns, and applying them to you debug log file using tools that are suitable for doing so, such as a simple string replacement or search and replace using regular expressions?

  • It would be more or less the same effort - you come up with the patterns, and apply them to a text file - it might even be simpler to do that, since you can use the right tool for the job that your specific patterns require, than to shoehorn the patterns into command line options or pattern files.

That aspect also is somewhat discussed in Debug builds of restic include auth tokens. · Issue #1123 · restic/restic · GitHub already. To transform the filename my suggestion would be to just pick some secret on restic startup and hash that together with the file path parts. The complex part is something else: catching every filename is rather complex as there are lots of possible places where they can end up in error and log messages.

Anonymized paths won’t be sufficient to debug every thing (in particular when related to filename matching!), but would be a useful thing to have for most cases. But it has to be clear that this is just a best effort mechanism. We don’t have the capacity the ensure that every single debug statement and error message (e.g. most syscalls) are guaranteed to be anonymized.

Maybe the best approach is to ensure that all output in restic goes through a logger and then apply a filter centrally. (Haven’t though this through though in regards to relative paths etc.)

:see_no_evil_monkey: