Hi all,
Before I embark on this project, wanted to see if anyone else had already done something similar.
I’m hoping to beef up my exclude lists. I’ve written a small script that takes a list of ~10 recent snapshots and restic diff
s them, and with a little awk / sed I’m left with a 500,000 line file. I can sort | uniq -c | sort
it to see that there are a few files that are getting frequently updated and could be considered for exclusion, but I think I’d get much better bang-for-buck looking to exclude culprit parent directories (instead of exact file matches).
The best idea I’ve had so far is to find a way to make a tree structure that when printed includes the counts of children (and sorts branches by this count) and perhaps lets me truncate the depth.
I could probably manage something just by sorting the file and using awk -F/ { stuff }
to determine the depth, but I usually end up regretting it when I start using bash scripts for something not entirely trivial.
head -n 100 changes.txt | tree -a --fromfile
^^ Something like that but with file counts for each directory would work.
Some of these might be adaptable but by default are counting actual files in a directory tree instead of using something like stdin for input of filenames: