Piping restic on Windows - how to deal with ANSI escape sequences

I have the following line in my bash script:

restic backup --verbose --verbose --files-from $SRC_FILE --exclude-file $EXCLUDE_FILE | grep -v unchanged

The purpose is to have a live output of any changed file so that I can decide if the snapshot is worth keeping afterwards (as restic lacks a dry-run feature)

Now I would like to port this command to Windows. This is my initial attempt

restic -r test_restic backup test --verbose --verbose | findstr /V unchanged

However, when run in a cmd window on my Windows 10 machine (1809), something like the following is produced:

e[2Kopen repository
e[2Klock repository
e[2Kload index files
e[2Kusing parent snapshot 958accad
e[2Kstart scan on [test]
e[2Kstart backup on [test]
e[2Kscan finished in 0.246s: 4 files, 10.277 KiBrors
e[2K[0:00] 0 files 0 B, total 1 files 26 B, 0 errors
e[2KFiles:           0 new,     0 changed,     4 unmodified
e[2KDirs:            0 new,     0 changed,     0 unmodified
e[2KData Blobs:      0 newtal 1 files 26 B, 0 errors
e[2KTree Blobs:      0 newtal 1 files 26 B, 0 errors
e[2KAdded to the repo: 0 B  l 1 files 26 B, 0 errors
e[2K[0:00] 0 files 0 B, total 1 files 26 B, 0 errors
e[2Kprocessed 4 files, 10.277 KiB in 0:00B, 0 errors
e[2Ksnapshot 0f361c77 savedal 1 files 26 B, 0 errors
e[2K[0:00] 4 files 10.277 KiB, total 4 files 10.277 KiB, 0 errors
e[1A

I know this should be due to the mis-handling of ANSI escape codes. But I would like to know if anyone of you have an idea on how to deal with this? Simply removing the ANSI escape codes does not work as the output is still a bit garbled.

Any help would be greatly appreciated. Thanks a lot

1 Like

Here is an answer to your question:
Running cmd gives you one windows command line interface while running PowerShell gives another. PowerShell is much improved over .bat files.
I use the following at the end of my command to change the output to ascii
| Out-File -FilePath .\restic_temp_stdout.log -encoding ascii
PowerShell needs to be installed
https://docs.microsoft.com/en-us/powershell/scripting/install/installing-powershell-core-on-windows?view=powershell-7
JSON style output is available from restic with the --json flag

But may I ask why would you not want to keep a valid backup?

Thank you for your answer. However, as no one has provided an answer, I have chosen to parse the json using the --json flag in python. That is much more work but as I have done it, there is no point in going back. (I have added a nice progress bar using tqdm in python too.)

The original reason why I want to port the script to Windows is that the script was written when I was using linux, and some time later I switched over to Windows but still continued using the bash script using WSL. Over time I found that the only use for WSL is for my backup script. Thus I decided to port the script to python so that I can delete WSL which consumes more than 2GB of space. Originally I tried to do it in a “pythonic” way like below:

process = subprocess.Popen(cmds, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
    output = process.stdout.readline()
    if process.poll() is not None and output == b'':
        break
    if output:
        line = output.decode(sys.stdout.encoding).strip()
        if "unchanged " not in line:
            print(line)

But this code still produced output polluted with ANSI codes. I tried a more simple approach where the output was directly piped to findstr and still failed. And then I made this post hoping to see if anyone have some insights.

Alternatively, I have tried using colorama on python, which solved the ANSI code problem but every status output was still printed, instead of staying at the bottom getting replaced.

Your suggestion of PowerShell is interesting. I have PowerShell too but due to this I have continued to use the Windows command line. However as you are redirecting the output to a file, restic might correctly detect it and disabled ANSI escape code output. So the problem may not be solved by switching to PowerShell alone. I’ll try the behavior on PowerShell later when I have time.

And for the reason why I might not want to keep a valid backup - Actually this is rare, but sometimes you will unexpectedly let some files that should not be backed up slip into the backup. For example, when I just started with Python, once I got __pycache__ files and intermediate build files of pyinstaller slipped into the backup. When I see unnecessary files added, I can revert the backup, adjust my exclude criteria, and try again.

I really hope that there will be a --dry-run feature, or there could be something like --no-ansi if the shell detection fails (without needing to go into parsing json). But I understand that this is an open source project and the developer is overwhelmed. So I will accept the current status. I’ll mark the answer solved for now, but I’ll still appreciate better insights from anyone.

I am also using python to read the output of the restic. The ASCII does not remove all of the quirks of the output. A simplistic program
with open(in_file, “r”) as fin:
for line in fin:
start = line.find(’{’)
if start >= 0:
data = json.loads(line[start:]) # returns a dictionary.
(Shows my tech abilities when I have not determined how to paste python code properly. How do you do it?)
If you want a copy of the program I can send it to you, but it is so simplistic for an actual python programmer that it is not worth anything. For example perhaps the data should be put into pandas, which I have never used, to do better reporting.

You likely know but I’ll mention it for anyone else: be sure to tell your Windows anti-virus to ignore the restic exe. Since restic looks at many files the virus checker scans many files.

Thank you for the code! The approach that I have used is a little bit different:

import json
import os
import subprocess
import sys
import re

def escape_ansi(line):
    ansi_escape = re.compile(r'(?:\x1B[@-_]|[\x80-\x9F])[0-?]*[ -/]*[@-~]')
    return ansi_escape.sub('', line)

# ...

cmds = [
    'restic', '--repo', restic_path_str, 'backup',
    '--verbose', '--verbose',
    '--files-from', files_from_path_str,
    '--exclude-file', exclude_file_str,
    '--json'
]

process = subprocess.Popen(cmds, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

while True:
    output = process.stdout.readline()
    if process.poll() is not None and output == b'':
        break
    if output:
        line = escape_ansi(output.decode(sys.stdout.encoding).strip())
        try:
            j = json.loads(line)
            # ...
            # Analyse the resultant dict and show fancy output
        except json.JSONDecodeError:
            # Print the line as is
            print(line)
retval = process.poll()

# ...

I am not an actual progammer either, just doing python programming as my hobby. The escape_ansi function was actually shamelessly copied from StackOverflow…

And I’m not going to put the data into pandas in anyway. My backup script should have nothing to do with data science :joy:

And for how to show properly formatted code in this forum, you can either indent all the code with four spaces, or paste your code between a pair of three backticks like this:

```
Code here
    Indentation preserved!
```

This is markdown syntax FYI

And yep I have also told Windows Defender to ignore restic.exe. Thank you

Thanks.
Learned something new today.