0

I've seen several post related to this, but no clear answer. Let's say I want to print the string s=u'\xe9\xe1' in a terminal which only supports ASCII (e.g., LC_ALL=C; python3). Is there any way to configure the following as default behaviour:

import sys
s = u'\xe9\xe1'
s = s.encode(sys.stdout.encoding, 'replace').decode(sys.stdout.encoding)
print(s)

I.e., I want to the string to print something - even garbage - rather than raising an exception (UnicodeEncodeError). I'm using python3.5.

I would like to avoid writing this for all of my strings which may contain UTF-8.

0

1 Answer 1

1

You can do one of three things:

  • Adjust the error handler for stdout and stderr with the PYTHONIOENCODING environment variable:

    export PYTHONIOENCODING=:replace
    

    note the :; I didn't specify the codec, only the error handler.

  • Replace the stdout TextIOWrapper, setting a different error handler:

    import sys
    import io
    
    sys.stdout = io.TextIOWrapper(
        sys.stdout.buffer, encoding=sys.stdout.encoding, 
        errors='replace',
        line_buffering=sys.stdout.line_buffering)
    
  • Create a separate TextIOWrapper instance around sys.stdout.buffer and pass that in as the file argument when printing:

    import sys
    import io
    
    replacing_stdout = io.TextIOWrapper(
        sys.stdout.buffer, encoding=sys.stdout.encoding, 
        errors='replace',
        line_buffering=sys.stdout.line_buffering)
    
    print(s, file=replacing_stdout)
    
Sign up to request clarification or add additional context in comments.

1 Comment

This is exactly what I was looking for - much appreciated! (I went with option 2)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.