Seemingly simple question: How do I print() a string in Python3? Should be a simple:
print(my_string)
But that doesn't work. Depending on the content of my_string, environment variables and the OS you use that will throw an UnicodeEncodeError exception:
>>> print("\u3423")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\u3423' in position 0: ordinal not in range(128)
Is there a clean portable way to fix this?
To expand a bit: The problem here is that a Python3 string contains Unicode encoded characters, while the Terminal can have any encoding. If you are lucky your Terminal can handle all the characters contained in the string and everything will be fine, if your Terminal can't (e.g. somebody set LANG=C), then you get an exception.
If you manually encode a string in Python3 you can supply an error handler that ignores or replaces unencodable characters:
"\u3423".encode("ascii", errors="replace")
For print() I don't see an easy way to plug in an error handler and even if there is, a plain error handler seems like a terrible idea as it would modify the data. A conditional error handler might work (i.e. check isatty() and decide based on that what to do), but it seems awfully hacky to go through all that trouble just to print() a string and I am not even sure that it wouldn't fail in some cases.
A real world example this problem would be for example this one:
sys.stdout = io.TextIOWrapper(sys.stdout.detach(), errors='backslashreplace').LANG=C python3 -c 'print("\u3423")', I can reproduce your error while, withLANG=en_US.UTF-8, works just fine.