8

I have a command line program written in Python, and when I pipe it through another program on the command line, sys.stdout.encoding is None. This makes sense, I suppose -- the output could be another program, or a file you're redirecting it into, or whatever, and it doesn't know what encoding is desired. But neither do I! This program will be used by many different people (humor me) in different ways. Should I play it safe and output only ascii (replacing non-ascii chars with question marks)? Or should I output UTF-8, since it's so widespread these days?

4 Answers 4

7

I suggest you use the current locale.

Python2> import locale
Python2> locale.getpreferredencoding()
'UTF-8'

The system knows what it should be, and the other side, if it also uses the current locale, will do the right thing.

Sign up to request clarification or add additional context in comments.

Comments

4

You should use the value returned by locale.getpreferredencoding().

Comments

1

if your application doesn't really deal with a whole lot of internationalisation, ascii should suffice. but if not, i'd say utf-8 or better still utf-16 should be the order of the day.

1 Comment

The whole point of UTF-8 is that programs that don't explicitly deal with internationalisation mostly Just Work on non-ascii data. UTF-16 is an evil non-starter in linux; even on windows it's never appropriate when you are just guessing.
0

You should output UTF-8 because thats what everyone should be using. It's a bug not to be. ;)

1 Comment

Sorry this is more of a comment than a constructive answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.