7

For instance if my string contains - 'नमस्ते' how do I print all the unicode escape sequence for the alphabets in the string.

2 Answers 2

10

If you want the \u escapes for each character (what you'd type to redefine the string in pure ASCII Python code), use the unicode-escape codec:

>>> 'नमसत'.encode('unicode-escape')
b'\\u0928\\u092e\\u0938\\u0924'

If it needs to end up a str, rather than bytes, decode it back as ASCII (and remove the quoting and doubled backslashes on display by printing it):

>>> print('नमसत'.encode('unicode-escape').decode('ascii'))
\u0928\u092e\u0938\u0924
Sign up to request clarification or add additional context in comments.

3 Comments

It looks like you lost a few chars there. For the OP's string of 'नमस्ते' I get b'\\u0928\\u092e\\u0938\\u094d\\u0924\\u0947'
@PM2Ring: Sigh. Stupid terminal didn't support the characters, probably lost it in the copy & paste. I hope the OP gets the idea. :-)
Thanks for the answer! One more question, is there any way I could map the individual unicode escape sequence to its utf-8 symbol?
4
>>> s = "नमस्ते"
>>> s.encode('utf-8')
b'\xe0\xa4\xa8\xe0\xa4\xae\xe0\xa4\xb8\xe0\xa5\x8d\xe0\xa4\xa4\xe0\xa5\x87'
>>> s.encode('unicode-escape')
b'\\u0928\\u092e\\u0938\\u094d\\u0924\\u0947'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.