1

How can I convert this string bellow from Python3 to a Json?

This is my code:

import ast
mystr = b'[{\'1459161763632\': \'this_is_a_test\'}, {\'1459505002853\': "{\'hello\': 12345}"}, {\'1459505708472\': "{\'world\': 98765}"}]'
chunk = str(mystr)
chunk = ast.literal_eval(chunk)
print(chunk)

Running from Python2 I get:

[{'1459161763632': 'this_is_a_test'}, {'1459505002853': "{'hello': 12345}"}, {'1459505708472': "{'world': 98765}"}]

Running from Python3 I get:

b'[{\'1459161763632\': \'this_is_a_test\'}, {\'1459505002853\': "{\'hello\': 12345}"}, {\'1459505708472\': "{\'world\': 98765}"}]'

How can I run from Python3 and get the same result as Python2?

1
  • For Py3: chunk.decode('utf8') or mystr.decode('utf8') Commented Apr 1, 2016 at 10:43

2 Answers 2

3

What you have in mystr is in bytes format, just decode it into ascii and then evaluate it:

>>> ast.literal_eval(mystr.decode('ascii'))
[{'1459161763632': 'this_is_a_test'}, {'1459505002853': "{'hello': 12345}"}, {'1459505708472': "{'world': 98765}"}]

Or in a more general case, to avoid issues with unicodes characters,

>>> ast.literal_eval(mystr.decode('utf-8'))
[{'1459161763632': 'this_is_a_test'}, {'1459505002853': "{'hello': 12345}"}, {'1459505708472': "{'world': 98765}"}]

And since, default decoding scheme is utf-8 which you can see from:

>>> help(mystr.decode)
Help on built-in function decode:

decode(...) method of builtins.bytes instance
    B.decode(encoding='utf-8', errors='strict') -> str
...

Then, you don't have to specify the encoding scheme:

>>> ast.literal_eval(mystr.decode())
[{'1459161763632': 'this_is_a_test'}, {'1459505002853': "{'hello': 12345}"}, {'1459505708472': "{'world': 98765}"}]
Sign up to request clarification or add additional context in comments.

2 Comments

Good answer, I've added some information about the source of the confusion, namely the 'b' prefix, to an answer below. Feel free to edit that into your own answer if you think it will help other readers.
@TomRees ... Let's keep it for your answer, you did effort for that, so you must be rewarded... ;)
2

Iron Fist beat me to the fix. To extend his answer, the 'b' prefix on the string indicates (to python3 but not python2) that the literal should be interpreted as a byte sequence, not a string.

The result is that the .decode method is needed to convert the bytes back into a string. Python2 doesn't make this distinction between the bytes and strings, hence the difference.

See What does the 'b' character do in front of a string literal? for more information on this.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.