0

I have a JSON file in Python. File contents are below.

{
    "cities": [
        "NY",
        "SFO",
        "LA",
        "NJ"
    ],
    "companies": [
        "Apple",
        "Samsung",
        "Walmart"
    ],
    "devices": [
        "iphone",
        "ipad",
        "ipod",
        "watch"
    ]
}

I want to create Python lists from this JSON file. I have done like below.

# Open JSON file in Python 
with open('test.json') as out_file:
  test_data = json.load(out_file)

# Query the output variable test_data 
test_data
{u'cities': [u'NY', u'SFO', u'LA', u'NJ'], u'companies': [u'Apple', u'Samsung', u'Walmart'], u'devices': [u'iphone', u'ipad', u'ipod', u'watch']}

# find type of test_data
type(test_data)
<type 'dict'>

# create list from test_data
device = test_data['devices']

# Check content of list created
device
[u'iphone', u'ipad', u'ipod', u'watch']

Now as you see the list is a unicode list I want it to be a pure Python list.

I can do like below

device_list = [str(x) for x in device]
device_list
['iphone', 'ipad', 'ipod', 'watch']

Is there a better way to do this?

2
  • Does it really matter if you have unicode objects instead of str objects? Commented May 25, 2018 at 18:15
  • @chepner It doesn't really matter But I would like to know how to do if it really matters Commented May 25, 2018 at 18:16

3 Answers 3

1

I think if you change the json.load to json.loads it will fix your issue. Removing any need to map.

Try this.

import jason
import yaml


f = open('temp.json', 'r')
json_str = f.read()

content = json.loads(json_str)

# this should remove all the unicode and return a dictionary
content = yaml.load(json.dumps(content))

content
{'cities': ['NY', 'SFO', 'LA', 'NJ'], 'companies': ['Apple', 'Samsung', 'Walmart'], 'devices': ['iphone', 'ipad', 'ipod', 'watch']}

content['devices']
['iphone', 'ipad', 'ipod', 'watch']
Sign up to request clarification or add additional context in comments.

2 Comments

It stills gives me unicode lists.even when I use json.loads
Are you on python 2? I made an edit. I think this will work for you now.
1

One approach is to use map

Ex:

l = [u'iphone', u'ipad', u'ipod', u'watch']
print(map(str, l))

python3

print(list(map(str, l)))

Output:

['iphone', 'ipad', 'ipod', 'watch']

Unicode or regular string does not make much difference

Comments

1

The reason you get back a list of unicode objects is that JSON uses Unicode. For plain ASCII strings, it would be sufficient to simply call str, but for "real" Unicode, you need to encode them first.

>>> [str(x) for x in json.loads(u'["foo"]')]
['foo']

>>> [str(x) for x in json.loads(u'["föö"]')]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128)

>>> [x.encode('utf8') for x in json.loads(u'["föö"]')]
['f\xc3\xb6\xc3\xb6']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.