1

I have a dataset containing multiple number of nested JSON objects like the following:

{
"coordinates": null,
"acoustic_features": {
    "instrumentalness": "0.00479",
    "liveness": "0.18",
    "speechiness": "0.0294",
    "danceability": "0.634",
    "valence": "0.342",
    "loudness": "-8.345",
    "tempo": "125.044",
    "acousticness": "0.00035",
    "energy": "0.697",
    "mode": "1",
    "key": "6"
},
"artist_id": "b2980c722a1ace7a30303718ce5491d8",
"place": null,
"geo": null,
"tweet_lang": "en",
"source": "Share.Radionomy.com",
"track_title": "8eeZ",
"track_id": "cd52b3e5b51da29e5893dba82a418a4b",
"artist_name": "Dominion",
"entities": {
    "hashtags": [{
        "text": "nowplaying",
        "indices": [0, 11]
    }, {
        "text": "goth",
        "indices": [51, 56]
    }, {
        "text": "deathrock",
        "indices": [57, 67]
    }, {
        "text": "postpunk",
        "indices": [68, 77]
    }],
    "symbols": [],
    "user_mentions": [],
    "urls": [{
        "indices": [28, 50],
        "expanded_url": "cathedral13.com/blog13",
        "display_url": "cathedral13.com/blog13",
        "url": "t.co/Tatf4hEVkv"
    }]
},
"created_at": "2014-01-01 05:54:21",
"text": "#nowplaying Dominion - 8eeZ Tatf4hEVkv #goth #deathrock #postpunk",
"user": {
    "location": "middle of nowhere",
    "lang": "en",
    "time_zone": "Central Time (US & Canada)",
    "name": "Cathedral 13",
    "entities": null,
    "id": 81496937,
    "description": "I\u2019m a music junkie who is currently responsible for 
Cathedral 13 internet radio (goth, deathrock, post-punk)which has been online 
since 06/20/02."
},
"id": 418243774842929150
}

I want to output file to look have the format:

user_id1 - track_id - hashtag1
user_id1 - track_id - hashtag2
user_id1 - track_id - hashtag3
user_id2 - track_id - hashtag1
user_id2 - track_id - hashtag2
....

that is for this example the output should be:

81496937  cd52b3e5b51da29e5893dba82a418a4b  nowplaying
81496937  cd52b3e5b51da29e5893dba82a418a4b  goth
81496937  cd52b3e5b51da29e5893dba82a418a4b  deathrock
81496937  cd52b3e5b51da29e5893dba82a418a4b  postpunk

I have written the following code to do that:

import json
import csv
with open('final_dataset_json.json') as data_file:
        data = json.load(data_file)

uth = open('uth.csv','wb')

cvwriter = csv.writer(uth)

for entry in data:
    text_list = [hashtag['text'] for hashtag in entry['entities']['hashtags']]
    for line in text_list:
        csvwriter.writerow([entry['user']['id'],entry['track_id'],line.strip()+'\n')

uth.close()

How can the achieve the given output?

1
  • You haven't stated what problem(s) you are having with your code. Commented Jul 17, 2017 at 10:26

2 Answers 2

1

In csvwriter if you want to write to a new line you have to send all your column data in a list.

I hope if you replace this line it is enough.

    csvwriter.writerow([entry['user']['id'],entry['track_id'],line.strip()])
Sign up to request clarification or add additional context in comments.

2 Comments

I get the following error, and I can't understand why. There is no problem with identation: csvwriter.writerow([entry['user']['id'],entry['track_id'],line.strip()]) NameError: name 'csvwriter' is not defined
@AsmitaPoddar in your code it is cvwriter.writerow([entry['user']['id'],entry['track_id'],line.strip()]) where cvwriter will specify to which file you want to write the data
1

Simple dictionary lookup (json has a module)

import json
d = json.loads(json_str)
for ht in d['entities']['hashtags']:
    print '{} - {} - {}'.format(d['user']['id'], d['artist_id'], ht['text'])

Yeilds:

81496937 - b2980c722a1ace7a30303718ce5491d8 - nowplaying
81496937 - b2980c722a1ace7a30303718ce5491d8 - goth
81496937 - b2980c722a1ace7a30303718ce5491d8 - deathrock
81496937 - b2980c722a1ace7a30303718ce5491d8 - postpunk

1 Comment

I would like to store this in a csv file. I have multiple json objects for which I would like to do so.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.