2

I am trying to convert nested json objects get from 'https://api.data.gov.sg/v1/transport/carpark-availability' into dataframe, but incomplete data is retrieved in the dataframe.

Below are my codes:

    res = requests.get("https://api.data.gov.sg/v1/transport/carpark-availability")
    j = res.json()
    df = pd.DataFrame(j)
    print(df.head())

Below is the incomplete result: items 0 {'timestamp': '2021-03-10T23:07:27+08:00', 'ca...

I reckon maybe I need to use json_normalize, could someone kindly enlighten?

1 Answer 1

2

Try the below

import requests
import pandas as pd
import copy

res = requests.get("https://api.data.gov.sg/v1/transport/carpark-availability")
data = res.json()
lst = data['items'][0]['carpark_data']
normalized = []
for entry in lst:
    temp = entry['carpark_info']
    for idx, x in enumerate(temp):
        _entry = copy.deepcopy(entry)
        del _entry['carpark_info']
        _entry.update(x)
        normalized.append(_entry)
df = pd.DataFrame(normalized)
print(df.head(110))

output

    carpark_number      update_datetime total_lots lot_type lots_available
0             HE12  2021-03-10T23:41:38         91        C             51
1              HLM  2021-03-10T23:41:18        583        C            477
2              RHM  2021-03-10T23:41:38        322        C             59
3             BM29  2021-03-10T23:41:47         97        C             95
4              Q81  2021-03-10T23:41:26         96        C             73
..             ...                  ...        ...      ...            ...
105             C7  2021-03-10T23:41:47        459        C            221
106            C17  2021-03-10T23:29:09         42        C              0
107            C17  2021-03-10T23:29:09         35        Y             17
108            C17  2021-03-10T23:29:09          1        H              1
109           C20M  2021-03-10T23:41:39        551        C            440
Sign up to request clarification or add additional context in comments.

3 Comments

There are 128 instances where there is more than one item in carpark_info. This method takes the first one. For example, take a look at indexes 106, 190, and 202 in lst.
@jakub you are right. My code assume that there is only 1 item in ['carpark_info']. So the question is how to handle them. A proper (?) solution can be to generate N records where N is the len of ['carpark_info']
@jakub I have changed the code in order to support your comment.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.