0

I am trying to scrape data from a JSON file. I am able to scrape data from some of the tags but few nested tags are giving problem. Following is a sample from the file -

{"orders":[{
  "order_id":9000,
  "flight_start":"2017-06-15T05:00:00.000Z",
  "flight_end":"2017-06-22T05:00:00.000Z",
  "spots":[{
      "spot_id":7354259,
      "spot_length":15}],
  "constraints":{
      "forbid":[{
        "network":"BRVO"},
        {"network":"DSE"},
        {"network":"ESPN"},
        {"network":"DFC"},
        {"hours":[2,6],
         "days_of_week":["Monday","Tuesday","Thursday","Friday"]},
        {"hours":[2,6],
         "days_of_week":["Saturday","Sunday"]}],
      "allocation":[{
         "hours":[6,9],
         "impressions":{
             "min":0.05,
             "max":0.05},
         "days_of_week":["Monday","Tuesday","Wednesday","Thursday","Friday"]},{
         "hours":[20,0],
         "impressions":{"min":0.5,"max":0.5},
         "days_of_week":["Monday","Tuesday","Wednesday","Thursday","Friday"]},{
         "budget":{
             "min":1,
             "max":1},
         "spot_length":15}]}}]}

I am not able to scrape all values from network tag, it is only returning top value from all the network tabs for each order.

I am using the following code -

 import urllib
 import json
 url = 'http://vw-test.elasticbeanstalk.com/test'
 json_obj = urllib.request.urlopen(url).read().decode('UTF-8')
 data = json.loads(json_obj)
 for i in data["orders"]:
     k = i["order_id"]
     j = i["flight_start"]
     l = i["flight_end"]
     m = i ['spots']
     for  value in m:    
         a = value["spot_length"]
         b = value["spot_id"]
     n = i["constraints"]
     c = n["forbid"]
     d = c[0]
     e = d["network"]
     print(e)

If any one could help me figure this out I'll be very grateful.

1 Answer 1

1

The json data in your question isn't complete. Making some assumptions, this could work:

for i in data["orders"]:
    k = i["order_id"]
    j = i["flight_start"]
    l = i["flight_end"]
    m = i ['spots']
    for  value in m:
        a = value["spot_length"]
        b = value["spot_id"]
    n = i["constraints"]
    c = n["forbid"]
    d = c[0]
    networks = [d["network"] for d in c if "network" in d]
    print(networks)
Sign up to request clarification or add additional context in comments.

3 Comments

yes this works, thanks a lot. BTW the link to the json file is given in the code if you want to take a look.
You're welcome. I just meant that the sample data shown in your question needed to end with "spot_length":15}]}}]} to be well-formed.
yeah you are right sorry about that, I have fixed it in the question. Thank you so much

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.