I'm scraping a website for product reviews. I can successfully get the JSON data, but I'm having an issue with the parsing. The levels of data are like this: payload -> reviews -> 22Y6N61W6TO2 -> customerReviews.
The data I want is in the "customerReviews level. However, the "6IYETQATGRMP" value will be different when looking at another item.
I don't want to have to use a different python script for each item to account for this one value. How do I use something like a wild card or something to get the data I'm after?
I'm using Python 3, requests, and JSON in my script.
My script looks like this:
import json
import pandas as pd
with open('data.json', 'r') as f:
data = json.load(f)
df = pd.json_normalize(data['payload']['reviews']['22Y6N61W6TO2']['customerReviews'])
print(df)
Below is a section of the JSON I'm working with:
"payload": {
"products": {},
"offers": {},
"idmlMap": {},
"reviews": {
"22Y6N61W6TO2": {
"averageOverallRating": 4.4783,
"roundedAverageOverallRating": 4.5,
"overallRatingRange": 5.0,
"totalReviewCount": 759,
"recommendedPercentage": 89,
"ratingValueOneCount": 35,
"ratingValueTwoCount": 27,
"ratingValueThreeCount": 30,
"ratingValueFourCount": 115,
"ratingValueFiveCount": 552,
"percentageOneCount": 4,
"percentageTwoCount": 3,
"percentageThreeCount": 3,
"percentageFourCount": 15,
"percentageFiveCount": 72,
"activeSort": "relevancy",
"pagination": {
"total": 759,
"pages": [
{
"num": 1,
"gap": false,
"active": true,
"url": "sort=relevancy&page=1"
},
{
"num": 2,
"gap": false,
"active": false,
"url": "sort=relevancy&page=2"
},
{
"num": 3,
"gap": false,
"active": false,
"url": "sort=relevancy&page=3"
},
{
"num": 4,
"gap": false,
"active": false,
"url": "sort=relevancy&page=4"
},
{
"num": 5,
"gap": false,
"active": false,
"url": "sort=relevancy&page=5"
},
{
"num": 6,
"gap": false,
"active": false,
"url": "sort=relevancy&page=6"
},
{
"num": 0,
"gap": true,
"active": false
},
{
"num": 38,
"gap": false,
"active": false,
"url": "sort=relevancy&page=38"
}
],
"next": {
"num": 0,
"gap": false,
"active": false,
"url": "sort=relevancy&page=2"
},
"currentSpan": "1-20"
},
"customerReviews": [
{
"reviewId": "248695872",
"authorId": "13b0b650b7694a54267279bf80e0fdfa99cc7c3c5150d32aff7db274e74c07f5f6e7f7b6c4fe8cb64a007c9e3c0f0c04",
"negativeFeedback": 0,
"positiveFeedback": 0,
"rating": 5.0,
"reviewTitle": "Amazing",
"reviewText": "This thing is amazing. I cooked bbq ribs in 30 mins. Then caramelized for 6 mins in my oven. They was awesome. Best kitchen appliance of 2020. Wish i had bought it before dec 31st. Buy one folks. You'll love it.",
"reviewSubmissionTime": "1/1/2021",
"userNickname": "Keith",
"badges": [
{
"badgeType": "Custom",
"id": "VerifiedPurchaser",
"contentType": "REVIEW"
}
],
"userAttributes": {},
"photos": [
{
"Id": "e917ed53-cf49-48af-b454-42f3fd87536a",
"Sizes": {
"normal": {
"Id": "normal",
"Url": "https://i5.walmartimages.com/dfw/6e29e393-988c/k2-_d716ba9d-2c5b-4f82-b9a6-588575975fe6.v1.bin"
},
"thumbnail": {
"Id": "thumbnail",
"Url": "https://i5.walmartimages.com/dfw/6e29e393-988c/k2-_d716ba9d-2c5b-4f82-b9a6-588575975fe6.v1.bin?odnWidth=150&odnHeight=150&odnBg=ffffff"
}
},
"SizesOrder": [
"normal",
"thumbnail"
]
}
],
"videos": [],
"externalSource": "bazaarvoice"
}