I have what seems like a rather simple questions but can't wrap my head around them.
I have a pandas dataframe for Tweets. The location of the users is registered in a variable named "Location" in various ways:
When the location is well recorded, I often get:
{'country_code': 'tr', 'state': 'Central Anatolia Region', 'county': 'Çankaya', 'city': 'Ankara'}
or
('country_code': 'tr', 'state': 'Black Sea Region', 'city': 'Trabzon'}
But sometimes, all I get is:
{'country_code': 'tr'}
('country_code': 'tr', 'state': 'Batman'}
and often, there's nothing and all that's registered is this:
{}
I want to write a script that can create new variables in my pandas dataframe for these individual values. In other words, if country_code is registered for a specific row, then I want the value in question to be recorded in a variable named country_code. And so on for state, county, and city. If nothing is there, it can simply input a blank or an NA for all the missing variables in question (county, state, city).
The end result should be such that I have four new variables in my dataframe: country-code, state, county, and city, based on the values registered in the "Location" variable with something (or nothing) registered for these values.
Can someone help by any chance?
Thank you so much!
DataFrameyou are showing adict. Is it alist of dictthat you are referring to?type(newdf2['Location']) Out[31]: pandas.core.series.Seriesnewdf2.