I am working with CSV files where several of the columns have a simple json object (several key value pairs) while other columns are normal. Here is an example:
name,dob,stats
john smith,1/1/1980,"{""eye_color"": ""brown"", ""height"": 160, ""weight"": 76}"
dave jones,2/2/1981,"{""eye_color"": ""blue"", ""height"": 170, ""weight"": 85}"
bob roberts,3/3/1982,"{""eye_color"": ""green"", ""height"": 180, ""weight"": 94}"
After using df = pandas.read_csv('file.csv'), what's the most efficient way to parse and split the stats column into additional columns?
After about an hour, the only thing I could come up with was:
import json
stdf = df['stats'].apply(json.loads)
stlst = list(stdf)
stjson = json.dumps(stlst)
df.join(pandas.read_json(stjson))
This seems like I'm doing it wrong, and it's quite a bit of work considering I'll need to do this on three columns regularly.
*Edit: desired output is the dataframe object below. added following lines of code to get there in my (crappy) way: