I want to process a CSV file present on my local hard disk in chunks using pandas. I have the processing code ready and it works without any error if I ran the code on a whole dataset. The problem arises when the same code is run on the chunks.
I thought maybe the chunks are of different data types so tried checking the type of chunks using type(chunk) and it is the same as type(whole_dataframe).
What I tried:
whole_data = pd.read_csv('data.csv', sep=',', header=0)
whole_data['cuisines'] = whole_data.cuisines.apply(lambda x: ','+x)
This gives me the expected result. But when I try running the same code on chunks as:
for chunk in pd.read_csv('data.csv', sep=',', header=0, chunksize=1000):
chunk['cuisines'] = chunk.cuisines.apply(lambda x: ','+x)
This gives me an error: TypeError: can only concatenate str (not "float") to str
I expect the output to be the same as output I got while running the code on the whole dataset.