Python script to skip specific column in CSV files

Question

asked Oct 6, 2020 in Data Science by blackindya (18.4k points)

I have the python code that filters the data according to a specific column and creates multiple CSV files.

Here is my main csv file:

Name, City, Email
john cty_1 [email protected]
jack cty_1 [email protected]
...
Ross cty_2 [email protected]
Rachel cty_2 [email protected]
...

My python logic currently produces a separate csv for separate cities. Existing python logic is:

from itertools import groupby
import csv
with open('filtered_final.csv') as csv_file:
reader = csv.reader(csv_file)
next(reader) #skip header

#Group by column (city)
lst = sorted(reader, key=lambda x : x[1])
groups = groupby(lst, key=lambda x : x[1])
#Write file for each city
for k,g in groups:
filename = k[21:] + '.csv'
with open(filename, 'w', newline='') as fout:
csv_output = csv.writer(fout)
csv_output.writerow(["Name","City","Email"]) #header
for line in g:
csv_output.writerow(line)

I want to remove the "City" column on each of the new CSV files.

1 Answer

supriya · Answer 1 · 2020-10-06T09:20:41+0000

If your data is small enough to place on the ram, you can just read the entire thing in also do a groupby:

import pandas as pd
df = pd.read_csv('filtered_final.csv')
for city, data in df[['Name','Email']].groupby(df['City']):
data.to_csv(f'{city}_data.csv', index=False)

Do check out python for data science which helps you understand from scratch

Python script to skip specific column in CSV files

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources