Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (50.2k points)

I want to append (merge) all the csv files in a folder using Python pandas.

For example: Say folder has two csv files test1.csv and test2.csv as follows:

A_Id    P_Id    CN1         CN2         CN3

AAA     111     702         709         740

BBB     222     1727        1734        1778

and

A_Id    P_Id    CN1         CN2         CN3

CCC     333     710        750          750

DDD     444     180        734          778

So the python script I wrote was as follows:

#!/usr/bin/python

import pandas as pd

import glob

all_data = pd.DataFrame()

for f in glob.glob("testfolder/*.csv"):

    df = pd.read_csv(f)

    all_data = all_data.append(df)

all_data.to_csv('testfolder/combined.csv')

Though the combined.csv seems to have all the appended rows, it looks as follows:

      CN1       CN2         CN3    A_Id    P_Id

  0   710      750         750     CCC     333

  1   180       734         778     DDD     444     

  0   702       709         740     AAA     111

  1  1727       1734        1778    BBB     222

Whereas it should look like this:

A_ID   P_Id   CN1    CN2    CN2

AAA    111    702    709    740

BBB    222    1727   1734   1778

CCC    333    110    356    123

DDD    444    220    256    223

Why are the first two columns moved to the end?

Why is it appending in the first line rather than at the last line?

What am I missing? And how can I get of 0s and 1s in the first column?

P.S: Since these are large csv files, I thought of using pandas.

1 Answer

0 votes
by (108k points)

You can try the following code:

all_data = all_data.append(df)[df.columns.tolist()]

Related questions

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

Browse Categories

...