Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I'm trying to set a new column (two columns in fact) in a pandas dataframe, with the data comes from other dataframe.

I have the following two dataframes (they are example for this purpose, the original dataframes are so much bigger):

In [116]: df0

Out[116]:     

   A  B  C

0  0  1  0

1  2  3  2

2  4  5  4

3  5  5  5

In [118]: df1

Out[118]: 

   A  D  E

0  2  7  2

1  6  5  5

2  4  3  2

3  0  1  0

4  5  4  6

5  0  1  0

And I want to have a new dataframe (or added to df0, whatever), as:

df2: 

   A  B  C  D  E

0  0  1  0  1  0

1  2  3  2  7  2

2  4  5  4  3  2

3  5  5  5  4  6

As you can see, in the resulting dataframe isn't present the row with A=6 which is present in df1 but not in df0. Also the row with A=0 is duplicated in df1, but not in the result df2.

Actually, I'm having trouble with the selection method. I can do this:

df1.loc[df1['A'].isin(df0['A'])]

But I'm not sure how to apply the part of keep with unique data (remember that df1 can contain duplicated data) and add the two columns to the df2 dataset (or add them to df0). I've search here and I don't know see how to apply something like groupby, or even map.

Any idea?

Thanks!

1 Answer

0 votes
by (41.4k points)

You can use merge:

import pandas as pd

df2 = pd.merge(df0,df1, left_index=True, right_index=True)

If you want to learn about Pandas DataFrame visit this Pandas Tutorial.

Browse Categories

...