Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (19.9k points)

I'm trying to join two DataFrames by index that can contain columns in common and I only want to add one to the other if that specific value is NaN or doesn't exist. I'm using the pandas example, so I've got:

df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],

                    'B': ['B0', 'B1', 'B2', 'B3'],

                    'C': ['C0', 'C1', 'C2', 'C3'],

                    'D': ['D0', 'D1', 'D2', 'D3']},

                    index=[0, 1, 2, 3])

as

    A   B   C   D

0  A0  B0  C0  D0

1  A1  B1  C1  D1

2  A2  B2  C2  D2

3  A3  B3  C3  D3

and

df4 = pd.DataFrame({'B': ['B2p', 'B3p', 'B6p', 'B7p'],

                    'D': ['D2p', 'D3p', 'D6p', 'D7p'],

                    'F': ['F2p', 'F3p', 'F6p', 'F7p']},

                    index=[2, 3, 6, 7])

as

    B    D    F

2  B2p  D2p  F2p

3  B3p  D3p  F3p

6  B6p  D6p  F6p

7  B7p  D7p  F7p

and the searched result is:

    A    B   C    D   F

0  A0   B0  C0   D0  Nan

1  A1   B1  C1   D1  Nan 

2  A2   B2  C2   D2  F2p

3  A3   B3  C3   D3  F3p

6 Nan  B6p Nan  D6p  F6p

7 Nan  B7p Nan  D7p  F7p

1 Answer

0 votes
by (25.1k points)

You can use combine.first(), where the row and column indices of the resulting dataframe will be the union of the two, i.e in the absence of an index in one of the dataframes, the value from the other is used (same behaviour as if it contained a NaN:

df1.combine_first(df4)

Related questions

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...