Join two DataFrames by index and columns

Question

asked Jul 29, 2019 in Python by Rajesh Malhotra (19.9k points)

I'm trying to join two DataFrames by index that can contain columns in common and I only want to add one to the other if that specific value is NaN or doesn't exist. I'm using the pandas example, so I've got:

df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 1, 2, 3])

as

A B C D
0 A0 B0 C0 D0
1 A1 B1 C1 D1
2 A2 B2 C2 D2
3 A3 B3 C3 D3

and

df4 = pd.DataFrame({'B': ['B2p', 'B3p', 'B6p', 'B7p'],
'D': ['D2p', 'D3p', 'D6p', 'D7p'],
'F': ['F2p', 'F3p', 'F6p', 'F7p']},
index=[2, 3, 6, 7])

as

B D F
2 B2p D2p F2p
3 B3p D3p F3p
6 B6p D6p F6p
7 B7p D7p F7p

and the searched result is:

A B C D F
0 A0 B0 C0 D0 Nan
1 A1 B1 C1 D1 Nan
2 A2 B2 C2 D2 F2p
3 A3 B3 C3 D3 F3p
6 Nan B6p Nan D6p F6p
7 Nan B7p Nan D7p F7p

1 Answer

Anirudh Singh · Answer 1 · 2019-07-29T05:52:02+0000

You can use combine.first(), where the row and column indices of the resulting dataframe will be the union of the two, i.e in the absence of an index in one of the dataframes, the value from the other is used (same behaviour as if it contained a NaN:

df1.combine_first(df4)

Join two DataFrames by index and columns

1 Answer

Related questions

Browse Categories