Explore Courses Blog Tutorials Interview Questions
0 votes
in Data Science by (17.6k points)

I have two pandas dataframes (df_1df_2) with the same columns, but in one dataframe (df_1) some values of one column are missing. So I want to fill in those missing values from df_2, but only when the the values of two columns match.

Here is a little example what my data looks like:

df_1: df_1

df_2: df_2

I tried to add the missing values with:

df_1.update(df_2, overwrite=False)

But the problem is, that it will fill in the values, even when just one column matches. I want to fill in the value when the columns "housenumber" AND "street" matches.

1 Answer

0 votes
by (41.4k points)

In both DataFrames,you need  set_index for Multiindex and after that  combine_first or fillna:

df1 = df_1.set_index(["housenumber", "street"])

df2 = df_2.set_index(["housenumber", "street"])

df = df1.combine_first(df2).reset_index()

df = df1.fillna(df2).reset_index()

Browse Categories