0 votes
1 view
in Python by (6.1k points)

I have two dataframes and I want to find the difference between dataframe1 and dataframe2 based on the condition. What I mean is the following:

df.ref_well:

zone depth

a    34

b    23

c    11

d    35

e    -9999

df_well

zone depth

a    17

c    15

d    25

f    11

what I want is to generate the df3 with the zone name and the difference between the depth of the same zones in df1 and df3:

df3 = well- ref well (the same zones)

zone depth

a    17

b    -9999

c    -4

d    10

e    -9999

I have tried to iterate through dfs separately and identify the same zones, and if they are equal to find the difference:

ref_well_zone_count=len(df_ref_well.iloc[:,0])

well_zone_count=len(df_well.iloc[:,0])

delta_depth=[]

for ref_zone in range(ref_well_zone_count):

    for well_zone in range(well_zone_count):

        if df_ref_well.iloc[ref_zone,0]==df_well.iloc[well_zone,0]:

            delta_tvdss.append(df_well.iloc[well_zone, 1] - df_ref_well.iloc[ref_zone, 1])

The problem is I can't fill the results into the new column I am not able to insert them, so when I try adding the delta_depth as a column it says that:

ValueError: Length of values does not match length of index

But if I print out the results it calculates perfectly

1 Answer

0 votes
by (12.1k points)

You didn't specify what you want to do if there is no match. So I will assume no match means depth = 0

Link 2 df together using merge, then fill those that doesn't have a match will have 0 by default:

df3 = pd.merge(ref_well,df_well, on=['zone'], how='outer').fillna(0)

Calculate the difference and put it back

df3['diff'] = df3.depth_x - df3.depth_y

...