Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

Suppose I have dataframe df1 which includes two columns - A & B. Value of A represents the lower range and value of B represents the upper range.

  A     B

10.5  20.5

30.5  40.5

50.5  60.5

I've another dataframe which includes two columns - C & D containing a different range of numbers.

  C     D

12.34  15.90

13.68  19.13

33.5   35.60

35.12  38.76

50.6   59.1

Now I want to list all the pairs from df2 that fall under the groups (between the lower and upper range) in the df1.

Final output should be like this -

     Key                Values

(10.5, 20.5)  [(12.34, 15.90), (13.68, 19.13)]

(30.5, 40.5)  [(33.5, 35.60), (35.12, 38.76)]

(50.5, 60.5)  [(50.6, 59.1)]

The solution should be efficient as I have 5000 groups of range and 85000 numbers from different range.

1 Answer

0 votes
by (41.4k points)

Firstly, you have to  create output DF with the keys from df with A and B

df = pd.DataFrame({'Key':[(a,b) for a,b in df1.itertuples(index=False)]})

Then, you have to define a function to find range in df2 within the range from the Keys column

def find_range( key, df_2):

    mask = (key[0] <= df_2['C']) & (key[1] >= df_2['D'])

    return [(c,d) for c,d in df_2[mask].itertuples(index=False)]

Now using apply for creating the column Values.

df['Values'] = df['Key'].apply(find_range, args=(df2,))

 Output:

            Key                           Values

0  (10.5, 20.5)  [(12.34, 15.9), (13.68, 19.13)]

1  (30.5, 40.5)                   [(33.5, 35.6)]

If you want to be build successful data science career then enroll for best data science certification.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.5k answers

500 comments

108k users

Browse Categories

...