1 view

I want to groupby my column 'group' first. Then change my values in my result column based on a condition in my result and rank columns.

This is my code:

import pandas as pd

import numpy as np

group = ['g1','g1','g1','g1','g1','g2','g2','g2','g2','g2','g2']

rank = ['1','2','3','4','5','1','2','3','4','5','6']

result = ['1','4','2','4','4','1','4','4','2','4','4']

df = pd.DataFrame({"group": group, "rank": rank, "result": result})

group   rank    result

0   g1        1       1

1   g1        2       4

2   g1        3       2

3   g1        4       4

4   g1        5       4

5   g2        1       1

6   g2        2       4

7   g2        3       4

8   g2        4       2

9   g2        5       4

10  g2        6       4

In my same group, I wanna change my result from 4 to 6 when my rank is greater than my rank of result = 2

For example: in g1, my rank of result = 2 is 3. So my result of rank 4 & 5 will be 6.

in g2, my rank of result = 2 is 4. So my result of rank 5 & 6 will be 6.

In this case, the desired output will be:

group   rank    result

0   g1        1       1

1   g1        2       4

2   g1        3       2

3   g1        4       6

4   g1        5       6

5   g2        1       1

6   g2        2       4

7   g2        3       4

8   g2        4       2

9   g2        5       6

10  g2        6       6

Can anyone help me solve this?

by (36.8k points)

Use the Series.where for replace rank to NaN for rows matched by 2 in result and then use GroupBy.transform for repeat values per groups by GroupBy.first, last compare for greater by Series.gt and set value 6 in DataFrame.loc:

#convert to integers for correct compare values greater like '10'

df[['rank','result']] = df[['rank','result']].astype(int)

s = df['rank'].where(df['result'].eq(2)).groupby(df['group']).transform('first')

df.loc[df['rank'].gt(s), 'result'] = 6

print (df)

group  rank  result

0     g1     1       1

1     g1     2       4

2     g1     3       2

3     g1     4       6

4     g1     5       6

5     g2     1       1

6     g2     2       4

7     g2     3       4

8     g2     4       2

9     g2     5       6

10    g2     6       6

Do check out Data Science with Python course which helps you understand from scratch