Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I want to calculate a percentage, for each id, of True values from all rows of the id.

Here an example of my data:

id     col1    

 1     True

 1     True

 1     False

 1     True

 2     False

 2     False

The new column should look like this:

id     col1    num_true

 1     True     0.75

 1     True     0.75

 1     False    0.75

 1     True     0.75

 2     False    0

 2     False    0

This is what I tried to do:

df['num_true']= df[df['col1'] == 'True'].groupby('id')['col1'].count()

df['num_col1_id']= df.groupby('id')['col1'].transform('count')

df['perc_true']= df.num_true/df.num_col1_id

1 Answer

0 votes
by (36.8k points)
edited by

Use groupby and apply transform to get the mean

df['num_true']=df.groupby('id').col1.transform('mean')

  id   col1  num_true

0   1   True      0.75

1   1   True      0.75

2   1  False      0.75

3   1   True      0.75

4   2  False      0.00

5   2  False      0.00

Learn Data Science with Python Course to improve your technical knowledge. 

Browse Categories

...