How do I sum values in a column that match a given condition using pandas?

Question

2 Answers

Shrutiparna · Answer 1 · 2019-05-21T10:23:14+0000

@Alex , you may refer to the following approaches:

1.Using groupby() which splits the dataframe into parts according to the value in column ‘X’ -

df.groupby('X')['Y'].sum()[1]

13

2.Similarly, we can use Boolean indexing where loc is used to handle indexing of rows and columns-

df.loc[df['X'] == 1, 'Y'].sum()

13

3.Query can also be used in order to filter rows you are interested in-

df.query("X == 1")['Y'].sum()

13

Similarly, if you had three columns :

Ex-

X Y Z
1 3 2

1 4 2

2 6 2

1 6 2

2 3 2

And you want to sum the rows of Y where Z is 2 and X is 2 ,then we may use the following:

1.groupby()

df.groupby('X')['Y'].sum()

2.Query

df.query("X == 2 and Z == 2")['Y'].sum()

3.Boolean indexing

df.loc[(df['X'] == 2) & (df['Z'] == 2), 'Y'].sum()

If You want to learn python for data science visit this python course by Intellipaat.

Vishal · Answer 2 · 2019-09-18T09:38:30+0000

You can also do this without using groupby or loc. By simply including the condition in code. Let the name of dataframe be df. Then you can try :

df[df['a']==1]['b'].sum()

or you can also try :

sum(df[df['a']==1]['b'])

Another way could be to use the numpy library of python :

import numpy as np
print(np.where(df['a']==1, df['b'],0).sum())

You can use the following video tutorials to clear all your doubts:-

Learn in detail about Python by enrolling in Intellipaat Python Course online and upskill.