I have 2 columns:

X Y

1 3

1 4

2 6

1 6

2 3

How to sum up values of Y where X=1 e.g this will give me [3+4+6=13] in pandas?

+2 votes

@Alex , you may refer to the following approaches:

1.Using **groupby() **which splits the dataframe into parts according to the value in column ‘X’ -

df.groupby('X')['Y'].sum()[1]

13

2.Similarly, we can use **Boolean indexing** where loc is used to handle indexing of rows and columns-

df.loc[df['X'] == 1, 'Y'].sum()

13

3.**Query **can also be used in order to filter rows you are interested in-

df.query("X == 1")['Y'].sum()

13

Similarly, if you had three columns :

Ex-

** X Y Z**

1 3 2

1 4 2

2 6 2

1 6 2

2 3 2

And you want to sum the rows of Y where Z is 2 and X is 2 ,then we may use the following:

1.groupby()

df.groupby('X')['Y'].sum()

2.Query

df.query("X == 2 and Z == 2")['Y'].sum()

3.Boolean indexing

df.loc[(df['X'] == 2) & (df['Z'] == 2), 'Y'].sum()

If You want to learn python for data science visit this python course by Intellipaat.

0 votes

You can also do this without using groupby or loc. By simply including the condition in code. Let the name of dataframe be df. Then you can try :

df[df['a']==1]['b'].sum()

or you can also try :

sum(df[df['a']==1]['b'])

Another way could be to use the numpy library of python :

import numpy as np

print(np.where(df['a']==1, df['b'],0).sum())

You can use the following video tutorials to clear all your doubts:-