Multiple aggregations of the same column using pandas GroupBy.agg()

Question

asked Aug 24, 2019 in Data Science by sourav (17.6k points)

Given the following (totally overkill) data frame example

import pandas as pd
import datetime as dt
df = pd.DataFrame({
"date" : [dt.date(2012, x, 1) for x in range(1, 11)],
"returns" : 0.05 * np.random.randn(10),
"dummy" : np.repeat(1, 10)
})

is there an existing built-in way to apply two different aggregating functions to the same column, without having to call agg multiple times?

The syntactically wrong, but intuitively right, way to do it would be:

# Assume `function1` and `function2` are defined for aggregating.
df.groupby("dummy").agg({"returns":function1, "returns":function2})

Obviously, Python doesn't allow duplicate keys. Is there any other manner for expressing the input to agg? Perhaps a list of tuples [(column, function)] would work better, to allow multiple functions applied to the same column? But it seems like it only accepts a dictionary.

Is there a workaround for this besides defining an auxiliary function that just applies both of the functions inside of it? (How would this work with aggregation anyway?)

1 Answer

Shlok Pandey · Answer 1 · 2019-08-24T10:44:50+0000

Pass the functions as a list:

In [20]: df.groupby("dummy").agg({"returns": [np.mean, np.sum]})
Out[20]:
returns
sum mean

dummy
1 0.285833 0.028583

or as a dictionary:

In [21]: df.groupby('dummy').agg({'returns':
                                  {'Mean': np.mean, 'Sum': np.sum}})
Out[21]:
        returns
            Sum Mean

dummy
1 0.285833 0.028583

If you want to learn more about Pandas then visit this Python Course designed by the industrial experts.

Multiple aggregations of the same column using pandas GroupBy.agg()

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources