Aggregations refer to any data transformation that produces scalar values from arrays. You may wonder what is going on when you invoke mean() on a GroupBy object. Many common aggregations, such as mean, median, standard deviation, count, have optimized implementations. However, you are not limited to only this set of methods.

You can use aggregations of your own devising and additionally call any method that is also defined on the grouped object. For example, you might recall that quantile computes sample quantiles of a Series or a DataFrame’s columns.
While quantile is not explicitly implemented for GroupBy, it is a Series method and thus available for use.

To use your own aggregation functions, pass any function that aggregates an array to the aggregate or agg method:

agg() method

we can also perform multiple functions by one aggregation for that we have to use agg() method and pass every function that we want to perform as a list.

we could also do perform multiple computations on a window dataframe, In that case for each function, a separate column will be formed showing the result of the function for that particular column. So if we take this example then on a window dataframe there would be 6 columns 3 for sum and 3 for the mean.

You don’t need to accept the names that GroupBy gives to the columns; notably, lambda functions have the name ‘<lambda>’, which makes them hard to identify (you can see for yourself by looking at a function’s __name__ attribute). Thus, if you pass a list of (name, function) tuples, the first element of each tuple will be used as
the DataFrame column names (you can think of a list of 2-tuples as an ordered mapping):

With a DataFrame you have more options, as you can specify a list of functions to apply to all of the columns or different functions per column.

Apply Multiple Functions on Multiple Columns of a DataFrame

So far we have selected all of the dataframe and computed statistics but we can select an individual column from the dataframe or multiple selected columns and compute statistics on them.


Once the rolling, expanding and ewm objects are created, several methods are available to perform aggregations on data. The aggregate method is used to perform multiple computations on the data. We use aggregation when we want to perform a task on the whole of the dataframe. We can aggregate by passing a function to the entire DataFrame, or select a Series (or multiple Series) via standard  __getitem__ and this will perform that function on that series.

Close Menu