When you are working with Pandas DataFrames, you may be required to find the maximum values across specific columns for each row and store the result in a new column. This operation is used for data analysis, decision-making, and reporting.
In this blog we are going to explore various methods such as df.max, apply(), and numpy.maximum.reduce() are used to achieve this in Pandas.
Table of Contents
Ways to Add a Column for Row-Wise Max Value
Let us explore a few ways to add a column for row-wise max value.
1. Creating a DataFrame
import pandas as pd
# Sample data that is used
data = {
'A': [10, 20, 30, 40],
'B': [5, 25, 35, 15],
'C': [8, 22, 31, 50]
}
# Creating a DataFrame out of the given data
df = pd.DataFrame(data)
print(df)
Output:
A B C
0 10 5 8
1 20 25 22
2 30 35 31
3 40 15 50
Now, once our DataFrame is created, we can perform the following operations.
2. Using df.max (axis=1)
The simplest way to compute the row-wise maximum is by using the function max() with the axis=1:
import pandas as pd
data = {
'A': [10, 20, 30, 40],
'B': [5, 25, 35, 15],
'C': [8, 22, 31, 50]
}
df = pd.DataFrame(data)
print(df)
df['Max_Value'] = df[['A', 'B', 'C']].max(axis=1)
print(df)
Output:
A B C Max_Value
0 10 5 8 10
1 20 25 22 25
2 30 35 31 35
3 40 15 50 50
3. Using apply() with max()
One more approach is to use the apply() function with max(), which is a little slower but is good for complex operations.
df['Max_Value'] = df[['A', 'B', 'C']].apply(lambda row: row.max(), axis=1)
print(df)
Output:
A B C Max_Value
0 10 5 8 10
1 20 25 22 25
2 30 35 31 35
3 40 15 50 50
4. Using numpy.maximum.reduce() for Efficiency
When you are working with the large DataFrames, Numpy’s maximum.reduce() function is a good alternative as it is very efficient. This method is significantly faster when you are working with large datasets as it is operating directly on NumPy arrays.
import numpy as np
df['Max_Value'] = np.maximum.reduce(df[['A', 'B', 'C','Max_Value']].values)
print(df)
Output:
A B C Max_Value
0 10 5 8 10
1 20 25 22 25
2 30 35 31 35
3 40 15 50 50
Examples to Add a Column for Row-Wise Max Value
1. Selecting Particular Columns Dynamically
When you want to find the maximum for only specific columns based on a condition, it enables you to dynamically select them:
your_columns = ['A', 'C']
df['Max_Value'] = df[your_columns].max(axis=1)
print(df)
Output:
A B C Max_Value
0 10 5 8 10
1 20 25 22 25
2 30 35 31 35
3 40 15 50 50
2. Managing Missing (NaN) Values
When your dataset contains the NaN values, the default max() function tends to ignore them. But you can still fill them with a default value before calculating the max value:
df['Max_Value'] = df[['A', 'B', 'C']].fillna(0).max(axis=1)
Alternatively, below is the code you can use if you want to return NaN when all values in the row are NaN:
df['Max_Value'] = df[['A', 'B', 'C']].max(axis=1, skipna=False)
Output:
A B C Max_Value
0 10 5 8 10
1 20 25 22 25
2 30 35 31 35
3 40 15 50 50
Conclusion
Adding a column for the row-wise maximum value in a Pandas DataFrame is easy and direct. You can use the df.max(axis = 1) function as it is very efficient, but the numpy.maximum.reduce() is better when you are working with larger datasets. If you need flexibility, you can use apply() for complex operations.