Python Pandas add a column for row-wise max value of selected columns

Python Pandas add a column for row-wise max value of selected columns

When you are working with Pandas DataFrames, you may be required to find the maximum values across specific columns for each row and store the result in a new column. This operation is used for data analysis, decision-making, and reporting. 


In this blog we are going to explore various methods such as df.max, apply(), and numpy.maximum.reduce() are used to achieve this in Pandas.

Table of Contents

Ways to Add a Column for Row-Wise Max Value

Let us explore a few ways to add a column for row-wise max value.

1. Creating a DataFrame

import pandas as pd

# Sample data that is used

data = {

    'A': [10, 20, 30, 40],

    'B': [5, 25, 35, 15],

    'C': [8, 22, 31, 50]

}

# Creating a DataFrame out of the given data

df = pd.DataFrame(data)

print(df)

Output:

    A   B   C

0  10   5   8

1  20  25  22

2  30  35  31

3  40  15  50

Now, once our DataFrame is created, we can perform the following operations.

2. Using df.max (axis=1)

The simplest way to compute the row-wise maximum is by using the function max() with the axis=1:

import pandas as pd

data = {

    'A': [10, 20, 30, 40],

    'B': [5, 25, 35, 15],

    'C': [8, 22, 31, 50]

}

df = pd.DataFrame(data)

print(df)

df['Max_Value'] = df[['A', 'B', 'C']].max(axis=1)

print(df)

Output:

    A   B   C     Max_Value

0  10   5   8        10

1  20  25  22        25

2  30  35  31        35

3  40  15  50        50

3. Using apply() with max()

One more approach is to use the apply() function with max(), which is a little slower but is good for complex operations.

df['Max_Value'] = df[['A', 'B', 'C']].apply(lambda row: row.max(), axis=1)

print(df)

Output:

   A    B   C     Max_Value

0  10   5   8        10

1  20  25  22        25

2  30  35  31        35

3  40  15  50        50

4. Using numpy.maximum.reduce() for Efficiency

When you are working with the large DataFrames, Numpy’s maximum.reduce() function is a good alternative as it is very efficient. This method is significantly faster when you are working with large datasets as it is operating directly on NumPy arrays.

import numpy as np

df['Max_Value'] = np.maximum.reduce(df[['A', 'B', 'C','Max_Value']].values)

print(df)

Output:

   A    B   C      Max_Value

0  10   5   8        10

1  20  25  22        25

2  30  35  31        35

3  40  15  50        50

Examples to Add a Column for Row-Wise Max Value

1. Selecting Particular Columns Dynamically

When you want to find the maximum for only specific columns based on a condition, it enables you to dynamically select them:

your_columns = ['A', 'C']  

df['Max_Value'] = df[your_columns].max(axis=1)

print(df)

Output:


A    B   C  Max_Value

0  10   5   8        10

1  20  25  22        25

2  30  35  31        35

3  40  15  50        50

2. Managing Missing (NaN) Values

When your dataset contains the NaN values, the default max() function tends to ignore them. But you can still fill them with a default value before calculating the max value:

df['Max_Value'] = df[['A', 'B', 'C']].fillna(0).max(axis=1)

Alternatively, below is the code you can use if you want to return NaN when all values in the row are NaN:

df['Max_Value'] = df[['A', 'B', 'C']].max(axis=1, skipna=False)

Output:

   A    B   C      Max_Value

0  10   5   8        10

1  20  25  22        25

2  30  35  31        35

3  40  15  50        50

Conclusion

Adding a column for the row-wise maximum value in a Pandas DataFrame is easy and direct. You can use the df.max(axis = 1) function as it is very efficient, but the numpy.maximum.reduce() is better when you are working with larger datasets. If you need flexibility, you can use apply() for complex operations.