How can I merge rows if a column data is same and change a value of another specific column on merged column efficiently in pandas?

Question

asked May 19, 2020 in Data Science by blackindya (18.4k points)

If columns are the same then I want to merge the rows. To do it I am using grouby command then replace the value of the column based on the condition given. Is there any other way better than this. Any suggestion is appreciated

This is the code:

data={'Name': {0: 'Sam', 1: 'Amy', 2: 'Cat', 3: 'Sam', 4: 'Kathy'},
'Subject1': {0: 'Math', 1: 'Science', 2: 'Art', 3: np.nan, 4: 'Science'},
'Subject2': {0: np.nan, 1: np.nan, 2: np.nan, 3: 'English', 4: np.nan},
'Result': {0: 'Pass', 1: 'Pass', 2: 'Fail', 3: 'TBD', 4: 'Pass'}}
df=pd.DataFrame(data)
df=df.groupby('Name').agg({
'Subject1': 'first',
'Subject2': 'first',
'Result': ', '.join}).reset_index()
df['Result']=df['Result'].apply(lambda x: 'RESULT_FAILED' if x=='Pass, TBD' else x )
data frame looks like this:
Name Subject1 Subject2 Result
0 Sam Math NaN Pass
1 Amy Science NaN Pass
2 Cat Art NaN Fail
3 Sam NaN English TBD
4 Kathy Science NaN Pass

1 Answer

supriya · Answer 1 · 2020-05-19T03:46:48+0000

In your sample, you have unique names that contain no NAN subjectX values. That is your SubjectX column has a single non-NAN value for duplicating the Name. So you can use the code below for the solution:

import numpy as np
df_final = (df.fillna('').groupby('Name', as_index=False).agg(''.join)
.replace({'':np.nan, 'PassTBD': 'RESULT_FAILED'}))
Out[16]:
Name Subject1 Subject2 Result
0 Amy Science NaN Pass
1 Cat Art NaN Fail
2 Kathy Science NaN Pass
3 Sam Math English RESULT_FAILED

If you are a beginner and want to know more about Data Science the do check out the Data Science course

How can I merge rows if a column data is same and change a value of another specific column on merged column efficiently in pandas?

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources