Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I have the pandas Dataframe with the column expressing this surname and name of several tennis players like this:

   | Player              | 

   |---------------------|

0  | 'Roddick Andy'      |

1  | 'Federer Roger'     |

2  | 'Tsonga Jo Wilfred  |

I want to have the full surname also get that initial of that name and middle name if there is. So panda's column should look like this:

   | Player            | 

   |-------------------|

0  | 'Roddick A.'      |

1  | 'Federer R.'      |

2  | 'Tsonga J.W.'     | N.B. J.W. with no space

Does anyone have any suggestions? 

1 Answer

0 votes
by (36.8k points)

Here is an approach with the str.extractall and groupby:

(df.Player

  .str.extractall('(?P<Surname>\w*)\s(?P<Name>\w*)')

  .groupby(level=0)

  .agg({'Surname':'first',

        'Name': lambda x: x.str[0].add('.').sum()

        })

  .agg(' '.join, axis=1)

)

Output:

0     Roddick A.

1     Federer R.

2    Tsonga J.W.

dtype: object

Learn Python for Data Science Course to improve your technical knowledge.

Browse Categories

...