Explore Courses Blog Tutorials Interview Questions
0 votes
in Data Science by (18.4k points)

I have a data frame named df1, having a column "name_str". Example below:


0 alp:ha

1 bra:vo

2 charl:ie

I have to create another column which would comprise say 5 characters that start after the colon (:). This is the code I have used:

import pandas as pd

data = {'name_str':["alp:ha", "bra:vo", "charl:ie"]}

#indx = ["name_1",]

df1 = pd.DataFrame(data=data)

n= df1['name_str'].str.find(":")+1

df1['slize'] = df1['name_str'].str.slice(n,2)


But the output is disappointing: NaanN

name_str slize

0 alp:ha NaN

1 bra:vo NaN

2 charl:ie NaN

The output should have been:

name_str slize

0 alp:ha ha

1 bra:vo vo

2 charl:ie ie

1 Answer

0 votes
by (36.8k points)
edited by

Use str.extract to extract everything after the colon with this regular expression: :(.*)

df1['slize'] = df1.name_str.str.extract(':(.*)')

>>> df1

name_str slize

0    alp:ha    ha

1    bra:vo    vo

2  charl:ie    ie


df['slize'] = df1.name_str.str.extract(':(.{,5})') 

Do check out Data Science with Python course which helps you understand from scratch

Browse Categories