Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in Data Science by (18.4k points)

df:

    Col_A        Month

0 March 2020      Mar

1 March 20        Mar

2 Ebg 2020        Mar

3 17 GOFE         Mar

4 APR 17          Mar

5 16 HGN          Nov

6 2015 ref        May

7 18Jun           Jul

How to replace digit from a string variable in a pandas data frame, for example, I need to replace digits in Col_A with 2019 or 19. if digit count or length in col_A is 4 then 2019 else 19.

Output:

    Col_A        Month

0 March 2019      Mar

1 March 19        Mar

2 Ebg 2019        Mar

3 19 GOFE         Mar

4 APR 19          Mar

5 19 HGN          Nov

6 2019 ref        May

7 19Jun           Jul

1 Answer

0 votes
by (36.8k points)

Here is how you can use re:

import pandas as pd

from re import sub, findall

df = pd.DataFrame(...)

df['Col_A'] = [sub('\d\d\d\d','2019',m)

               if findall('\d\d\d\d',m)

               else sub('\d\d','19',m)

               for m in df['Col_A']]

UPDATE: Another way:

import pandas as pd

from re import sub, findall

df = pd.DataFrame(...)

df['Col_A'] = df.Col_A.map(lambda m: sub('[0-9]{4}','2019',m)

                           if findall('[0-9]{4}',m)

                           else sub('[0-9]{2}','19',m))

If you want to know more about the Data Science then do check out the following Data Science which will help you in understanding Data Science from scratch

Browse Categories

...