0 votes
2 views
in Data Science by (10.1k points)

df:

    Col_A        Month

0 March 2020      Mar

1 March 20        Mar

2 Ebg 2020        Mar

3 17 GOFE         Mar

4 APR 17          Mar

5 16 HGN          Nov

6 2015 ref        May

7 18Jun           Jul

How to replace digit from a string variable in a pandas data frame, for example, I need to replace digits in Col_A with 2019 or 19. if digit count or length in col_A is 4 then 2019 else 19.

Output:

    Col_A        Month

0 March 2019      Mar

1 March 19        Mar

2 Ebg 2019        Mar

3 19 GOFE         Mar

4 APR 19          Mar

5 19 HGN          Nov

6 2019 ref        May

7 19Jun           Jul

1 Answer

0 votes
by (20.4k points)

Here is how you can use re:

import pandas as pd

from re import sub, findall

df = pd.DataFrame(...)

df['Col_A'] = [sub('\d\d\d\d','2019',m)

               if findall('\d\d\d\d',m)

               else sub('\d\d','19',m)

               for m in df['Col_A']]

UPDATE: Another way:

import pandas as pd

from re import sub, findall

df = pd.DataFrame(...)

df['Col_A'] = df.Col_A.map(lambda m: sub('[0-9]{4}','2019',m)

                           if findall('[0-9]{4}',m)

                           else sub('[0-9]{2}','19',m))

If you want to know more about the Data Science then do check out the following Data Science which will help you in understanding Data Science from scratch
Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...