Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I have a column in a data frame and I am trying to extract 8 digits from a string. How to do it

    Input

 Shipment ID

20180504-S-20000

20180514-S-20537

20180514-S-20541

20180514-S-20644

20180514-S-20644

20180516-S-20009

20180516-S-20009

20180516-S-20009

20180516-S-20009

Output:

Order_Date

20180504

20180514

20180514

20180514

20180514

20180516

20180516

20180516

20180516

I tried the below code and it didn't work.

data['Order_Date'] = data['Shipment ID'][:8]

1 Answer

0 votes
by (36.8k points)

You need to index with str which is apply for each value of Series:

data['Order_Date'] = data['Shipment ID'].str[:8]

For better performance if no NaNs values:

data['Order_Date'] = [x[:8] for x in data['Shipment ID']]

print (data)

        Shipment ID Order_Date

0  20180504-S-20000   20180504

1  20180514-S-20537   20180514

2  20180514-S-20541   20180514

3  20180514-S-20644   20180514

4  20180514-S-20644   20180514

5  20180516-S-20009   20180516

6  20180516-S-20009   20180516

7  20180516-S-20009   20180516

8  20180516-S-20009   20180516

You can filter column by position, first N values like:

print (data['Shipment ID'][:2])

0    20180504-S-20000

1    20180514-S-20537

Name: Shipment ID, dtype: object

 If you want to know more about the Data Science then do check out the following Data Science which will help you in understanding Data Science from scratch

Browse Categories

...