Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I'm working with a large csv file and the next to last column has a string of text that I want to split by a specific delimiter. I was wondering if there is a simple way to do this using pandas or python?

CustNum  CustomerName     ItemQty  Item   Seatblocks                 ItemExt

32363    McCartney, Paul      3     F04    2:218:10:4,6                   60

31316    Lennon, John        25     F01    1:13:36:1,12 1:13:37:1,13     300

I want to split by the space (' ') and then the colon (':') in the Seatblocks column, but each cell would result in a different number of columns. I have a function to rearrange the columns so the Seatblocks column is at the end of the sheet, but I'm not sure what to do from there. I can do it in excel with the built in text-to-columns function and a quick macro, but my dataset has too many records for excel to handle.

Ultimately, I want to take records such John Lennon's and create multiple lines, with the info from each set of seats on a separate line.

1 Answer

0 votes
by (41.4k points)

Using the below code splits the Seatblocks by space and gives each its own row.

In [129]: df

Out[129]: 

   CustNum     CustomerName  ItemQty Item                 Seatblocks  ItemExt

0    32363  McCartney, Paul        3  F04               2:218:10:4,6       60

1    31316     Lennon, John       25  F01  1:13:36:1,12 1:13:37:1,13      300

In [130]: s = df['Seatblocks'].str.split(' ').apply(Series, 1).stack()

In [131]: s.index = s.index.droplevel(-1) # to line up with df's index

In [132]: s.name = 'Seatblocks' # needs a name to join

In [133]: s

Out[133]: 

0    2:218:10:4,6

1    1:13:36:1,12

1    1:13:37:1,13

Name: Seatblocks, dtype: object

In [134]: del df['Seatblocks']

In [135]: df.join(s)

Out[136]: 

   CustNum     CustomerName  ItemQty Item  ItemExt    Seatblocks

0    32363  McCartney, Paul        3  F04       60  2:218:10:4,6

1    31316     Lennon, John       25  F01      300  1:13:36:1,12

1    31316     Lennon, John       25  F01      300  1:13:37:1,13

If you want to learn more about Pandas then visit this Python Course designed by the industrial experts.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.6k answers

500 comments

108k users

Browse Categories

...