Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I'm working with a large csv file and the next to last column has a string of text that I want to split by a specific delimiter. I was wondering if there is a simple way to do this using pandas or python?

CustNum  CustomerName     ItemQty  Item   Seatblocks                 ItemExt

32363    McCartney, Paul      3     F04    2:218:10:4,6                   60

31316    Lennon, John        25     F01    1:13:36:1,12 1:13:37:1,13     300

I want to split by the space (' ') and then the colon (':') in the Seatblocks column, but each cell would result in a different number of columns. I have a function to rearrange the columns so the Seatblocks column is at the end of the sheet, but I'm not sure what to do from there. I can do it in excel with the built in text-to-columns function and a quick macro, but my dataset has too many records for excel to handle.

Ultimately, I want to take records such John Lennon's and create multiple lines, with the info from each set of seats on a separate line.

1 Answer

0 votes
by (41.4k points)

Using the below code splits the Seatblocks by space and gives each its own row.

In [129]: df

Out[129]: 

   CustNum     CustomerName  ItemQty Item                 Seatblocks  ItemExt

0    32363  McCartney, Paul        3  F04               2:218:10:4,6       60

1    31316     Lennon, John       25  F01  1:13:36:1,12 1:13:37:1,13      300

In [130]: s = df['Seatblocks'].str.split(' ').apply(Series, 1).stack()

In [131]: s.index = s.index.droplevel(-1) # to line up with df's index

In [132]: s.name = 'Seatblocks' # needs a name to join

In [133]: s

Out[133]: 

0    2:218:10:4,6

1    1:13:36:1,12

1    1:13:37:1,13

Name: Seatblocks, dtype: object

In [134]: del df['Seatblocks']

In [135]: df.join(s)

Out[136]: 

   CustNum     CustomerName  ItemQty Item  ItemExt    Seatblocks

0    32363  McCartney, Paul        3  F04       60  2:218:10:4,6

1    31316     Lennon, John       25  F01      300  1:13:36:1,12

1    31316     Lennon, John       25  F01      300  1:13:37:1,13

If you want to learn more about Pandas then visit this Python Course designed by the industrial experts.

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...