0 votes
in Data Science by (43.2k points)

I have a pandas dataframe with a column named 'City, State, Country'. I want to separate this column into three new columns, 'City, 'State' and 'Country'.

0                 HUN

1                 ESP

2                 GBR

3                 ESP

4                 FRA

5             ID, USA

6             GA, USA

7    Hoboken, NJ, USA

8             NJ, USA

9                 AUS

Splitting the column into three columns is trivial enough:

location_df = df['City, State, Country'].apply(lambda x: pd.Series(x.split(',')))

However, this creates left-aligned data:

     0       1 2

0    HUN   NaN NaN

1    ESP   NaN NaN

2    GBR   NaN NaN

3    ESP   NaN NaN

4    FRA   NaN NaN

5    ID   USA NaN

6    GA   USA NaN

7    Hoboken  NJ USA

8    NJ   USA NaN

9    AUS   NaN NaN

How would one go about creating the new columns with the data right-aligned? Would I need to iterate through every row, count the number of commas and handle the contents individually?

1 Answer

0 votes
by (92.8k points)

You can perform the following code:

foo = lambda x: pd.Series([i for i in reversed(x.split(','))])

rev = df['City, State, Country'].apply(foo)

print rev

      0    1     2

0   HUN NaN      NaN

1   ESP NaN      NaN

2   GBR NaN      NaN

3   ESP NaN      NaN

4   FRA NaN      NaN

5   USA   ID   NaN

6   USA   GA   NaN

7   USA   NJ Hoboken

8   USA   NJ   NaN

9   AUS NaN      NaN

If you are interested to learn Pandas visit this Python Pandas Tutorial.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !