0 votes
1 view
in Data Science by (17.6k points)

I'm reading a CSV file into a DataFrame. I need to strip whitespace from all the stringlike cells, leaving the other cells unchanged in Python 2.7.

Here is what I'm doing:

def remove_whitespace( x ):

    if isinstance( x, basestring ):

        return x.strip()

    else:

        return x

my_data = my_data.applymap( remove_whitespace )

Is there a better or more idiomatic to Pandas way to do this?

Is there a more efficient way (perhaps by doing things column wise)?

I've tried searching for a definitive answer, but most questions on this topic seem to be how to strip whitespace from the column names themselves, or presume the cells are all strings.

1 Answer

0 votes
by (38.2k points)

You could use pandas' Series.str.strip() method to do this quickly for each string-like column:

>>> data = pd.DataFrame({'values': ['   ABC   ', '   DEF', '  GHI  ']})

>>> data

      values

0     ABC   

1        DEF

2      GHI  

>>> data['values'].str.strip()

0    ABC

1    DEF

2    GHI

Name: values, dtype: object

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...