Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (50.2k points)

I have tried passing the dtype parameter with read_csv as dtype={n: pandas.Categorical} but this does not work properly (a result is an Object). The manual is unclear.

1 Answer

0 votes
by (108k points)

In version 0.19.0 you can use argument dtype='category' in read_csv:

data = 'col1,col2,col3\na,b,1\na,b,2\nc,d,3'

df = pd.read_csv(pd.compat.StringIO(data), dtype='category')

print (df

)

  col1 c

ol2 col3

0    a b    1

1    a b    2

2    c d    3

print (df.dtypes)

col1    category

col2    category

col3    category

dtype: object

If you want to specify a column for category use dtype with a dictionary, then just follow the code:

df = pd.read_csv(pd.compat.StringIO(data), dtype={'col1':'category'})

print (df)

  col1 col2  col3

0    a b     1

1    a b     2

2    c d     3

print (df.dtypes)

col1    category

col2      object

col3       int64

dtype: object

If you are interested in learning Pandas and want to become an expert in Python Programming, then check out this Python Course and upskill yourself.

Related questions

0 votes
1 answer
0 votes
1 answer
asked Sep 10, 2019 in Data Science by ashely (50.2k points)
0 votes
1 answer

Browse Categories

...