Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I have this:

df = DataFrame(dict(person= ['andy', 'rubin', 'ciara', 'jack'], 

     item = ['a', 'b', 'a', 'c'], 

     group= ['c1', 'c2', 'c3', 'c1'], 

     age= [23, 24, 19, 49]))

df:

    age group item person

0   23  c1    a    andy

1   24  c2    b    rubin

2   19  c3    a    ciara

3   49  c1    c    jack

what I want to do, is to get the length of unique items in each column. Now I know I can do something like:

len(df.person.unique())

for every column.

Is there a way to do this in one go for all columns?

I tried to do:

for column in df.columns:

    print(len(df.column.unique()))

but I know this is not right.

How can I accomplish this?

1 Answer

0 votes
by (41.4k points)
edited by

To get the length of unique items in each column in one go for all columns, you can use pd.Series.nunique.

df.apply(pd.Series.nunique)

age       4

group     3

item      3

person    4

dtype: int64

If you wish to learn about Pandas visit this Pandas Tutorial.

Thinking of getting a master's degree in Data Science? Enroll in the MSc in Data Science in USA!

Browse Categories

...