Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I have this:

df = DataFrame(dict(person= ['andy', 'rubin', 'ciara', 'jack'], 

     item = ['a', 'b', 'a', 'c'], 

     group= ['c1', 'c2', 'c3', 'c1'], 

     age= [23, 24, 19, 49]))

df:

    age group item person

0   23  c1    a    andy

1   24  c2    b    rubin

2   19  c3    a    ciara

3   49  c1    c    jack

what I want to do, is to get the length of unique items in each column. Now I know I can do something like:

len(df.person.unique())

for every column.

Is there a way to do this in one go for all columns?

I tried to do:

for column in df.columns:

    print(len(df.column.unique()))

but I know this is not right.

How can I accomplish this?

1 Answer

0 votes
by (41.4k points)
edited by

To get the length of unique items in each column in one go for all columns, you can use pd.Series.nunique.

df.apply(pd.Series.nunique)

age       4

group     3

item      3

person    4

dtype: int64

If you wish to learn about Pandas visit this Pandas Tutorial.

Thinking of getting a master's degree in Data Science? Enroll in the MSc in Data Science in USA!

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.6k answers

500 comments

108k users

Browse Categories

...