Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Data Science by (17.6k points)

I am creating a dataframe name "salesdata" and it has a column name "Outlet_Size",this column contains some missing data.This is my code-:

#defining a dictionary

cat_dict ={}

#getting all the values of the column

outlet_size_values = salesdata.Outlet_Size.values

unique_outlet_size_val = list(set(outlet_size_values))  

print(unique_outlet_size_val)

the output I am getting is [nan,'High','Medium','Small'] I don't want this missing data(nan) to be the part of my list and I don;t want to create a new list for this.

1 Answer

0 votes
by (41.4k points)

You can use pandas functions:  dropna to remove the nan values and then unique to get the set-equivalent result.

salesdata.Outlet_Size.dropna().unique()

Or you can use numpy.unique also:

import pandas as pd 

import numpy as np 

np.unique(salesdata.Outlet_Size.dropna().values)

If you wish to learn more about how to use python for data science, then go through this data science python course by Intellipaat for more insights.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers

500 comments

94.1k users

Browse Categories

...