Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I am creating a dataframe name "salesdata" and it has a column name "Outlet_Size",this column contains some missing data.This is my code-:

#defining a dictionary

cat_dict ={}

#getting all the values of the column

outlet_size_values = salesdata.Outlet_Size.values

unique_outlet_size_val = list(set(outlet_size_values))  

print(unique_outlet_size_val)

the output I am getting is [nan,'High','Medium','Small'] I don't want this missing data(nan) to be the part of my list and I don;t want to create a new list for this.

1 Answer

0 votes
by (41.4k points)

You can use pandas functions:  dropna to remove the nan values and then unique to get the set-equivalent result.

salesdata.Outlet_Size.dropna().unique()

Or you can use numpy.unique also:

import pandas as pd 

import numpy as np 

np.unique(salesdata.Outlet_Size.dropna().values)

If you wish to learn more about how to use python for data science, then go through this data science python course by Intellipaat for more insights.

Browse Categories

...