Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)
I am a student and I started learning data science using my Jupyter notebook. I downloaded a dataset through the internet and the dataset consists of  5 columns and more than 10000 records present in my dataset. It also contains NA values. I am trying to remove those values using my Jupiter notebook but not able to get the desired results. Can anyone help me solve it?

1 Answer

0 votes
by (36.8k points)

When you are working on the huge dataset it is common having NA values. These values indicated there is no value filled in that particular cell. We need to remove them when we have a huge dataset because when you try implementing any machine learning models on top of this data set you will not end up getting desired results. 

To remove the NA values from the dataset, you can use the code below, I am using a Test as my dataset, as shown:

import pandas as pd

import numpy as np 

Test = pd.DataFrame({'A':[1,2,3,4,5],'B':[1,2,np.nan,4,5],'C':[1,2,3,pd.NaT,5]})

print(Test)

Test = Test.dropna()

print(Test)

 output:

   A     B     C

0  1   1.0     1

1  2   2.0     2

4  5   5.0    5

Improve your knowledge in data science from scratch by click on the link Data Science 

Related questions

Browse Categories

...