Continuous or categorical data in data science

Question

asked Jul 13, 2019 in Data Science by sourav (17.6k points)

I am building an automated cleaning process that clean null values from the dataset. I discovered few functions like mode, median, mean which could be used to fill NaN values in given data. But which one I should select? if data is categorical it has to be either mode or median while for continuous it has to be mean or median. So to define whether data is categorical or continuous I decided to make a machine learning classification model.

I took few features like,
1) standard deviation of data
2) Number of unique values in data
3) total number of rows of data
4) ratio of unique number of total rows
5) minimum value of data
6) maximum value of data
7) number of data between median and 75th percentile
8) number of data between median and 25th percentile
9) number of data between 75th percentile and upper whiskers
10) number of data between 25th percentile and lower whiskers
11) number of data above upper whisker
12) number of data below lower whisker

First with this 12 features and around 55 training data I used the logistic regression model on Normalized form to predict label 1(continuous) and 0(categorical).

Fun part is it worked!!

But, did I do it the right way? Is it a correct method to predict nature of data? Please advise me if I could improve it further.

1 Answer

Shlok Pandey · Answer 1 · 2019-07-20T08:59:49+0000

So, below is a better approach that can help you take forward this system, but it is a little bit time consuming.

Find the nearest neighbor for each column with missing data and replace it with that value. Suppose you have k columns excluding target, so for each column, treat it as dependent variable and rest of k-1 columns as independent.

After that, find its nearest neighbor and then its output is desired value for missing attribute.

If you want to Learn about Data Science visit this Data Science Course.

Continuous or categorical data in data science

Continuous or categorical data in data science

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Browse Categories

Popular Courses

Top Tutorials

Top Articles

Top Interview Questions