Data Science: How to process and clean data when csv file has thousands of columns

Question

1 Answer

supriya · Answer 1 · 2020-02-07T12:17:13+0000

In machine learning, statistics, and information theory, reducing the number of random variables is a process of dimensionality reduction which is considered by a set of principal variables

For training, the model which are having lots of features is not preferred, since it reduces the accuracy and also costly.

The first step is to pre-process the dataset which involves removing missing values.

To remove the missing values, the code is as follows:

data[data==" ?"] <- NA
data= na.omit(data)

As you have mentioned you are new to Data Science then learn Data Science with R which will help you to solve your problem.

Refer the link below to learn Data Science with R course

Data Science: How to process and clean data when csv file has thousands of columns

1 Answer

Related questions

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources