I am getting the error “Error in knn(train = prc_train, test = prc_test, cl = prc_train_labels, : no missing values are allowed”

Question

asked Apr 9, 2020 in Data Science by blackindya (18.4k points)

I am beginner to data science and started learning using internet. I downloaded the cancer dataset and started to built a knn model on top that datset, the code is given below:

stringsAsFactors = FALSE
str(prc)
prc <- prc[-1] #removes the first variable(id) from the data set.
table(prc$diagnosis_result) # it helps us to get the numbers of patients
prc$diagnosis <- factor(prc$diagnosis_result, levels = c("B", "M"), labels = c("Benign", "Malignant")) #rename
round(prop.table(table(prc$diagnosis)) * 100, digits = 1) # it gives the result in the percentage form rounded of to 1 decimal place( and so it’s digits = 1)
normalize <- function(x) {
return ((x - min(x)) / (max(x) - min(x))) } #very important step (normalizes to a common scale)
prc_n <- as.data.frame(lapply(prc[2:9], normalize))
summary(prc_n$radius)
prc_train <- prc_n[1:65,]
prc_test <- prc_n[66:100,]
prc_train_labels <- prc[1:65, 1]
prc_test_labels <- prc[66:100, 1]
library(class)
prc_test_pred <- knn(train = prc_train, test = prc_test, cl = prc_train_labels,k=10)
library(gmodels)
CrossTable(x=prc_test_labels, y=prc_test_pred, prop.chisq=FALSE) ```
And I am getting error as shown below:
at prc_test_pred which says Error in knn(train = prc_train, test = prc_test, cl = prc_train_labels, : no missing values are allowed.

Can anyone help me?

1 Answer

supriya · Answer 1 · 2020-04-09T08:05:52+0000

I don't know were exactly your code is facing the issue, But i am giving the simple example to build the entire model so that you can see my example and correct yourself. you will also get an idea of how to start.

Here is the code:

#
prc <- read.csv("https://raw.githubusercontent.com/duttashi/learnr/master/data/misc/Prostate_Cancer.csv", header = TRUE, stringsAsFactors = FALSE)
prc <- prc[-1]
prc$diagnosis <- factor(prc$diagnosis_result, levels = c("B", "M"), labels = c("Benign", "Malignant"))
normalize <- function(x) {
return ((x - min(x)) / (max(x) - min(x))) }
prc_n <- as.data.frame(lapply(prc[2:9], normalize))
prc_train <- prc_n[1:65,]
prc_test <- prc_n[66:100,]
prc_train_labels <- prc[1:65, 1]
prc_test_labels <- prc[66:100, 1]
library(class)
prc_test_pred <- knn(train = prc_train, test = prc_test, cl = prc_train_labels,k=10)
library(gmodels)
CrossTable(x=prc_test_labels, y=prc_test_pred, prop.chisq=FALSE)
# -------------------------------------------------------------------------
# Cell Contents
# |-------------------------|
# | N |
# | N / Row Total |
# | N / Col Total |
# | N / Table Total |
# |-------------------------|
#
#
# Total Observations in Table: 35
#
#
# | prc_test_pred
# prc_test_labels | B | M | Row Total |
# ----------------|-----------|-----------|-----------|
# B | 6 | 13 | 19 |
# | 0.316 | 0.684 | 0.543 |
# | 0.857 | 0.464 | |
# | 0.171 | 0.371 | |
# ----------------|-----------|-----------|-----------|
# M | 1 | 15 | 16 |
# | 0.062 | 0.938 | 0.457 |
# | 0.143 | 0.536 | |
# | 0.029 | 0.429 | |
# ----------------|-----------|-----------|-----------|
# Column Total | 7 | 28 | 35 |
# | 0.200 | 0.800 | |
# ----------------|-----------|-----------|-----------|

Hope this will help you.

Learn Python for Data Science Course to improve your technical knowledge.

I am getting the error “Error in knn(train = prc_train, test = prc_test, cl = prc_train_labels, : no missing values are allowed”

1 Answer

Related questions

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources