Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I am trying to build a neural network using data science. I found one-hot encoding on the internet which is categorical data before training. So I thought of implementing it but I am getting the error when I am training the model.

Error in model.frame.default(formula = nndf$class ~ ., data = train) : 

  invalid type (list) for variable 'nndf$class'

I read the formula to be passed in this way

class ~ x1 + x2

But still, I have no idea how should I pass the data:

The code I am using is as follows:

nndf$al <- one_hot(as.data.table(nndf$al))

nndf$su <- one_hot(as.data.table(nndf$su))

nndf$rbc <- one_hot(as.data.table(nndf$rbc))

nndf$pc <- one_hot(as.data.table(nndf$pc))

nndf$pcc <- one_hot(as.data.table(nndf$pcc))

nndf$ba <- one_hot(as.data.table(nndf$ba))

nndf$htn <- one_hot(as.data.table(nndf$htn))

nndf$dm <- one_hot(as.data.table(nndf$dm))

nndf$cad <- one_hot(as.data.table(nndf$cad))

nndf$appet <- one_hot(as.data.table(nndf$appet))

nndf$pe <- one_hot(as.data.table(nndf$pe))

nndf$ane <- one_hot(as.data.table(nndf$ane))

nndf$class <- one_hot(as.data.table(nndf$class))

class(nndf$class)

# view the dataframe to ensure one hot encoding is correct

summary(nndf)

# randomly sample rows for tt split

train_idx <- sample(1:nrow(nndf), 0.8 * nrow(nndf))

test_idx <- setdiff(1:nrow(nndf), train_idx)

# prepare training set and corresponding labels

train <- nndf[train_idx,]

# prepare testing set and corresponding labels

X_test <- nndf[test_idx,]

y_test <- nndf[test_idx, "class"]

# create model with a single hidden layer containing 500 neurons

model <- nnet(nndf$class~., train, maxit=150, size=10)

# prediction

X_pred <- predict(train, type="raw")

1 Answer

0 votes
by (36.8k points)

Since you have told all the variables are in categorical 

You need to convert all the variables except the Response variable to one-hot encoding format, to do so use the command below:

one_hot_df <- one_hot(nndf[, -13]) # 13 is the index of `class` variable.

For model.matix_method:

model_mat_df <- model.matri ~ . - 1, nndf[, -13])

Then

Convert class as a factor and add it to either of the above dfs.

class <- as.factor(nndf$class)

      final_df <- cbind(model_mat_df, class)

Split final_df into train and test and use that in the model.

nnet(class~., train, maxit=150, size=10)

If you want to know more about the Data Science then do check out the following Data Science which will help you in understanding Data Science from scratch 

Browse Categories

...