0 votes
1 view
in Machine Learning by (19k points)

How is Q-learning different from value iteration in reinforcement learning?

I know Q-learning is model-free and training samples are transitions (s, a, s', r). But since we know the transitions and the reward for every transition in Q-learning, is it not the same as model-based learning where we know the reward for a state and action pair, and the transitions for every action from a state (be it stochastic or deterministic)? I do not understand the difference.

1 Answer

0 votes
by (33.2k points)

The support vector machine (SVM) in the e1071 package uses the "one-against-one" strategy for multiclass classification. 

You can use the generic functions plot and summary:

# Subset the iris dataset to only 2 labels and 2 features

iris.part = subset(iris, Species != 'setosa')

iris.part$Species = factor(iris.part$Species)

iris.part = iris.part[, c(1,2,5)]

# Fit svm model

fit = svm(Species ~ ., data=iris.part, type='C-classification', kernel='linear')

# Make a plot of the model

dev.new(width=5, height=5)

plot(fit, iris.part)

# Tabulate actual labels vs. fitted labels

pred = predict(fit, iris.part)

table(Actual=iris.part$Species, Fitted=pred)

w = t(fit$coefs) %*% fit$SV

# Calculate decision values manually

iris.scaled = scale(iris.part[,-3], fit$x.scale[[1]], fit$x.scale[[2]]) 

t(w %*% t(as.matrix(iris.scaled))) - fit$rho

# Should equal...


For more details, study SVM Algorithms. For more details, Machine Learning Online Course.

Hope this answer helps.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !