0 votes
1 view
in Machine Learning by (15.7k points)

How is Q-learning different from value iteration in reinforcement learning?

I know Q-learning is model-free and training samples are transitions (s, a, s', r). But since we know the transitions and the reward for every transition in Q-learning, is it not the same as model-based learning where we know the reward for a state and action pair, and the transitions for every action from a state (be it stochastic or deterministic)? I do not understand the difference.

1 Answer

0 votes
by (33.2k points)

The support vector machine (SVM) in the e1071 package uses the "one-against-one" strategy for multiclass classification. 

You can use the generic functions plot and summary:

# Subset the iris dataset to only 2 labels and 2 features

iris.part = subset(iris, Species != 'setosa')

iris.part$Species = factor(iris.part$Species)

iris.part = iris.part[, c(1,2,5)]

# Fit svm model

fit = svm(Species ~ ., data=iris.part, type='C-classification', kernel='linear')

# Make a plot of the model

dev.new(width=5, height=5)

plot(fit, iris.part)

# Tabulate actual labels vs. fitted labels

pred = predict(fit, iris.part)

table(Actual=iris.part$Species, Fitted=pred)

w = t(fit$coefs) %*% fit$SV

# Calculate decision values manually

iris.scaled = scale(iris.part[,-3], fit$x.scale[[1]], fit$x.scale[[2]]) 

t(w %*% t(as.matrix(iris.scaled))) - fit$rho

# Should equal...

fit$decision.values

For more details, study SVM Algorithms. For more details, Machine Learning Online Course.

Hope this answer helps.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...