What is the difference between Q-learning and Value Iteration?

Question

1 Answer

vinita · Answer 1 · 2019-07-01T13:39:53+0000

Q-learning is a model-free reinforcement learning algorithm. The goal of the Q-learning is to learn a policy, which tells the agent what action to take under what circumstances. It does not require any model (thus "model-free") of the environment, and it can handle problems with stochastic transitions and rewards, without any requiring adaptations.

Value iteration is used when you have transition probabilities, which means when you know the probability of getting from state ‘x’ into state ‘x'’ with action ‘a’.

In contrast, you might have a black box( black box transition probability is a function of the states and actions, which vary as the exploration moves forward) that allows you to simulate it, but you're not actually given the probability. So you are model-free. This is when you apply Q learning.

If you want to learn Artificial Intelligence then go through this video tutorial:

What is the difference between Q-learning and Value Iteration?

1 Answer

Related questions

Browse Categories