**Q-learning** is a model-free reinforcement learning algorithm. The goal of the Q-learning is to learn a policy, which tells the agent what action to take under what circumstances. It does not require any model (thus "model-free") of the environment, and it can handle problems with stochastic transitions and rewards, without any requiring adaptations.

**Value iteration** is used when you have transition probabilities, which means when you know the probability of getting from state ‘x’ into state ‘x'’ with action ‘a’.

In contrast, you might have a black box( black box transition probability is a function of the states and actions, which vary as the exploration moves forward) that allows you to simulate it, but you're not actually given the probability. So you are model-free. This is when you apply Q learning.

If you want to learn Artificial Intelligence then go through this video tutorial: