RNNs, in general, have received the most success when working with sequences of words and paragraphs, generally called natural language processing.
It includes both sequences of text and spoken language represented as a time series. RNN used in generative models that require a sequence output, not only with text but on applications such as generating handwriting.
Use RNNs For:
It is not appropriate for tabular datasets as you would see in a CSV file or spreadsheet and image data input.
The main idea behind the RL Tuner model is to take an RNN trained on data and refine it using RL. The model uses a standard DQN implementation, complete with an experienced buffer and Target Q-network.
The trained Note RNN is used in supplying the initial values of the weights in the Q-network and Target Q-network, and a third copy is used as the Reward RNN. The Reward RNN remains constant during training and is used to supply part of the reward function used to train the model. The figure below illustrates these ideas.
For better understanding, refer the following link:
https://magenta.tensorflow.org/2016/11/09/tuning-recurrent-networks-with-reinforcement-learning