In this blog on LSTM, you will come across the following topics:
Now, let’s understand ‘What is LSTM?’ First, you must be wondering ‘What does LSTM stand for?’ LSTM stands for long short-term memory networks, used in the field of Deep Learning. It is a variety of recurrent neural networks (RNNs) that are capable of learning long-term dependencies, especially in sequence prediction problems. LSTM has feedback connections, i.e., it is capable of processing the entire sequence of data, apart from single data points such as images. This finds application in speech recognition, machine translation, etc. LSTM is a special kind of RNN, which shows outstanding performance on a large variety of problems.
Next in this extensive blog on ‘What is LSTM?’, let us discuss the logic behind it.
The Logic Behind LSTM
The central role of an LSTM model is held by a memory cell known as a ‘cell state’ that maintains its state over time. The cell state is the horizontal line that runs through the top of the below diagram. It can be visualized as a conveyor belt through which information just flows, unchanged.
Information can be added to or removed from the cell state in LSTM and is regulated by gates. These gates optionally let the information flow in and out of the cell. It contains a pointwise multiplication operation and a sigmoid neural net layer that assist the mechanism.
The sigmoid layer gives out numbers between zero and one, where zero means ‘nothing should be let through,’ and one means ‘everything should be let through.’
Further in this ‘What is LSTM?’ blog, you will learn about the various differences between LSTM and RNN.
LSTM vs RNN
Consider, you have the task of modifying certain information in a calendar. To do this, an RNN completely changes the existing data by applying a function. Whereas, LSTM makes small modifications on the data by simple addition or multiplication that flow through cell states. This is how LSTM forgets and remembers things selectively, which makes it an improvement over RNNs.
Now consider, you want to process data with periodic patterns in it, such as predicting the sales of colored powder that peaks at the time of Holi in India. A good strategy is to look back at the sales records of the previous year. So, you need to know what data needs to be forgotten and what needs to be stored for later reference. Else, you need to have a really good memory. Recurrent neural networks seem to be doing a good job at this, theoretically. However, they have two downsides, exploding gradient and vanishing gradient, that make them redundant.
Here, LSTM introduces memory units, called cell states, to solve this problem. The designed cells may be seen as differentiable memory.
Next in this ‘What is LSTM?’ blog, you will come across the numerous applications of LSTM.
LSTM networks find useful applications in the following areas:
- Language modeling
- Machine translation
- Handwriting recognition
- Image captioning
- Image generation using attention models
- Question answering
- Video-to-text conversion
- Polymorphic music modeling
- Speech synthesis
- Protein secondary structure prediction
This list does give an idea about the areas in which LSTM is employed but not how exactly it is used. Let’s understand the types of sequence learning problems that LSTM networks are capable of addressing.
LSTM neural networks are capable of solving numerous tasks that are not solvable by previous learning algorithms like RNNs. Long-term temporal dependencies can be captured effectively by LSTM, without suffering much optimization hurdles. This is used to address the high-end problems.
Let’s discuss Bidirectional LSTMs in this ‘What is LSTM?’ blog.
What are Bidirectional LSTMs?
These are like an upgrade over LSTMs. In bidirectional LSTMs, each training sequence is presented forward and backward so as to separate recurrent nets. Both sequences are connected to the same output layer. Bidirectional LSTMs have complete information about every point in a given sequence, everything before and after it.
But, how do you rely on the information that hasn’t happened yet? The human brain uses its senses to pick up information from words, sounds, or from whole sentences that might, at first, make no sense but mean something in a future context. Conventional recurrent neural networks are only capable of using the previous context to get information. Whereas, in bidirectional LSTMs, the information is obtained by processing the data in both directions within two hidden layers, pushed toward the same output layer. This helps bidirectional LSTMs access long-range context in both directions.
To Sum up!
LSTM networks are indeed an improvement over RNNs as they can achieve whatever RNNs might achieve with much better finesse. As intimidating as it can be, LSTMs do provide better results and are truly a big step in Deep Learning. With more such technologies coming up, you can expect to get more accurate predictions and have a better understanding of what choices to make.