+1 vote
1 view
in Machine Learning by (4.8k points)

I've checked the source code for both functions, and it seems that LSTM() makes the LSTM network in general, while LSTMCell() only returns one cell.

However, in most cases people only use one LSTM Cell in their program. Does this mean when you have only one LSTM Cell (ex. in simple Seq2Seq), calling LSTMCell() and LSTM() would make no difference?

1 Answer

+2 votes
by (7.9k points)
  • LSTM is a recurrent layer
  • LSTMCell is an object (which happens to be a layer too) used by the LSTM layer that contains the calculation logic for one step.
  • A recurrent layer contains a cell object. The cell contains the core code for the calculations of each step, while the recurrent layer commands the cell and performs the actual recurrent calculations.

Usually, people use LSTM layers in their code.

Or they use RNN layers containing LSTMCell.

Both things are almost the same. An LSTM layer is a RNN layer using an LSTMCell, as you can check out in the source code.

About the number of cells:

Although it seems, because of its name, that LSTMCell is a single cell, it is actually an object that manages all the units/cells as we may think. In the same code mentioned, you can see that the units argument is used when creating an instance of LSTMCell.

...