Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in Machine Learning by (19k points)

I would like to understand how an RNN, specifically an LSTM is working with multiple input dimensions using Keras and Tensorflow. I mean the input shape is (batch_size, timesteps, input_dim) where input_dim > 1.
I think the below images illustrate quite well the concept of LSTM if the input_dim = 1.
Does this mean if input_dim > 1 then x is not a single value anymore but an array? But if it's like this then the weights are also become arrays, same shape as x + the context?

LSTM structure

enter image description here

1 Answer

0 votes
by (33.1k points)

Keras creates a computational graph that executes the sequence in your bottom picture per feature. It means that the state value C is always a scalar, one per unit. It does not process features at once, it processes units at once, and features separately.

For example:

import keras.models as kem

import keras.layers as kel

model = kem.Sequential()

lstm = kel.LSTM(units, input_shape=(timesteps, features))

model.add(lstm)

model.summary()

free_params = (4 * features * units) + (4 * units * units) + (4 * num_units)

print('free_params ', free_params)

print('kernel_c', lstm.kernel_c.shape)

print('bias_c', lstm.bias_c .shape)

where 4 represents one for each of the f, i, c, and o internal paths in your bottom picture. The first term is the number of weights for the kernel, the second term for the recurrent kernel, and the last one for the bias is applied. For

units = 1

timesteps = 1

features = 1

we see

Layer (type)                 Output Shape              Param #

=======================================================

lstm_1 (LSTM)                (None, 1)                 12

=======================================================

Total params: 12.0

Trainable params: 12

Non-trainable params: 0.0

_______________________________________________________

num_params 12

kernel_c (1, 1)

bias_c (1,)

and for

units = 1

timesteps = 1

features = 2

we see

Layer (type)                 Output Shape              Param #

=======================================================

lstm_1 (LSTM)                (None, 1)                 16

=======================================================

Total params: 16.0

Trainable params: 16

Non-trainable params: 0.0

_______________________________________________________

num_params 16

kernel_c (2, 1)

bias_c (1,)

Hope this answer helps you! More insights about this will be given through Machine Learning Online Course.

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...