Using Keras for video prediction (time series)

Question

asked Jul 23, 2019 in Machine Learning by ParasSharma1 (19k points)

I want to predict the next frame of a (greyscale) video given N previous frames - using CNNs or RNNs in Keras. Most tutorials and other information regarding time series prediction and Keras use a 1-dimensional input in their network but mine would be 3D (N frames x rows x cols)

I'm currently really unsure what a good approach for this problem would be. My ideas include:

Using one or more LSTM layers. The problem here is that I'm not sure whether they're suited to take a series of images instead a series of scalars as input. Wouldn't the memory consumption explode? If it is okay to use them: How can I use them in Keras for higher dimensions?

Using 3D convolution on the input (the stack of previous video frames). This raises other questions: Why would this help when I'm not doing a classification but a prediction? How can I stack the layers in such a way that the input of the network has dimensions (N x cols x rows) and the output (1 x cols x rows)?

I'm pretty new to CNNs/RNNs and Keras and would appreciate any hint into the right direction.

1 Answer

Anurag · Answer 1 · 2019-07-23T14:05:31+0000

In the current version of Keras (v1.2.2), this layer is already included and can be imported using

from keras.layers.convolutional_recurrent import ConvLSTM2D

To use this layer, the video data has to be formatted as follows:

[nb_samples, nb_frames, width, height, channels] # if using dim_ordering = 'tf'
[nb_samples, nb_frames, channels, width, height] # if using dim_ordering = 'th'

For more details on this, study Artificial Intelligence Course.

Using Keras for video prediction (time series)

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources