How do I set TensorFlow RNN state when state_is_tuple=True?

Question

asked Jul 25, 2019 in Machine Learning by ParasSharma1 (19k points)

I have written an RNN language model using TensorFlow. The model is implemented as an RNN class. The graph structure is built in the constructor, while RNN.train and RNN.test methods run it.

I want to be able to reset the RNN state when I move to a new document in the training set, or when I want to run a validation set during training. I do this by managing the state inside the training loop, passing it into the graph via a feed dictionary.

In the constructor I define the the RNN like so

cell = tf.nn.rnn_cell.LSTMCell(hidden_units)
rnn_layers = tf.nn.rnn_cell.MultiRNNCell([cell] * layers)
self.reset_state = rnn_layers.zero_state(batch_size, dtype=tf.float32)
self.state = tf.placeholder(tf.float32, self.reset_state.get_shape(), "state")
self.outputs, self.next_state = tf.nn.dynamic_rnn(rnn_layers, self.embedded_input, time_major=True,
initial_state=self.state)

The training loop looks like this

for document in document:
state = session.run(self.reset_state)
for x, y in document:
state = session.run([self.train_step, self.next_state],
feed_dict={self.x:x, self.y:y, self.state:state})

x and y are batches of training data in a document. The idea is that I pass the latest state along after each batch, except when I start a new document, when I zero out the state by running self.reset_state.

This all works. Now I want to change my RNN to use the recommended state_is_tuple=True. However, I don't know how to pass the more complicated LSTM state object via a feed dictionary. Also I don't know what arguments to pass to the self.state = tf.placeholder(...) line in my constructor.

What is the correct strategy here? There still isn't much example code or documentation for dynamic_rnn available.

1 Answer

Anurag · Answer 1 · 2019-07-25T12:21:23+0000

There is a problem with the Tensorflow placeholder is that you can only feed it with a Python list or Numpy array. So you can't save the state between runs in tuples of LSTM State Tuple.

I solved this by saving the state in a tensor.

For example:

initial_state = np.zeros((num_layers, 2, batch_size, state_size))

You have two components in an LSTM layer, the cell state, and the hidden state. The "2" comes from.

When building the graph you unpack and create the tuple state like this:

state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, batch_size, state_size])
l = tf.unpack(state_placeholder, axis=0)
rnn_tuple_state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(l[idx][0],l[idx][1])
for idx in range(num_layers)]
)

Then you get the new state the usual way

cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.MultiRNNCell([cell] * num_layers, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell, series_batch_input, initial_state=rnn_tuple_state)

Hope this answer helps you! For more details, go through Machine Learning Course and study Tensorflow Tutorial as well.

How do I set TensorFlow RNN state when state_is_tuple=True?

How do I set TensorFlow RNN state when state_is_tuple=True?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Browse Categories

Popular Courses

Top Tutorials

Top Articles

Top Interview Questions