+2 votes
1 view
in Machine Learning by (4.2k points)

I am trying to define my own RNNCell (Echo State Network) in Tensorflow, according to below definition.

x(t + 1) = tanh(Win*u(t) + W*x(t) + Wfb*y(t))

y(t) = Wout*z(t)

z(t) = [x(t), u(t)]

x is state, u is input, y is output. Win, W, and Wfb are not trainable. All weights are randomly initialized, but W is modified like this: "Set a certain percentage of elements of W to 0, scale W to keep its spectral radius below 1.0

I have this code to generate the equation.

x = tf.Variable(tf.reshape(tf.zeros([N]), [-1, N]), trainable=False, name="state_vector")
W = tf.Variable(tf.random_normal([N, N], 0.0, 0.05), trainable=False)
# TODO: setup W according to the ESN paper
W_x = tf.matmul(x, W)

u = tf.placeholder("float", [None, K], name="input_vector")
W_in = tf.Variable(tf.random_normal([K, N], 0.0, 0.05), trainable=False)
W_in_u = tf.matmul(u, W_in)

z = tf.concat(1, [x, u])
W_out = tf.Variable(tf.random_normal([K + N, L], 0.0, 0.05))
y = tf.matmul(z, W_out)
W_fb = tf.Variable(tf.random_normal([L, N], 0.0, 0.05), trainable=False)
W_fb_y = tf.matmul(y, W_fb)

x_next = tf.tanh(W_in_u + W_x + W_fb_y)

y_ = tf.placeholder("float", [None, L], name="train_output")

My problem is two-fold. First I don't know how to implement this as a superclass of RNNCell. Second I don't know how to generate a W tensor according to the above specification.

Any help about any of these question is greatly appreciated. Maybe I can figure out a way to prepare W, but I sure as hell don't understand how to implement my own RNN as a superclass of RNNCell.

1 Answer

+2 votes
by (6.8k points)
edited by

Create the RNN cell. Tensorflow provides support for LSTM, ESN(slightly different architecture than LSTM) and simple RNN cells. 

import tensorflow as tf

data = tf.placeholder(tf.float32, [None, 20,1])

target = tf.placeholder(tf.float32, [None, 21])

num_hidden = 24

cell = tf.nn.rnn_cell.LSTMCell(num_hidden,state_is_tuple=True)

val, state = tf.nn.dynamic_rnn(cell, data, dtype=tf.float32)

val = tf.transpose(val, [1, 0, 2])

last = tf.gather(val, int(val.get_shape()[0]) - 1)

weight = tf.Variable(tf.truncated_normal([num_hidden, int(target.get_shape()[1])]))

bias = tf.Variable(tf.constant(0.1, shape=[target.get_shape()[1]]))

prediction = tf.nn.softmax(tf.matmul(last, weight) + bias)

cross_entropy = -tf.reduce_sum(target * tf.log(tf.clip_by_value(prediction,1e-10,1.0)))

optimizer = tf.train.AdamOptimizer()

minimize = optimizer.minimize(cross_entropy)

init_op = tf.initialize_all_variables()

sess = tf.Session()

sess.run(init_op)

batch_size = 1000

no_of_batches = int(len(train_input)/batch_size)

epoch = 5000

for i in range(epoch):

   ptr = 0

   for j in range(no_of_batches):

       inp, out = train_input[ptr:ptr+batch_size], train_output[ptr:ptr+batch_size]

       ptr+=batch_size

       sess.run(minimize,{data: inp, target: out})

   print("Epoch - ",str(i))

incorrect = sess.run(error,{data: test_input, target: test_output})

print('Epoch {:2d} error {:3.1f}%'.format(i + 1, 100 * incorrect))

sess.close()

This is kind of related to Machine Learning Algorithms, thus studying Recurrent Neural Network Tutorial will be quite useful as far as the software domain is concerned. 

Welcome to Intellipaat Community. Get your technical queries answered by top developers !

Categories

...