+3 votes
1 view
in Machine Learning by (4.8k points)
edited by

I have a model trained using Keras with Tensorflow as my backend, but now I need to turn my model into a TensorFlow graph for a certain application. I attempted to do this and make predictions to ensure that it is working correctly, but when comparing to the results gathered from the model.predict() I get very different values. For instance:

from keras.models import load_model
import tensorflow as tf

model = load_model('model_file.h5')
x_placeholder = tf.placeholder(tf.float32, shape=(None,7214,1))
y = model(x_placeholder)
x = np.ones((1,7214,1))
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print("Predictions from:\ntf graph:      "+str(sess.run(y, feed_dict={x_placeholder:x})))
    print("keras predict: "+str(model.predict(x)))

returns:

Predictions from:
tf graph:      [[-0.1015993   0.07432419  0.0592984 ]]
keras predict: [[ 0.39339241  0.57949686 -3.67846966]]

The values from Keras predict are correct, but the tf graph results are not.

If it helps to know the final intended application, I am creating a jacobian matrix with the tf.gradients() function, but currently it does not return the correct results when comparing to theano's jacobian function, which gives the correct jacobian. Here is my tensorflow jacobian code:

x = tf.placeholder(tf.float32, shape=(None,7214,1))
y = tf.reshape(model(x)[0],[-1])
y_list = tf.unstack(y)
jacobian_list = [tf.gradients(y_, x)[0] for y_ in y_list]
jacobian = tf.stack(jacobian_list)
EDIT: Model code
import numpy as np

from keras.models import Sequential
from keras.layers import Dense, InputLayer, Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from keras.optimizers import Adam
from keras.callbacks import EarlyStopping, ReduceLROnPlateau

# activation function used following every layer except for the output layers
activation = 'relu'

# model weight initializer
initializer = 'he_normal'

# shape of input data that is fed into the input layer
input_shape = (None,7214,1)

# number of filters used in the convolutional layers
num_filters = [4,16]

# length of the filters in the convolutional layers
filter_length = 8

# length of the maxpooling window 
pool_length = 4

# number of nodes in each of the hidden fully connected layers
num_hidden_nodes = [256,128]

# number of samples fed into model at once during training
batch_size = 64

# maximum number of interations for model training
max_epochs = 30

# initial learning rate for optimization algorithm
lr = 0.0007

# exponential decay rate for the 1st moment estimates for optimization algorithm
beta_1 = 0.9

# exponential decay rate for the 2nd moment estimates for optimization algorithm
beta_2 = 0.999

# a small constant for numerical stability for optimization algorithm
optimizer_epsilon = 1e-08

model = Sequential([

    InputLayer(batch_input_shape=input_shape),

    Conv1D(kernel_initializer=initializer, activation=activation, padding="same", filters=num_filters[0], kernel_size=filter_length),

    Conv1D(kernel_initializer=initializer, activation=activation, padding="same", filters=num_filters[1], kernel_size=filter_length),

    MaxPooling1D(pool_size=pool_length),

    Flatten(),

    Dense(units=num_hidden_nodes[0], kernel_initializer=initializer, activation=activation),

    Dense(units=num_hidden_nodes[1], kernel_initializer=initializer, activation=activation),

    Dense(units=3, activation="linear", input_dim=num_hidden_nodes[1]),
]) 

# compile model
loss_function = mean squared error
early_stopping_min_delta = 0.0001
early_stopping_patience = 4
reduce_lr_factor = 0.5
reuce_lr_epsilon = 0.0009
reduce_lr_patience = 2
reduce_lr_min = 0.00008

optimizer = Adam(lr=lr, beta_1=beta_1, beta_2=beta_2, epsilon=optimizer_epsilon, decay=0.0)

early_stopping = EarlyStopping(monitor='val_loss',     min_delta=early_stopping_min_delta, 
                                   patience=early_stopping_patience, verbose=2, mode='min')

reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.5, epsilon=reuce_lr_epsilon, 
                              patience=reduce_lr_patience,     min_lr=reduce_lr_min, mode='min', verbose=2)

model.compile(optimizer=optimizer, loss=loss_function)

model.fit(train_x, train_y, validation_data=(cv_x, cv_y),
      epochs=max_epochs, batch_size=batch_size, verbose=2,
      callbacks=[reduce_lr,early_stopping])

model.save('model_file.h5')

1 Answer

+3 votes
by (7.9k points)

Using the TensorFlow backend, your Keras code is truly building a TF graph. You can just grab this graph.

Keras only uses one graph and one session. You can access the session via: K.get_session(). The graph related to it'd then be: K.get_session().graph.

Also, this should be relevant to you:

https://intellipaat.com/community/4050/role-of-flatten-in-keras https://intellipaat.com/community/3409/how-to-concatenate-two-layers-in-keras

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...