Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in AI and Deep Learning by (50.2k points)

First of all, I realize from a methodological standpoint why your loss function must be dependent on the output of a neural network. This question comes more from an experiment I've been doing while trying to understand Keras and Tensorflow a bit better. Consider the following:

input_1 = Input((5,))

hidden_a = Dense(2)(input_1)

output = Dense(1)(hidden_a)

m3 = Model(input_1, output)

def myLoss (y_true, y_pred):

    return K.sum(hidden_a)                    # (A)

    #return K.sum(hidden_a) + 0*K.sum(y_pred) # (B)

m3.compile(optimizer='adam', loss=myLoss)

x = np.random.random(size=(10,5))

y = np.random.random(size=(10,1))

m3.fit(x,y, epochs=25)

This code induces:

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

but it runs if you swap line A for line B even though nothing has changed numerically.

The former case seems like it should be perfectly fine to me. The computation graph is well defined and everything should be differentiable in terms of the loss. But it seems that Keras requires y_pred to be in the loss function somehow regardless of whether or not it has any effect.

Thanks!

1 Answer

0 votes
by (108k points)

Keras does not require y_pred to be in the loss function. Though, it needs that all trainable variables to be referenced in the loss function.

When you call this function: m3.fit(), Keras will perform a gradient computation between your loss function and the trainable weights of your layers. If your loss function does not reference the same elements that you have in the trainable_variables collection, some of the gradients computation operations will not be possible.

You can avoid it by referencing y_pred, even if doesn't do anything. Or you could halt the layers that won't be impacted by the optimizer (as you don't compute their loss anyway)

So in your case, you just have to halt your output layer :

output = Dense(1, trainable = False)(hidden_a)

If you wish to know more about Keras then visit this Python Course.

Browse Categories

...