I am trying to train a sequence tagging model (LSTM), where the sequence labels are either 1 (first class) , 2 (second class) or 0 (don't care).
I tried to write my own loss function that ignores the zeros:
import keras.backend as K
def my_loss(y_true, y_pred):
"""(sum([(t-p)**2 for t,p in zip(y_true, y_pred)])/n_nonzero)**0.5"""
return K.sqrt(K.sum(K.square(y_pred*K.cast(y_true>0, "float32") - y_true), axis=-1) / K.sum(K.cast(y_true>0, "float32") ))
Which essentially calculates mean squared error only on non-zeros.
However, I get loss=nan when training the model.
What I am doing wrong ?
What is the standard way to ignore certain labels in the training process ?