Explore Courses Blog Tutorials Interview Questions
0 votes
in AI and Deep Learning by (50.2k points)

I have an input image with greyscale values ranging from let's say 25000 - 35000. I'm doing binary pixel-wise classification, so the output ground truth is a matrix of either 0's or 1's.

Does anyone know what the default output activation function is? My question is, is it a ReLu? I want it to be a SoftMax function. In which case, each prediction value would be between 0 and 1(obviously close to my ground truth data).

I'm using example code from here that I have adjusted to make work for my data.

I have a working network that is training, but the minibatch loss is at about 425 right now and the accuracy at 0.0, and for the LSTM MNIST example code (linked) the minibatch loss was about 0.1 and the accuracy about 1.0. I hope that if I ca change the activation function to use the SoftMax function, I can improve results.

1 Answer

0 votes
by (108k points)

The activation functions here are for the individual cells. The output layer is defined outside the RNN/LSTM code, and in fact, it uses an explicitly created softmax as part of the cross-entropy cost function. The default activation function for BasicLSTMCell is tf.tanh(). You can change the activation function by defining the optional activation argument when creating the BasicLSTMCell object and passing any TensorFlow op that expects a single input and produces a single output of the same shape.

Browse Categories