Explore Courses Blog Tutorials Interview Questions
+1 vote
in Machine Learning by (4.2k points)

I was wondering what was the difference between Activation Layer and Dense layer in Keras.

Since Activation Layer seems to be a fully connected layer, and Dense has a parameter to pass an activation function, what is the best practice?

Let's imagine a fictionnal network like this : Input -> Dense -> Dropout -> Final Layer Final Layer should be : Dense(activation=softmax) or Activation(softmax) ? What is the cleanest and why ?

Thanks, everyone!

2 Answers

+1 vote
by (160 points)

The best practice is to avoid using the softmax function for hidden layers of the nueral nets. The reason is, the output of the softmax function provides us the probability of the label by providing the value in the range of (0,1) and thereby softmax activation is generally preferred to be used at the last layer of the Neural net.

Moreover, if you will try to use Dense(activation=softmax) then it will internally create a dense layer first and apply softmax on top it and show you the result directly and you won't be able to retrieve the exact outputs of the last layer, instead, you will get their probability of occurrence.

Hope this helps. For more details on this, Neural Network Tutorial would be the most beneficial topic when it comes to master the course.

+1 vote
by (6.8k points)

Using Dense(activation=softmax) is computationally corresponding to first add Dense so add Activation(softmax). However there is one advantage of the second approach - you could retrieve the outputs of the last layer (before activation) out of such a defined model. In the first approach - it's impossible.

For more details on dense activation, refer to the Machine Learning Courses by Intellipaat.

Browse Categories