I was wondering what was the difference between Activation Layer and Dense layer in Keras.
Since Activation Layer seems to be a fully connected layer, and Dense has a parameter to pass an activation function, what is the best practice?
Let's imagine a fictionnal network like this : Input -> Dense -> Dropout -> Final Layer Final Layer should be : Dense(activation=softmax) or Activation(softmax) ? What is the cleanest and why ?
Thanks, everyone!