Using Dense(activation=softmax)
is computationally equivalent to first add Dense
and then add Activation(softmax)
. However there is one advantage of the second approach – you could retrieve the outputs of the last layer (before activation) out of such defined model. In the first approach – it’s impossible.