Keras: Difference between Kernel and Activity regularizers

Question

The activity regularizer works as a function of the output of the net, and is mostly used to regularize hidden units, while weight_regularizer, as the name says, works on the weights (e.g. making them decay). Basically you can express the regularization loss as a function of the output (activity_regularizer) or of the weights (weight_regularizer).

The new kernel_regularizer replaces weight_regularizer – although it’s not very clear from the documentation.

From the definition of kernel_regularizer:

kernel_regularizer: Regularizer function applied to
the kernel weights matrix
(see regularizer).

And activity_regularizer:

activity_regularizer: Regularizer function applied to
the output of the layer (its “activation”).
(see regularizer).

Important Edit: Note that there is a bug in the activity_regularizer that was only fixed in version 2.1.4 of Keras (at least with Tensorflow backend). Indeed, in the older versions, the activity regularizer function is applied to the input of the layer, instead of being applied to the output (the actual activations of the layer, as intended). So beware if you are using an older version of Keras (before 2.1.4), activity regularization may probably not work as intended.

You can see the commit on GitHub

Five months ago François Chollet provided a fix to the activity regularizer, that was then included in Keras 2.1.4

Leave a Comment Cancel reply