What does opt.apply_gradients() do in TensorFlow?

Question

The update rule that the apply_gradients method actually applies depends on the specific optimizer. Take a look at the implementation of apply_gradients in the tf.train.Optimizer class here. It relies on the derived classes implementing the update rule in the methods _apply_dense and _apply_spares. The update rule you are referring to is implemented by the GradientDescentOptimizer.

Regarding your desired positive additive update: If what you are calling opt is an instantiation of GradientDescentOptimizer, then you could indeed achieve what you want to do by

grads_and_vars = opt.compute_gradients(E, [v])
eta = opt._learning_rate
my_grads_and_vars = [(g-(1/eta)*p, v) for g, v in grads_and_vars]
opt.apply_gradients(my_grads_and_vars)

The more elegant way to do this is probably to write a new optimizer (inheriting from tf.train.Optimizer) that implements your desired update rule directly.

Leave a Comment Cancel reply