Mirror descent
December 29, 2019 — August 29, 2023
First-order gradient descent that can “adapt to the problem structure”. Q: How does it relate to Natural gradient descent? For that has a similar description.
Beck and Teboulle (2003):
The mirror descent algorithm (MDA) was introduced by Nemirovsky and Yudin for solving convex optimization problems. This method exhibits an efficiency estimate that is mildly dependent on the decision variables dimension, and thus suitable for solving very large-scale optimization problems. We present a new derivation and analysis of this algorithm. We show that the MDA can be viewed as a nonlinear projected-subgradient type method, derived from using a general distance-like function instead of the usual Euclidean squared distance. Within this interpretation, we derive in a simple way convergence and efficiency estimates. We then propose an Entropic mirror descent algorithm for convex minimization over the unit simplex, with a global efficiency estimate proven to be mildly dependent on the dimension of the problem.
Bubeck’s lectures are good: Bubeck (2019).
T Lienart, Mirror descent algorithm.