Adaptive Single-Loop Methods for Stochastic Minimax Optimization on Riemannian Manifolds
Stochastic minimax optimization on Riemannian manifolds has recently attracted significant attention due to its broad range of applications, such as robust training of neural networks and robust maximum likelihood estimation. Existing optimization methods for these problems typically require selecting stepsizes based on prior knowledge of specific problem parameters, such as Lipschitz-type constants and (geodesic) strong concavity constants. Unfortunately, these parameters are often unknown in practice. To overcome this issue, we develop single-loop adaptive methods that automatically adjust stepsizes using cumulative Riemannian (stochastic) gradient norms. We first propose a deterministic single-loop Riemannian adaptive gradient descent ascent method and show that it attains an $ε$-stationary point within $O(ε^{-2})$ iterations. This deterministic method is of independent interest and lays the foundation for our subsequent stochastic method. In particular, we propose the Riemannian stochastic adaptive gradient descent ascent method, which finds an $ε$-stationary point in $O(ε^{-6})$ iterations. Under additional second-order smoothness, this iteration complexity is further improved to $O(ε^{-4})$, which even outperforms the corresponding complexity result in Euclidean space. Some numerical experiments on real-world applications are conducted, including the regularized robust maximum likelihood estimation problem, and the robust training of neural networks with orthonormal weights. The results are encouraging and demonstrate the effectiveness of adaptivity in practice.
💡 Research Summary
This paper addresses stochastic minimax optimization on Riemannian manifolds, a problem that appears in robust neural‑network training and robust maximum‑likelihood estimation. Existing algorithms for such problems typically require prior knowledge of problem‑specific constants—Lipschitz smoothness, strong concavity, etc.—to set step sizes, which is unrealistic in practice. The authors propose the first single‑loop adaptive methods that automatically tune step sizes using cumulative Riemannian (stochastic) gradient norms, thereby eliminating the need for any a‑priori parameter tuning.
The work proceeds in two stages. First, a deterministic algorithm called Riemannian Adaptive Gradient Descent‑Ascent (RAGDA) is introduced for the problem
\
Comments & Academic Discussion
Loading comments...
Leave a Comment