Dynamic Proximal Gradient Algorithms for Schatten-$p$ Quasi-Norm Regularized Problems

Dynamic Proximal Gradient Algorithms for Schatten-$p$ Quasi-Norm Regularized Problems
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper investigates numerical solution methods for the Schatten-$p$ quasi-norm regularized problem with $p \in [0,1]$, which has been widely studied for finding low-rank solutions of linear inverse problems and gained successful applications in various mathematics and applied science fields. We propose a dynamic proximal gradient algorithm that, through the use of the Cayley transformation, avoids computationally expensive singular value decompositions at each iteration, thereby significantly reducing the computational complexity. The algorithm incorporates two step size selection strategies: an adaptive backtracking search and an explicit step size rule. We establish the sublinear convergence of the proposed algorithm for all $p \in [0,1]$ within the framework of the Kurdyka-Lojasiewicz property. Notably, under mild assumptions, we show that the generated sequence converges to a stationary point of the objective function of the problem. For the special case when $p=1$, the linear convergence is further proved under the strict complementarity-type regularity condition commonly used in the linear convergence analysis of the forward-backward splitting algorithms. Preliminary numerical results validate the superior computational efficiency of the proposed algorithm.


💡 Research Summary

The paper addresses the computational challenges of solving low‑rank matrix recovery problems regularized by the Schatten‑p quasi‑norm (0 ≤ p ≤ 1). The standard formulation is
  min X ∈ ℝ^{m×n}  F(X) := ½‖A(X) − b‖₂² + λ‖X‖{S_p},
where A is a linear operator and ‖X‖{S_p} = (∑{i=1}^n σ_i^p)^{1/p} for p∈(0,1] (‖X‖_{S_0}=rank X). Existing algorithms (fixed‑point/forward‑backward splitting, smoothing majorization, iteratively re‑weighted schemes) all require a full singular value decomposition (SVD) at every iteration, which is prohibitive for large‑scale matrices.

Key idea.
Assume an approximate solution (\hat X = \hat U^{\top} D(\hat\sigma)\hat V) is available. By exploiting the invariance of the objective under orthogonal transformations, the problem can be rewritten in terms of three variables:

  • σ ∈ ℝⁿ (the singular values),
  • E ∈ S(m) and F ∈ S(n) (skew‑symmetric matrices that generate orthogonal updates).

The new objective is
  (\hat F(σ,E,F) = f_{\Omega}(σ,E,F) + λ‖σ‖p^p)
with (\Omega = (\hat U,\hat V)) and (f
{\Omega}(σ,E,F)=½‖A((\hat U e^E)^{\top} D(σ) (\hat V e^F)) - b‖_2^2).

Algorithm (DPGA).

  1. Proximal step on σ.
    Compute
      σ_{k+1}=prox_{t_k λ‖·‖_p^p}\bigl(σ_k - t_k ∇σ f{\Omega_k}(σ_k,0,0)\bigr).
    For p ∈ {0, ½, 2/3, 1} this proximal operator has a closed‑form solution, making the step cheap.

  2. Cayley update of orthogonal factors.
    Set (E_k = -s_k ∇E f{\Omega_k}(σ_k,0,0)), (F_k = -s_k ∇F f{\Omega_k}(σ_k,0,0)).
    Update the orthogonal matrices via the Cayley transform:
      ((I + \tfrac12E_k)^{\top} U_{k+1} = (I - \tfrac12E_k)^{\top} U_k),
      ((I + \tfrac12F_k)^{\top} V_{k+1} = (I - \tfrac12F_k)^{\top} V_k).
    These are linear systems whose coefficient matrices approach the identity as E_k, F_k → 0, so they can be solved efficiently with a few iterations.

  3. Step‑size selection.
    Two strategies are proposed:

    • Backtracking search that enlarges (t_k, s_k) until the sufficient‑descent condition
        (F(X_{k+1}) + α‖(σ_{k+1}-σ_k, E_k, F_k)‖² ≤ F(X_k))
      holds.
    • Explicit rule based on Lipschitz constants (l_k^{σ}, l_k^{Ω}) and current gradient norms, guaranteeing the same descent property without extra line‑search.

Convergence analysis.

  • A sufficient‑descent lemma (Lemma 3.1) together with analytic expressions for the gradients (Proposition 3.2) yields the descent inequality (2.9).
  • The objective (\hat F) satisfies the Kurdyka‑Łojasiewicz (KL) property with an exponent θ∈

Comments & Academic Discussion

Loading comments...

Leave a Comment