Global exponential stability of primal-dual gradient flow dynamics based on the proximal augmented Lagrangian: A Lyapunov-based approach
For a class of nonsmooth composite optimization problems with linear equality constraints, we utilize a Lyapunov-based approach to establish the global exponential stability of the primal-dual gradient flow dynamics based on the proximal augmented La…
Authors: Dongsheng Ding, Mihailo R. Jovanovic
Global exponential stability of primal-dual gradient flo w dynamics based on the proximal augmented Lagrangian: A L yapuno v-based approach Dongsheng Ding, Student Member , IEEE, and Mihailo R. Jov anovi ´ c, F ellow , IEEE Abstract — For a class of nonsmooth composite optimiza- tion pr oblems with linear equality constraints, we utilize a L yapunov-based approach to establish the global exponential stability of the primal-dual gradient flow dynamics based on the proximal augmented Lagrangian. The result holds when the differentiable part of the objective function is str ongly con vex with a Lipschitz continuous gradient; the non-differentiable part is proper , lower semi-continuous, and con vex; and the matrix in the linear constraint is full row rank. Our quadratic L yapunov function generalizes r ecent result from strongly con vex problems with either affine equality or inequality con- straints to a broader class of composite optimization problems with nonsmooth regularizers and it provides a worst-case lower bound of the exponential decay rate. Finally , we use compu- tational experiments to demonstrate that our conv ergence rate estimate is less conservativ e than the existing alternativ es. I . I N T R O D U C T I O N Primal-dual gradient flow dynamics belong to a class of Lagrangian-based methods for constrained optimization problems. Among other applications, such dynamics ha ve found use in network utility maximization [1], resource allocation [2], distributed optimization [3], and feedback- based online optimization [4] problems. Stability conditions for various forms of the gradient flo w dynamics hav e been proposed since their introduction in the 1950’ s [5]. L yapunov-based approach has been an effecti ve tool for studying the stability of primal-dual algorithms starting with the seminal paper of Arro w , Hurwicz, and Uzawa [5]. They utilized a quadratic L yapunov function to establish the global asymptotic stability of the primal-dual dynamics for strictly conv ex-conca ve Lagrangians. This early result was extended to the problems in which the Lagrangian is either strictly con vex or strictly conca ve [6]. A simplified L yapunov function was also proposed for linearly-con ve x or linearly- concav e Lagrangians. F or a projected variant of the primal- dual dynamics that could account for inequality constraints, a Krasovskii-based L yapunov function was combined with LaSalle’ s inv ariance principle to show the global asymptotic stability in [1]. The in variance principle was also special- ized to discontinuous Carath ´ eodory systems and a quadratic L yapunov function was used to show the global asymptotic stability of projected primal-dual gradient flo w dynamics under globally-strict or locally-strong conv exity-concavity assumptions [7], [8]. Additional information about the utility Financial support from the National Science F oundation under awards ECCS-1708906 and ECCS-1809833 is gratefully acknowledged. D. Ding and M. R. Jovano vi ´ c are with the Ming Hsieh Department of Electrical and Computer Engineering, Univ ersity of Southern California, Los Angeles, CA 90089. E-mails: dongshed@usc.edu, mihailo@usc.edu. of a L yapunov-based analysis in optimization can be found in a recent reference [9]. In [10], the theory of proximal operators was combined with the augmented Lagrangian approach to solve optimiza- tion problems in which the objectiv e function can be decom- posed into the sum of the strongly con vex term with a Lip- schitz continuous gradient and a con ve x non-differentiable term. By e v aluating the augmented Lagrangian along a cer- tain manifold, a continuously differentiable function of both primal and dual variables was obtained. This function was named the proximal augmented Lagrangian and the theory of integral quadratic constraints (IQCs) in the frequenc y domain was employed to pro ve the global e xponential stability of the resulting primal-dual dynamics [10]. This method yields an ev olution model with a continuous right-hand-side e ven for nonsmooth problems and it a v oids the e xplicit construction of a L yapunov function. In [11], a quadratic L yapunov function was used to prove similar properties for a narrower class of problems that inv olve strongly conv ex and smooth objective functions with either affine equality or inequality constraints. More recently , this L yapuno v-based result was extended to account for variations in the constraints [12], [13] and the theory of IQCs was used to prove global exponential stability of the differential equations that gov ern the e volution of proximal gradient and Douglas-Rachford splitting flows [14]. Herein, we utilize a L yapunov-based approach to establish the global exponential stability of the primal-dual gradient flow dynamics resulting from the proximal augmented La- grangian. As aforementioned, this method was introduced in [10] to solve a class of nonsmooth composite optimization problems. When the dif ferentiable part of the objective func- tion is strongly conv ex with a Lipschitz continuous gradient; the non-dif ferentiable part is proper, lower semi-continuous, and con vex; and the matrix in the linear constraint is full row rank, we construct a new quadratic L yapuno v function for the underlying primal-dual dynamics. This L yapunov function allows us to deriv e a worst-case lower bound on the e xponential decay rate. In contrast to [1], [7], [8], [15], our gradient flow dynamics are projection-free and there are no nonsmooth terms in the L yapunov function. W e employ the theory of IQCs in the time domain to obtain a quadratic L yapunov function that establishes the global exponential stability of the primal-dual gradient flo w dynamics resulting from the proximal augmented Lagrangian framew ork for nonsmooth composite optimization. Our L ya- punov function is more general that the one in [11] and it yields less conservati ve con ver gence rate estimates. This extends the results of [11] from strongly con ve x problems with af fine equality or inequality constraints to a broader class of optimization problems with nonsmooth regularizers. The remainder of the paper is organized as follo ws. In Section II, we provide background material, formulate the nonsmooth composite optimization problem, and describe the proximal augmented Lagrangian as well as the result- ing primal-dual gradient flo w dynamics. In Section III, we construct a quadratic L yapuno v function for verifying the global e xponential stability of the primal-dual dynamics. In Section IV, we use computational experiments to illustrate the utility of our results. In Section V, we close the paper with concluding remarks. I I . P R O B L E M F O R M U L A T I O N A N D BAC K G R O U N D W e consider con ve x composite optimization problems in which the objective function consists of a continuously differentiable term f and a non-dif ferentiable term g minimize x, z f ( x ) + g ( z ) sub ject to T x − z = 0 (1) where T ∈ R m × n is a matrix that relates the optimization variables x ∈ R n and z ∈ R m . Assumption 1: Problem (1) is feasible and its minimum is finite. Assumption 2: The continuously differentiable function f is m f -strongly conv ex with an L f -Lipschitz continuous gradient and the non-differentiable function g is proper , lo wer semi-continuous, and con ve x. Assumption 3: The matrix T ∈ R m × n has a full row rank. A. Pr oximal augmented Lagrangian The proximal operator of the function g is gi ven by [16] pro x µg ( v ) : = argmin x g ( x ) + 1 2 µ k x − v k 2 and the associated value function is Moreau en v elope, M µg ( v ) : = g ( prox µg ( v )) + 1 2 µ k pro x µg ( v ) − v k 2 where µ is a positi v e parameter . The Moreau en velope is continuously differentiable, even when g is not, and its gradient is determined by , ∇ M µg ( v ) = 1 µ v − prox µg ( v ) . The augmented Lagrangian of the constrained optimiza- tion problem (1) is gi v en by L ( x, z ; y ) = f ( x ) + g ( z ) + y T ( T x − z ) + 1 2 µ k T x − z k 2 where x ∈ R n and z ∈ R m are the primal v ariables, y ∈ R m is a dual variable, and µ is a positi ve parameter . Completion of squares brings L ( x, z ; y ) into the following form L ( x, z ; y ) = f ( x ) + g ( z ) + 1 2 µ k z − ( T x + µy ) k 2 − µ 2 k y k 2 . The minimizer of the augmented Lagrangian with respect to z is z ? µ ( x ; y ) = prox µg ( T x + µy ) where prox µg denotes the proximal operator of the func- tion g . Restriction of L along the manifold determined by z ? µ ( x ; y ) yields the proximal augmented Lagrangian [10], L µ ( x ; y ) : = L ( x, z ? µ ( x ; y ); y ) = f ( x ) + M µg ( T x + µy ) − µ 2 k y k 2 (2) where M µg is the Moreau env elope of the function g . Continuous dif ferentiability of the proximal augmented La- grangian L µ ( x ; y ) with respect to both x and y follows from continuous differentiability of M µg and Lipschitz continuity of the gradient of f . B. Examples W e next provide examples of con vex optimization prob- lems that can be brought into the form (1). For instance, the problem with linear equality constraints, minimize x f ( x ) sub ject to T x = b (3) where b ∈ R m is a given vector can be cast as (1) by choosing g ( z ) to be an indicator function, g ( z i ) : = { 0 , z i = b i ; ∞ , otherwise } . In this case, the proximal operator is giv en by pro x µg ( v i ) = b i , the associated Moreau en velope is M µg ( v i ) = 1 2 µ ( v i − b i ) 2 , and the gradient of the Moreau en velope is ∇ M µg ( v i ) = ( v i − b i ) /µ. The problem with linear inequality constraints, minimize x f ( x ) sub ject to T x ≤ b (4) where b ∈ R m is a given vector can be cast as (1) by choosing g ( z ) to be an indicator function, g ( z i ) : = { 0 , z i ≤ b i ; ∞ , otherwise } . The proximal operator is pro x µg ( v i ) = min { v i , b i } , the associated Moreau en velope is M µg ( v i ) = { 1 2 µ ( v i − b i ) 2 , v i > b i ; 0 , otherwise } , and the gradient of the Moreau en velope is ∇ M µg ( v i ) = max(0 , ( v i − b i ) /µ ) . Unconstrained optimization problems with nonsmooth regularizers can be also represented by (1). For example, the logistic re gression with elastic net regularization [17] minimize x ` ( x ) + 1 2 k x k 2 + k x k 1 (5) where the logistic loss ` ( x ) is gi ven by P d i =1 (log(1 +e a T i x ) − y i a T i x ) where a i is the feature vector and y i ∈ { 0 , 1 } is the corresponding label. Choosing f ( x ) : = ` ( x ) + 1 2 k x k 2 , g ( z ) : = k z k 1 , and T : = I brings (5) into (1). The proximal operator is the soft-thresholding pro x µg ( v i ) = sign ( v i ) max {| v i | − µ, 0 } , the associated Moreau en velope is the Huber function M µg ( v i ) = { 1 2 µ v 2 i , | v i | ≤ µ ; | v i | − µ 2 , | v i | ≥ µ } , and the gradient of the Moreau en velope is the saturation function ∇ M µg ( v i ) = sign( v i ) min ( | v i | /µ, 1) . C. Primal-dual gradient flow dynamics The primal-dual gradient flow dynamics can be used to compute the saddle points of (2), ˙ w = F ( w ) (6a) where w : = [ x T y T ] T and F ( w ) : = −∇ x L µ ( x ; y ) ∇ y L µ ( x ; y ) = − ( ∇ f ( x ) + T T ∇ M µg ( T x + µy )) µ ( ∇ M µg ( T x + µy ) − y ) . (6b) Let ¯ w : = [ ¯ x T ¯ y T ] T denote the equilibrium points of (6), i.e., the solutions to F ( ¯ w ) = 0 . The Lagrangian of the optimization problem (1) is gi ven by f ( x ) + g ( z ) + y T ( T x − z ) and the associated KKT optimality condition are, 0 = ∇ f ( x ? ) + T T y ? 0 ∈ ∂ g ( z ? ) − y ? 0 = T x ? − z ? (7) where ∂ g is the subgradient of g . The follo wing lemma establishes the relation between ¯ w and the optimality condi- tions (7); see [10] for details. Lemma 1: Let Assumptions 1 and 2 hold. The equilibrium point ¯ w : = [ ¯ x T ¯ y T ] T of the primal-dual gradient flo w dynamics (6) satisfies optimality conditions (7) with ¯ z : = pro x µg ( T ¯ x + µ ¯ y ) . Moreover , ( ¯ x, ¯ z ) is the optimal solution of nonsmooth composite optimization problem (1). Under Assumptions 1-3, the global exponential stability of the primal-dual gradient flo w dynamics (6) w as established in [10] by employing the theory of integral quadratic con- straints in the frequency domain. An upper bound on the con vergence rate was also obtained but the explicit form for the quadratic L yapunov function was not provided. Recent reference [11] used a L yapunov-based approach to show the global exponential stability for a class of problems with a strongly con ve x and smooth objecti ve function f subject to either af fine equality or inequality constraints. In our preliminary work [18], a similar quadratic L yapunov function was used to prov e global exponential stability of the primal- dual gradient flo w dynamics (6). In what follo ws, we emplo y the theory of IQCs in the time domain to obtain a quadratic L yapunov function that establishes the global exponential stability of (6) and yields less conservati v e con ver gence rate estimates. I I I . G L O B A L E X P O N E N T I A L S TA B I L I T Y V I A QU A D R A T I C L YA P UN O V F UN C T I O N In this section, we identify a quadratic L yapunov function that can be used to establish the global exponential stability of the primal-dual gradient flow dynamics (6) for strongly con vex problems (1) with full row rank matrix T and provide an estimate of the con vergence rate. A. A system-theoretic viewpoint of primal-dual dynamics Inspired by [19], we vie w (6) as a feedback interconnec- tion of an L TI system with static nonlinearities; see Fig. 1. These are determined by the gradient of the smooth part of the objecti ve function ∇ f and the proximal operator pro x µg . Structural properties of nonlinear terms that we exploit in our analysis are specified in Assumption 2. ∆ 2 ∆ 1 G ∆ u 1 u 2 ξ 1 = x ξ 2 = T x + µy Fig. 1: Block diagram of primal-dual gradient flo w dynam- ics (6): G is an exponentially stable L TI system in (8a) and ∆ is a static nonlinear map that satisfies quadratic constraint (9). Let u = [ u T 1 u T 2 ] T and ξ = [ ξ T 1 ξ T 2 ] T , with ξ 1 : = x ξ 2 : = T x + µy u 1 : = ∆ 1 ( ξ 1 ) = ∇ f ( x ) − m f x u 2 : = ∆ 2 ( ξ 2 ) = pro x µg ( T x + µy ) . For strongly conv e x f , the primal-dual dynamics (6) can be cast as an L TI system G in feedback with a nonlinear block ∆ , where ˙ w = Aw + B u ξ = C w u = ∆( ξ ) (8a) with A = − ( m f I + 1 µ T T T ) − T T T 0 B = − I 1 µ T T 0 − I , C = I 0 T µI . (8b) The input is giv en by u = ∆( ξ ) where ∆( ξ ) is a 2 × 2 block-diagonal matrix with the diagonal blocks ∆ 1 ( ξ 1 ) and ∆ 2 ( ξ 2 ) . These nonlinearities satisfy the pointwise quadratic inequalities [10] ξ i − ¯ ξ i u i − ¯ u i T 0 ˆ L i I ˆ L i I − 2 I ξ i − ¯ ξ i u i − ¯ u i ≥ 0 where ¯ w : = [ ¯ x T ¯ y T ] T is the equilibrium point of system (8), ¯ ξ 1 = ¯ x , ¯ ξ 2 = T ¯ x + µ ¯ y , ˆ L 1 : = L f − m f , and ˆ L 2 = 1 . This is because ∆ 1 is the gradient of the conv ex function f ( ξ 1 ) − ( m f / 2) k ξ 1 k 2 and, thus, it is Lipschitz continuous with parameter L f − m f [19, Proposition 5]; and ∆ 2 is giv en by the proximal operator of the function g and, thus, it is firmly non-expansi ve (i.e., Lipschitz continuous with parameter one) [16]. These quadratic constraints can be combined into, ξ − ¯ ξ u − ¯ u T 0 Π 0 Π 0 − 2Λ | {z } Π ξ − ¯ ξ u − ¯ u ≥ 0 (9) where Π 0 = λ 1 ˆ L 1 I 0 0 λ 2 I , Λ = λ 1 I 0 0 λ 2 I and λ 1 , λ 2 are non-negati ve scalars. B. Lyapuno v-based analysis for global exponential stability For the primal-dual gradient flo w dynamics (6) with equi- librium point ¯ w , we propose a quadratic L yapunov function candidate V ( ˜ w ) = ˜ w T P ˜ w (10a) with ˜ w : = w − ¯ w and P = α I (1 /µ ) T T (1 /µ ) T (1 + m f /µ ) I + (1 /µ 2 ) T T T (10b) where α is a positiv e parameter , m f is the strong con ve xity module of the function f , µ is the augmented Lagrangian parameter , and T is the full rank matrix associated with the linear equality constraint in (1). The matrix P is positiv e definite and, for A in (8b), we ha ve A T P + P A = − 2 α m f I 0 0 (1 /µ ) T T T ≺ 0 . (11) Thus, A is a Hurwitz matrix and the L TI system in Fig. 1 is exponentially stable. Furthermore, the deriv ativ e of V along the solutions of (8a) is determined by ˙ V = ˜ w ˜ u T A T P + P A P B B T P 0 ˜ w ˜ u (12a) where ˜ u : = u − ¯ u , and the substitution of the output equation ξ = C w in (8a) to (9) yields the quadratic inequality , ˜ w ˜ u T 0 C T Π 0 Π 0 C − 2Λ ˜ w ˜ u ≥ 0 . (12b) The sufficient condition for the global exponential stability of (8) is obtained by adding (12b) to (12a) and it amounts to the e xistence of a positi ve constant ρ such that − ( A T P + P A + 2 ρP ) − ( P B + C T Π 0 ) − ( P B + C T Π 0 ) T 2Λ 0 . (13) If this condition holds, we have ˙ V ≤ − 2 ρV . Thus, V ( ˜ w ( t )) ≤ V ( ˜ w (0)) e − 2 ρt and since P 0 , k ˜ w ( t ) k ≤ √ κ p k ˜ w (0) k e − ρt , for all t ≥ 0 where κ p is the condition number of the matrix P . Since Λ 0 , the remaining task is to verify the existence of the positiv e parameters α , µ , λ 1 , λ 2 , and ρ such that − ( A T P + P A + 2 ρP ) − 1 2 ( P B + C T Π 0 ) Λ − 1 ( P B + C T Π 0 ) T 0 (14) which follows from the application of the Schur complement to (13). W e are now ready to prove the global exponential stability of the primal-dual gradient flow dynamics (6) and provide estimates of the con ver gence rate ρ for L f > m f . Similar result can be established for L f = m f . Theor em 2: Let Assumptions 1-3 hold, let L f > m f , and let σ max ( T ) be the lar gest singular value of the matrix T . Then, the global e xponential stability of the primal-dual gradient flow dynamics (6) can be established with L yapunov function (10) if the augmented Lagrangian parameter satisfies µ > max L f − m f 4 , σ 2 max ( T ) 8 m f 1 + r 1 + 16 m 2 f σ 2 max ( T ) !! . (15) Pr oof: If (14) holds for ρ = 0 , the continuity of the left-hand side of (14) with respect to ρ implies the existence of ρ > 0 such that (14) holds. For ρ = 0 , (14) becomes − ( A T P + P A ) − 1 2 ( P B + C T Π 0 ) Λ − 1 ( P B + C T Π 0 ) T 0 (16) where A T P + P A is giv en by (11), and P B + C T Π 0 reads ( λ 1 ˆ L 1 − α ) I λ 2 T T − ( α/µ ) T ( µλ 2 − α (1 + αm f /µ )) I where ˆ L 1 : = L f − m f > 0 . Thus, the matrix M : = 1 2 ( P B + C T Π 0 ) Λ − 1 ( P B + C T Π 0 ) T is giv en by M = M 1 M T 0 M 0 M 2 (17) where M 1 = 1 2 ( α − λ 1 ˆ L 1 ) 2 λ 1 I + λ 2 T T T M 2 = 1 2 α 2 λ 1 µ 2 T T T + 1 λ 2 ( µλ 2 − α (1 + m f µ )) 2 I M 0 = 1 2 α ( α − λ 1 ˆ L 1 ) λ 1 µ + µλ 2 − α (1 + m f µ ) T and α , λ 1 , λ 2 , and µ are positive parameters that have to be selected such that (16) holds. Setting λ 1 : = α/ ˆ L 1 and λ 2 : = ( α/µ )(1+ m f /µ ) yields M 0 = 0 and (16) simplifies to 2 αm f I − λ 2 2 T T T 0 0 α µ (2 − α 2 λ 1 µ ) T T T 0 or , equi v alently , 4 αm f > λ 2 σ 2 max ( T ) and 4 µλ 1 > α . Combining these two conditions with the above definitions of λ 1 and λ 2 yields (15). W e next utilize the choices of parameters λ 1 and λ 2 in Theorem 2 to estimate the con vergence rate ρ . Pr oposition 3: Let Assumptions 1-3 hold, let L f > m f , and let σ min ( T ) and σ max ( T ) be the smallest and the largest singular values of the matrix T . Then, the primal- dual gradient flow dynamics (6) are globally exponentially stable with the rate ρ ≥ ρ 0 ( µ ) : = σ 2 min ( T ) 2( µ + m f + σ 2 max ( T ) /µ ) (18a) if µ > max ( L f − m f , ˆ µ ) , where ˆ µ : = inf { µ ∈ [ σ max ( T ) , ∞ ) , β ( µ ) < 2 m f } (18b) β ( µ ) : = ( m f + µ ) σ 2 max ( T ) 2 µ 2 + 2 ρ 0 ( µ )( µ + 4 ρ 0 ( µ )) µ . (18c) Pr oof: See Appendix A. I V . C O M P U T A T I O NA L E X P E R I M E N T S W e next provide an example to demonstrate the merits of our approach. Let us consider optimization problem (1) with, f ( x ) = 1 2 x T Qx + q T x g ( z ) = ( 0 , z ≤ b ∞ , otherwise (19) where x and q are the n -dimensional vectors, Q ∈ R n × n is a positiv e definite matrix, T ∈ R m × n if a full row rank matrix, and b ∈ R m is a giv en vector . The gradient of the Moreau en velope is determined by ∇ M µg ( v i ) = max (0 , ( v i − b i ) /µ ) and ( L f , m f ) are the largest and the smallest eigen values of the matrix Q , respectively . W e use Matlab ODE solver o de45 to simulate the primal- dual gradient flow dynamics (6) and set n = m = 10 , q = 10 × randn ( n, 1) , and Q = H H T + K , where H = randn ( n, n ) and K = diag ( exp ( randn ( n , 1 ))) . W e choose b to be a vector of all ones, set T = I , and report results for ( L f , m f ) = (1 . 24 , 1 . 03) and ( L f , m f ) = (27 . 81 , 1 . 03) . Figure 2a demonstrates the e xponential con vergence of dy- namics (6) with ( L f , m f ) = (1 . 24 , 1 . 03) for dif ferent values of µ . W e note that the conv ergence rate decreases when µ becomes lar ger than 2 . F or a given value of µ that satisfies Proposition 3, we use formula (18a) to estimate the lo wer bound on the con vergence rate ρ 0 . W e compare our estimate with [11, Theorem 2] and [18, Theorem 6]. As sho wn in Fig. 3a, Proposition 3 pro vides a less conserv ati ve estimate of the con ver gence rate than the existing methods. As Figs. 2b and 3b illustrates, similar observations can be made for a larger condition number , ( L f , m f ) = (27 . 81 , 1 . 03) . Clearly , the increase in condition number reduces the rate of expo- nential decay and our estimates are less conservati ve than those provided in the literature. V . C O N C L U D I N G R E M A R K S In this paper , we use a L yapunov-based approach to establish global exponential stability of the primal-dual gra- dient flo w dynamics resulting from the proximal augmented Lagrangian framew ork for nonsmooth composite optimiza- tion. W e provide a worst-case estimate of the exponential decay rate when the differentiable part of the objectiv e function is strongly con ve x and its gradient is Lipschitz continuous. For a quadratic programming problem, compu- tational experiments are used to show that our estimate of the conv ergence rate is less conservati ve compared to the existing literature. Our ongoing work focuses on identifying a quadratic L yapunov function that can certify the global exponential stability of a second-order primal-dual method for nonsmooth composite optimization [20]. A P P E N D I X A. Pr oof of Pr oposition 3 W e show that (14) holds for ρ = ρ 0 ( µ ) . Substitution of the expressions for A T P + P A and 1 2 ( P B + C T Π 0 )Λ − 1 ( B T P + k w ( t ) − w ? k (a) ( L f , m f ) = (1 . 24 , 1 . 03) time (seconds) k w ( t ) − w ? k (b) ( L f , m f ) = (27 . 81 , 1 . 03) time (seconds) Fig. 2: Con vergence of the primal-dual gradient flo w dynam- ics (6) for problem (19) with (a) ( L f , m f ) = (1 . 24 , 1 . 03) and (b) ( L f , m f ) = (27 . 81 , 1 . 03) . Π 0 C ) gi v en by (11) and (17) into (14) yields R = R 1 R T 0 R 0 R 2 0 (20) where R 1 = 2 α ( m f − ρ ) I − ( α − λ 1 ˆ L 1 ) 2 2 λ 1 I − λ 2 2 T T T R 2 = ( 2 α µ − α 2 2 λ 1 µ 2 ) T T T − 2 ρα ((1 + m f µ ) I + 1 µ 2 T T T ) − 1 2 λ 2 ( µλ 2 − α (1 + m f µ )) 2 I R 0 = − α µ ( α − λ 1 ˆ L 1 2 λ 1 + 2 ρ ) T − 1 2 ( µλ 2 − α (1 + m f µ )) T . Here, ˆ L 1 : = L f − m f > 0 , and α , λ 1 , λ 2 , and µ are positive parameters that ha ve to be selected such that (14) holds for ρ = ρ 0 ( µ ) . W e set λ 1 : = α/ ˆ L 1 , λ 2 : = α (1 + m f /µ ) /µ , and add/subtract 3 αT T T / (2 µ ) to R 2 to obtain R 2 = α ( 2 µ − ˆ L 1 2 µ 2 ) T T T − 3 α 2 µ T T T + α µ T T T − 2 ρα ((1 + m f µ ) I + 1 µ 2 T T T ) + α 2 µ T T T . If µ ≥ ˆ L 1 , then α ( 2 µ − ˆ L 1 2 µ 2 ) T T T − 3 α 2 µ T T T 0 . (21) ρ (a) ( L f , m f ) = (1 . 24 , 1 . 03) µ ρ (b) ( L f , m f ) = (27 . 81 , 1 . 03) µ Fig. 3: Conv er gence rate estimates, as a function of µ , re- sulting from (18a) ( – – ), [18, Theorem 6] ( ··· ), and [11, The- orem 2] ( – - ) for problem (1) with (19) and (a) ( L f , m f ) = (1 . 24 , 1 . 03) ; (b) ( L f , m f ) = (27 . 81 , 1 . 03) . Furthermore, for ρ = ρ 0 , we ha ve α µ T T T − 2 ρα ((1 + m f µ ) I + 1 µ 2 T T T ) α µ σ 2 min ( T ) I − 2 ρα (1 + m f µ + σ 2 max ( T ) µ 2 ) I = 0 . (22) Combining (21) and (22) with the definition of R 2 yields R 2 αT T T / (2 µ ) and the positiv e definiteness of R 2 follows from the fact that T is a full row rank matrix. The application of the Schur complement requires R 1 − R T 0 R − 1 2 R 0 0 . Using R 2 αT T T / (2 µ ) , we can rewrite this condition as R 1 − (2 µ/α ) R T 0 ( T T T ) − 1 R 0 0 . For β ( µ ) and ˆ µ giv en by (18c) and (18b), respecti vely , if µ > ˆ µ , we hav e R 1 − (2 µ/α ) R T 0 ( T T T ) − 1 R 0 α 2 m f − (2 ρ 0 + σ 2 max ( T ) 2 µ (1 + m f µ ) + 8 ρ 2 0 µ ) I = α 2 m f − β ( µ ) I 0 . W e now prov e the existence of such ˆ µ . Since ρ 0 ( µ ) is mono- tonically decreasing for µ ≥ σ max ( T ) , β ( µ ) monotonically decreases to zero on the interval µ ∈ [ σ max ( T ) , ∞ ) . There are two cases: (i) if β ( σ max ( T )) > 2 m f , then β ( ¯ µ ) = 2 m f for some ¯ µ ∈ [ σ max ( T ) , ∞ ) and ˆ µ = ¯ µ ; (ii) if β ( σ max ( T )) ≤ 2 m f , then β ( µ ) < β ( σ max( T ) ) ≤ 2 m f for all µ ∈ ( σ max( T ) , ∞ ) . Thus ˆ µ = σ max( T ) . Therefore, the set in (18b) is nonempty and such ˆ µ alw ays exists. R E F E R E N C E S [1] D. Feijer and F . Paganini, “Stability of primal-dual gradient dynamics and applications to network optimization, ” Automatica , v ol. 46, no. 12, pp. 1974–1981, 2010. [2] D. Ding and M. R. Jovanovi ´ c, “ A primal-dual Laplacian gradient flow dynamics for distributed resource allocation problems, ” in Proceedings of the 2018 American Control Conference , 2018, pp. 5316–5320. [3] J. W ang and N. Elia, “ A control perspectiv e for centralized and distributed con ve x optimization, ” in Proceedings of the 50th IEEE Confer ence on Decision and Control , 2011, pp. 3800–3805. [4] M. Colombino, E. Dall’Anese, and A. Bernstein, “Online optimization as a feedback controller: stability and tracking, ” IEEE T rans. Contr ol Netw . Syst. , 2019, doi:10.1109/TCNS.2019.2906916. [5] K. J. Arrow , L. Hurwicz, and H. Uzawa, Studies in linear and non- linear progr amming . Stanford Univ ersity Press, 1958. [6] A. Cherukuri, B. Gharesifard, and J. Cort ´ es, “Saddle-point dynamics: conditions for asymptotic stability of saddle points, ” SIAM J. Control Optim. , vol. 55, no. 1, pp. 486–511, 2017. [7] A. Cherukuri, E. Mallada, and J. Cort ´ es, “ Asymptotic conv ergence of constrained primal-dual dynamics, ” Syst. Contr ol Lett. , vol. 87, pp. 10–15, 2016. [8] A. Cherukuri, E. Mallada, S. Low , and J. Cort ´ es, “The role of con- vexity on saddle-point dynamics: L yapunov function and robustness, ” IEEE Tr ans. Automat. Contr ol , vol. 63, no. 8, pp. 2449–2464, 2018. [9] B. Polyak and P . Shcherbako v , “L yapunov functions: An optimization theory perspectiv e, ” IF A C-P apersOnLine , vol. 50, no. 1, pp. 7456– 7461, 2017. [10] N. K. Dhingra, S. Z. Khong, and M. R. Jovano vi ´ c, “The proximal aug- mented Lagrangian method for nonsmooth composite optimization, ” IEEE Tr ans. Automat. Contr ol , vol. 64, no. 7, pp. 2861–2868, 2019. [11] G. Qu and N. Li, “On the exponential stability of primal-dual gradient dynamics, ” IEEE Contr ol Syst. Lett. , vol. 3, no. 1, pp. 43–48, 2019. [12] Y . T ang, G. Qu, and N. Li, “Semi-global exponential stability of primal-dual gradient dynamics for constrained conv ex optimization, ” 2019, [13] X. Chen and N. Li, “Exponential stability of primal-dual gradient dynamics with non-strong conve xity , ” 2019, [14] S. Hassan-Moghaddam and M. R. Jovanovi ´ c, “Proximal gradient flow and Douglas-Rachford splitting dynamics: global exponential stability via integral quadratic constraints, ” Automatica , 2019, submitted; also [15] J. Cort ´ es and S. K. Niederl ¨ ander , “Distributed coordination for nons- mooth conve x optimization via saddle-point dynamics, ” J. Nonlinear Sci. , pp. 1–26, 2018. [16] N. Parikh and S. Boyd, “Proximal algorithms, ” F oundations and T rends in Optimization , vol. 1, no. 3, pp. 127–239, 2014. [17] H. Zou and T . Hastie, “Regularization and variable selection via the elastic net, ” J. R. Stat. Soc. B , vol. 67, no. 2, pp. 301–320, 2005. [18] D. Ding and M. R. Jovano vi ´ c, “Global exponential stability of primal- dual gradient flo w dynamics based on the proximal augmented La- grangian, ” in Pr oceedings of the 2019 American Contr ol Conference , 2019, pp. 3414–3419. [19] L. Lessard, B. Recht, and A. Packard, “ Analysis and design of optimization algorithms via integral quadratic constraints, ” SIAM J . Optim. , vol. 26, no. 1, pp. 57–95, 2016. [20] N. K. Dhingra, S. Z. Khong, and M. R. Jov anovi ´ c, “ A second order primal-dual method for nonsmooth conv ex composite optimization, ” IEEE Tr ans. Automat. Contr ol , 2017, conditionally accepted; also
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment