Region of Attraction for Power Systems using Gaussian Process and Converse Lyapunov Function -- Part I: Theoretical Framework and Off-line Study

This paper introduces a novel framework to construct the region of attraction (ROA) of a power system centered around a stable equilibrium by using stable state trajectories of system dynamics. Most existing works on estimating ROA rely on analytical…

Authors: Chao Zhai, Hung D. Nguyen

Region of Attraction for Power Systems using Gaussian Process and   Converse Lyapunov Function -- Part I: Theoretical Framework and Off-line   Study
1 Re gion of Attraction for Po wer Systems using Gaussian Process and Con v erse L yapuno v Function – P art I: Theoretical Frame w ork and Of f-line Study Chao Zhai and Hung D. Nguyen Abstract —This paper introduces a no vel framework to con- struct the region of attraction (R O A) of a power system center ed around a stable equilibrium by using stable state trajectories of system dynamics. Most existing works on estimating RO A rely on analytical L yapunov functions, which are subject to two limitations: the analytic L yapuno v functions may not be always readily a vailable, and the resulting RO A may be overly conser - vativ e. This work overcomes these two limitations by leveraging the con verse L yapunov theor em in control theory to eliminate the need for an analytic L yapunov function and learning the unknown L yapunov function with the Gaussian Process (GP) approach. In addition, a Gaussian Process Upper Confidence Bound (GP-UCB) based sampling algorithm is designed to reconcile the trade-off between the exploitation for enlarging the RO A and the exploration f or reducing the uncertainty of sampling r egion. Within the constructed R O A, it is guaranteed in the probability that the system state will con v erge to the stable equilibrium with a confidence level. Numerical simulations are also conducted to validate the assessment approach for the RO A of the single machine infinite bus system and the New England 39 -bus system. Numerical r esults demonstrate that our approach can significantly enlarge the estimated R O A compar ed to that of the analytic L yapunov counterpart. Index T erms —Stability assessment, r egion of attraction, power systems, L yapunov function, Gaussian process N O M E N C L A T U R E α ( z ) a class Γ function V ( x ) L yapuno v function ˆ V ( x ) estimation of L yapuno v function V ? ( x ) an existing L yapunov function φ ( x , t ) state trajectory of nonlinear dynamical system k ( x , x 0 ) covariance function of GP or kernel function in the reproducing kernel Hilbert space (RKHS) S real region of attraction for nonlinear system Ω c lev el set of L yapunov function with the upper bound c N number of stable sampling points. ∆ t time interval of numerical method for solving the differential equation δ parameter to specify the confidence level A N set of N stable sampling points µ i ( x ) mean value of a Gaussian process at the i -th itera- tion Chao Zhai and Hung D. Nguyen are with School of Electrical and Elec- tronic Engineering, Nanyang T echnological Uni versity , 50 Nanyang A venue, Singapore 639798. Email: zhaichao@ntu.edu.sg and hunghtd@ntu.edu.sg. Corresponding author: Hung D. Nguyen. σ i ( x ) standard deviation of a Gaussian process at the i -th iteration k · k 2-norm in Euclidean space k · k k induced RKHS norm with the kernel k ( x , x 0 ) I . I N T RO D U C T I O N The region of attraction (R O A) for complex dynamical systems provides a useful measure of stability level and robustness against external disturbances. Thus, it is of great importance to safety-critical systems (e.g., power systems, nuclear reactor control systems, engine control systems, etc), where the system stability has to be guaranteed before they are implemented in practice. In terms of power systems, the R O A refers to a subspace of operating states that can con ver ge to a steady-state equilibrium. There are various approaches for estimating the RO A of a general nonlinear system such as contraction analysis [1], lev el sets of L yapuno v function [8, 17], sum of square technique [18, 19], sampling-based method [20], and so on. Nev ertheless, these approaches largely rely on deterministic models and may not be applicable to deal with uncertainties in more realistic systems. As a non-parametric method, GP is flexible to incorporate the prior information as well as to quantify the uncertainty [21]. By regarding the unknown function or dynamics as a GP , Bayesian machine learning provides a powerful tool for both regression and inference using the prior belief and sample data, and it can generate the posterior distrib ution for the unknown function [7, 22]. Therefore, the GP approach has found wide applications in various fields such as bandit setting [23], robotics [24], and classifier design [25], to name just a few . As is well known, it is always a challenging problem for determining a suitable L yapunov function for a complex nonlinear dynamical system. For a stable equilibrium point of the dynamical system, the con verse L yapuno v theorem ensures the existence of L yapunov functions and enables us to compute the values using stable state trajectories [17]. In practice, the accurate values of L yapunov function are not av ailable due to numerical error and time restrictions. For this reason, it would be desirable to regard the unknown L yapuno v function as a GP . In this way , the posterior distribution of L yapuno v function can be obtained by learning the sampling data (i.e., estimated values of L yapuno v function), which makes it possible to construct and ev aluate the R OA for nonlinear dynamical systems. Thus, this paper centers on the quantitati ve ev aluation of R O A for power systems from a GP perspectiv e. Compared 2 with existing work [24, 26, 27], the key contributions of this work lie in 1) Propose a theoretical frame work based on the con verse L yapuno v theorem for estimating the RO A of general nonlinear systems around an equilibrium without con- structing an analytic L yapuno v function. 2) Develop a GP-UCB based sampling algorithm for creat- ing a sampling set and learning the unknown L yapuno v function using stable state trajectories. 3) For an existing L yapuno v function, our approach allows for extending the certified R O A with a guaranteed con- fidence lev el. The remainder of this paper is organized as follows. Section II introduces the region of attraction for a general dynamical system and the estimation of the L yapunov function. Section III presents the GP approach for learning the known dynamics, followed by the main results on the GP-UCB based sampling algorithm in Section IV. Numerical simulations are conducted to validate the proposed approach on the IEEE test systems in Section V. Finally , Section VI draws a conclusion and discusses future work. I I . T H E R OA O F A G E N E R A L D Y N A M I C A L S Y S T E M The R O A of a general dynamical system normally refers to a region where each state can con ver ge to the stable equilibrium point as time goes to infinity . Like wise, the RO A of a power system is viewed as a set of operating states such as rotor angles and frequencies able to con verge to the stable equilibrium, which corresponds to the solution of power flow problem [4, 9 – 12, 16], after being subject to a disturbance. The corresponding con vergent trajectory is regarded as a stable state trajectory . The estimation of RO A is basically dependent on the construction of L yapunov function and its lev el set (see Fig. 1). In practice, constructing an analytic L yapunov function for a nonlinear system is a challenging task. The con verse L yapunov theorem [17] allows for estimating the v alue of L yapunov function without using its analytic form. For a po wer system that has a stable state trajectory x ( t ) , t ≥ 0, a commonly used con verse L yapunov function is V ( x ) = R ∞ 0 k x ( t ) k 2 d t [27]. W e generalize conv erse L yapunov function by introducing a more general function α ( · ) (see the definition in Appendix VI-A) and a solution trajectory φ ( · ) in Lemma 1. The existence of such generalized L yapunov function is guaranteed in theory (i.e., Theorem 4 . 17 in [17]) as follows. Lemma 1. W ithout loss of generality , let x = 0 be an asymp- totically stable equilibrium point for the nonlinear system ˙ x = f ( x ) , where f : X → R n is locally Lipschitz, and S is the r e gion of attraction, then ther e is a continuous positive definite function W ( x ) such that V ( x ) = Z ∞ 0 α ( k φ ( x , t ) k ) d t , V ( 0 ) = 0 and dV ( x ) d t = ∂ V ( x ) ∂ x f ( x ) ≤ − W ( x ) , ∀ x ∈ S (1) with d φ ( x , t ) d t = f ( φ ( x , t )) , φ ( x , 0 ) = x . (2) Fig. 1: Illustration on the level set of L yapunov function. The red ellipse describes the le vel set { x ∈ R 2 | V ( x ) ≤ C 1 } , while the blue one denotes the lev el set { x ∈ R 2 | V ( x ) ≤ C 2 } with C 1 < C 2 and x = ( x 1 , x 2 ) . The black dot represents a stable equilibrium point of dynamical system. Each state in the lev el set can con verge to the equilibrium point on condition of ˙ V ( x ) < 0. α ( z ) is a class Γ function (see Appendix VI-A), and the level set Ω c = { x ∈ R n | V ( x ) ≤ c , ∀ c > 0 } is a compact subset of S = { x ∈ R n | lim t → + ∞ φ ( x , t ) = 0 } . In Lemma 1, the dynamics ˙ x = f ( x ) of a po wer system normally represent that of generators which are kno wn as swing equations [28]. Here we explicitly assume that the system dynamics are av ailable, although our frame work can be generally extended to incorporate uncertainties with a pre- defined complexity level, i.e., the smoothness of the unknown components, in the system’ s dynamical model. Property (1) implies that the function V ( x ) decays over time. Equation (2) defines a stable trajectory φ ( x , t ) with the initial state x . More importantly , the L yapunov function V ( x ) in Lemma 1 has a nice property that, by increasing the level set of V ( x ) , we can approach the real region of attraction (see Remark 7 for a formal explanation). Note that this property may not hold for any analytical L yapunov function. Thus, we can lev erage the con verse L yapunov function approach to obtain a better R OA by collecting more sampling points in order to enlarge the lev el sets. Due to the discrete nature of sampling data, the con verse L yapunov function V ( x ) in Lemma 1 can be discretized and approximated as follows ˆ V ( x ) = n ∑ i = 1 α ( k φ ( x , t i ) k ) ∆ t , (3) where ∆ t denotes the sampling time interval and t i = ( i − 1 ) ∆ t , i ∈ { 1 , 2 , ..., n } . The error caused by the above discretization can be further estimated (see Lemma 2 in Appendix VI-B). While the approximated ˆ V ( x ) can be calculated directly based on samples, the explicit L yapunov function V ( x ) is unknown. This work therefore learns this unknown L yapunov function V ( x ) from the discretized counterpart. By treating V ( x ) and ˆ V ( x ) as a GP and its measurement, respectiv ely , the estimation error V ( x ) − ˆ V ( x ) can be regarded as the measurement noise. This enables us to learn the unknown L yapunov function V ( x ) using the GP approach. 3 Fig. 2: Examples of basic kernel functions: (a) Squared Exponential K ernel k ( x , x 0 ) = e −k x − x 0 k 2 / 2 l 2 with a length scale parameter l and (b) Linear K ernel k ( x , x 0 ) = x T x 0 with x 0 = 1. Each kernel allows to approximate the unknown function with a certain “complexity”. I I I . G P F O R L E A R N I N G U N K N OW N D Y NA M I C S The above section introduces the discretized, approximated L yapunov function and the respectiv e unknown L yapunov function. In this section, we propose the GP approach for learning the unknown L yapunov function (i.e., GP regression). Normally , a general GP regression requires a prior distribution of unknown functions specified by a mean function, a co- variance function, and the probability of the observations and sampling data to obtain the posterior distribution. W ithout loss of generality , we consider an unknown function h ( x ) as a GP , which can be sequentially measured by y ( i ) = h ( x ( i ) ) + ε , i ∈ Z + where y ( i ) refers to the observed function value for the input x ( i ) at the i -th sampling step, and the measurement noise ε is zero-mean, independent and bounded by σ . W ith the GP approach, we can obtain the posterior distribu- tion ov er h ( x ) by using sampling data in the training set { ( x ( 1 ) , y ( 1 ) ) , ( x ( 2 ) , y ( 2 ) ) , ..., ( x ( i ) , y ( i ) ) } . In this work, it is assumed that the unknown L yapunov function is a GP prior or its “complexity” can be measured by the RKHS norm. The sampling data are obtained by implementing the GP-UCB based sampling algorithm. Note that, for an existing L yapunov function, the GP approach allows to extend the certified R O A with a given confidence lev el. A. Gaussian pr ocess and RKHS norm By regarding the values of V ( x ) as random variables, any finite collection of them is multiv ariate distributed in an overall consistent way . The unknown L yapunov function V ( x ) can be approximated by a GP . Note that the covariance or kernel function k ( x , x 0 ) encodes the smoothness property of V ( x ) from the GP (see Fig. 2). For a sample from a known GP distribution ˆ V N = [ ˆ V ( x ( 1 ) ) , ..., ˆ V ( x ( N ) )] T at points A N = { x ( 1 ) , ..., x ( N ) } , ˆ V ( x ( i ) ) = V ( x ( i ) ) + ε with ε ∼ N ( 0 , σ 2 ) , there are the analytic formulas for mean µ N ( x ) , covariance k N ( x , x 0 ) and variance σ 2 N ( x ) of the posterior distribution as follows [23] µ N ( x ) = k N ( x ) T ( K N + σ 2 I ) − 1 ˆ V N k N ( x , x 0 ) = k ( x , x 0 ) − k N ( x ) T ( K N + σ 2 I ) − 1 k N ( x 0 ) σ 2 N ( x ) = k N ( x , x ) (4) where k N ( x ) = [ k ( x ( 1 ) , x ) , ..., k ( x ( N ) , x )] T and K N is the positiv e definite kernel matrix [ k ( x , x 0 )] x , x 0 ∈ A N . If the prior distribution over V ( x ) is unknown, V ( x ) can be approximated by the linear combination of kernel functions in the RKHS, i.e. V ( x ) = ∑ i c i k ( x , x ( i ) ) . The RKHS H k ( X ) with the kernel k ( x , x 0 ) is a complete subspace of L 2 ( X ) , and its inner product h· , ·i k is endowed with the reproducing property: h V ( x ) , k ( x , · ) i k = V ( x ) , ∀ V ( x ) ∈ H k ( X ) . Moreover , there exists a uniquely associated RKHS H k ( X ) for each k ernel k ( x , x 0 ) [21]. The induced RKHS norm k V k k = p h V , V i k is used to quantify the smoothness of the function V with respect to the kernel k ( x , x 0 ) . In brief, the function V ( x ) gets smoother as k V k k decreases. In this work, we consider that L yapunov functions V ( x ) hav e relatively low “complexity” or high smoothness, which can be measured by the RKHS norm. Remark 1. Ther e is an inter esting connection between the kernel function in RKHS and the covariance function of a GP . If the kernel function in RKHS is the same as the covariance function of a GP , the posterior mean of a GP is equivalent to the estimator of kernel ridge r egr ession in RKHS [29]. In addition, the posterior variance of a GP can be re gar ded as a worst case err or in RKHS. W ithin the GP approach, how one samples plays an im- portant role because the sampling rule affects the confidence lev el of the estimated RO A, which is the probability at least with which a trajectory initiated from an inner state of such estimated R OA will con ver ge to the corresponding stable equilibrium. In other words, that is the probability that an estimated RO A is valid. For this work, we use the GP- UCB based sampling rule. Compared with other heuristics in GP optimization [30 – 32], the GP-UCB based sampling rule is able to deal with the trade-off between exploitation for optimizing the objecti ve function and exploration for reducing the uncertainty with the guaranteed theoretical performance. B. GP-UCB based algorithm For a giv en value of δ ∈ ( 0 , 1 ) and the sampling domain X ∈ R n , which is a subset of the state space, our goal is to maximize the region of attraction, wherein each point con verges to the origin with probability at least the confident lev el of 1 − δ . Thus, a GP-UCB based algorithm is de veloped in T able I to select the sampling points for enlarging the R OA with a guaranteed confidence lev el. Specifically , a sampling point x ( i ) is selected in X by searching for the maxima of µ i − 1 ( x ) + β 1 / 2 i σ i − 1 ( x ) , where the term µ i − 1 ( x ) contributes to enlarge the lev el set of L yapunov function and the term σ i − 1 ( x ) allows to reduce the uncertainty of sampling re gion. Essentially , the sampling rule aims to reconcile the trade-off between the exploitation for enlarging the RO A and the explo- ration for reducing the uncertainty of sampling region. Let the 4 T ABLE I: GP-UCB based Algorithm. Input: X ∈ R n , δ , ξ , t n , µ 0 , σ 0 , k ( x , x 0 ) , i = 1 Output: x ( i ) , φ ( x ( i ) , t ) , ˆ V ( x ( i ) ) , µ i ( x ) , σ i ( x ) , i ∈ { 1 , 2 , ..., N } 1: while ( i ≤ N ) 2: Choose x ( i ) = argmax x ∈ X h µ i − 1 ( x ) + β 1 / 2 i σ i − 1 ( x ) i 3: Generate a state trajectory φ ( x ( i ) , t ) , t ≥ 0 4: if ( k φ ( x ( i ) , t n ) k < ξ ) 5: Sample ˆ V ( x ( i ) ) = V ( x ( i ) ) + ε with (3) 6: Update µ i ( x ) and σ i ( x ) with (4) 7: Update i = i + 1 8: else 9: Update X = X / { x ( i ) } . 10: end if 11: end while −1 −0.5 0 0.5 1 −2 −1 0 1 2 3 4 A B C x V(x) Fig. 3: Illustration on three different sampling rules. The blue line denotes mean v alues of the unkno wn function V ( x ) , and the gray shade describes the 95% confidence interval. The three red circles A , B and C represent three dif ferent sampling rules, respecti vely . A prefers the point with the maximum posterior mean v alue in order to achieve the large V ( x ) , and B selects the point with the maximum posterior variance in order to reduce the uncertainty . C aims to allow for both the exploitation for large V ( x ) and the exploration for eliminating the uncertainty . sampling point x ( i ) serve as the initial condition of nonlinear system ˙ x = f ( x ) , and this allows to generate a state trajectory φ ( x ( i ) , t ) , t ≥ 0. If this state trajectory can con verge to the origin, the point x ( i ) is called as a stable sampling point. Then the value of L yapunov function at x ( i ) is estimated by ˆ V ( x ( i ) ) with (3). By choosing { ( x ( 1 ) , ˆ V ( x ( 1 ) )) , ..., ( x ( i ) , ˆ V ( x ( i ) )) } as the training set, µ i ( x ) and σ i ( x ) for the unknown L yapunov function V ( x ) can be updated according to (4). On the other hand, if the state trajectory φ ( x ( i ) , t ) , t ≥ 0 fails to con ver ge to the origin, the sampling point x ( i ) is removed from the sampling region X . The above process does not terminate until it achiev es the specified number of stable sampling points. Figure 3 illustrates three different sampling schemes, i.e., Scheme A, Scheme B, Scheme C which correspond to sam- pling points A, B, C, respectiv ely . Mathematically , Scheme A can be described as x ( i ) = argmax x ∈ X µ i − 1 ( x ) and Scheme B is expressed as x ( i ) = argmax x ∈ X σ i − 1 ( x ) . In practice, Scheme A is too greedy and tends to get local optima, while Scheme B provides a good rule for exploring V ( x ) globally [23]. As a result, Scheme C is designed to inte grate A with B by adopting x ( i ) = argmax x ∈ X J i ( x ) , where the rew ard function J i ( x ) is giv en by J i ( x ) = µ i − 1 ( x ) + β 1 / 2 i σ i − 1 ( x ) . In addition, the parameter β i depends on the “complexity” of L yapunov function, the sample size and information gain (see Appendix VI-C). Remark 2. T o obtain the global maximum of J i ( x ) in the sampling re gion X is gener ally infeasible due to the non- con vexity of J i ( x ) . In practice, there ar e multiple ways to heuristically sear ch for its local maxima. F or instance, the local maxima can be readily identified fr om the finite historic dataset of sampling points instead of the r e gion X . In addition, it is expected to appr oach the local maxima along the gradient ascent of J i ( x ) , and the gradient can be appr oximated with the finite differ ence method [33]. Remark 3. F or a certified R O A, it is sufficient to judge a stable sampling point if the corr esponding state trajectory can overpass the boundary of this R O A without chec king its con vergence to the stable equilibrium point. This will reduce the computation time for selecting the stable sampling points. I V . M A I N R E S U LTS This section presents main theoretical results on the con- struction and e valuation of R O A with a giv en confidence le vel by using the GP-UCB based algorithm. Theorem 1. Let δ ∈ ( 0 , 1 ) , and the measurement noise is bounded by σ . Then it holds with pr obability at least 1 − δ that | V ( x ) − µ N − 1 ( x ) | ≤ β 1 / 2 N σ N − 1 ( x ) , ∀ x ∈ X , ∀ N ∈ Z + wher e β N = 2 k V k 2 k + 300 γ N ln 3 ( N / δ ) . Pr oof. The results follo w directly from Theorem 6 in [23]. Remark 4. The RKHS norm k · k k char acterizes the “complex- ity” or “smoothness” of the function in RKHS. If the bound of k V k 2 k is unknown a prior , it can be computed using kernel ridge r e gression as k V k 2 k = ˆ V T N P T diag ( λ i ( λ i + N θ ) − 2 ) P ˆ V N , where K N = P T diag ( λ i ) P and diag ( λ i ) ∈ R N × N denotes a diagonal matrix with the diagonal elements λ i , i = 1 , 2 , ... N . Mor eover , θ is a positive tunable parameter for contr olling the smoothness of V ( x ) to avoid overfitting. A larg er value of θ leads to the smoother function V ( x ) (see Appendix VI-D). For a stable equilibrium point, the L yapunov function V ( x ) is viewed as a GP , and Theorem 1 allows to estimate the R O A of nonlinear system with a giv en confidence lev el. If the R OA Ω ? is already established according to an existing L yapunov function V ? ( x ) , the proposed approach is applied to ev aluate the stability of the region outside the R O A Ω ? by treating the 5 mismatch between V ( x ) and V ? ( x ) as a GP . Thus, theoretical results are summarized as follows. Theorem 2. Let δ ∈ ( 0 , 1 ) and V ? ( x ) be an existing L yapunov function for nonlinear system ˙ x = f ( x ) . W ith GP-UCB based Algorithm in T able I, it holds that Pr ob ( x ∈ S ) ≥ 1 − δ , ∀ x ∈ S δ , N wher e S δ , N is given by n x ∈ X | V ? ( x ) + µ N − 1 ( x ) + ¯ β 1 / 2 N σ N − 1 ( x ) ≤ C max , N o with C max , N = max x ( i ) ∈ A N ˆ V ( x ( i ) ) and ¯ β N = 2 k V ( x ) − V ? ( x ) k 2 k + 300 γ N ln 3 ( N / δ ) . F or a stable equilibrium point, the above conclusion also holds with V ? ( x ) ≡ 0 . Pr oof. See Appendix VI-E. Remark 5. Ther e ar e alternative schemes to char acterize the mismatch between the con verse L yapunov function V ( x ) and the existing Lyapunov function V ? ( x ) other than the dif fer ence scheme ∆ V ( x ) = V ( x ) − V ? ( x ) . F or example, it is also feasible to adopt the proportion scheme V ( x ) / V ? ( x ) as an unknown function for the GP learning. Essentially , a stable equilibrium point can be thought of as a special case of the R O A when the R O A shrinks into a point and thus the existing Lyapunov function becomes V ? ( x ) ≡ 0 . V . N U M E R I C A L S I M U L A T I O N S This section presents simulation results using the GP-UCB based algorithm for both single machine infinite bus (SMIB) system and IEEE 39 bus system. First of all, swing dynamics of a power system are introduced as follo ws. A. P ower system model Consider a M -bus power system described by a set of swing equations as follows [26, 35] m i ¨ ψ i + d i ˙ ψ i = p i − ∑ j ∈ N i 1 B i j sin ( θ ? i j + ψ i j ) , (5) where ψ i = θ i − θ ? i , ψ i j = ψ i − ψ j and θ ? i j = θ ? i − θ ? j . Note that θ i denotes the angle of the generator at b us i , and the superscript ? represents the steady state condition or the power flow solution. The swing equation (5) describes the ev olution of the phase angle due to the power mismatch between the mechanical power p i and the electrical power . In addition, N i is the neighbourhood set of bus i , and B i j denotes the susceptance of branch i j . The parameters m i and d i refer to inertia and damping coefficients for the machine, respectiv ely . For a load bus i , we normally assume that m i = d i = 0. In the steady state condition, the mechanical po wer p i can be expressed as p i = ∑ j ∈ N i 1 / B i j sin θ ? i j . This allows us to obtain the following perturbed dynamic model m i ¨ ψ i + d i ˙ ψ i = ∑ j ∈ N i 1 B i j  sin θ ? i j − sin ( θ ? i j + ψ i j )  . −5 0 5 −5 −4 −3 −2 −1 0 1 2 3 4 5 ψ 1 ˙ ψ 1 Fig. 4: Sampling points and state trajectories of SMIB system. Red dots represents the unstable sampling points that fail to con ver ge to the origin, while green dots refer to stable sampling points that con ver ge to the origin. Blue lines indicate their state trajectories, and the arrows point in the direction of state trajectories. The red dashed ellipse denotes the certified R OA according to (6). A number of approaches to assess the stability of these systems are presented in [2, 3, 5, 6, 13 – 15]. Here, we use an energy- like function for the abov e system of nonlinear differential equations as follows V ? ( ˙ ψ , ψ ) = 1 2 M ∑ i = 1 ∑ j ∈ N i 1 B i j Z ψ i j 0  sin θ ? i j − sin ( θ ? i j + τ )  d τ + 1 2 M ∑ i = 1 m i ˙ ψ 2 i and the origin (i.e., ψ = ˙ ψ = 0) is locally asymptotically stable, and the certified R O A can be estimated by [26] Ω ? =  ( ψ , ˙ ψ ) | ψ T L ψ + ˙ ψ T Λ ˙ ψ ≤ C λ  , (6) where ψ = ( ψ 1 , ..., ψ M ) T and ˙ ψ = ( ˙ ψ 1 , ..., ˙ ψ M ) T . In addition, L denotes the Laplacian matrix of power network with weights 1 / B i j , and Λ is a diagonal matrix that satisfies Λ = diag ( m ) with m = ( m 1 , ..., m M ) T . Moreover , C λ is given by C λ = min i j 1 / B i j ( 2 cos λ − ( π − 2 λ ) sin λ ) with λ = max i j | θ ? i j | . For simplicity , the class Γ function α ( z ) in Definition 1 is chosen as α ( z ) = z 2 in order to estimate the value of conv erse L yapunov function. B. Single mac hine infinite bus system In this SMIB system, bus 1 is the generator bus which is connected to the infinity bus 2 where θ 2 = ψ 2 = 0. The swing equation for this SMIB system is given by m 1 ¨ ψ 1 + d 1 ˙ ψ 1 = 1 B 12 [ sin θ ? 1 − sin ( θ ? 1 + ψ 1 )] where m 1 = 12, d 1 = 20, p 1 = 0 . 5 and B 12 = 0 . 1, and thus we hav e sin ( θ ? 12 ) = 0 . 05 and θ ? 12 = arcsin ( 0 . 05 ) . All values are in p.u. Then the R OA can be estimated by { ( ψ 1 , ˙ ψ 1 ) | 10 ψ 2 1 + 12 ˙ ψ 2 1 ≤ 18 . 45 } according to (6) and it is described by the red 6 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 ψ 1 ˙ ψ 1 Fig. 5: Confidence ev aluation for the R OA of SMIB system with 100 sampling points. The green dots denote the stable sampling points, and red ellipse region refers to the certified R OA with an existing L yapunov function. The yello w region indicates the state that con verges to the origin with the probability at least 95%. dashed ellipse in Fig. 4. The GP-UCB based Algorithm in T able I is adopted to assess the confidence le vel of operating states around the certified R O A. The parameter setting is giv en as follo ws: N = 100, δ = 0 . 05, ξ = 0 . 01, ∆ t = 0 . 01 and t n = 100. In addition, the squared exponential kernel or cov ariance function is employed to learn the unknown L yapunov function with the unit characteristic length-scale and µ 0 = 0. Figure 5 presents the simulation result for the SMIB system. The sampling points are denoted by small green dots, and the region of operating states with confidence lev el at least 95% has been marked in yello w . This implies that each state point in the yellow region conv erges to the origin with the probability that is not less than 95%. It is observed that the red dashed ellipse is closely surrounded by the yellow region. Notably , the yello w region co vers most sampling points outside the certified R O A except for four sampling points that are far aw ay from the origin. C. IEEE 39 b us system In order to validate the scalability of the proposed approach, numerical simulations are also conducted on the IEEE 39-bus 10-machine system (see Fig. 6), and the parameters are the same as those in [36]. Bus 31 with a generator is assigned as the swing b us. W e then consider the dynamics for the remaining 9 machines. The detailed parameter setting and the algorithm can be found in our MA TLAB code made av ailable in GitHub [37]. Figure 7 shows the assessment result for the R OA of IEEE 39 bus system with 9 machines. T o facilitate the visualization, we project the stable sampling points, the certified RO A and the confidence region onto 9 distinct two- dimensional planes, respectiv ely . Essentially , each plane acts as a cross section to sho wcase the profile of confidence region and certified R OA with respect to a dif ferent machine. In each panel of Fig. 7, the state points inside the red dashed ellipse is Fig. 6: IEEE 39 Bus System guaranteed to con ver ge to the origin, while those in the yellow region are asymptotically con vergent with the probability at least 95%. By Monte Carlo like estimate, the volume of yello w region is about 2 . 3 × 10 4 times larger than that of the certified R OA. This achiev ement is due to the large number of state dimensions or the size of the considered dynamical system. Note that this comparison is not entirely fair as the certified R OA can ensure that the system state will always conv erge to the stable equilibrium if it starts from the inside, while our estimated yellow region is only 95% confident. Similar to the case of the SMIB system, the yello w region does not cov er the state points that are relativ ely far aw ay from the origin. Actually , the profile of yello w regions largely depends on the distribution and size of sampling points as well as the choice of kernel functions for GP learning. D. Discussions on computational cost The computational burden associated with GP-UCB algo- rithm is mostly due to two processes: solving the dynam- ical equation or swing equation using ODE solvers, and GP optimization presented in (4). While the former depends heavily on the system size, the dimension of differential equation for power systems does not result in the visible increase of computational cost in GP optimization. Indeed, the computational cost is essentially immune to the dimension of power system dynamics. This is because the computation burden largely results from the operation of matrix inv erse in (4), where the size of kernel matrix K N mainly depends on the number of sampling points N rather than the dimension of power system dynamics. W ith Matlab 2017b in the desktop (Intel i7-3770 CPU 3.40GHz and installed RAM 8GB), it takes around 10 minutes for IEEE 39-b us system and 50 seconds for the SMIB system in our simulations. For the large- size sampling data, many ef ficient approaches for training the GP with the desirable performance [38] are av ailable. F or instance, the sparse representation of GP model is de veloped to 7 Fig. 7: Confidence ev aluation for the R O A of IEEE 39 bus system with 400 sampling points. In each panel, the green dots denote the stable sampling points, and the red dashed ellipse represents the boundary of a certified RO A. The yello w region indicates the state that con ver ges to the origin with the probability at least 95%. ov ercome the limitations for large data sets via the sequential construction of a sub-sample of the entire sampling data [39]. V I . C O N C L U S I O N S A N D F U T U R E W O R K In this paper , we in vestigated the problem of estimating the R OA for power systems. By treating the unkno wn L yapunov function as a Gaussian Process, we assessed state stability of power systems with the aid of the con verse L yapunov function. For an existing L yapunov function, our approach allows for extending the pre-existing R O A with a prov able confidence lev el. In addition, a GP-UCB based algorithm was developed to deal with the trade-off between exploration and exploitation in selecting stable sampling points. Numerical simulations are conducted to validate the proposed approach on the IEEE test cases. In the next step, we will consider the online learning appli- cations of unknown dynamics in practical power systems and the real-time prediction and stability assessment. This requires the creation of an efficient numerical algorithm with the aid of Gaussian Process and the con verse L yapunov function, which introduces the second part of this work. Another improv ement is to optimally rescale state variables to be aligned to the shape of the real R O A which is not equal in all dimensions. A C K N O W L E D G E M E N T The work of Chao Zhai and Hung Nguyen is supported by NTU SUG. A P P E N D I X This section provides a mathematical definition of the class Γ function and theoretical proofs for Lemma 2, Lemma 3 and Theorem 2, respectiv ely . First of all, the definition is presented as follows. A. The class Γ function Definition 1. The class Γ function consists of all continuous functions α : [ 0 , a ) → [ 0 , ∞ ] which satisfy the following condi- tions: 1) ∀ z > 0 , α ( z ) ∈ C 2 . 2) ∀ z > y ≥ 0 , α ( z ) > α ( y ) and α ( 0 ) = 0 . 3) ∀ z ≥ 0 , ∃ m > 0 , such that α ( z ) ≤ z m . Remark 6. Condition 1 in Definition 1 indicates that the first two derivatives of the class Γ function α ( z ) exist and ar e continuous. Condition 2 implies that α ( z ) is a strictly incr easing function. In addition, Condition 3 aims to impose the r estriction on the rate of α ( z ) . Remark 7. The construction of V ( x ) in Lemma 1 allows us to obtain an important pr operty on the r e gion of attraction S (i.e., lim c → + ∞ Ω c = S). Specifically , ∀ x ∈ Ω c , we have R ∞ 0 α ( k φ ( x , t ) k ) d t ≤ c, which implies lim t → + ∞ α ( k φ ( x , t ) k ) = 0 and lim t → + ∞ φ ( x , t ) = 0 . Thus, we have x ∈ S and Ω c ⊆ S. In addition, it is guaranteed that V ( x ) ≤ ∞ , ∀ x ∈ S . This indicates x ∈ Ω ∞ and thus S ⊆ Ω ∞ . It follows that Ω c ⊆ S ⊆ Ω ∞ . Considering that the constant c can be sufficiently lar ge, we have lim c → + ∞ Ω c = S. B. An upper bound of discr etizing err or An upper bound of the estimation error due to discretizing con verse L yapunov function V ( x ) is gi ven belo w . Lemma 2. Let [ ∂ f / ∂ x ]( 0 ) be Hurwitz. Ther e are positive constants κ , η , λ and m such that the following inequality holds   V ( x ) − ˆ V ( x )   ≤ κ n ( ∆ t ) 3 12 + η m k φ ( x , t n ) k m m λ (7) The proof is giv en below . V ( x ) − ˆ V ( x ) = V ( x ) − Z t n 0 α ( k φ ( x , t ) k ) d t + Z t n 0 α ( k φ ( x , t ) k ) d t − n ∑ i = 1 α ( k φ ( x , t i ) k ) ∆ t , that      V ( x ) − n ∑ i = 1 α ( k φ ( x , t i ) k ) ∆ t      ≤     V ( x ) − Z t n 0 α ( k φ ( x , t ) k ) d t     +      Z t n 0 α ( k φ ( x , t ) k ) d t − n ∑ i = 1 α ( k φ ( x , t i ) k ) ∆ t      = Z ∞ t n α ( k φ ( x , t ) k ) d t +      Z t n 0 α ( k φ ( x , t ) k ) d t − n ∑ i = 1 α ( k φ ( x , t i ) k ) ∆ t      By the trapezoidal rule for numerical integration in the interval [ 0 , t n ] [40], the error bound is giv en by      Z t n 0 α ( k φ ( x , t ) k ) d t − n ∑ i = 1 α ( k φ ( x , t i ) k ) ∆ t      ≤ κ n ( ∆ t ) 3 12 8 where | ∂ t t α | ≤ κ , t ∈ [ 0 , t n ] . Considering that [ ∂ f / ∂ x ]( 0 ) is Hurwitz, x = 0 is an exponentially stable equilibrium point from Corollary 4 . 3 in [17]. This implies that there are positiv e constants η and λ , such that k φ ( x , t ) k ≤ η k φ ( x , t n ) k e − λ ( t − t n ) , ∀ t ≥ t n . In terms of Condition 3 of the Γ class function in Definition 1, there is the positiv e constant m such that α ( k φ ( x , t ) k ) ≤ α ( η k φ ( x , t n ) k e − λ ( t − t n ) ) ≤ η m k φ ( x , t n ) k m e − m λ ( t − t n ) , ∀ t ≥ t n Therefore, we obtain Z ∞ t n α ( k φ ( x , t ) k ) d t ≤ Z ∞ t n η m k φ ( x , t n ) k m e − m λ ( t − t n ) d t = η m k φ ( x , t n ) k m Z ∞ t n e − m λ ( t − t n ) d t = η m k φ ( x , t n ) k m m λ This completes the proof. W e discuss some notes on the Inequality (7) in the follo wing remark. Remark 8. F or the fixed total sampling time t n , the first term on the right hand side of Inequality (7) con verges to 0 as the sampling time interval ∆ t goes to 0 . Since lim n → ∞ k φ ( x , t n ) k = 0 , the second term on the right hand side of Inequality (7) con verges to 0 as n goes to the positive infinity . Remark 9. The first partial derivative of α ( k φ ( x , t ) k ) with r espect to time t is given by ∂ t α = ∂ α ( k φ ( x , t ) k ) ∂ t = α 0 ( k φ ( x , t ) k ) · φ ( x , t ) T k φ ( x , t ) k · d φ ( x , t ) d t = α 0 ( k φ ( x , t ) k ) · φ ( x , t ) T f ( φ ( x , t )) k φ ( x , t ) k wher e α 0 ( z ) = d α ( z ) / d z. And the second partial derivative of α ( k φ ( x , t ) k ) with r espect to time t is given by ∂ t t α = ∂ 2 α ( k φ ( x , t ) k ) ∂ t 2 = ∂ ∂ t  α 0 ( k φ ( x , t ) k ) · φ ( x , t ) T f ( φ ( x , t )) k φ ( x , t ) k  =  α 00 ( k φ ( x , t ) k ) − 1  · [ φ ( x , t ) T f ( φ ( x , t ))] 2 k φ ( x , t ) k 3 + α 0 ( k φ ( x , t ) k ) · k f ( φ ( x , t )) k 2 + φ ( x , t ) T ∂ t f k φ ( x , t ) k , (8) wher e α 00 ( z ) = d 2 α ( z ) / d z 2 and ∂ t f = ∂ f ( φ ( x , t )) ∂ t = f 0 ( φ ( x , t )) f ( φ ( x , t )) . Equation (8) enables us to estimate the upper bound of | ∂ t t α | in [ 0 , t n ] and obtain the constant κ . C. Information gain T o quantify the reduction in uncertainty on V ( x ) from observations ˆ V ( x ) , the information gain is introduced for a Gaussian process as follows I ( ˆ V A ; V A ) = 1 2 log | I + σ − 2 K A | , where K A = [ k ( x , x 0 )] x , x 0 ∈ A is the covariance matrix of V A = [ V ( x )] x ∈ A with the sample set A . Let γ N denote the upper bound of I ( ˆ V A ; V A ) for the sample set A with | A | = N . That is γ N = max A ⊂ X : | A | = N I ( ˆ V A ; V A ) Actually , γ N is related to the choice of kernel functions k ( x , x 0 ) . Lemma 3. F or A ⊆ X ⊂ R n , it holds that γ N ≤ N 2 σ 2 when k ( x , x 0 ) ≤ 1 . Pr oof. Since K A is a positiv e definite kernel matrix, it follows that | I + σ − 2 K A | = N ∏ i = 1 ( 1 + σ − 2 λ i ) where λ i , i ∈ I N = { 1 , ..., N } are all positiv e eigen values of K A . Considering that log ( 1 + x ) ≤ x , ∀ x ≥ 0, we have log | I + σ − 2 K A | = log N ∏ i = 1 ( 1 + σ − 2 λ i ) ! = N ∑ i = 1 log ( 1 + σ − 2 λ i ) ≤ σ − 2 N ∑ i = 1 λ i Because of ∑ N i = 1 λ i = t r ( K A ) and k ( x , x 0 ) ≤ 1, we obtain N ∑ i = 1 λ i = t r ( K A ) = N ∑ i = 1 k ( x i , x i ) ≤ N Therefore, we get log | I + σ − 2 K A | ≤ σ − 2 N , ∀ x ∈ A , | A | = N and γ N ≤ N 2 σ 2 This completes the proof. Remark 10. The upper bound for γ N in Lemma 3 is applied to all kernel functions satisfying k ( x , x 0 ) ≤ 1 . F or a specific kernel (e.g ., finite dimensional linear kernel, squared e xponential kernel and Mat ´ ern kernel, etc), the tighter upper bound is available [23]. 9 D. Computation of the RKHS norm Kernel ridge regression can be formulated as a regularized empirical risk minimization problem over the RKHS H k ( X ) as follows min V ∈ H k ( X ) 1 N N ∑ i = 1 ( ˆ V ( x ( i ) ) − V ( x ( i ) )) 2 + θ k V k 2 k . It follows from Theorem 3 . 4 in [21] that there is a unique solution to the above minimization problem, and the solution is given by V ( x ) = ∑ N i = 1 c i k ( x , x ( i ) ) with the coefficient vector c = ( c 1 , c 2 , ..., c N ) T = ( K N + N θ I ) − 1 ˆ V N . Thus, k V k 2 k can be computed by k V k 2 k = h V , V i k = N ∑ i = 1 N ∑ j = 1 c i c j k ( x ( i ) , x ( j ) ) = c T K N c . By substituting the vector c = ( K N + N θ I ) − 1 ˆ V N and the posi- tiv e semidefinite matrix K N = P T diag ( λ i ) P , we hav e k V k 2 k = c T K N c = ˆ V T N ( K N + N θ I ) − 1 K N ( K N + N θ I ) − 1 ˆ V N = ˆ V T N ( K N + N θ I ) − 1 P T diag ( λ i ) P ( K N + N θ I ) − 1 ˆ V N = ˆ V T N P T diag ( λ i ( λ i + N θ ) − 2 ) P ˆ V N . E. Proof of Theorem 2 The proof consists of two parts. The first part aims to estimate the RO A for a stable equilibrium point, and the second part centers on the RO A for the case of an existing L yapunov function V ? ( x ) . ( 1 ) Since x = 0 is an asymptotically stable equilibrium point for the nonlinear system ˙ x = f ( x ) , Lemma 1 guarantees the ex- istence of L yapunov function V ( x ) . Considering a sequence of sampling points x ( 1 ) , x ( 2 ) ,..., x ( N ) in A ⊂ X that can generate the stable state trajectory , the values of V ( x ( i ) ) , i ∈ { 1 , ..., N } can be estimated as ˆ V ( x ( i ) ) according to Equation (3). By treating ˆ V ( x ( i ) ) as the observ ations of a GP , the measurement noise ε is uniformly bounded by σ , which satisfies σ ≤ κ n ( ∆ t ) 3 12 + η m k φ ( x , t n ) k m m λ from Lemma 2. In addition, it follows from Theorem 1 that with probability at least 1 − δ , the inequality | V ( x ) − µ N − 1 ( x ) | ≤ β 1 / 2 N σ N − 1 ( x ) holds, where β N = 2 k V k 2 k + 300 γ N ln 3 ( N / δ ) and γ N ≤ N 2 σ 2 from Lemma 3. This allows us to deduce that the inequality V ( x ) ≤ µ N − 1 ( x ) + β 1 / 2 N σ N − 1 ( x ) , ∀ x ∈ X holds with probability at least 1 − δ . Then we define a lev el set for the con verse L yapunov function V ( x ) as W N = { x ∈ X | V ( x ) ≤ C max , N } , which includes all the stable state trajec- tories of sampling points x ( i ) , i ∈ { 1 , ..., N } and is a compact subset of the region of attraction S (i.e., W N ⊆ S ). Thus, if we hav e µ N − 1 ( x ) + β 1 / 2 N σ N − 1 ( x ) ≤ C max , N , the inequality V ( x ) ≤ C max , N holds with probability at least 1 − δ , which implies Prob ( x ∈ W N ) ≥ 1 − δ , ∀ x ∈ Ω δ , N with Ω δ , N = n x ∈ X | µ N − 1 ( x ) + β 1 / 2 N σ N − 1 ( x ) ≤ C max , N o . Considering that W N ⊆ S , we obtain Prob ( x ∈ S ) ≥ 1 − δ , ∀ x ∈ Ω δ , N . ( 2 ) Considering that V ? ( x ) is an existing L yapunov function for the nonlinear system ˙ x = f ( x ) with the certified RO A Ω ? , it is suggested that the origin is an asymptotically stable equi- librium point. This ensures the existence of a con verse L ya- punov function V ( x ) according to Lemma 1. Define ∆ V ( x ) = V ( x ) − V ? ( x ) as an unknown function for the GP learning. By observing the measurements ∆ ˆ V ( x ( i ) ) = ˆ V ( x ( i ) ) − V ? ( x ( i ) ) as a GP at the sampling points x ( i ) , i ∈ { 1 , ..., N } according to GP-R OA based algorithm in T able I, it follows from Theorem 1 that the inequality | ∆ V ( x ) − µ N − 1 ( x ) | ≤ ¯ β 1 / 2 N σ N − 1 ( x ) holds with probability at least 1 − δ . By replacing ∆ V ( x ) with V ( x ) − V ? ( x ) , we obtain | V ( x ) − V ? ( x ) − µ N − 1 ( x ) | ≤ ¯ β 1 / 2 N σ N − 1 ( x ) . In light of the proof for a stable equilibrium point, it is con- cluded that for any x ∈ S δ , N , the inequality Prob ( x ∈ S ) ≥ 1 − δ holds. This completes the proof. R E F E R E N C E S [1] Nguyen, Hung D and V u, Thanh Long and Slotine, Jean-Jacques and T uritsyn, Konstantin, 2017. Contrac- tion analysis of nonlinear dae systems. arXiv preprint [2] Parikshit Pareek, K onstantin T uritsyn, Krishnamurthy Dvijotham, Hung D Nguyen, 2018. A Sufficient Condi- tion for Small-Signal Stability and Construction of Ro- bust Stability Region. arXi v preprint [3] Dongchan Lee, Hung D Nguyen, Krishnamurthy Dvi- jotham, K onstantin T uritsyn, 2018. Conv ex Restriction of Power Flow Feasibility Sets. arXiv preprint arXiv: 1803.00818 [4] Dhagash Mehta, Hung D. Nguyen, Konstantin T uritsyn, 2016. Numerical Polynomial Homotopy Continuation Method to Locate All The Po wer Flow Solutions. IET Generation, Transmission & Distribution, 10.12 (2016): 2972-2980. [5] Hung D Nguyen, K onstantin T uritsyn, 2015. Robust stability assessment in the presence of load dynamics uncertainty . IEEE T ransactions on Power Systems, 31.2 (2015): 1579-1594. [6] Hung D Nguyen, Konstantin T uritsyn, 2015, V oltage multistability and pulse emergency control for distribu- tion system with power flow rev ersal. IEEE T ransac- tions on Smart Grid 6.6 (2015): 2985-2996. [7] Bogodorova, T etiana and V anfretti, Luigi and T uritsyn, K onstantin, 2015. Bayesian parameter estimation of power system primary frequency controls under model- ing uncertainties. IF A C-PapersOnLine, 48(28), pp. 461– 465. [8] T . L. V u and K. T uritsyn, 2016. L yapunov Functions Family Approach to Transient Stability Assessment. IEEE T ransactions on Po wer Systems, 31(2), pp. 1269- 1277. 10 [9] Zheng Zhang, Hung Dinh Nguyen, Konstantin T uritsyn, Luca Daniel, 2015. Probabilistic power flow compu- tation via lo w-rank and sparse tensor recovery . arXi v preprint [10] Lesieutre, Bernard and W u, Dan, 2015. An ef ficient method to locate all the load flow solutions-revisited. 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 381–388. [11] J. Liu and B. Cui and D.K. Molzahn and C. Chen and X. Lu, 2019. Optimal Power Flow for DC Networks with Robust Feasibility and Stability Guarantees. [12] Aolaritei, Liviu and Bolognani, Sav erio and D ¨ orfler , Florian, 2017. A distributed voltage stability margin for power distribution networks. IF AC-P apersOnLine, 50(1), pp. 13240–13245. [13] P . V orobe v and P . Huang and M. Al Hosani and J. L. Kirtley and K. T uritsyn, 2018. High-Fidelity Model Order Reduction for Microgrids Stability Assessment. IEEE T ransactions on Po wer Systems, 33(1), pp. 874- 887. [14] Tuyen, Nguyen Duc and Fujita, Goro and Funabashi, T oshihisa and Nomura, Masakatsu, 2017. Analysis of transient-to-island mode of po wer electronic interface with conv entional dq-current controller and proposed droop-based controller . Electrical Engineering, 99(1), pp. 47–57. [15] Tran, Thanh Son and Nguyen, Duc T uyen and Fu- jita, Goro, 2019. The Analysis of T echnical T rend in Islanding Operation, Harmonic Distortion, Stabilizing Frequency , and V oltage of Islanded Entities. Resources, 8(1). [16] Ali, Mazhar and Dymarsky , Anatoly and T uritsyn, Kon- stantin, 2017. T ransversality Enforced Newton Raphson Algorithm for Fast Calculation of Maximum Loadabil- ity . IET Generation, T ransmission & Distribution. [17] Khalil, H.K. and Grizzle, J.W ., 2002. Nonlinear Systems (V ol. 3). Upper Saddle Riv er , NJ: Prentice Hall. [18] Izumi, S., Somekawa, H., Xin, X. and Y amasaki, T ., 2018. Estimation of regions of attraction of power sys- tems by using sum of squares programming. Electrical Engineering, 100(4), pp. 2205-2216. [19] Chesi, G., 2011. Domain of attraction: analysis and control via SOS programming (V ol. 415). Springer Science & Business Media. [20] Bobiti, R. and Lazar, M., 2018. Automated-Sampling- Based Stability V erification and DO A Estimation for Nonlinear Systems. IEEE Transactions on Automatic Control, 63(11), pp. 3659-3674. [21] Kanagawa, M., Hennig, P ., Sejdinovic, D. and Sripe- rumbudur , B.K., 2018. Gaussian processes and kernel methods: A revie w on connections and equiv alences. arXiv preprint [22] Rasmussen, C.E. and Nickisch, H., 2010. Gaussian pro- cesses for machine learning (GPML) toolbox. Journal of Machine Learning Research, 11(Nov), pp. 3011-3015. [23] Sriniv as, N., Krause, A., Kakade, S.M. and Seeger , M.W ., 2012. Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE T ransactions on Information Theory , 58(5), pp. 3250-3265. [24] Berkenkamp, F ., Krause, A. and Schoellig, A.P ., 2016. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics. arXiv preprint [25] Gibbs, M.N. and MacKay , D.J., 2000. V ariational Gaus- sian process classifiers. IEEE Transactions on Neural Networks, 11(6), pp. 1458-1464. [26] Mnz, U. and Romeres, D., 2013. Region of attraction of power systems. IF A C Proceedings V olumes, 46(27), pp. 49-54. [27] Jones, M., Mohammadi, H. and Peet, M. M., 2017, December . Estimating the region of attraction using polynomial optimization: A conv erse L yapunov result. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC) (pp. 1796-1802). IEEE. [28] Kundur , P ., Balu, N. J. and Lauby , M. G., 1994. Power System Stability and Control (V ol. 7). New Y ork: McGraw-Hill. [29] G. S. Kimeldorf and G. W ahba. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. The Annals of Mathematical Statistics, 41(2): 495-502, 1970. [30] Mockus, J., 1975. On Bayesian methods for seeking the extremum. In Optimization T echniques IFIP T echnical Conference (pp. 400-404). Springer , Berlin, Heidelberg. [31] Mockus, J., 2012. Bayesian Approach to Global Opti- mization: Theory and Applications (V ol. 37). Springer Science & Business Media. [32] Brochu, E., Cora, V . M. and De Freitas, N., 2010. A tutorial on Bayesian optimization of expensiv e cost functions, with application to activ e user modeling and hierarchical reinforcement learning. arXi v preprint arXiv: 1012.2599. [33] Stoer, J. and Bulirsch, R., 2013. Introduction to Numer- ical Analysis (V ol. 12). Springer Science & Business Media. [34] R. D. Zimmerman, C. E. Murillo-Snchez, and R. J. Thomas, “MA TPOWER: Steady-State Operations, Plan- ning and Analysis T ools for Power Systems Research and Education, ” IEEE Transactions on Power Systems, vol. 26, no. 1, pp. 12-19, Feb. 2011. [35] Sauer, P .W . and Pai, M.A., 1998. Power System Dy- namics and Stability (V ol. 101). Upper Saddle Riv er , NJ: Prentice hall. [36] Drfler, F ., Jov anovi, M.R., Chertkov , M. and Bullo, F ., 2014. Sparsity-promoting optimal wide-area control of power networks. IEEE Transactions on Po wer Systems, 29(5), pp. 2281-2291. [37] https://github.com/Chaocas/R O A-for-Po wer-Systems [38] Liu, H., Ong, Y .S., Shen, X. and Cai, J., 2018. When Gaussian process meets big data: A re view of scalable GPs. arXiv preprint [39] Csato, L. and Opper, M., 2002. Sparse on-line Gaussian processes. Neural computation, 14(3), pp. 641-668. [40] Atkinson, K. E., 2008. An introduction to Numerical Analysis. John W iley & Sons.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment