A Barrier Function Approach to Finite-Time Stochastic System Verification and Control

This paper studies the problem of enforcing safety of a stochastic dynamical system over a finite-time horizon. We use stochastic control barrier functions as a means to quantify the probability that a system exits a given safe region of the state sp…

Authors: Cesar Santoyo, Maxence Dutreix, Samuel Coogan

A Barrier Function Approach to Finite-Time Stochastic System   Verification and Control
A Barrier F unction Approac h to Finite-Time Sto c hastic System V erification and Con trol Cesar San to y o 1 , Maxence Dutreix 1 , and Sam uel Co ogan 2 Abstract This pap er studies the problem of enforcing safety of a stochastic dynamical system o ver a finite-time horizon. W e use sto c hastic con trol barrier functions as a means to quantify the probabilit y that a system exits a given safe region of the state space in finite time. A barrier certificate condition that b ounds the exp ected v alue of the barrier function ov er the time horizon is recast as a sum-of-squares optimization problem for efficien t n umerical computation. Unlike prior w orks, the proposed certificate condition includes a state-dependent upp er b ound on the ev olution of the exp ectation. W e presen t form ulations for b oth contin uous-time and discrete-time systems. Moreo ver, for systems for which the drift dynamics are affine-in-control, w e prop ose a metho d for syn thesizing p olynomial state feedbac k con trollers that achiev e a specified probability of safet y . Several case studies are presented which b enchmark and illustrate the p erformance of our v erification and control metho d in the con tin uous-time and discrete-time domains. 1 In tro duction Reliance on complex, safety-critical systems is increas- ing, which has made safety verification of suc h systems of utmost imp ortance. F or example, environmen ts p op- ulated b y b oth h umans and autonomous systems (e.g. fulfillmen t centers and autonomous v ehicles) require rig- orous safet y v erification to ensure des ired b eha vior is ac hiev ed. F rom a practical standp oin t, safet y verifi ca- tion can translate directly to ensuring qualitativ e guide- lines such as collision a voidance are main tained. Safety- critical systems are often analyzed in a purely determin- istic framework, how ever, many real-world applications are sub ject to sto c hastic disturbances and are b etter mo deled as sto c hastic systems. A common approac h to safet y v erification in deter- ministic systems is via b arrier functions whic h provide Ly apuno v-like guarantees regarding system b ehavior. The existence of a barrier function whic h satisfies a b arrier c ertific ate can often b e enough to certify the 1 C. San to y o ( csantoyo@gatech.edu ) and M. Dutreix ( maxdutreix@gatech.edu ) are with the School of Electrical & Computer Engineering, Georgia Institute of T echnology , A tlan ta, GA, 30318, USA. 2 S. Coogan ( sam.coogan@gatech.edu ) is with the School of Electrical & Computer Engineering and the Sc ho ol of Civil and Environmen tal Engineering, Georgia Institute of T echnology , Atlan ta, GA, 30318, USA. 3 This work was partially supp orted by NSF under Grant #1749357. C. San toy o was supp orted b y the NSF Gradu- ate Research F ellowship Program under Grant No. DGE- 1650044. safe op eration of a system [17]. Recent work has mo d- ified and improv ed the deterministic form of barrier functions and expanded their application. In particu- lar, control barrier functions hav e b een introduced to guaran tee safet y of affine-in-con trol systems [5, 24]. This is demonstrated in applications for cruise control [4, 5], collision av oidance in robotic sw arms [23], walking rob ots [6], and has recently b een extended to allow for input-to-state safe control barrier functions [10] and to guaran tee finite-time conv ergence to a safe region [14]. In the stochastic setting, contin uous-time (CT) safety v erification via barrier certificates for infinite time hori- zons was introduced in [17] alongside the determinis- tic coun terpart. The work presented in [17] provides a framew ork for b ounding the probability a system will ev er exit a safe region based on a non-negative barrier function defined on the system state space. T o obtain probabilistic guarantees o ver infinite time horizons, [18] requires the infinitesimal generator, which dictates the exp ected v alue evolution of a sto chastic pro cess, to b e non-p ositiv e; i.e., the barrier function is required to b e a sup ermartingale . The pap er [20] relaxes the sup ermartingale condition for finite-time safety v erification and instead provides a bar- rier certificate which only requires the infinitesimal gen- erator of the barrier pro cess to b e upp er b ounded b y a constan t. Suc h pro cesses are called c-martingales and al- lo w the exp ected v alue of the barrier function to increase o v er time. This approach results in a safety probability b ound for finite-time horizons. Preprin t submitted to Automatica 12 Septem ber 2019 Recen tly , discrete-time (DT) con trol barrier functions ha v e b een used to certify safety for bi-p edal rob ots [1], safe p olicy synthesis for multi-agen t systems [2] and for temp oral logic v erification of discrete-time systems [8, 9]. The work presented in [1] mirrors, in discrete-time, the form ulation of deterministic contin uous-time control barrier functions initially presented in [4] whose ov erar- c hing theory and applications are summarized in [3]. In [1], the form ulation for discrete-time barrier functions presen ts a significan t distinction from the con tin uous- time coun terpart resulting in a nonlinear optimization problem whic h is not necessarily conv ex. This p oses c hallenges in solving the sto chastic discrete-time c on- troller syn thesis problem in a similar manner to that of sto chastic contin uous systems shown in [19]. There exist few publications related to v erification and con trol of sto chastic discrete-time systems. The presen t paper studies the problem of v erifying safety of sto chastic systems on finite time horizons for b oth con tin uous-time and discrete-time domains, and the con- tributions are as follo ws. W e build on the approaches prop osed in [17, 20] and prop ose a barrier certificate con- strain t that imp oses a state-dep endent b ound on the ex- p ected v alue for both con tin uous-time and discrete-time systems. This b ound was originally prop osed and stud- ied by Kushner in [11, 12, 13] in the context of sto chas- tic stability . The prop osed barrier certificate allo ws the exp ected v alue of the barrier to increase and cov ers the c-martingale condition of [20] as a sp ecial case. Ho wev er, our formulation also accounts for the system dynamics in the exp ectation constraint. This allows for probabil- it y bounds that are no w orse than the c-martingale con- dition, and in many cases, esp ecially with high v alues of sigma, provides b etter probability b ounds. As in [17, 20], we compute barrier functions using sum- of-squar es (SOS) optimization. Like in [17], but unlike [20], we utilize p olynomial barrier functions. This pro- vides a simpler formulation of the probability of failure on a finite time horizon when compared to the approac h in [20] which uses exp onential barrier functions and, em- pirically , provides tighter probabilit y b ounds. Third, w e extend our formulation to allow for control inputs and pro vide a metho d for synthesizing a safe con- troller. In particular, we consider affine-in-con trol sys- tems and the prop osed approach searches for a p olyno- mial state feedback controller which ensures a system’s failure probability achiev es a predetermined criterion via a sto chastic c ontr ol b arrier function . Our preliminary work on contin uous-time v erification and con trol synthesis is published in [19] with tw o case studies. The pap er [19] only fo cused on contin uous-time. In this paper, we consider sto chastic system v erification and control law synthesis in the discrete-time setting. This pap er is organized as follows: Section 2 cov ers the bac kground information of sto c hastic differen tial and difference equations, barrier functions and SOS opti- mization. Section 3 presents the problem formulation. Section 4 highlights the methodology we utilize to solv e the SOS optimization and sto chastic control problem. Section 5 and Section 6 present numerical case studies whic h illustrate our results and conclusions, respectively . 2 Preliminaries In this section, we first in tro duce background informa- tion regarding sto chastic systems, sto chastic pro cesses, and SOS p olynomials. 2.1 Sto chastic Differ ential Equations Consider a complete probability space (Ω , F , P ) and a standard Wiener pro cess w ( t ) taking v alues in R m . W e consider contin uous-time sto chastic pro cesses x ( t ) sat- isfying a sto chastic differential equation of the form dx = F ( x ) dt + σ ( x ) dw (1) where the compact set X ⊂ R n is the system state space, F : X → R n is the drift rate and σ : X → R n × m is the diffusion term. W e assume the functions F ( x ) and σ ( x ) are Lipschitz contin uous. W e no w in tro duce the in- finitesimal generator, which extends the usual definition of a time deriv ativ e to instead consider the exp ectation of a function of a random pro cess [15]. Definition 1 L et x ( t ) b e a sto chastic pr o c ess in R n . The infinitesimal generator A of x ( t ) acts on functions of the state sp ac e and is define d as A B ( x ) = lim t ↓ 0 E [ B ( x ) | x 0 ] − B ( x 0 ) t wher e B : X → R such that the limit exists for al l x 0 = x (0) . In particular, the infinitesimal generator for any pro cess as in (1) is of the form shown in F act 1. F act 1 (Ch. 7, Theorem 7.3.3 of [15]) L et x ( t ) b e a sto chastic pr o c ess satisfying (1), then the infinitesimal generator A of some twic e differ entiable function B ( x ) is given by A B ( x ) = n X i =1 F i ( x ) ∂ B ∂ x i + 1 2 n X i =1 n X j =1  σ ( x ) σ T ( x )  i,j ∂ 2 B ∂ x i ∂ x j . The stochastic pro cess x ( t ) is not guaran teed to lie in X at all times whic h leads us to define the stopped pro cess ˜ x . 2 Definition 2 ([17], Definition 12) Supp ose that τ is the first time of exit of x ( t ) fr om the op en set Int( X ). Then the stopp ed pro cess ˜ x ( t ) is define d by ˜ x ( t ) =  x ( t ) for t < τ x ( τ ) for t ≥ τ . The stopp ed pro cess ˜ x ( t ) inherits the same strong Mark o vian prop erty of x ( t ) and shares the same in- finitesimal generator [13]. 2.2 Sto chastic Differ enc e Equations Consider no w a discrete-time sto chastic pro cess of the form [12] x [ k + 1] = F ( x [ k ]) + σ ( x [ k ]) ξ [ k ] (2) where X ⊂ R n , ξ [ k ] ∈ R p , F : R n → R n , and σ : R n → R n × p . Here, ξ [ k ] is a random disturbance whose v alue is go v erned by some distribution at eac h time step k . F or the discrete-time setting, a stopp ed pro cess is defined analogously to Definition 2 and denoted by ˜ x [ k ]. 2.3 Sum-of-Squar es Definition 3 Define R [ x ] as the set of al l p olynomials in x ∈ R n . Then Σ[ x ] ,  s ( x ) ∈ R [ x ] : s ( x ) = m X i =1 g i ( x ) 2 , g i ( x ) ∈ R [ x ]  is the set of sum-of-squares p olynomials. Note that if s ( x ) ∈ Σ[ x ] then s ( x ) ≥ 0 ∀ x . Definition 4 Given p i ( x ) ∈ R [ x ] for i = 0 , . . . , m , the pr oblem of finding q i ( x ) ∈ Σ[ x ] for i = 1 , . . . , ˆ m and q i ( x ) ∈ R [ x ] for i = ˆ m + 1 , . . . , m such that p 0 ( x ) + m X i =1 p i ( x ) q i ( x ) ∈ Σ[ x ] is a sum-of-squares program (SOSP). SOSPs can b e efficiently conv erted to semidefinite pro- grams using to ols such as SOSTOOLS [16]. 3 Problem F orm ulation W e address the problem of creating a bound on the probabilit y a sto chastic system of form (1) or (2) exits a safe region during a finite-time horizon. Additionally , w e present an algorithmic approach for control synthe- sis based on a system’s probability of b ecoming unsafe. With this, we ac hiev e the following ob jectiv es for b oth con tin uous-time and discrete-time systems. Ob jectives (CT & DT): (V erification) First, giv en a con tinuous-time or discrete-time sto chastic system of the form (1) or (2) and a fixed time horizon, upp er b ound the probabilit y of failure, i.e. , the probabilit y that the system’s state reaches a set of unsafe condi- tions within the finite time horizon. (Synthesis) Second, giv en a con tin uous-time or discrete-time sto c hastic sys- tem with input, synthesize a feedbac k con trol law to ac hiev e a desired maximum probability of failure. 3.1 Continuous Time Systems Consider the sto chastic process x ( t ) which satisfies the sto c hastic differential equation dx = ( f ( x ) + g ( x ) u ( x )) dt + σ ( x ) dw (3) where f : X → R n , g : X → R n × p , σ : X → R n × m and w is a m -dimensional Wiener pro cess. Additionally , u : X → R p where u is a state feedbac k control law. W e define F ( x ) = f ( x ) + g ( x ) u ( x ). In the deriv ation b elo w, we consider u ( x ) given and fixed and hence F is a function only of x . In Section 4.2, when w e address the problem of syn thesizing a feedbac k control law u ( x ), it is then implicit that F dep ends on this c hoice of feedbac k. The following theorem is an immediate corollary of [13, Chapter 3, Theorem 1] and reco vers the supermartingale condition [17, Theorem 15] and c-martingale condition [20, Theorem 2.4] as sp ecial cases. Theorem 1 Given the sto chastic differ ential e quation (3) and the sets X ⊂ R n , X u ⊆ X , X 0 ⊆ X \ X u with F ( x ) = f ( x ) + g ( x ) u ( x ) and σ ( x ) lo c al ly Lipschitz c on- tinuous, wher e u ( x ) is some fe e db ack c ontr ol law. Con- sider the stopp e d pr o c ess ˜ x ( t ) . Supp ose ther e exists a twic e differ entiable function B such that B ( x ) ≤ γ ∀ x ∈ X 0 (4) B ( x ) ≥ 1 ∀ x ∈ X u (5) B ( x ) ≥ 0 ∀ x ∈ X (6) ∂ B ∂ x F ( x ) + 1 2 T r ac e  σ T ( x ) ∂ 2 B ∂ x 2 σ ( x )  ≤ − αB ( x ) + β ∀ x ∈ X \ X u (7) for some α ≥ 0 , β ≥ 0 and γ ∈ [0 , 1) . Define ρ u := P { ˜ x ( t ) ∈ X u for 0 ≤ t ≤ T | ˜ x (0) ∈ X 0 } (8) ρ B := P  sup 0 ≤ t ≤ T B  ˜ x  ≥ 1 | ˜ x (0) ∈ X 0  . (9) Then • If α > 0 and β α ≤ 1 , 3 ρ u ≤ ρ B ≤ 1 −  1 − γ  e − β T . (10) • If α > 0 and β α ≥ 1 , ρ u ≤ ρ B ≤ γ + ( e β T − 1) β α e β T . (11) • If α = 0 , ρ u ≤ ρ B ≤ γ + β T . (12) The b ound (12) is characterized in [8] and [20] as the upp er b ound on the probability of b eing unsafe for a c- martingale. If B ( x ) satisfies the conditions of Theorem 1, then B ( x ) is called a sto chastic c ontr ol b arrier function for a given con trol p olicy u ( x ). Relaxing the sup ermartingale con- dition on the infinitesimal generator in the fashion of Theorem 1 gives three case-dep endent finite time prob- abilit y b ounds on a system’s lik eliho o d of entering an unsafe region in the form of (10), (11), and (12). Remark 1 If the initial state x 0 is known exactly,then B ( x 0 ) c an b e substitute d for γ in the pr ob ability b ounds of The or em 1. This pr ovides an upp er b ound on the pr ob a- bility of failur e over a p articular initial p oint r ather than on an initial set, X 0 . 3.2 Discr ete Time Systems Consider the sto chastic discrete-time system x [ k + 1] = f ( x [ k ]) + g ( x [ k ]) u ( x [ k ]) + σ ( x [ k ]) ξ [ k ] (13) where f : X → R n , g : X → R n × p , σ : X → R n × m and ξ is a sto chastic pro cess whose v alue is gov erned by some probabilistic distribution. Additionally , u : X → R p where u ( x ) is a p olynomial con trol law. W e define F ( x, ξ ) = f ( x ) + g ( x ) u ( x ) + σ ( x ) ξ . The following theo- rem is an immediate corollary of [13, Chapter 3, Theo- rem 3]. Theorem 2 Given the sto chastic differ enc e e quation (13) and the sets X ⊂ R n , X u ⊆ X , X 0 ⊆ X \ X u with F ( x, ξ ) = f ( x ) + g ( x ) u ( x ) + σ ( x ) ξ wher e u ( x ) is some fe e db ack c ontr ol law. Consider the stopp e d pr o c ess ˜ x [ k ] . Supp ose ther e exists a twic e differ entiable function B such that B ( x ) ≤ γ ∀ x ∈ X 0 (14) B ( x ) ≥ 1 ∀ x ∈ X u (15) B ( x ) ≥ 0 ∀ x ∈ X (16) E [ B ( F ( x, ξ )) | x ] ≤ B ( x ) ˜ α + ˜ β ∀ x ∈ X \ X u (17) for some ˜ α ≥ 1 , 0 ≤ ˜ β < 1 and γ ∈ [0 , 1) . Define ρ u := P { ˜ x [ k ] ∈ X u for 0 ≤ k ≤ N | ˜ x [0] ∈ X 0 } (18) ρ B := P  sup 0 ≤ k ≤ N B ( ˜ x ) ≥ 1 | ˜ x [0] ∈ X 0  . (19) Then • If ˜ α > 1 and ˜ β ˜ α ˜ α − 1 ≤ 1 , ρ u ≤ ρ B ≤ 1 −  1 − γ  N − 1 Y 0  1 − ˜ β  . (20) • If ˜ α > 1 and ˜ β ˜ α ˜ α − 1 > 1 , ρ u ≤ ρ B ≤ γ ˜ α − N + (1 − ˜ α − N ) ˜ α ˜ β ( ˜ α − 1) . (21) • If ˜ α = 1 , ρ u ≤ ρ B ≤ γ + ˜ β N . (22) Lik e in contin uous time, if B ( x ) satisfies the conditions of Theorem 2, then B ( x ) is called a sto chastic control barrier function for a giv en control p olicy u ( x ). Addi- tionally , like in contin uous-time, Remark 1 also applies. 4 SOS F orm ulations & Numerical Procedures In this section w e presen t our approac h to construct b oth contin uous-time and discrete-time sto c hastic con- trol barrier functions based on the problem form ulations of Section 2. First, we adapt the inequality constraints giv en in Theorem 1 & 2 to b e formulated as an SOSP when α and u ( x ) are known. Second, w e present the al- gorithms which construct barrier functions and present our metho d for computing a control p olicy . 4.1 SOS F ormulation for Safety V erific ation F or contin uous-time system verification, the conditions in Theorem 1 can b e recast as SOS constraints. Theorem 3 Consider a system of the form of (3) and the sets X , X 0 , and X u and assume these sets ar e de- scrib e d as X = { x ∈ R n : s X ( x ) ≥ 0 } , X 0 = { x ∈ R n : s X o ( x ) ≥ 0 } , and X u = { x ∈ R n : s X u ( x ) ≥ 0 } for some p olynomials s X , s X o , and s X u . Supp ose ther e exists a p olynomial B ( x ) , and SOS p olynomials λ X ( x ) , λ X o ( x ) , and λ X u ( x ) that satisfy B ( x ) − λ X ( x ) s X ( x ) ∈ Σ[ x ] (23) B ( x ) − λ X u ( x ) s X u ( x ) − 1 ∈ Σ[ x ] (24) − B ( x ) − λ X o ( x ) s X o ( x ) + γ ∈ Σ[ x ] (25) − ∂ B ( x ) ∂ x F ( x ) − 1 2 T r ac e  σ T ( x ) ∂ 2 B ∂ x 2 σ ( x )  − αB ( x ) + β − λ X u ( x ) s X u ( x ) − λ X ( x ) s X ( x ) ∈ Σ[ x ] (26) 4 wher e F ( x ) = f ( x ) + g ( x ) u ( x ) . Then, the pr ob ability of failur e, dep ending on the values of α and β , satisfies (10), (11) or (12). Theorem 2 for discrete-time verification for systems of the form of (13) can also b e recast as SOS constraints. Theorem 4 Consider a system of the form of (13) and the sets X , X 0 , and X u and assume these sets c an b e describ e d as X = { x ∈ R n : s X ( x ) ≥ 0 } , X 0 = { x ∈ R n : s X o ( x ) ≥ 0 } , and X u = { x ∈ R n : s X u ( x ) ≥ 0 } for some p olynomials s X , s X o , and s X u . Supp ose ther e exists a p olynomial B ( x ) , and SOS p olynomials λ X ( x ) , λ X o ( x ) , and λ X u ( x ) that satisfy the fol lowing B ( x ) − λ X ( x ) s X ( x ) ∈ Σ[ x ] (27) B ( x ) − λ X u ( x ) s X u ( x ) − 1 ∈ Σ[ x ] (28) − B ( x ) − λ X o ( x ) s X o ( x ) + γ ∈ Σ[ x ] (29) − E [ B ( F ( x, ξ )) | x ] + B ( x ) ˜ α + ˜ β − λ X u ( x ) s X u ( x ) − λ X ( x ) s X ( x ) ∈ Σ[ x ] (30) wher e F ( x, ξ ) = f ( x ) + g ( x ) u ( x ) + σ ( x ) ξ . Then, the pr ob ability of failur e, dep ending on the values of ˜ α and ˜ β , is define d by (20), (21) or (22). W e omit the pro ofs for Theorems 3 and 4, which follow the general approac h for relaxing set constrain ts to SOS programs using the Positivstel lensatz condition; see the do cumen tation of [16] for details. F or Theorem 4, the exp ectation, E  F ( x, ξ ) | x  in (30) is encoded using the n -th momen t of a random v ariable. In the case studies in Section 5, we mo del the system noise as a random v ariable with a zero mean normal distribution. The exp ected v alue of the n -th moment of a normally distributed random v ariable, z , is E [ z n ] =  0 if n is o dd 1 · 3 · · · ( n − 1) σ n if n is even. (31) Using (31) allo ws for a closed-form expression for E [ B ( F ( x, ξ )) | x ]. 4.2 V erific ation & Contr ol L aw Synthesis Algorithms Theorem 3 and 4 are not SOSPs—in fact, they are noncon v ex—when all of the relev an t parameters are considered v ariables, i.e., α , β , and u ( x ). As a result, w e presen t algorithms to numerically compute barrier functions to circumv en t the nonconv ex problem. Since the algorithms w e presen t are v alid for discrete-time and contin uous-time systems we use x to represent the con tin uous-time and discrete-time state instead of x ( t ) or x [ k ], resp ectiv ely . First, we assume that u ( x ) is fixed, and th us w e solve the v erification problem via Algorithm 1 which computes a barrier function B ( x ) satisfying the conditions in Theo- rem 3. These conditions are nonconv ex in α , so w e will p erform a line searc h on α. The barrier function is ev al- uated o v er the set X 0 and utilized to compute the prob- abilit y , P , using (10), (11), or (12) for contin uous-time systems. The p olynomial degree n B of B ( x ) is a design parameter; how ev er, higher-order p olynomials tend to pro duce tigh ter b ounds. W ell refined bounds (i.e. higher- order p olynomials) present themselves with the trade- off of longer computational times versus probability of failure refinement. The ob jective of the SOSP in Algorithm 1 is set to mini- mize the v alue γ + β . This ob jective w as chosen to a v oid creating bi-linear programs where initialization of the v ariables can b ecome complex. In other words, minimiz- ing γ + β is a heuristic which may not b e the b est but empirically provides reliable p erformance. Remark 2 As in R emark 1, if x 0 is known exactly, B ( x 0 ) c an b e substitute d for γ to pr ovide a b ound for al l initial c onditions x 0 ∈ X 0 . The discrete-time pro cedure follows the general idea of the contin uous-time approach and is also presented in Algorithm 1 but the optimization program is instead constrained by (27)–(30). Additionally , even though ˜ α and ˜ β app ear in Theorem 2, for clarity , the notation α and β is used in Algorithm 1. Like in contin uous-time systems, the discrete-time probabilit y b ound of b ecom- ing unsafe is a function of ˜ α and ˜ β and is computed using (20), (21) or (22). 4.3 Contr ol ler Synthesis Pr o c e dur e So far, w e ha v e assumed a given feedbac k control policy u ( x ). In this section, we will consider the case of solving for u ( x ) to achiev e a desired probabilit y of safety . In general, w e syn thesize a p olynomial feedbac k con trol la w of the same or lo w er order of B ( x ) suc h that the upp er b ound on the probabilit y of failure reduces to a designer sp ecified v alue. First, the p olynomial u ( x ) is written in quadratic form as u ( x ) = z T Qz (32) where z is a vector of monomials in x of a specified order and Q is a co efficient matrix of appropriate dimensions. Because there likely exist many feasible controllers en- suring the desired probability of failure, we introduce a cost criterion to choose among them. W e approximate the energy of a particular control p olicy via a proxy mea- sure. In this case, the pro xy is the non-negativ e scalar, c , suc h that the following vector element-wise constrain ts 5 Algorithm 1 Compute B ( x ) 1: pro cedure Compute- B ( l α , u α , d, σ, u ( x ) , n B ) 2: . ˜ α & ˜ β used for discrete-time 3: α ← Rang e ( l α , u α , d ) . Assign α values d apart 4: P ∗ ← 1 5: P ← ∅ 6: for α i ∈ α do 7: 8: Con tin uous-time: 9: min γ + β 10: sub ject to (23) - (26) 11: 12: Compute P , using (10), (11) or (12) 13: 14: Discrete-time: 15: min γ + β 16: sub ject to (27) - (30) 17: 18: Compute P , using (20), (21) or (22). 19: 20: if P < P ∗ then 21: α ∗ := α i 22: β ∗ := β 23: P ∗ := P 24: end if 25: end for 26: return α ∗ , β ∗ , P ∗ 27: end pro cedure Algorithm 2 Initialize u ( x ) 1: pro cedure Compute- u ( B ( x ) , α, β , n u ) 2: . ˜ α & ˜ β used for discrete-time 3: u ( x ) = z T Qz . u ( x ) is an n u power polynomial 4: . z is a vector of state monomials 5: min c 6: sub ject to c 1 − v ec( Q ) ≥ 0 7: v ec( Q ) + c 1 ≥ 0 8: Con tinuous-time: (26) 9: Discrete-time: (30) 10: return u ( x ) , c, Q 11: end pro cedure c 1 − vec( Q ) ≥ 0 v ec( Q ) + c 1 ≥ 0 hold where vec( Q ) is the vector form of matrix Q and 1 is the vector of ones of appropriate dimension. Con- straining the individual v alues of the p olynomial co ef- ficien ts pro vides a means of upp er-b ounding and low er- b ounding the control effort applied at eac h particular state. W e choose the cost min c to minimize the co effi- cien ts appearing in the p olynomial con troller to encour- age lo wer control effort. This ob jective and pro cedure are highlighted in Algorithm 2. Con trol syn thesis is performed using Algorithm 3 whic h utilizes the verification approac h from Algorithm 1 and in terlea ves it with the controller search in Algorithm 2. Similar to the verification pro cedure, Algorithm 3 ini- tially computes a p olynomial barrier given a fixed con- trol p olicy (i.e. u ( x ) = 0). F ollo wing this, Algorithm 3 Algorithm 3 Search for control p olynomial u ( x ) 1: pro cedure Compute- u goal ( P goal , σ, α, n B , n u ,  ) 2: . ˜ α & ˜ β used for discrete-time 3: i count = 1 . Initialize counting v ariable 4: while | P ∗ − P goal | >  do 5: if i count = 1 then 6: β , P ← COMPUTE- B ( l α , u α , d, σ, u ( x ) , n B ) 7: . Since α fixed, l α = u α 8: . u ( x ) = 0 9: i count := i count + 1 10: else 11: u ( x ) , c, Q ← COMPUTE- u ( B ( x ) , α, β , n u ) 12: β , P ← COMPUTE- B ( l α , u α , d, σ, u ( x ) , n B ) 13: end if 14: 15: if P < P goal and c < c ∗ then 16: β ∗ := β 17: P ∗ := P 18: c ∗ := c 19: end if 20: . c ∗ is initialized as a large number 21: if P > P goal then 22: β := a dec β 23: else 24: β := a inc β 25: end if 26: . a inc > 1 and a dec < 1 are scaling factors 27: 28: end while 29: return u ∗ ( x ) , c ∗ , Q 30: end pro cedure iterativ ely syn thesizes a feedback con trol law b y adjust- ing the parameter, β . Generally speaking, as in our case studies, we are interested in systems where the proba- bilit y of failure with no control action is ab ov e the goal probabilit y and thus con trol action is required to ac hieve the desired probability of safety . The discrete-time pro cedure for con troller synthesis is also dem onstrated in Algorithm 2 and 3 where ˜ α and ˜ β are utilized instead of α and β . The ob jective of the ap- proac h w e present is to find a con trol p olynomial based on a system’s probability of failure. In contin uous-time, the condition (7) is affine-in-control; how ever, the same is not alwa ys true for condition (17) of discrete-time sys- tems. In contin uous-time systems, the evolution of the exp ected v alue is gov erned by the infinitesimal genera- tor presen ted in F act 1. In discrete-time, the evolution is go v erned by the difference b etw een the exp ected v alue of the barrier function at x [ k + 1] and x [ k ]. Since we are considering p olynomial barrier functions the searc h for control polynomials b ecomes complex due to the E [ B ( F ( x, ξ )) | x ] term in (30). Because of this, the sum- of-squares program b ecomes non-linear and is not nec- essarily conv ex; how ever, if the chosen barrier function is linear then the optimization problem remains con vex. 6 5 Case Studies In this section, we first present a simple contin uous- time example to illustrate the adv antages and limita- tions of our technique. Second, a nonlinear contin uous- time example is presen ted to demonstrate the versa- tilit y of our approac h. Lastly , a discrete-time p opula- tion growth mo del is considered. F or all case studies, w e conduct Mon te Carlo simulations to establish ground truth probability bounds. W e utilize SOSTOOLS [16] whic h conv erts the SOSP into semidefinite programs. Our choice of solv er is the semidefinite program solver SDPT3 [21, 22]. The noise term in b oth the con tinuous- time and discrete-time systems are mo deled to b e v al- ues from a standard normal distribution, N (0 , 1). These case studies w ere conducted on a 2.3 GHz Intel Core i5 computer with 8GB of memory . 4 5.1 1-D Sto chastic System Consider a 1-D stochastic affine-in-control system of the form dx =  − x + u ( x )  dt + σ dw . (33) This is of the same form as (3) where f ( x ) = − x , g ( x ) = 1, and constant σ ( x ) ≡ σ . W e define the state space as X = { x : − 2 ≤ x ≤ 2 } , X u = { x : x 2 ≥ 1 } , and X 0 = { x : x 2 ≤ 0 . 2 2 } . First, we b enchmark the proba- bilit y of failure without a control input (i.e. u ( x ) = 0) for a finite time horizon of T = 1 s. Thus, to do so, the pro cedure outlined in Algorithm 1 is utilized. W e grid searc h o ver a defined range of v alues for the constant α . In this particular example, α ∈ [0 , 5] with d = 0 . 05 in Algorithm 1. W e search for a 16 th degree B ( x ). Ad- ditionally , the c-martingale b ound presen ted in [20, Al- gorithm 3] is repro duced. Lastly , the results are b ench- mark ed against the true probability of failure created via a 5000 draw Mon te Carlo simulation. The results are presen ted in Fig. 1. In Fig. 1, the polynomial bound on the probability of failure p erforms better than the bound from [20] generated using the c-martingale condition that is not state-dep enden t. The difference is particularly notable at higher v alues of σ where the exp onential b ound from [20] b ecomes trivial, i.e., greater than or equal to one. Next, the control problem of ac hieving a particular b ound on the probabilit y of failure of this system is addressed. W e consider a desired failure probability of P g oal = 0 . 30. W e restrict our attention to a linear con troller of the form u ( x ) = − k x . The search for a lo w- energy con troller whic h successfully fulfills the design requiremen t follo ws a mo dified binary search v ersion 4 The MA TLAB source co de for the four case stud- ies is con tained at https://github.com/gtfactslab/ stochasticbarrierfunctions Fig. 1. The probabilit y of failure b ounds for (33) are pre- sen ted here. A 16 th degree p olynomial barrier function is considered. The Mon te Carlo sim ulation results illustrate the true probability of failure for this system. Fig. 2. An illustration of (33) demonstrating the trade-off b et w een required control gain and the degree of the bar- rier function, B ( x ), needed to successfully attain the desired probabilit y of failure threshold. Using higher-order p olyno- mials allo ws us to guaran tee that the desired probability b ound is satisfied for a smaller control gain up until some p oin t. Even tually , the order of the p olynomial will not im- pro v e the b ound as is happ ening from the 12 th to 14 th order p olynomial. of Algorithm 3. This enables a simple search for the k necessary to achiev e the desired criterion. Fig. 2 plots k ∗ ac hieving the desired failure probabil- it y b ound for σ ∈ [1 , 2]. Here, note that the degree of the barrier function for which we search greatly affects the con trol gain needed to ac hieve the con trol ob jective. In some sense, searching for a higher-order p olynomial refines the probability of failure b ound requiring low er con trol effort; how ever, these high order polynomials re- quire more computation time. Even tually , the degree of the p olynomial reaches a saturation point where it do es 7 Fig. 3. Giv en the initial conditions x 0 = [ − 2 , 0], the sin- gle tra jectory dynamics of (34)– (35) for a time horizon of T = 2 and a σ = 1 . 0 are illustrated. The unsafe region is X u = { x 2 | x 2 ≥ 2 . 25 } . Additionally , the level sets of B ( x ) and their resp ective v alues are lab eled and given as dashed blue lines. not further decrease the k ∗ required. 5.2 Nonline ar Dynamics Consider the sto chastic nonlinear dynamics dx 1 = x 2 dt (34) dx 2 =  − x 1 − x 2 − x 3 1 + u ( x )  dt + σ dw . (35) This system is studied in [18] without the input term u ( x ) and constant σ ( x ) ≡ σ . W e define the state space as X = { ( x 1 , x 2 ) | − 3 ≤ x 1 ≤ 2 , − 2 ≤ x 2 ≤ 3 } , X u = { x 2 | x 2 ≥ 2 . 25 } , and X 0 = { ( x 1 , x 2 ) | ( x 1 + 2) 2 + x 2 2 ≤ 0 . 1 2 } . A sample tra jectory of (34)–(35) is illustrated in Fig. 3. Additionally , level sets of B ( x ) are pro jected onto the state space. In this illustration, B ( x ) is computed with u ( x ) = 0 solely using Algorithm 1. In this particular tra jectory illustration, the ev olution of system noise is enough for the system to enter the pre- defined unsafe set; how ev er, this is not alwa ys the case. T o illustrate this, w e compute a Monte Carlo sim ulation of the system dynamics shown. Additionally , an upp er b ound is computed on the probability of b ecoming un- safe given our initial condition and illustrated in Fig. 4. While a set of initial conditions is enco ded into the SOSP , the probability b ound is ev aluated at the same initial p oint, x 0 ∈ X 0 , as the Monte Carlo simulation. The feedbac k control la w design sp ecification for this system is to reduce the probability of failure b ound to Fig. 4. Computing a 14 th order p olynomial barrier function for the nonlinear dynamics (34)– (35), we are able to b ound the probability of failure of the 5000 draw Monte Carlo dy- namics for constant σ ∈ [0 . 5 , 1 . 5]. σ P u ( x )= 0 α min c 0.6 0.860 1.4 2.1821 0.9 0.919 1.3 0.5251 1.0 0.912 1.3 0.6396 1.3 0.949 1.5 1.1488 T able 1 The results from the searc h for a control p olynomial u ( x ) whic h reduces the probability of failure to P goal = 0 . 10 for (34)–(35). The upper-b ound on the probability of failure without a given con trol input is presented here for compari- son. P g oal = 0 . 10 for sp ecified σ v alues. F or this example a 2 nd order p olynomial controller of the form of (32) is syn thesized. The constant, c , highlighted in Algorithm 2 is minimized. Algorithm 3 pro duces the results in T able 1 for select v alues of σ and sp ecific α v alues. The α v alues in T able 1 originate from the initial (i.e., u ( x ) = 0) probabilit y b ound computation. Here, 10 th order B ( x ) are considered due to the computational limitations of SOSTOOLS. 5.3 Discr ete-Time Population Mo del Consider the stochastic v ersion of the discrete-time pop- ulation growth mo del from [7] x 1 [ k + 1] = m 3 x 2 [ k ] + u ( x [ k ]) (36) x 2 [ k + 1] = m 1 x 1 [ k ] + m 2 x 2 [ k ] + σ ξ [ k ] (37) where m 1 = 0 . 5, m 2 = 0 . 95, and m 3 = 0 . 5. F or the discrete time system in (36)–(37), we first p erform ver- ification via a p olynomial barrier function follo w ed by con trol synthesis using 1 st order barrier functions. F or v erification via polynomial barrier functions, we take 8 Fig. 5. The p opulation dynamics (36)–(37) for σ = 0 . 5. The 8 th order barrier function, B ( x ), level sets are superimp osed on the state space. σ P u ( x [ k ])= 0 Mon te Carlo γ 0.1 0.069 0.006 0.075 0.2 0.342 0.051 0.216 0.3 0.574 0.118 0.261 T able 2 Mon te Carlo results for the system (36)–(37) and the com- puted upp er b ound P u ( x [ k ])=0 on the probability of failure using an 8 th order p olynomial. Additionally , the asso ciated γ v alue used to compute the set-wise probability of failure is provided. X = { x 1 , x 2 | − 3 ≤ x 1 ≤ 3 , − 3 ≤ x 2 ≤ 3 } , X u = { x 1 , x 2 | x 2 1 + x 2 2 ≥ 2 } and X 0 = { x 1 , x 2 | x 2 1 + x 2 2 ≤ 1 . 5 } . An illustrative tra jectory of the discrete time dynamics (36)–(37) is displa yed in Fig. 5 with the barrier function lev el sets display ed on the state space. T able 2 presents the verification results of Algorithm 1 when N = 2 and compares P u ( x [ k ])=0 to the true probability of failure ob- tained via Monte Carlo simulation for several v alues of constan t σ . As highlighted in Section 4.3, in the discrete-time case, ev aluating E [ B ( F ( x, ξ )) | B ( x )] results in a nonconv ex constrain t unless B ( x ) is affine. Thus, we consider the case when B ( x ) is affine. W e now consider the domain X = { x 1 , x 2 | 0 ≤ x 1 ≤ 4 , 0 ≤ x 2 ≤ 4 } such that 1 st order barrier functions are a viable approach. In the 1 st order barrier function case, we take N = 3, P g oal = 0 . 10 and X u = { x 1 | 2 ≤ x 1 ≤ 4 } . The level sets of a linear barrier function for X are shown in Fig. 6. Next, con trol syn thesis for the system is p erformed using the discrete-time version of Algorithm 3. The results of con trol synthesis are presented in T able 3. Fig. 6. The p opulation dynamics (36)–(37) ov er a time hori- zon of N = 3 and σ = 1 . 5. The 1 st order barrier function, B ( x ), level sets are sup er imp osed on the state space. Here, w e see that the B ( x ) ≥ 1 from x 1 = 2 to x 1 = 4 σ P u ( x [ k ])= 0 ˜ α min c 1.0 0.499 2 1.44 1.5 0.512 2.05 2.074 2.0 0.523 2.10 2.488 2.5 0.544 2.20 2.986 T able 3 The c v alue derived from implementing Algorithm 3 for the system presented in (36)–(37) using a 1 st order barrier func- tion for P goal = 0 . 10. The last column gives the v alue of c whic h encourages a low-energy control effort for a 2 nd order u ( x ). 6 Conclusion W e consider b oth contin uous-time and discrete-time sto c hastic control barrier functions whose existence pro vides a means of quantifying an upp er b ound on a system’s probability of failure. Additionally , we present a nov el approach to the problem of finite-time verifica- tion b y constraining the evolution of the exp ectation b y a non-negative barrier function. This approac h includes the sup ermartingale and c-martingale conditions pro- p osed in prior literature as sp ecial cases. Lastly , w e syn- thesize a feedbac k control strategy u ( x ) suc h that a cer- tain probability of failure criterion is met. W e illustrate the metho ds with three case studies which demonstrate our ability to quantify system failure probabilities. F or discrete-time systems, we p erform verification leverag- ing p olynomial barrier functions; ho wev er, con troller syn thesis in discrete-time systems gives rise to noncon- v exities. The discrete-time noncon v exities are mitigated b y only considering a region of the state-space such that linear barrier functions are a viable approach using the presen ted numerical metho ds. In these case stud- ies, sto chastic control barrier functions are synthesized using SOS optimization which enable control synthesis 9 based on the upp er-b ound on the probability a system will enter an unsafe region of the state space. References [1] Agraw al, A. and Sreenath, K. (2017). Discrete con- trol barrier functions for safety-critical con trol of dis- crete systems with application to bip edal rob ot navi- gation. In R ob otics: Scienc e and Systems . [2] Ahmadi, M., Singletary , A., Burdick, J. W., and Ames, A. D. (2019). Safe p olicy synthesis in multi- agen t p omdps via discrete-time barrier functions. arXiv pr eprint arXiv:1903.07823 . [3] Ames, A. D., Co ogan, S., Egerstedt, M., Notomista, G., Sreenath, K., and T abuada, P . (2019). Con trol barrier functions: Theory and applications. arXiv pr eprint arXiv:1903.11199 . [4] Ames, A. D., Grizzle, J. W., and T abuada, P . (2014). Con trol barrier function based quadratic programs with application to adaptive cruise control. In IEEE Confer enc e on De cision and Contr ol (CDC) , pages 6271–6278. IEEE. [5] Ames, A. D., Xu, X., Grizzle, J. W., and T abuada, P . (2017). Control barrier function based quadratic pro- grams for safety critical systems. Automatic Contr ol, IEEE T r ansactions on , 62(8):3861–3876. [6] Hsu, S.-C., Xu, X., and Ames, A. D. (2015). Con- trol barrier function based quadratic programs with application to bip edal rob otic walking. In Americ an Contr ol Confer enc e (ACC) , pages 4542–4548. IEEE. [7] Iannelli, M. and Pugliese, A. (2015). A n Intr o duc- tion to Mathematic al Population Dynamics: Along the T r ail of V olterr a and L otka , volume 79. Springer. [8] Jagtap, P ., Soudjani, S., and Zamani, M. (2018). T emp oral logic verification of sto c hastic systems us- ing barrier certificates. CoRR , abs/1807.00064. [9] Jagtap, P ., Soudjani, S., and Zamani, M. (2019). F or- mal syn thesis of sto chastic systems via con trol barrier certificates. arXiv pr eprint arXiv:1905.04585 . [10] Kolathay a, S. and Ames, A. D. (2018). Input-to- state safety with control barrier functions. 3(1). [11] Kushner, H. (1966). Finite time sto chastic stabil- it y and the analysis of tracking systems. Automatic Contr ol, IEEE T r ansactions on , 11(2):219–227. [12] Kushner, H. (1971). Introduction to sto chastic con- trol. T echnical rep ort, Brown Univ ersit y Providence, RI Division of Applied Mathematics. [13] Kushner, H. J. (1967). Sto chastic stability and c on- tr ol . Mathematics in science and engineering, v.33. Academic Press, New Y ork. [14] Li, A., W ang, L., Pierpaoli, P ., and Egerstedt, M. (2018). F ormally correct comp osition of co or- dinated b ehaviors using control barrier certificates. IEEE/RSJ International Confer enc e on Intel ligent R ob ots and Systems . [15] Øksendal, B. (1998). Sto chastic differ ential e qua- tions : an intr o duction with applic ations . Universitext. Springer, Berlin ; New Y ork, 5th ed.. edition. [16] Papac hristo doulou, A., Anderson, J., V almorbida, G., Pra jna, S., Seiler, P ., and Parrilo, P . A. (2013). SOSTOOLS: Sum of squar es optimization to olb ox for MA TLAB . . [17] Pra jna, S., Jadbabaie, A., and Pappas, G. (2007). A framework for w orst-case and sto chastic safet y v eri- fication using barrier certificates. Automatic Contr ol, IEEE T r ansactions on , 52(8):1415–1428. [18] Pra jna, S., Jadbabaie, A., and Pappas, G. J. (2004). Sto c hastic safety verification using barrier certificates. In IEEE Confer enc e on De cision and Contr ol, 2004 , pages 929–934. IEEE. [19] Santo y o, C., Dutreix, M., and Co ogan, S. D. (2019). V erification and control for finite-time safety of sto chastic systems via barrier functions. [20] Steinhardt, J. and T edrake, R. (2012). Finite- time regional verification of stochastic non-linear sys- tems. The International Journal of R ob otics R ese ar ch , 31(7):901–923. [21] T oh, K., T o dd, M., and T utuncu, R. (1999). Sdpt3 - a matlab soft w are pac k age for semidefinite program- ming, version 1.3. Optimization Metho ds & Softwar e , 11-2(1–4):545–581. [22] T oh, K. C., T o dd, M. J., and T utuncu, R. (2003). Solving semidefinite-quadratic-linear programs using sdpt3. Mathematic al Pr o gr amming , 95(2):189–217. [23] W ang, L., Ames, A. D., and Egerstedt, M. (2017). Safet y barrier certificates for collisions-free m ultirobot systems. R ob otics, IEEE T r ansactions on , 33(3):661– 674. [24] Wieland, P . and Allg¨ ower, F. (2007). Constructiv e safet y using control barrier functions. IF AC Pr o c e e d- ings V olumes , 40(12):462–467. 10

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment