Optimal strategies in the average consensus problem

Optimal strategies in the a verag e consen sus pro blem Jean-Cha rles Delvenne Ruggero Carli Sandro Z ampieri Abstract — W e prov e that f or a set of comm unicating agents to compute the a verage of their initial positions (a vera ge consensus problem), the optimal topology of communication is giv en by a de Bruijn’s graph. Consensus is then reached in a ﬁ nitely many steps. A more general family of strategies, constructed by block Kro necker products, is in v estigated and compared to Cayley strategies. I . I N T R O D U C T I O N Coordinatio n algorithms f or mu ltiple au tonomo us vehicles and decentralized estimation techniques for handling data coming fro m distributed sensor networks have attracted large attention in recent y ears. T his is main ly m otiv ated by that both coordinated control an d distrib uted estimation h av e applications in many areas, such as coor dinated ﬂocking of mobile vehicles [26 ], [27], cooperative contro l of unm anned air and under water vehicles [4], [3], multi-vehicle trac king with limited sensor inf ormation [19], monitorin g very large scale ar eas with ﬁne resolution an d collab orative e stimation over wireless sensor networks [2 4]. T ypically , both i n coordinated control and in distributed estimation the agents need to commun icate data in order to execute a task. In par ticular they ma y need to agree on the value of c ertain coo rdination state v ariables. On e expects that, in or der to achieve coordin ation, the variables shared by the ag ents, c onv erge to a co mmon value, asym ptotically . The problem of d esigning controller s that lead to such asymptotic coordin ation is called coordinated consensus , see for exam- ple [12], [2 0], [15] and refere nces therein. Generalisation to high order co nsensus [22] and n onho lonomic agents [18], [11], [28] ha ve also been explored. On e o f the simp lest consensus problems that h as been mostly s tudied consists in starting from systems describ ed by an integrato r and in ﬁnding a feedback control yielding c onsensus, namely driving a ll the states to the same value [20]. The inform a- tion exchange is modeled by a directed g raph d escribing in which pair of agen ts the data tr ansmission is allowed. The situation mostly treated in th e literature is when each agent has the possibility of communicate its state to the other agen ts that are p ositioned inside a neig hborh ood [26], [15] and the commu nication n etwork is tim e-varying [27], [15]. Robustness to commu nication link failur e [ 8] an d the effects of time delays [20] has been consider ed recen tly . Randomly time- varying networks hav e a lso be en analy zed in [14]. Moreover, a ﬁrst an alysis in volving qu antized data J.-C. Delv enne is with t he Institute for Mathematic al Science s, Imperial Colle ge, 53 Prince’ s Gate, London SW7 2PG, United Kingdom, jc.delvenne@i mperial.ac.u k . R. Carli and S. Zampieri are with the Depa rtment of Information E ngineeri ng, Uni versit` a di Padov a, V ia Gradenigo 6/a, 3 5131 Padov a, Italy { carlirug|zampi } @de i.unipd.it . transmission has b een propo sed in [7], [16]. I n th is pape r we consider the co nsensus pro blem fro m a dif ferent perspective. W e are interested to char acterize the relation ship betwe en the amo unt of in formatio n exchanged by the agents and the achiev able contro l perfor mance. More precisely we assume that N ag ents are given initial position s in the eu clidean space, and move in discrete-time i n order to r each the a verage of th eir initial position s. Th is p roblem is also called average coor dinated consensu s . Every agent asks se veral agen ts their position before taking a decision to m odify its own position. W e imp ose th at, in order to limit costs of c ommun ication, ev ery agen t commun icates with only ν agents (including itself), where ν < N . This means that in the graph describing the communication s b etween agents, the max in-degree is at most ν . In this paper, we exhib it a family of strategies for solving this problem based on de Bruijn’ s g raphs an d we prove that acco rding to a suitable criteria this is the best that o ne c an do. Pr ecisely we comp ute its performances accordin g two cr iteria: r ate of con vergence to the average of the ir initial p ositions an d an LQR criterion. W e ﬁnd that a deadbe at strategy is optimal accord ing to the rate of convergence, an d near ly optim al accord ing to the L QR criterion. Finally , we compare it with an ano ther strategy having limited commu nication and exhibiting symmetries: the Cay ley strategies [6], [5]. It should be n oted ho wev er that ou r strategy is limited to the case wher e the n umber o f agents is an e xact power o f ν . Whether it is possible to build a lin ear time-inv ariant deadbeat strategy fo r any nu mber of agents ( for a given ν ) remains an open problem. The paper is organized as follows. In Section II we provide some b asic n otions of grap h theory and some notational conv entions. In Sectio n I II w e formally d eﬁne the average consensus problem. In Section IV we introduce the block Kronecker strategy . In Section V we show that the bloc k Kronecker strategy is th e q uickest possible stra tegy an d we compa re it with the Cayle y strategy . In section VI we ev aluate the performan ce o f the block Kron ecker strategy accordin g to suitable q uadratic criteria. Finally we ga ther our conclusions in Section V II. I I . P R E L I M I NA R I E S O N G R A P H T H E O RY Before deﬁning the p roblem we want to solve, we sum - marize some n otions o n grap h theo ry that will be useful throug hout the r est o f the paper . Let G = ( V , E ) be a directed graph where V = (1 , . . . , N ) is the set of vertices an d E ⊂ V × V is the set of ar cs o r edges . If ( i, j ) ∈ E we say that the ar c ( i, j ) is outgoing from i and incomin g in j . The adjacen cy matrix A is a { 0 , 1 } -v alued sq uare matrix in dexed by the elements in V deﬁned by letting A ij = 1 if and o nly if ( i, j ) ∈ E . Deﬁne the in- de gr ee of a vertex j as P i A ij and the out-degr ee of a vertex i as P j A ij . In ou r setup we adm it the pr esence of self -loops. A gr aph is called in-r e gular ( ou t-r e gular ) of degree k if each vertex has in -degree (out-d egree) equal to k . A path in G consists of a sequen ce of vertices i 1 i 2 . . . . . . i r such th at ( i ℓ , i ℓ +1 ) ∈ E for every ℓ = 1 , . . . , r − 1 ; i 1 (resp. i r ) is said to be the initial (resp . termina l ) vertex of the path. A cycle is a path in which the initial an d the terminal vertices coincide. A vertex i is said to be conn ected to a vertex j if there e xists a p ath with initial vertex i an d ter minal vertex j . A dire cted g raph is said to be con nected if, gi ven any p air of vertices i and j , eith er i is connected to j or j is con nected to i . A directed grap h is said to be str ongly con nected if, giv en any pair of vertices i and j , i is co nnected to j . Finally s ome notational con ventions. Le t A any matr ix belongin g to R N × N . W ith T r A we d enote the tra ce of A , i.e. the sum of the diagonal entries. W e say that A is nonnegative, denoted A ≥ 0 , or p ositiv e, denoted A > 0 , if the entries of A are respectively nonnegative or positi ve. I I I . P RO B L E M F O R M U L AT I O N W e suppo se th at the po sitions o f all N agen ts ar e listed into one vector of dimension N . If the agen ts move, say , in R 3 , it seems that we would need a 3 N -dimensional vector . Howe ver we will suppose that the p ositions are scalar , a s ev ery linear strategy on scalar p ositions, if ap plied separately on every componen t of the position, tr ivially extend s to strategies for higher dimen sions. More precisely the problem of our inter est c an be fo rmal- ized in the following way . Consider N > 1 identical systems whose dynamics are describe d by th e fo llowing discrete time state eq uations x + i = x i + u i i = 1 , . . . , N where x i ∈ R is the state o f the i -th system, x + i represents the upd ated state and u i ∈ R is the c ontrol input. More compactly w e c an write x + = x + u (1) where x, u ∈ R N . The goal is to design a feedback control law u = K x, K ∈ R N × N yielding the a verage consensus, n amely a co ntrol su ch that all the x i ’ s b ecome asymptotically equal to the a verage of the initial con dition. M ore pr ecisely , ou r objective is to obtain K such that, for any initial con dition x (0) ∈ R N , the closed loop system x + = ( I + K ) x, yields lim t →∞ x ( t ) = α 1 (2) where 1 := [1 , . . . , 1] T and α = 1 N 1 T x (0) . (3) Writing x ( t ) as a linear com bination o f the e igenv ectors of I + K , it is almost im mediate to see tha t the a verage consensus prob lem is solved if and o nly if the f ollowing three co ndition s h old: (A) Every r ow and every column of I + K sums to on e. Hence it has eige n value 1 with 1 as left and right eigenv ector . (B) The eigenv alue 1 o f I + K has algebra ic multiplicity one (n amely it is a simple r oot of the character istic polyno mial o f I + K ). (C) All the other eigenv alues are strictly inside the unit circle. For no nnegative m atrices, namely f or matrices having all the componen ts n onnegative, con dition (A) is called double stochasticity , condition (B) is ergodicity an d conditio n (C) is a co nsequence of dou ble stoch asticity . W e do no t r equire our matrices to b e n onnegative, even thou gh it will app ear that the optima l matrices are. Observe now tha t th e fact th at the element in po sition i, j of the matr ix I + K is different fro m zer o, me ans that th e system i needs to kn ow exactly the state of the system j in order to compute its feedback actio n. This implies th at the j -th ag ent must commu nicate his state x j to i -th agen t. I n this co ntext a good description o f the c ommun ication effort required by a spec iﬁc fe edback K is given by the directed graph G I + K with set of vertices { 1 , . . . , N } in which there is an arc fro m j to i wh enever in the feedb ack m atrix K the element ( I + K ) ij 6 = 0 . The graph G K is said to be the commun ication graph associated with K . Co n versely , giv en any direc ted graph G with set of vertices { 1 , . . . , N } , a f eedback K is said to be c ompatible with G if G I + K is a subgrap h of G ( we will use the n otation G I + K ⊆ G ). In the sequel, we will impo se the f ollowing constraint on the commu nication gra ph: the max in -degree of th e nodes is ν . Th is mod els the fact that comm unication lines are costly to establish or oper ate, an d ev ery ag ent has the right to talk to a limited num ber of other agents. Note that for compatib ility with usual conv entions we consider that ν counts all arcs entering a n ode, includ ing self-loops (which could be consid ered as ‘free co mmunica tion’ in m ost technolog ical situation s). W ithout this constrain t, the prob lem b ecomes trivial: choose the co mplete graph, and the con sensus is reache d in o ne s tep. W e ther efore add the fo llowing constraint o n I + K : (D) Every row of I + K contain s at m ost ν non- zero elements. From this poin t of view we would like to ob tain a matrix I + K satis fying (A),(B),(C),(D) and minimizing a suitable perfor mance in dex. The simp lest c ontrol pe rforma nce in- dex is the exponential rate of convergence to the a verage consensus. When we are de aling with average con sensus controller s it is meaningf ul to consider the displacement from the a verage of the in itial cond ition ∆( t ) := x ( t ) −  1 N 1 T x (0)  1 . It is immediate to check th at, ∆( t ) = x ( t ) −  1 N 1 T x ( t )  1 (since th e average position 1 N 1 T x ( t ) is th e same at all times t ) and that it satisﬁes the closed loo p equ ation ∆ + = ( I + K )∆ . (4) Notice moreover th at the initial con ditions ∆(0) are such that 1 T ∆(0) = 0 . (5) Hence the asymptotic behavior of our consensus problem can equiv alently be studied b y lo oking at the ev olution (4 ) o n th e hyperp lane char acterized by the cond ition ( 5). The speed of conv ergence toward the av erage of the initial cond ition can be d eﬁned as follows. Let P any matrix satisfyin g cond itions (A),(B),(C). D eﬁne ρ ( P ) =  1 if dim ker( P − I ) > 1 max λ ∈ σ ( P ) \{ 1 } | λ | if dim ker( P − I ) = 1 , which is called the essential spectral radius of P . As th e dominan t eigenv alues of P t is o ne and the other s are sma ller in magn itude th an ρ ( P ) t , the essential spectral radius says how quic kly P t conv erges to the r ank-o ne matrix 1 / N 11 T , where N is the dimen sion of P . In this context the ind ex ρ ( I + K ) seems quite approp riate f or analy zing how per- forman ce is related to the communicatio n effort associated with a grap h. The smaller the essential spectral radius, the quicker the system will c onv erge to the average of the initial condition . Howe ver in con trol th eory , strategies that conver ge in ﬁnite time or very q uickly are sometimes d ismissed on the groun d that they lead to large values of up date v alues u ( t ) = x ( t + 1) − x ( t ) , that can be physically im possible or very costly to imp lement. Hence a strategy is of ten required to op timize an LQR cost , taking into a ccount both the quickn ess con vergence an d the nor m of upd ates v alues. Therefo re anoth er suitable mea sure of p erform ance could be the following quan tity: J = E ( X t ≥ 0 || x ( t ) − x ( ∞ ) || 2 + γ || u ( t ) || 2 ) , (6) where x ( t ) is th e vector of positions at time t , x ( ∞ ) = lim ∞ x ( t ) is the vector whose every entr y is the average of initial p ositions, u ( t ) = x ( t + 1) − x ( t ) is the update vector at time t , the initial p ositions are supp osed to be uncorr elated r andom variables with unit variance, E denotes the expectation, || x || 2 = x T x is the euclidean norm and γ is a nonn egativ e real. W e will p rove that the o ptimal topolog y of communication (in the meaning of speed of conver gence) is given by a de Brujin’ s graph. W e will call the con trol strategies based o n such g raph block Kronecker strategies, as explained in th e next section. For these strategies we will evaluated (6) and we will comp are them to an other family of strategies b ased on a regular communica tion graph having the same d egree ν : the Cay ley strategies [6], [5]. I V . B L O C K K RO N E C K E R S T R A T E G I E S In th is section, we deﬁne block Kr onecker strategies. Le t A be a n × n matrix satisfying ( A),(B),(C),( D) and k b e a nonnegative integer . No te that if A is f ull then n ≤ ν (sinc e the numb er of non-ze ro elemen ts cannot exceed ν ). T hen we build an n k × n k matrix M in the following way . Let A =      a 0 a 1 . . . a n − 1      be a r ow-partition of the matrix A , where a i ∈ R 1 × n . Then M is the matrix M =      I n k − 1 ⊗ a 0 I n k − 1 ⊗ a 1 . . . I n k − 1 ⊗ a n − 1      . (7) For example, if A =  α β β α  (with α + β = 1 ) and k = 3 , then M =             α β α β α β α β β α β α β α β α             This is a kind of b lock Kron ecker pro duct. A gen eral theory of block Kr onecker produ ct is built in [17]. W e only ne ed a mo re restricted d eﬁnition, detailed below . The new matrix M is a matrix of larger dimension than A and satisfying condition s (A),(B),(C),( D): (A) and (D) follow from the d eﬁnition, while ( B) and ( C) are proved belo w . Hence it can play the r ole of the matrix I + K in Section III.W e star t by some remin ders on Kro necker pro duct, deﬁne the block Kro necker pr oduct and explore the pr operties of the latter . A. Kr onecker pr oduct W e recall that the Kr onec ker pr oduct A ⊗ B of the matrices A and B is the matrix [ a ij B ] i,j , whose d imensions are the produ ct of d imensions of A and B . Some useful proper ties of th e Kronecker p roduct ar e the following: • AB ⊗ C D = ( A ⊗ C )( B ⊗ D ) ; • T r A ⊗ B = T r A T r B ; • the eigenv alues o f A ⊗ B are all possible p rodu cts of an eige n value of A with an eigen value of B ; • the eigen vectors of A ⊗ B are all possible Kro necker produ cts of an eigenvector of A with an eigen vector o f B . The Kronecker p rodu ct is sometimes called tensor product. Let us see wh y . For instance consider the m atrices B , C, D of sizes m B × n B , m C × n C , m D × n D . The Kron ecker produ ct has size m B m C m D × n B n C n D , and an arbitr ary element of B ⊗ C ⊗ D can be deno ted ( B ⊗ C ⊗ D ) abc,def = B ad C be D cf , where the index written as a bc denotes the number c + b m D + am C m D and th e index def is the number f + en D + dn C n D ; we sup pose that the indices start form zero: a = 0 , . . . , m B − 1 , etc. I f B , C, D happ en to be square m atrices of size n , this no tation coincides with the usual notation in base n of an index running from 0 to n 3 − 1 . This notation of the Kron ecker prod uct is very close to the tensor pro duct used in algeb ra an d differential geometry . The on ly difference is that B ⊗ C ⊗ D , v iewed as a tensor pr oduct, is consid ered as a 6 -dim ensional array with a, b, c, d, e, f as separa te indice s, in stead of a matrix (i.e., a 2 -dimen sional ar ray). All this im mediately extends to mor e than th ree m atrices. B. B lock Kr onecker pr oduct Let us n ow consider th e following variant of Kr onecker produ ct, that we ca ll b lock Kr onecker pr oduct . Consider fo r instance two matrices B (of size n 3 × n 3 ) and C ( of size n 2 × n 2 ). The block Kron ecker produ ct of B and C is deﬁned as follows: its e lement of index abcde, g hij k is the eleme nt B cde,g hi C ab,j k (notice the shift of the ﬁrst in dices by two places). W e will deno te this matrix by B ⊙ C . Th is d eﬁnition applies to any two squar e m atrices whose d imensions are powers of n . In gene ral, we ca n w rite ( B ⊙ C ) p,q = ( B ⊗ C ) σ t ( p ) ,q , where σ operates a cyclic permu tation by one place to the left o n the digits o f p in b ase n , and C is of size n t . The matrix M de ﬁned by Equatio n (7) can be expressed as M = ( I ⊗ · · · ⊗ I ) ⊙ A (wh ere the n × n identity matrix I is re peated k − 1 times). If we write the in dex o f M in base n , then M i 1 ...i k ,j 1 ...j k = I i 2 ,j 1 I i 3 ,j 2 . . . I i k − 1 ,j k A i 1 ,j 0 . This form is par ticularly useful to compu te the b ehavior of M from the p roperties of the block Kronecker prod uct, which we now explo re. As a ﬁrst pro perty , we can easily see that ( B ⊙ C ) T = C T ⊙ B T . (8) W e can also prove the following lemma. Lemma 4. 1: For any matr ices A, B , C, D , E , F fo r which all th e products below are meaningfu l, we have (( A ⊗ B ) ⊙ C )(( D ⊗ E ) ⊙ F ) = B D ⊙ ( C E ⊗ AF ) . ( 9) Pr oof: W e write, using Einstein ’ s co n vention (indices repeated twic e in an expression are implicitly summed over), [(( A ⊗ B ) ⊙ C )(( D ⊗ E ) ⊙ F )] u,w = = (( A ⊗ B ) ⊙ C ) u,v (( D ⊗ E ) ⊙ F ) v, w = A u 2 ,v 1 B u 3 ,v 2 C u 1 ,v 3 D v 2 ,w 1 E v 3 ,w 2 F v 1 ,w 3 , where u , v , a nd w , in terpreted as seque nces of digits in ba se n , ha ve been partitioned into u 1 u 2 u 3 , v 1 v 2 v 3 , and w 1 w 2 w 3 in an ap propr iate way . Th is is possible if B and D have same size, as well as C and E , an d A an d F . T hen the expr ession above can be regrouped as ( B D ) u 3 ,w 1 ( C E ) u 1 ,w 2 ( AF ) u 2 ,w 3 = = ( B D ⊙ ( C E ⊗ AF )) u,w , which e nds the pro of. In p articular, if B = D = 1 we h av e ( A ⊙ C )( E ⊙ F ) = ( C E ⊗ AF ) . (10) If we choo se C = E = 1 instead, we have ( A ⊗ B )( D ⊙ F ) = B D ⊙ AF . (11) The f ollowing pro position provides an in teresting char ac- terization of the powers of any ord er of the m atrix M . Pr oposition 4.1: For A a squ are matrix, M d eﬁned by Equation (7), and any integers r ≥ 0 and 0 ≤ s < k , M r k + s = ( A r ⊗ · · · ⊗ A r ) | {z } k − s ⊙ ( A r +1 ⊗ · · · ⊗ A r +1 ) | {z } s , where th e expon ents in the righ t-hand side sum to r k + s . Pr oof: W e prove the claim by ind uction on r k + s . It is true by deﬁn ition for r k + s = 1 . The induction step is easily proved b y apply ing Equation (9). I ndeed, [( A r ⊗ ( A r ⊗ · · · ⊗ A r )) ⊙ ( A r +1 ⊗ · · ·⊗ A r +1 )][(( I ⊗ · · · ⊗ I ) ⊗ ( I ⊗ · · · ⊗ I )) ⊙ A )] can be written as ( A r ⊗ · · · ⊗ A r ) ⊙ ( A r +1 ⊗ · · · ⊗ A r +1 ⊗ ( A r A )) . The argumen t is co rrect also for limit cases s = 0 and s = k − 1 . In p articular we have the following. Cor ollary 4.1 : For A a squ are matrix and M deﬁned by Equation (7), M k = A ⊗ · · · ⊗ A. Moreover , if A satisﬁes (A) ,(B),(C) th e essential spectral radius of M is the k th ro ot of the essential spectra l r adius of A . Pr oof: The ﬁrst part is a particu lar case of Proposition 4.1. From the pro perties of Kr onecker p roduct, we know the spectrum of M k is com posed of all po ssible pro ducts of k eigenv alues of A . Hence th e largest eigenv alue in absolute value, different fro m 1 , of the matrix M k results to be 1 k − 1 λ , where λ denotes the largest eigenv alue in absolute v alue, different from 1 , of the matrix A . This also proves also th at condition s (B) an d (C) are veriﬁed for M when they are for A . If we take A = 1 /n 11 T , (12) of size n , then M k is the matrix 1 /n k 11 T of size n k with all iden tical elements. Th us we h ave a strategy con verging exactly in k step s. W e comment further on this example in the next section. An other pr operty o f M th at will prove useful is stated in the next proposition. Pr oposition 4.2: For A a squ are matrix, M d eﬁned by Equation (7), and any integers r ≥ 0 and 0 ≤ s < k , M T r k + s M r k + s = A T r A r ⊗ · · · ⊗ A T r A r | {z } k − s ⊗ ⊗ A T r +1 A r +1 ⊗ · · · ⊗ A T r +1 A r +1 | {z } s , where th e sums of exponents is r k + s . Pr oof: From Pro position 4.1, we know that M r k + s = ( A r ⊗ · · · ⊗ A r ) ⊙ ( A r +1 ⊗ · · · ⊗ A r +1 ) . Hence, by Equ ation (8), M T r k + s = ( A T r +1 ⊗ · · · ⊗ A T r +1 ) ⊙ ( A r ⊗ · · · ⊗ A r ) . These two expr essions are multiplied using Equatio n ( 10). Now we would like to co mpute T r M T t M t +1 . This will be u seful later whe n we will ev aluate the perfo rmance of the block Kronecker strate gy . W e ﬁrst need the following lemma. Lemma 4. 2: Let B 0 , B 1 , . . . , B k − 1 be k squar e m atrices of same dimen sions. If l ≤ k is relativ ely prime to k , then T r ( B 0 ⊗ B 1 ⊗ · · · ⊗ B l − 1 ) ⊙ ( B l ⊗ · · · ⊗ B k − 1 ) = T r B 0 B l B 2 l B 3 l · · · B ( k − 1) l , where th e ind ices a re un derstood mo dulo k . Pr oof: If we use Einstein’ s conv ention (repeated indices are sum med over), we can write T r ( B 0 ⊗ B 1 ⊗ · · · ⊗ B l − 1 ) ⊙ ( B l ⊗ . . . ⊗ B k − 1 ) = [( B 0 ⊗ B 1 ⊗ · · · ⊗ B l − 1 ) ⊙ ( B l ⊗ · · · ⊗ B k − 1 )] p,p = ( B 0 ) p k − l ,p 0 ( B 1 ) p k − l +1 ,p 1 · · · ( B l − 1 ) p k − 1 ,p l − 1 ( B l ) p 0 ,p l · · · ( B k ) p k − l − 1 ,k = ( B 0 ) p k − l ,p 0 ( B l ) p 0 ,p l ( B 2 l ) p l ,p 2 l ( B 3 l ) p 2 l ,p l · · · ( B ( k − 1) l ) p ( k − 2) l ,p ( k − 1) l = T r B 0 B l B 2 l . . . B ( k − 1) l , where p = p 0 p 1 . . . p k − 1 . Pr oposition 4.3: For A and M as d eﬁned above, if A is normal (i. e., A T A = AA T ) then T r M T t M t +1 = T r A T t A t +1 . Pr oof: W e kn ow that M T t M t = A T r A r ⊗ . . . ⊗ A T r +1 A r +1 , if t = r k + s for som e 0 ≤ s < k . He nce M T t M t +1 = ( A T r A r ⊗· · ·⊗ A T r +1 A r +1 )(( I ⊗· · · ⊗ I ) ⊙ A ) , which by Equation (11) is equal to ( A T r A r ⊗ · · · ⊗ A T r +1 A r +1 ) ⊙ A T r A r +1 . By Le mma 4.2, th is matrix has the same trace as A T r A r . . . A T r +1 A r +1 A T r A r +1 . As A T and A commu te, this is a lso the trace o f A T t A t +1 . C. De B ruijn’ s graph The commun ication graph of M is (a sub graph of) a de Bruijn graph, which has n k vertices an d arcs from any i to ni, ni + 1 , ni + 2 , . . . and ni + k − 1 (all modulo n k ). In particular, if A is given by Equation (12), then M is the adjacency matrix of a d e Bruijn graph, no rmalized so as fo r ev ery row to sum to one. This g raph was introdu ced b y de Bruijn [10] in 1946 and h as been conside red fo r efﬁcient distribution of information in different context such as in parallel compu ting [23] an d peer-to-peer networks [13]. This paper can be seen as an extension of this idea to consensus problem s. D. Design decentralisation The proc ess itself of conver gence to co nsensus is de- centralised, in the sense that every agen t acts on its own. Howe ver the co mmun ication strategy (who talks to whom ?) must be d esigned o nce f or all befo rehand . This can be don e in centralised way , where a new external ag ent dispatch to ev ery other agen t their own strategy . This can also be don e in a decentra lised way , where ev ery agent is attributed a number i between 0 and N − 1 an d then ﬁnds the agents of num ber ν i , ν i + 1 ,. . . , ν i + ν − 1 . Achieving this in the most effecti ve way is a prob lem o f its own, and is not treated in this p aper . V . T H E Q U I C K E S T P O S S I B L E S T R A T E G Y W e have seen that s tarting from A with all identical entries, we get ar bitrarily large m atrices M conver ging in ﬁnite time k . If we write N = n k the dimen sion of M , this conv ergence time is log N / log n = log N / log ν , wh ere ν is the maxima l in-degree o f the graph of com municatio n for M . W e can see that no strategy , whether linear o r not, whether time- in variant o r not, can co n verge more rapidly . Indeed , to reach the average of th e initial condition s, e very agent must hav e informa tion about all oth er a gents, but it can on ly know ν o ther positions in one s tep of time, ν 2 in tw o steps of time, etc. Hence the propagation of informatio n needs around log N/ log ν steps to connect all agents. This reasoning is made r igorou s in the following prop osition. Pr oposition 5.1: Let M ∈ R N × N such that M ≥ 0 . Let ν be deﬁned as above. Then M k > 0 im plies ν k ≥ N . Pr oof: The fact M k > 0 implies that for any pa ir of nodes ( i, j ) there exists in the graph G M a path con necting i to j of len gth k . Hence there are at least N 2 paths of len gth k . Let n ow P i denote the n umber of paths having length i . The p revious consideratio n implies that P k ≥ N 2 . On the other hand it is easy to see that P 1 ≤ ν N and in g eneral that P i ≤ ν i N fr om wh ich we get tha t P k ≤ ν k N . Hen ce ν k N ≥ N 2 from whic h it results that ν k ≥ N . The abov e propo sition considers only the time-in v ariant case. An id entical re sult ca n be found fo r the time-varying case, showing th at there is no difference, in terms of speed of co n vergence toward the meeting point, between the time- in variant an d the time-varying cases. T his can be seen an a posteriori justiﬁcation of our interest in the class of the time-inv ariant strategies. A linear time-inv ariant strategy conv erges in ﬁnite time iff its essential spectral radius is 0 . For a strategy conv erging in inﬁnite time, the essential spectral rad ius is a g ood measure of the co n vergence to the average of the in itial condition s, as alr eady mentioned. A. Comp arison between block Kr onecker strate gy and Cay- ley strate gy In this sub section we prop ose a compa rison o f the b lock Kronecker strategy with ano ther stra tegy wh ose underly ing commun ication graph has limited max in-degree and exhibits strong symmetries: the Cayley strategy . First we recall the concep t of Cayley gr aph deﬁn ed on Abelian gro ups [2], [ 1]. Let G be any ﬁnite Ab elian gro up (internal oper ation will always b e deno ted + ) o f o rder | G | = N , and let S be a sub set of G containing zer o. T he Cayley graph G ( G, S ) is the directed gr aph with vertex set G and arc set E = { ( g , h ) : h − g ∈ S } . Notice tha t a Cayley gr aph is alw ays in-regular, namely the in-degree of each vertex is e qual to | S | . Notice also that strong connectivity c an be checked algebr aically . Ind eed, it can be seen that a Cayley graph G ( G, S ) is strong ly connected if and only if the set S generates the gro up G , which means that any ele ment in G can be expressed as a ﬁnite sum o f (not necessarily distinct) elemen ts in S . I f S is such that − S = S we say that S is inverse-closed. In this case th e gr aph obtained is undirected. A notion of Cayley structure can also be in troduce d for matrices. L et G be any ﬁn ite Abelian gro up of ord er | G | = N . A m atrix P ∈ R G × G is said to b e a Cayley m atrix over the group G if P i,j = P i + h,j + h ∀ i , j, h ∈ G . It is clear that for a Cayley m atrix P there e xists a π : G → R such that P i,j = π ( i − j ) . The function π is called the generato r o f the Cayley m atrix P . Notice how , if P is a Cayley matrix gener ated by π , then G P is a Cayley graph with S = { h ∈ G : π ( h ) 6 = 0 } . More over , it is easy to see that for any Cayley matrix P we h av e tha t P 1 = 1 if and only if 1 T P = 1 T . Th is implies that a Cayle y s tochastic matrix is automa tically dou bly stoch astic. In this c ase the function π associated with the matrix P is a p robability distribution on the g roup G . Among the multiple possible choices of the p robab ility distribution π , o ne is particularly simple, n amely π ( g ) = 1 / | S | for e very g ∈ S . Example 1: Let us consider the grou p Z N of integers modulo N and the Cayley grap h G ( Z N , S ) whe re S = {− 1 , 0 , 1 } . Notice that in th is case S is inverse-closed. Consider the unifo rm pro bability distribution π (0) = π (1 ) = π ( − 1) = 1 / 3 The co rrespon ding Cayley stochastic m atrix is given by P =        1 / 3 1 / 3 0 0 · · · 0 1 / 3 1 / 3 1 / 3 1 / 3 0 · · · 0 0 0 1 / 3 1 / 3 1 / 3 · · · 0 0 . . . . . . . . . . . . · · · . . . . . . 1 / 3 0 0 0 · · · 1 / 3 1 / 3        . (13) Notice that in this case we have two symmetrie s. Th e ﬁrst is tha t th e gr aph is u ndirected and the second that the grap h is circulant. These symmetries can be seen in the structure of the transition matrix P that, indeed , turns out to be both symmetric and circu lant [9]. Example 2: Let us n ow co nsider the grou p Z N × Z N and the Cayley graph G ( Z N × Z N , S ) wh ere S = { (1 , 0 ); (0 , 0); (0 , 1) } . Again consider the un iform probabil- ity distribution π ((0 , 0 )) = π ((1 , 0)) = π ((0 , 1 )) = 1 / 3 The correspon ding Cayley stochastic matrix is given by the following block circu lant m atrix b elongin g to R N 2 × N 2 P =        P 1 P 2 0 0 · · · 0 0 0 0 P 1 P 2 0 · · · 0 0 0 0 0 P 1 P 2 · · · 0 0 0 . . . . . . . . . . . . · · · . . . . . . . . . P 2 0 0 0 · · · 0 0 P 1        (14) where P 1 , P 2 ∈ R N × N are such that P 1 =      1 / 3 1 / 3 0 · · · 0 0 0 1 / 3 1 / 3 · · · 0 0 . . . . . . . . . · · · . . . . . . 1 / 3 0 0 · · · 0 1 / 3      , P 2 = 1 3 I . (1 5) This examp le can be gen eralized to the more gene ral case of th e discrete d -dim ensional tor i Z d N , extensiv ely studied in the literature regar ding the p eer-to-peer networks [21], [2 5]. Now we r ecall an interesting r esult regard ing the essen tial spectral radius of the Cayley stochastic matrices. Assume that P ∈ R N × N is a Cayley stochastic matrix generated by a suitab le π and assume that | S | = ν , wher e S is as previously deﬁned . More over assume that 0 ∈ S . Notice that this last fact implies th at P ii > 0 , ∀ i : 1 ≤ i ≤ N . Then it f ollows tha t ρ ≥ 1 − C / N 2 / ( ν − 1) , where C > 0 is a con stant indep endent of S and N the number of agents. This result was proved in [6]. On the othe r side, the bloc k Kro necker strategy con- structed from any matr ix A h as essential spectral radius | λ | 1 /k , where | λ | is the essential s pectral radiu s of A , as stated in Corollary 4.1. If 0 < | λ | < 1 , then | λ | 1 /k behaves like 1 − µ/k for large k and some µ . Recall th at k is log N / log n . Hence this is better th an ab elian Cayley strategies. In conclusion , block Kronecker strate gies have a better essential spectral radiu s, hence a quic ker convergence speed, than Cayley strate gies. For the p articular choice gi ven by Equation (1 2), we co n verge in ﬁnite tim e, and th is time is the smallest possible over all linear strategies with th e same constrained degree. B. S imulation result As an illustration, we pr esent a simulative co mparison between the Cayley strategy and the block Kronecker strat- egy . The network co nsidered consists of N = 81 agents. The matrix P fo r the Cayley strategy is the matrix (13), whereas the matrix M for the blo ck Kron ecker strategy is built starting from A =   1 / 3 1 / 3 1 / 3 1 / 3 1 / 3 1 / 3 1 / 3 1 / 3 1 / 3   with n = 3 and k = 4 . The initial conditio ns has been chosen ran domly inside the interval [ − 50 , 50] . In both cases the in- degree is 3 . Notice that, as depicte d in Figu re 1, th e block Kr onecker strategy reache s the average of the initial condition s in a ﬁnite number of steps whereas, the Cayley strategy , after th e same nu mbers of steps, is sti ll far f rom conv erging toward the meeting point. 1 2 3 4 5 6 7 8 9 10 −50 −40 −30 −20 −10 0 10 20 30 40 50 Block Kronecker Strategy 1 2 3 4 5 6 7 8 9 10 −50 −40 −30 −20 −10 0 10 20 30 40 50 Cayley Strategy Fig. 1. The block Kroneck er strate gy (left) con ver ges in ﬁnite time, while the Cayle y strate gy (right) has a relati vel y slo w con ver gence V I . L Q R C O S T In this section we want to ev aluate the performance of the block Kronecker strategy accor ding to the qu adratic cost J = J 1 + γ J 2 , where J 1 = E P t ≥ 0 ( x ( t ) − x ( ∞ )) T ( x ( t ) − x ( ∞ )) accou nts fo r th e quickn ess of co n vergence, J 2 = E P t ≥ 0 ( x ( t + 1 ) − x ( t )) T ( x ( t + 1 ) − x ( t )) limits the norm of the updates, and γ weights the respective imp ortance of those two factors. Precisely we evaluate J for any block Kron ecker strategy der iv ed fro m a normal matrix A . Remember that th e initial state x (0) is supposed to b e charac terized by a identity covariance matrix. W e start with a lemma which p rovides a n upper an d a lower boun d for J 1 . Lemma 6. 1: If A is a nor mal n × n matrix satisfying condition s ( A),(B),(C), and ρ is the essential spec tral rad ius of A , th en the J 1 cost of the co rrespon ding block Kronecker strategy of size n k satisﬁes: J L ≤ J ≤ J U , where J L = N 1 − (T r ( A T A ) /n ) k 1 − T r ( A T A ) /n − k an d J U = J L + k 1 − ρ 2 (T r A T A − 1) . Pr oof: Clas sical arguments lead to write: J 1 = E X t ≥ 0 ( x ( t ) − x ( ∞ )) T ( x ( t ) − x ( ∞ )) = X t ≥ 0 E ( x ( t ) − x ( ∞ )) T ( x ( t ) − x ( ∞ )) = X t ≥ 0 E T r ( x ( t ) − x ( ∞ )) T ( x ( t ) − x ( ∞ )) = X t ≥ 0 E T r ( x ( t ) − x ( ∞ ))( x ( t ) − x ( ∞ )) T = X t ≥ 0 T r ( M t − E ) E ( x (0) x (0) T )( M t − E ) T = X t ≥ 0 T r ( M t − E )( M t − E ) T with E = 1 / n k 11 T . Now , T r ( M t − E )( M t − E ) T = T r M t M t T − T r E = T r M t M t T − 1 . When M is derived by block Kronecker prod uct f rom a normal matrix A , this is equal to (T r ( A T A ) r ) k − s (T r ( A T A ) r +1 ) s if t = r k + s , accor ding to Proposition 4.2. W e get a lower bound on J 1 by summin g o nly th e ﬁrst k terms: J 1 = X r ≥ 0 k − 1 X s =0 ((T r ( A T A ) r ) k − s (T r ( A T A ) r +1 ) s − 1) ≥ k − 1 X s =0 (T r ( A T A ) 0 ) k − s (T r ( A T A ) 1 ) s − k (16) = k − 1 X s =0 n k − s (T r A T A ) s − k , The last summ ation is a geometric series that can be ev alu- ated, lead ing to the bound J 1 ≥ N 1 − (T r A T A/n ) k 1 − T r A T A/n − k This proves the left ineq uality in the claim . For the right inequality , we ﬁnd an upp er bo und on the term s neglected in th e lower bound (1 6). As n ormal matrices can b e diago nalized by a unitary transf ormation , the eigenv alues of A T A , which w e denote 1 , λ 1 , λ 2 , . . . , λ n − 1 , are precisely th e squared modu le of the eig en values of A . In particular, ρ 2 = λ 1 , and the trace o f ( A T A ) t is 1 + P λ t i . The ter ms neglected in the lo wer b ound (1 6) ar e X r ≥ 1 k − 1 X s =0 ((1 + X i λ r i ) k − s (1 + X i λ r +1 i ) s . For every r , we bou nd ev ery of the k ter ms by the high est (for which s = 0 ). Hen ce the neglected terms are bo unded from ab ove by : X r ≥ 1 ((1 + X i λ r i ) k − 1) = k X r ≥ 1 P ( λ r 1 , . . . , λ r n − 1 ) , where P is a polyno mial in the variables λ 1 , . . . , , λ n − 1 with no inde penden t term: all m onom ials have degree at least one. Now we can sum all correspon ding mon omials for r = 1 , 2 , . . . : th is is a g eometric series o f prog res- sion at most λ 1 . He nce P r ≥ 1 P ( λ r 1 , . . . , λ r k ) is at most 1 1 − λ 1 P ( λ 1 , . . . , λ k ) = 1 1 − λ 1 (T r A T A − 1) . Hence J 1 differs fr om our lower b ound by at most k 1 1 − λ 1 (T r A T A − 1) . Thus J 1 = N 1 − (T r ( A T A ) /n ) k 1 − T r ( A T A ) /n + O (log N ) . Now we estimate J 2 . Lemma 6. 2: Und er the assumptions of Lemma 6.1, if ρ i denote the eigenv alues of A different fro m one, 2 J 1 − N − X i 1 1 − | ρ i | 2 ≤ J 2 ≤ 2 J 1 − N . Pr oof: First we notice, adap ting the ﬁrst steps o f the proof of the preceding p roposition , that J 2 = P T r ( M − I ) T ( M T ) t M t ( M − I ) , with I th e identity . This inv olves terms of the form ( M T ) t +1 M t . More pre cisely , J 2 = X t ≥ 0 T r ( M T t +1 M t +1 − M T t +1 M t − − M T t M t +1 + M T t M t ) = 2 X t ≥ 0 (T r M T t M t − 1) − N − − 2 X t ≥ 0 (T r M T t M t +1 − 1) The ﬁr st ter m of the last m ember is pr ecisely 2 J 1 , the last term is, than ks to Pr oposition 4.3, 2 P t ≥ 0 (T r A T t A t +1 − 1) . From Cauchy -Schwartz inequ ality applied to Fro benius norm, T r A T t A t +1 ≤ p T r A T t +1 A t +1 T r A T t A t ≤ T r A T t A t . Hence P t ≥ 0 (T r A T t A t +1 − 1) ≤ P t ≥ 0 P i λ t i = P i 1 1 − λ i , where, as argued in the proof of Lemma 6.1, λ i = | ρ i | 2 . Hence, J = N  (1 + 2 γ ) 1 − (T r ( A T A ) /n ) k 1 − T r ( A T A ) /n − γ + O (lo g N / N )  . Since th e trace of A T A is the sum of squ ares of ele- ments of A , we see that the coefﬁ cient of N (neglecting the O (log N / N )) term) is optimized by th e matr ix A = 1 /n 11 T , whate ver the value of γ is. In this ca se, the lower bound obtained on J 1 is exact, since only k terms are non- zero. The optim al cost is the n J = N  (1 + 2 γ ) 1 − 1 / N 1 − 1 /n − γ + O (log N / N )  , with ν = n . Hence ther e is her e no tra de-off between J 1 and J 2 among the family of block Krone cker strategies, in co ntrast with the general LQR theory . Note that th e optimal control strategy fo r uncon strained degree (ev ery ag ent knows every p osition) is easily solved by a scalar alg ebraic Riccati equ ation, leading to the op timal cost J = N (1 + √ 1 + 4 γ ) / 2 . If γ is small and n is large, then the op timal ﬁnite-time bloc k Kronecker appr oaches the un- bound ed d egree op timal solution with a co st approx imately equal to (1 + γ ) N . V I I . C O N C L U S I O N S W e h av e introd uced a family of strategies for a co nsensus problem , whose graph of com municatio n is de Bru ijn’ s graph. W e have shown that they can conv erge in ﬁnite logarithm ic tim e, wh ich is optima l. W e have e valuated the LQR cost of these strategies, proving their quasi-op timality if the c ost of upd ate is small and th e degree of th e g raph no t too low . This work can be extended in several d irections, including: • designing strategies valid for any N , not only exact powers of n ; • tackling the continuous-time case, where no dead beat strategy can exist; • estimating the LQR cost fo r Cayley strategies; • ﬁnding strategies that min imize the LQR cost fo r any cost γ o f the upd ate; V I I I . A C K N O W L E D G E M E N T S This work was partly de veloped d uring the stay o f o ne of the au thors (J.- C. D.) at Dep artment of Info rmation Engineer ing of Universit ` a di Padova. Stim ulating d iscussions with J. Hend rickx are gratefully ac knowledged. R E F E R E N C E S [1] N. Alon and Y . Roichman. Random Cayle y graphs and expa nders. Random Structur es and Algorithms , 5:271–284, 1994. [2] L . Babai. Spectra of Cayle y graphs. Journal of Combinatorial Theory , Series B , 27:180–189, 1979. [3] R. W . Beard, J. L awton , and F . Y . Hadaegh . A coordinati on architec - ture for space craft formati on control . IEEE T ransact ion on Contro l Systems T echnolo gy , 9:777–790, 2001. [4] P . Bhatta and N. E. Leonard. Stabi lizati on and coordina tion of underw ater gliders. In 41st IEEE Confere nce on Decision and Contr ol , 2002. [5] R. Carli, F . Fagnani, A. S peranzo n, and S. Zampieri. Communicati on constrai nts in coordinate d consensus problem. In American Contr ol Confer ence (ACC ’06) , June 2006. [6] R. Carli, F . Fagnani, A. Speranzon, and S. Zampieri. Communica tion constrai nts in the av erage conse nsus problem. Accepted for publi cation to Automati ca . [7] R. Carli, F . Fagnani , and S. Zampieri. On the state agreement with quantiz ed information. In Proce edings of the 17th Internationa l Sym- posium on Mathemati cal Theory of Networks and Systems (MTNS), K yoto, Ja pan , pages 1500–1508, J uly 2006. [8] J. Corte s, S. Marti nez, and F . Bullo. Rob ust rendezv ous for mobile autonomous agent s via proximity graphs in arbi trary dimensions. IEEE T ra ns. Automat. Contr ol . [9] P . J. Davis. Cir culant matrices . A W iley -Interscie nce Publicati on, Pure and Applied Mathemat ics. John Wil ey & Sons, New Y ork-Chiche ster- Brisbane , 1979. [10] N. G. de Bruijn. A combinatoria l problem. K oninkl ijk e Nederlandse Akademie v . W etenscha ppen , 49:758–764, 1946. [11] D. V . Dimarogonas and K. J. K yriak opoulos. On the rende zvous problem for multiple nonholonomic agents. IEEE T ransactions on automati c contr ol , 52(5):916–922, 2007. [12] J. A. Fax and R. M. Murray . Information ﬂow and cooperati v e control of vehic le formatio ns. IEEE T rans. Automat. Contr o l , 49(9):146 5– 1476, 2004. [13] P . Fraign iaud and P . Gau ron. D2b: A de Bruij n based content- addressabl e network. Theor . Comput. Sci. , 355(1):65–79, 2006. [14] Y . Hatano and M. Meshabi. Agreement of random networks. In IEEE Confer ence on Decision and Contr ol , 2004. [15] A. Jadbabaie, J. Lin, and A. S. Morse. Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE T rans. Automat . Contr ol , 48(6):988–1001, 2003. [16] A. Kashyap, T . Basar , and R. Srikant. Consensus with quantized informati on updates. In Pr oceedin gs of CDC Confere nce, san Die go , 2006. [17] R. H. Koning, H. Neudec ker , and T . W ansbeek. Block kronecke r products and the vecb operator . Linea r algebra and its applications , 149:165–1 84, 1991. [18] Z . L in, B. Francis, and M. Maggiore. Necessary and sufﬁcie nt graph- ical conditions for formation control of unic ycles. IEEE T ransac tions on automatic contr ol , 50(1):121–127 , 2005. [19] M. Mazo, A. Speranzon , K. H. Johansson, and X. Hu. Multi-robot tracki ng of a moving object using directio nal sensors. In Pr oceedin gs of the Internatio nal Confere nce on R obotics and Automation (ICRA) , 2004. [20] R. Olfat i-Saber and R. M. Murray . Consensus problems in networks of agents with switc hing topology and time-dela ys. IE EE T r ans. Automat . Contr ol , 49(9):1520–15 33, 2004. [21] S. Ratnasamy , P . Francis, M. Handley , R. M. Karp, and S. Shenker . A scalabl e content-ad dressable network. In SIGCOMM. , pages 161–172, 2001. [22] W . Ren, K. L. Moore, and Y . Chen. Necessary and sufﬁcie nt graphical conditi ons for fo rmation c ontrol o f uni cycl es. ASME Journal of Dynamic Systems, Measur ement, and Contr ol , to appear , 2007. [23] M. R. Samatham and D. K. Pradhan. The de Bruijn multiprocessor netw ork: A versati le parallel processing and sorting netwo rk for vlsi. IEEE T ran s. Comput. , 38(4):567–581, 1989. [24] A. Speranzon, C. Fi schione, and K. Johansson. Distribute d and colla borati ve estimation over wirel ess sensor network s. In Procee dings of the IE EE Confere nce on Decision and Contr ol (CDC’06) , pages 1025–1030, December 2006. [25] I. Stoica, R. Morris, D. R. Karger , M. F . Kaashoek, and H. Balakr - ishnan. Chord: A scalable peer -to-peer lookup service for interne t applic ations. In SIGCOMM. , pages 161–172, 2001. [26] H. G. T anner , A. Jadbabaie, and G. J. Pappas. Stabl e ﬂocking of mobile agents, part i: ﬁxed topology . In IEEE Conferen ce on decision and contr ol , 2003. [27] H. G. T anner , A. Jadbabaie, and G. J. Pappas. Stabl e ﬂocking of mobile agents, part ii: dynamic topology . In IEEE Confer enc e on decisio n and contr ol , 2003. [28] H. G. T anne r , S. G. Loizou, and K. J. Kyria kopoul os. Nonholonomic navi gation and control of cooperati ng mobile m anipula tors. IEEE T ra nsactions on robo tics and automation , 19(1):53–64, 2003.

Optimal strategies in the average consensus problem

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment