An Efficient Algorithm for Maximum-Entropy Extension of Block-Circulant Covariance Matrices

An Eﬃcien t Algorithm for Maxim um-En trop y Extension of Blo c k–Circulan t Co v ariance Matrices F rancesca P . Carli ∗ Augusto F erran te ∗ Mic hele Pa v on † Giorgio Picci ∗ ‡ Decem b er 28, 2012 Abstract This pap er deals with maxim um entrop y completion of partially speciﬁed blo ck– circulan t matrices. Since positive deﬁnite symmetric circulan ts happen to b e co v ari- ance matrices of stationary perio dic processes, in particular of stationary recipro cal pro cesses, this problem has applications in signal pro cessing, in particular to image mo deling. In fact it is strictly related to maxim um likelihoo d estimation of bilateral AR–t yp e represen tations of acausal signals sub ject to certain conditional indep endence constrain ts. The maxim um en tropy completion problem for blo c k–circulant matrices has recen tly been solv ed b y the authors, although leaving op en the problem of an eﬃcien t computation of the solution. In this pap er, we provide an eﬀcient algorithm for com- puting its solution which compares very fav ourably with existing algorithms designed for positive deﬁnite matrix extension problems. The prop osed algorithm beneﬁts from the analysis of the relationship b et ween our problem and the band–extension problem for blo c k–T o eplitz matrices also developed in this paper. 1 In tro duction W e consider the problem of completing a partially sp eciﬁed blo c k–circulant matrix under the constrain t that the completed matrix should b e p ositive deﬁnite and blo c k-circulant with an in v erse of banded structure. As shown in [6], a blo ck–circulan t completion problem of this kind is a crucial to ol for the identiﬁcation of a class of recipro cal pro cesses. These pro cesses ([24], [32], [34]) are a generalization of Marko v pro cesses which are particularly useful for modeling random signals whic h live in a ﬁnite region of time or of the space line, for example images. In this pap er we consider stationary recipro cal pro cesses for whic h w e refer the reader to [31, 14] and references therein. In particular, stationary recipro cal pro cesses of the autoregressive type can b e describ ed by linear mo dels inv olving a banded blo c k–circulan t concentration matrix 1 whose blo c ks are the (matrix–v alued) parameters of the mo del. ∗ Departmen t of Information Engineering, Univ ersity of Pado v a, Italy . carlifra@dei.unipd.it , augusto@dei.unipd.it , picci@dei.unipd.it † Departmen t of Pure and Applied Mathematics, Univ ersity of Pado v a, Italy . pavon@math.unipd.it ‡ W ork partially supp orted by the Italian Ministry for Education and Resarch (MIUR) under PRIN grant “Iden tiﬁcation and Adaptive Con trol of Industrial Systems”. 1 i.e. the inv erse cov ariance matrix, also known as the precision matrix. 1 This problem ﬁts in the general framework of co v ariance extension problems introduced b y A. P . Dempster [12] and studied by man y authors (see [5], [13], [22], [11], [35], [20], [3], [1], [2], [26], [19], [28], [25], [18], [29], [9], [15] and references therein). A key disco very by Dempster is that the in verse of the maxim um entrop y completion of a partially assigned co v ariance matrix has zeros exactly in the p ositions corresp onding to the unspeciﬁed entries in the giv en matrix, a prop ert y whic h, from now on, will be referred to as the Dempster pr op erty (an alternative, concise pro of of this statemen t can for example b e found in [7]). A relev ant fact is that, even when the c onstr aint of a cir culant structur e is imp ose d , the in verse of the maxim um entrop y completion maintains the Dempster prop ert y . This fact has b een ﬁrst noticed in [6] for a banded structure and then pro v ed in complete generality , i.e. for arbitrary given elements within a block–circulan t structure, in [7]. Otherwise stated, this means that the solution of the Maxim um Entrop y block–Circulan t Extension Problem (CME) and of the Dempster Maxim um En tropy Extension Problem (DME) with data con- sisten t with a blo c k–circulant structure, coincide. Note that this prop ert y do es not hold, for example, for arbitrary missing elements in a blo c k–T o eplitz structure: if we ask the comple- tion to b e T oeplitz, the maximum en tropy extension fails to satisfy the Dempster prop ert y unless the giv en data lie on consecutive bands cen tered along the main diagonal (see [13] and [19] for a general formulation of matrix extension problems in terms of so–called banded– algebra techniques and for a thorough discussion of the so–called band–extension problem for block–T oeplitz matrices). Moreov er, the block–T oeplitz band extension problem can be solv ed b y factorization techniques and is essentially a linear problem. This is unfortunately no longer true when a blo c k–circulant structure is imp osed [8] to the extension. The main con tribution of this pap er is to propose a new algorithm for solving the CME problem. A straightforw ard application of standard optimization algorithms would b e to o exp ensiv e for large sized problems like those w e ha ve in mind for, sa y applications to image pro cessing. Here we prop ose a new pro cedure which rests on duality theory and exploits information on the structure of the problem as well as the circulant structure for computing the solution of the CME. Since the solutions of the CME and of the DME with circulant–compatible data coincide, metho ds a v ailable in the literature for the DME can, in principle, b e emplo yed to compute the solution of CME. In this resp ect, it has been shown that if the graph asso ciated with the sp eciﬁed entries is c hordal ([21]), the solution of the DME can b e expressed in closed form in terms of the principal minors of the cov ariance matrix, see [3], [16], [33]. In our problem how ev er the sparsity pattern asso ciated with the given entries is not chordal and the maxim um entrop y completion has to be computed n umerically . A n umber of sp ecialized algorithms ha ve been prop osed in the graphical mo dels literature; see [12, 35, 36, 27]. These algorithms deal with the general unstructured setting of Dempster and are not esp ecially tailored to the circulant structure. A detailed comparison of our pro cedure with the best algorithms a v ailable so far is presen ted in Section 5. W e sho w that the prop osed algorithm outp erforms the algorithms prop osed in the graphical mo dels literature for the solution of the DME, b eing especially suitable to deal with very large sized instances of the problem. W e shall ﬁrst relate our work to the solution of the band extension problem for blo ck– T o eplitz matrices and show that the maximum en trop y circulant extension approximates arbitrarily closely the blo ck–T oeplitz band extension with the same starting data, when the dimension of the circulan t extension b ecomes large. This result is in the spirit of the relation b et w een stationary Mark ov and recipro cal pro cesses on an inﬁnite interv al established by 2 Levy in [30] and will b e useful to pro vide an eﬃcient initialization for the prop osed algorithm. In this con text, w e shall brieﬂy touch up on feasibilit y of the CME problem. The feasibilit y problem for generic blo c ks size and bandwidth has b een addressed in [6] and [7], where a suﬃcient condition on the data for a p ositiv e deﬁnite blo ck –circulan t completion to exist has b een deriv ed. Here we shall deriv e a ne c essary and suﬃcient condition for feasibility of the CME problem v alid for the scalar case with unitary bandwidth. The outline of the pap er is as follo ws. In Section 2 we introduce some notation and state the en tropy maximization problem. In Section 3 the relation b et ween the maximum en tropy extension for banded T o eplitz and banded circulant matrices is inv estigated. A necessary an suﬃcient condition for feasibility is also derived in this Section. In Section 4 w e describe the prop osed procedure for the solution of the CME problem. Section 5 con tains a brief review and discussion of some of the most p opular metho ds for the solution of the DME. A comparison of the proposed algorithm and the metho ds a v ailable in the literature is presented in Section 6. Section 7 concludes the pap er. 2 Notation and preliminaries All random v ariables in this pap er, denoted by b oldface characters, hav e zero mean and ﬁnite second order moments. It is sho wn in [6] that a wide–sense stationary R m –v alued pro cess y is stationary on { 0 , 1 , . . . , N } if and only if its cov ariance matrix, sa y Σ N , has a blo c k–circulan t symmetric structure, i.e. Σ N is of the form Σ N =                  Σ 0 Σ > 1 . . . Σ > τ . . . Σ τ . . . Σ 1 Σ 1 Σ 0 Σ > 1 . . . Σ > τ . . . . . . . . . . . . . . . . . . . . . Σ τ Σ τ . . . Σ 1 Σ 0 Σ > 1 . . . . . . . . . Σ τ . . . Σ 0 . . . Σ > τ Σ > τ . . . . . . . . . . . . . . . . . . . . . Σ > 1 Σ > 1 . . . Σ > τ . . . Σ τ Σ 1 Σ 0                  where the k –th blo ck, Σ k , is giv en by Σ k = E y ( t + k ) y ( t ) > . W e refer the reader to [10] for an in tro duction to circulan ts; an extension of some relev ant results for the blo ck –case can b e found, for example, in [7]. Here we just recall that the class of blo ck–circulan ts is closed un- der sum, pro duct, in verse and transp ose. Moreov er, all block–circulan ts are simultaneously diagonalized by the F ourier blo c k–matrix of suitable size (see (9)–(11) b elo w). The diﬀer ential entr opy H ( p ) of a probability densit y function p on R n is deﬁned by H ( p ) = − Z R n log( p ( x )) p ( x ) dx. (1) In case of a zero-mean Gaussian distribution p with cov ariance matrix Σ N , it results H ( p ) = 1 2 log(det Σ N ) + 1 2 n (1 + log (2 π )) . (2) 3 Let S N denote the vector space of real symmetric matrices with N × N square blo cks of dimension m × m . Moreov er, let U N denote the blo ck–circulan t shift matrix with N × N blo c ks, U N =        0 I m 0 . . . 0 0 0 I m . . . 0 . . . . . . . . . . . . 0 0 0 . . . I m I m 0 0 . . . 0        , E n the N × ( n + 1) blo ck matrix E n =             I m 0 . . . 0 0 I m 0 . . . . . . . . . 0 . . . . . . I m 0 . . . 0 . . . . . . . . . 0 . . . 0             . and T n ∈ S n +1 the blo c k–T o eplitz matrix made of the ﬁrst n + 1, m × m cov ariance lags { Σ 0 , Σ 1 , . . . , Σ n } , T n :=          Σ 0 Σ > 1 . . . . . . Σ > n Σ 1 Σ 0 Σ > 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . Σ > 1 Σ n . . . . . . Σ 1 Σ 0          . (3) The symmetric blo ck–T oeplitz matrix T n is completely sp eciﬁed by its ﬁrst blo ck–ro w, so, with obvious notation, it will b e also denoted as T n = T o epl  Σ 0 , Σ > 1 , . . . , Σ > n  . The maximum entrop y cov ariance extension problem for blo c k–circulan t matrices (CME) can b e stated as follows. max { log det Σ N | Σ N ∈ S N , Σ N > 0 } (4a) sub ject to : E > n Σ N E n = T n , (4b) U > N Σ N U N = Σ N . (4c) where we hav e exploited the fact that a matrix C N with N × N blo c ks is blo c k–circulant if and only if it commutes with U N , namely if and ony if C N = U > N C N U N . Problem (4) is a c onvex optimization pr oblem since we are minimizing a strictly con vex function on the in tersection of a con vex cone (minus the zero matrix) with a linear manifold. If we do not imp ose the completion to b e block–circulan t, w e obtain the cov ariance selection problem studied by A. P . Dempster (DME) in [12]. 4 Notice that, although in Problem 4 we are maximizing the entrop y functional ov er zero– mean Gaussian densities, w e are not actually restricting ourselves to the case of Gaussian distributions. Indeed, the Gaussian distribution with (zero mean and) co v ariance matrix solving (4) maximizes the entrop y functional (1) o ver the larger family of (zero mean) probabilit y densities whose cov ariance matrix satisﬁes the b oundary conditions (4b), (4c), see [6, Theorem 7.2]. 3 Relation with the blo c k-T o eplitz co v ariance extension prob- lem In this Section, w e shall p oint out a relation betw een the solutions of the maxim um entrop y band extension problem for blo c k–circulan t and blo c k–T o eplitz matrices. Let T n and E n b e the blo c k matrices deﬁned in Section 2. Moreov er let A N − 1 and B N − 1 b e the ( N − 1) × N blo c k shift matrices A N − 1 =        I m 0 0 . . . 0 0 0 I m 0 0 0 0 0 I m 0 0 . . . . . . . . . . . . 0 0 0 . . . I m 0        , B N − 1 =        0 I m 0 0 . . . 0 0 0 I m 0 0 0 0 0 I m . . . 0 . . . . . . . . . 0 0 0 0 . . . I m        The maxim um entrop y band extension problem for blo c k–T o eplitz matrices (TME) can b e stated as follows. max { log det Σ N | Σ N ∈ S N , Σ N > 0 } (5a) sub ject to : E > n Σ N E n = T n , (5b) A N − 1 Σ N A > N − 1 = B N − 1 Σ N B > N − 1 . (5c) This problem has a long history , and w as probably the ﬁrst matrix completion problem studied in the literature ([13], [19]). As men tioned in the In tro duction, it can b e solved b y factorization tec hniques, in fact, by the celebrated Levinson–Whittle algorithm [37] and is essen tially a linear problem. Belo w we shall sho w that for N → ∞ , the solution of the CME problem can b e appro ximated arbitrarly closely in terms of the solution of the T o eplitz band extension problem. The Theorem reads as follows. Theorem 3.1. Let T n b e p ositiv e deﬁnite and let { ˆ Σ k , k = n +1 , n +2 , . . . } b e the maximum en tropy blo ck– T o eplitz extension of { Σ 0 , Σ 1 , . . . , Σ n } solution of the TME problem 5. Then, for N large enough, the symmetric blo c k–circulant matrix Σ ( c ) N giv en by T o epl  Σ 0 , Σ > 1 , . . . , Σ > n , ˆ Σ > n +1 , . . . , ˆ Σ > N 2 − 1 , ˆ Σ > N 2 + ˆ Σ N 2 , ˆ Σ N 2 − 1 , . . . , ˆ Σ n +1 , Σ n , . . . Σ 1  , (6) for N even, and T o epl  Σ 0 , Σ > 1 , . . . , Σ > n , ˆ Σ > n +1 , . . . , ˆ Σ > N − 1 2 , ˆ Σ N − 1 2 , . . . , ˆ Σ n +1 , Σ n , . . . , Σ 1  , (7) for N o dd, is a cov ariance matrix whic h for N → ∞ is arbitrarily close to the mN × mN maxim um entrop y blo c k– cir culant extension of T n solution of the CME 4. 5 Pr o of. That Σ ( c ) N is a v alid co v ariance matrix for N large enough follo ws from [6, Theorem 5.1]. It remains to show that Σ ( c ) N giv en by (6), (7) tends to the maxim um entrop y blo c k– circulan t extension of T n , say Σ o N , i.e. that lim N →∞    Σ ( c ) N − Σ o N    = 0 . (8) T o this aim, w e recall that the maximum entrop y completion Σ ( c ) N is the unique completion of the giv en data whose in verse has the prop ert y to b e zero in the complemen tary positions of those assigned ([6], [7], [12]). Thus, (8) holds if and only if for N → ∞ the inv erse of Σ ( c ) N tends to b e banded blo c k–circulant lim N →∞      Σ ( c ) N  − 1 − ( Σ o N ) − 1     = 0 , i.e. if and only if its oﬀ–diagonal blo cks tend uniformly to zero (faster than N 2 ). T o sho w this, recall that Σ ( c ) N can b e blo c k–diagonalized as Σ ( c ) N = VΨ N V ∗ (9) where V is the F ourier blo ck-matrix whose ( k , l )-th blo c k is V kl := 1 / √ N exp [ − j2 π ( k − 1)( l − 1) / N ] I m (10) and Ψ N is the blo c k–diagonal matrix Ψ N := diag (Ψ 0 , Ψ 1 , . . . , Ψ N − 1 ) , (11) whose diagonal blo c ks Ψ ` , are the co eﬃcien ts of the ﬁnite F ourier transform of the ﬁrst blo c k row of Σ ( c ) N Ψ ` = ˆ Σ 0 + e j ϑ ` ˆ Σ > 1 +  e j ϑ `  2 ˆ Σ > 2 + · · · +  e j ϑ `  N − 2 ˆ Σ 2 +  e j ϑ `  N − 1 ˆ Σ 1 , (12) with ϑ ` := − 2 π `/ N . Thus in particular  Σ ( c ) N  − 1 = VΨ − 1 N V ∗ where Ψ − 1 N := diag  Ψ − 1 0 , Ψ − 1 1 , . . . , Ψ − 1 N − 1  . No w, let us consider the blo ck–T o eplitz band extension of the given data T n , { ˆ Σ k , k = 0 , 1 , 2 , . . . } , and the asso ciated sp ectral densit y matrix Φ( z ) := ˆ Σ 0 + ∞ X i =1 ˆ Σ i z − i + ∞ X i =1 ˆ Σ i z − i ! ∗ . (13) It is well–kno wn [37] that Φ( z ) can b e expressed in factored form as Φ( z ) =  L n ( z − 1 )  − 1 Λ n  L n ( z − 1 )  −∗ (14) 6 where L n ( z − 1 ) is the n –th Levinson–Whittle matrix polynomial asso ciated with the blo ck– T o eplitz matrix T n L n ( z − 1 ) = n X k =0 A n ( k ) z − k (15) with the A n ( k )’s and Λ n = Λ > n > 0 b eing the solutions of the Y ule–W alker t yp e equation  A n (0) A n (1) . . . A n ( n )  T > n =  Λ n 0 . . . 0  . (16) Note that Φ( z ) − 1 = L n ( z − 1 ) ∗ Λ − 1 n L n ( z ) is a Laurent p olynomial, that can b e written as Φ( z ) − 1 = M 0 +  M 1 z + M 2 z 2 + · · · + M n z n  +  M 1 z + M 2 z 2 + · · · + M n z n  ∗ Moreo ver, the Ψ ` ’s in (12) can b e written as 2 Ψ ` = ˆ Σ 0 + e j ϑ ` ˆ Σ > 1 + · · · +  e j ϑ `  h ˆ Σ > h + e − j ϑ ` ˆ Σ 1 + · · · +  e − j ϑ `  h ˆ Σ h (17) where h :=  N − 1 2 , N o dd N / 2 , N ev en No w, comparing expression (17) with (13), w e can write Ψ ` = Φ  e j ϑ `  − h ∆Φ N  e j ϑ `  + ∆Φ ∗ N  e j ϑ ` i (18) where ∆Φ N ( z ) := ∞ X i = h +1 ˆ Σ i z − i . Since the causal part of Φ( z ) is a rational function with p oles inside the unit circle, sup l =0 , ..., N − 1   ∆Φ N  e j ϑ `  + ∆Φ ∗ N  e j ϑ `    → 0 exp onen tially fast for N → ∞ . It follo ws that, for N → ∞ , Ψ − 1 ` tends to  Φ( e j θ ` )  − 1 , whic h is giv en by  Φ( e j θ ` )  − 1 = M 0 + M 1 e j ϑ ` + · · · + M n  e j ϑ `  n + M > 1  e j ϑ `  N − 1 + · · · + M > n  e j ϑ `  N − n , (19) for ev ery ` = 0 , 1 , . . . , N − 1. In other words, Ψ − 1 ` tends to the ﬁnite F ourier transform of a sequence of the form M 0 , M > 1 , M > 2 , . . . , M > n , 0 , . . . , 0 , M n , . . . , M 1 , i.e.  Σ ( c ) N  − 1 tends to b e banded blo c k–circulant, as claimed. 2 F or N even e j ϑ ` h = e − j ϑ ` h = − 1, so that  e j ϑ `  h ˆ Σ > h +  e − j ϑ `  h ˆ Σ h = −  ˆ Σ h + ˆ Σ > h  . 7 This result is very m uch in the spirit of the ﬁndings by Levy [30], whic h establish a relation b etw een stationary Mark o v and recipro cal pro cesses on an inﬁnite interv al and will b e used in Section 4.1 to pro vide an eﬃcien t inizialization for the prop osed algorithm. F easibilit y of the C ME problem has b een addressed in [6], where a suﬃcient condition on the data for a p ositiv e deﬁnite blo c k-circulant com pletion to exist has b een derived. There is a simple, y et, to the b est of our knowledge, still unnoticed, necessary and suﬃcient condition for the existence of a p ositiv e deﬁnite circulan t completion for scalar (blo cks of size 1 × 1) en tries and bandwidth n = 1 whic h can b e derived by combining the results in [1], [12], [7]. It reads as follows. Prop osition 3.1. L et N ≥ 4 . The p artial ly sp e ciﬁe d cir culant matrix             σ 0 σ 1 ? . . . . . . ? σ 1 σ 1 σ 0 σ 1 ? . . . . . . ? ? σ 1 σ 0 σ 1 ? . . . ? . . . . . . . . . . . . ? ? . . . . . . ? σ 1 σ 0 σ 1 σ 1 ? . . . . . . ? σ 1 σ 0             (20) admits a p ositive deﬁnite circulan t c ompletion if and only if | σ 1 | < σ 0 , for N even, and cos  N − 1 N π  σ 0 < σ 1 < σ 0 , for N o dd. Pr o of. In [1, Corollary 5] it is sho wn that the partially sp eciﬁed circulant matrix (20) admits a positive deﬁnite (but not necessarily circulant) completion if and only if | σ 1 | < σ 0 , for N even, and cos  N − 1 N π  σ 0 < σ 1 < σ 0 , for N o dd. On the other hand, Dempster [12] sho ws that if there is any p ositiv e deﬁnite symmetric matrix which agrees with the partially sp eciﬁed one in the giv en positions, then there exists exactly one suc h a matrix with the additional property that its inv erse has zeros in the complemen tary positions of those sp eciﬁed and this same matrix is the one which maximizes the entrop y functional among all the normal mo dels whose co v ariance matrix agrees with the giv en data. But, accordingly to the ﬁndings in [7], if the giv en data are consistent with a circulan t structure, the maximum en trop y completion is necessarily circulant, whic h concludes the pro of. Prop osition 3.1 provides an explicit condition on the oﬀ–diagonal en tries for the CME to b e feasible. Moreov er, for given σ 0 and σ 1 , it states that feasibility dep ends on the size N of the asked completion. This conﬁrms, by a completely indep enden t argument, the ﬁndings in [6, Theorem 5.1], where the dep endency of feasibility on the completion size N has b een ﬁrst noticed (and prov ed for a generic block–size and bandwidth). A clarifying example, whic h also mak es use of the c haracterization of the set of the p ositiv e deﬁnite completions in [7], is presented in App endix A. 4 A new algorithm for the solution of the CME problem In this Section we shall derive our new algorithm to solv e the CME problem. The deriv ation rests up on dualit y theory for the CME problem developed in [6, Section VI] and proﬁts by 8 the structure of our CME along with the prop erties of block–circulan t matrices to devise a computationally adv an tageous pro cedure for the computation of its solution. Consider the CME as deﬁned in (4) and deﬁne the linear map A : S n +1 × S N → S N (Λ , Θ) 7→ E n Λ E > n + U N Θ U > N − Θ (21) and the set L + := { (Λ , Θ) ∈ ( S n +1 × S N ) | (Λ , Θ) ∈ (k er( A )) ⊥ ,  E n Λ E > n + U N Θ U > N − Θ  > 0 } . (22) L + is an op en, conv ex subset of (k er( A )) ⊥ . Letting h A, B i := tr AB > , the L agr angian function results L ( Σ N , Λ , Θ) := − tr log Σ N + D Λ ,  E > n Σ N E n − T n E , + D Θ ,  U > N Σ N U N − Σ N E = − tr log Σ N + tr  E n Λ E > n Σ N  − tr (Λ T n ) + tr  U N Θ U > N Σ N  − tr (Θ Σ N ) and its ﬁrst v ariation (at Σ N in direction δ Σ N ∈ S N ) is δ L ( Σ N , Λ , Θ; δ Σ N ) = − tr  Σ − 1 N δ Σ N  + tr  E n Λ E > n δ Σ N  + tr  U N Θ U > N − Θ  δ Σ N  . Th us δ L ( Σ N , Λ , Θ; δ Σ N ) = 0 , ∀ δ Σ N ∈ S N if and only if Σ − 1 N = E n Λ E > n + U N Θ U > N − Θ . It follows that, for each ﬁxed pair (Λ , Θ) ∈ L + , the unique Σ o N minimizing the Lagrangian o ver S N , + := { Σ N ∈ S N , Σ N > 0 } is Σ o N =  E n Λ E > n + U N Θ U > N − Θ  − 1 . (23) Moreo ver, computing the Lagrangian at Σ N = Σ o N results in L ( Σ o N , Λ , Θ) = − tr log   E n Λ E > n + U N Θ U > N − Θ  − 1  + tr h  E n Λ E > n + U N Θ U > N − Θ   E n Λ E > n + U N Θ U > N − Θ  − 1 i − tr(Λ T n ) = tr log  E n Λ E > n + U N Θ U > N − Θ  + tr I mN − tr (Λ T n ) . This is a strictly conca v e function on L + whose maximization is the dual pr oblem of (CME). W e can equiv alen tly consider the con vex problem min { J (Λ , Θ) , (Λ , Θ) ∈ L + } , (24) 9 where J is given by J (Λ , Θ) = tr (Λ T n ) − tr log  E n Λ E > n + U N Θ U > N − Θ  . (25) It can b e sho wn ([6, Theorem 6.1]) that the function J admits a unique minim um p oin t  ¯ Λ , ¯ Θ  in L + . The gradient of J is ∇ Λ J (Λ , Θ) = − E > n  E n Λ E > n + U N Θ U > N − Θ  − 1 E n + T n (26a) ∇ Θ J (Λ , Θ) = − U > N  E n Λ E > n + U N Θ U > N − Θ  − 1 U N +  E n Λ E > n + U N Θ U > N − Θ  − 1 (26b) Th us the application of whatev er ﬁrst–order iterative metho d for the minimization of J w ould inv olv e rep eated inv ersions of the mN × mN blo c k matrix ( E n Λ E > n + U N Θ U > N − Θ), whic h could b e a prohibitive task for N large. Nev erthless, by exploting our kno wlwdge of the problem, we can devise the follo wing alternativ e. Let ( ¯ Λ , ¯ Θ) b e the unique minimum p oin t of the functional J on L + . W e know that ( ¯ Λ , ¯ Θ) are such that Σ o = E n ¯ Λ E > n + U N ¯ Θ U > N − ¯ Θ is circulant. Th us, one can think of restricting the searc h for the solution of the optimization problem to the set n (Λ , Θ) |  E n Λ E > n + U N Θ U > N − Θ  is circulant o . (27) If w e denote by C N the linear subspace of symmetric, block–circulan t matrices and b y Π C N the orthogonal pro jection on C N , the set (27) can b e written as n (Λ , Θ) | Π C ⊥ N  E n Λ E > n + U N Θ U > N − Θ  = 0 o . (28) W e can now exploit the characterization of the matrices b elonging to the orthogonal com- plemen t of C N in [6, Lemma 6.1], which states that a symmetric matrix M b elongs to the orthogonal complemen t of C N , say C N ⊥ , if and only if, for some N ∈ S N , it can b e ex- pressed as M = U N N U > N − N . Th us ( U N Θ U > N − Θ) ∈ C N ⊥ and set (28) can b e written as n (Λ , Θ) | Π C ⊥ N  E n Λ E > n  = −  U N Θ U > N − Θ o . (29) If we compute the dual function J on the set (29), w e obtain J (Λ , Θ) |  (Λ , Θ) | Π C ⊥ N ( E n Λ E > n ) = − ( U N Θ U > N − Θ )  = tr (Λ T n ) − tr log  E n Λ E > n + U N Θ U > N − Θ >  = tr (Λ T n ) − tr log  E n Λ E > n − Π C ⊥ N  E n Λ E > n  = tr (Λ T n ) − tr log  Π C N  E n Λ E > n  (30) where an explicit formula for the orthogonal pro jection of E n Λ E > n on the subspace of sym- metric, blo c k–circulant matrices is giv en by Theorem 7.1 in [6]. In fact, if we denote with Λ =      Λ 00 Λ 01 . . . Λ 0 n Λ > 01 Λ 11 . . . Λ 1 n . . . . . . . . . Λ > 0 n Λ > 1 n . . . Λ nn      , 10 it can b e shown that the orthogonal pro jection of E n Λ E > n on to C N , say Π Λ , is the banded blo c k–circulan t matrix giv en by Π Λ := Π C N  E n Λ E > n  =                       Π 0 Π > 1 . . . Π > n 0 . . . 0 Π n . . . Π 1 Π 1 Π 0 Π > 1 . . . Π > n 0 . . . 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . Π n Π n . . . . . . . . . . . . . . . 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0 0 . . . . . . . . . . . . . . . Π > n Π > n . . . . . . . . . . . . . . . . . . . . . . . . Π > n 0 . . . 0 Π n . . . Π 1 Π 0 Π > 1 Π > 1 . . . Π > n 0 . . . 0 Π n . . . Π 1 Π 0                       with Π 0 = 1 N (Λ 00 + Λ 11 + . . . + Λ nn ) , (31a) Π 1 = 1 N (Λ 01 + Λ 12 + . . . + Λ n − 1 ,n ) > , (31b) . . . Π n = 1 N Λ > 0 n , (31c) and Π i = 0, forall i in the interv al n + 1 ≤ i ≤ N − n − 1 . Let us denote with ¯ J the restriction of J on (29) ¯ J (Λ) := tr (Λ T n ) − tr log n Π C N  E n Λ E > n o . (32) The gradient of the mo diﬁed functional ¯ J is ∇ Λ ¯ J (Λ) = − E > n Π − 1 Λ E n + T n . (33) Again, the computation of the gradient matrix inv olv e the inv ersion of an mN × mN matrix, namely the pro jection on the subspace of symmetric blo c k–circulant matrices of E n Λ E > n , Π Λ . Nev erthless, notice that this time the mN × mN matrix to b e inv erted is blo ck–cir culant , which implies that its inv erse can b e eﬃciently computed by exploting the blo c k–diagonalization Π Λ = VΩ N V ∗ , (34) where V is the blo c k–F ourier matrix (10) and Ω N is the blo c k–diagonal matrix whose diagonal blo c ks are the co eﬃcien ts of the ﬁnite F ourier transform of the ﬁrst blo c k ro w of Π Λ . In fact, (34) yields Π − 1 Λ = VΩ − 1 N V ∗ , so that the cost of computing Π − 1 Λ reduces to the cost of singularly inv erting the m × m diagonal blo cks of Ω N and indeed, b y exploiting the Hermitian symmetry of the diagonal 11 blo c ks of Ω N , to the cost of in verting only the ﬁrst  N +1 2  m × m blo c ks of Ω N . As a ﬁnal impro vemen t, notice that due to the ﬁnal left and righ t multiplication by E > n and E n , only the ﬁrst n + 1 blo c ks of Π − 1 Λ are needed to compute the gradient. T o recap, the prop osed pro cedure reduces the computational cost of eac h iteration of a ge neric ﬁrst–order descent metho d to O ( m 3 ) ﬂops, in place of the O ( N 3 ) op erations p er iteration which would hav e b een required b y a straighforward application of duality theory . In the follo wing, w e apply a gradien t descen t method to the optimization of the mo diﬁed functional ¯ J . The ov erall prop osed pro cedure is as follows. Algorithm 1 Matricial gradien t descent algorithm Giv en a starting p oin t Λ ∈ dom ¯ J , α ∈ (0 , 0 . 5), β ∈ (0 , 1) while   ∇ Λ ¯ J (Λ)   2 > η do ∆Λ := −∇ Λ ¯ J (Λ) while ¯ J (Λ + t ∆Λ) > ¯ J (Λ) + αt tr  ∇ ¯ J (Λ) > ∆Λ  do t := β t end while Λ := Λ + t ∆Λ end while In the next subsection w e provide an eﬃcient initialization for Algorithm 1. A comparison of the proposed procedure with state of the art algorithms for DME from the literature will b e presented in Section 6. 4.1 Algorithm initialization In this Section we exploit the asymptotic result in Theorem 3.1 to provide a go od starting p oin t for the iterative pro cedure of Algorithm 1. T o this aim, recall that the maximum en tropy completion of a partially sp eciﬁed blo c k– T o eplitz matrix can b e computed via the form ula Φ( z ) =  G ∗ ( z ) T − 1 n ˜ B  ˜ B ∗ T − 1 n ˜ B  − 1 ˜ B ∗ T − 1 n G ( z )  − 1 (35) (see [17] for details), where G ( z ) =  z I − ˜ A  − 1 ˜ B (36) with ˜ B =        0 0 . . . 0 I        , ˜ A =        0 I 0 . . . 0 0 0 I . . . 0 . . . . . . 0 I 0 . . . . . . . . . 0        . (37) It follows that the sp ectral factor W ( z ) :=  L n ( z − 1 )  − 1 Λ 1 2 n has a realization W ( z ) = C ( z I − A ) − 1 B + D 12 50 100 150 200 250 0 0.5 1 1.5 2 2.5 CP U time [sec. ] N · m (com pletion size) Id enti ty T o eplitz Figure 1: CPU time [in sec.] for the matricial gradient descen t algorithm with diﬀeren t initializations (iden tity in green and as in Section 4.1 in blue). The rep orted times ha ve b een computed for N = [10 , 20 , 30 , 40 , 50], m = 5 and bandwidth n = 3. Iden tity T o eplitz N m # of itz. CPU time # of itz. CPU time 10 5 99 0.1455 61 0.0767 20 5 212 0.4143 65 0.1270 30 5 322 0.8355 97 0.2504 40 5 432 1.4233 130 0.4285 50 5 541 2.1937 163 0.6603 T able 1: CPU time [in sec.] for the matricial gradien t descen t algorithm with diﬀeren t initializations (iden tity on the left and as in Section 4.1 on the right). The rep orted times ha ve been computed for N = [10 , 20 , 30 , 40 , 50], m = 5 and bandwidth n = 3. with D = Λ 1 2 n , C = −  A n ( n ) A n ( n − 1) . . . . . . A n (1)  and A =           0 I m 0 0 . . . 0 0 0 I m 0 . . . 0 . . . . . . . . . . . . . . . 0 0 . . . . . . . . . 0 I m − A n ( n ) − A n ( n − 1) . . . . . . . . . − A n (1)           , B =         0 0 . . . 0 Λ 1 2 n         , The p ositiv e real part of the maximum en tropy sp ectrum is given b y Φ + ( z ) = C ( z I − A ) − 1 ¯ C > + 1 2 Σ 0 (38) 13 where ¯ C > = AP C > + B D > , with P = AP A > + B B > and the maximum en tropy co v ariance extension results ˆ Σ k = C A k − 1 ¯ C > , k > n. With this extension at hand, we can compute an approximation for the maxim um en tropy blo c k–circulan t extension as suggested by Theorem 3.1. A go od starting p oin t for our algorithm can then b e obtained from (31) assuming for Λ a T o eplitz structure. As an example, we ha ve compared the execution time of the prop osed algorithm ini- tialized with the iden tity matrix and initialized with the solution of the asso ciated matrix extension problem for T o eplitz matrices as describ ed ab o v e for blo c ks of size m = 5, band- width n = 3 and N v arying b et ween 10 and 50. The computational times are rep orted in Figure 1 along with T able 1. The sim ulation results conﬁrm that the prop osed initializa- tion acts eﬀectively to reduce the num b er of iterations (and th us the computational time) required to reach the minim um. 5 Algorithms for the unstructured co v ariance selection prob- lem In this Section we in tro duce and discuss some of the main algorithms in the literature for the p ositiv e deﬁnite matrix completion problem with the aim of comparison with our newly prop osed algorithm. In the literature concerning matrix completion problems, it is common practice to describ e the pattern of the sp eciﬁed entries of an mN × mN partial symmetric matrix M = ( m ij ) by an undirected graph of mN v ertices which has an edge joining vertex i and v ertex j if and only if the entry m ij is sp eciﬁed. Since the diagonal en tries are all assumed to b e sp eciﬁed, w e ignore lo ops at the vertices. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 nz = 34 (a) Titolo 1 2 4 3 5 6 7 8 Figure 1: Caption of this w onderful Ti k Z ﬁgure. 1 (b) Figure 2: Banded pattern of the given entries for the TME problem with N = 8, n = 2, m = 1 (on the left) and asso ciated graph (on the right). 14 0 2 4 6 8 10 12 0 2 4 6 8 10 12 nz = 84 (a) 1 1 2 3 4 5 6 7 8 9 10 Figure 1: Graph asso ciated to the arro w sparsit y pattern of Figure ?? . 1 2 3 4 5 6 7 8 9 10 11 12 Figure 2: Graph asso ciated to the arro w sparsit y pattern of Figure ?? . (b) Figure 3: Banded pattern of the giv en en tries for the CME with N = 12, n = 3, m = 1 (on the left) and asso ciated graph (on the right). The graph is not c hordal since, for example, the cycle { 1 , 4 , 7 , 10 } do es not ha ve a c hord. As anticipated in the In tro duction, if the graph of the speciﬁed en tries is chor dal (i.e., a graph in which ev ery cycle of length greater than three has an edge connecting nonconsec- utiv e no des, see e.g.[21]), the maxim um determinan t matrix completion problem admits a closed form solution in terms of the principal minors of the sample cov ariance matrix (see [3], [16], [33]). An example of chordal sparsity pattern along with the asso ciated graph is sho wn in Figure 2. Ho wev er, the graph asso ciated with a banded circulant sparsit y pat- terns is not chordal, as it is apparent from the example of Figure 3b. Therefore we ha ve to resort to iter ative algorithms . F or the applications w e ha ve in mind, we are dealing with v ector–v alued pro cesses possibly deﬁned on a quite large interv al. A straigh tforw ard ap- plication of standard optimization algorithms is to o exp ensiv e for problems of such a size, and an umber of sp ecialized algorithms hav e b een proposed in the graphical mo dels liter- ature ([12, 35, 36, 27]). In his early work ([12]), Dempster himself prop osed tw o iterative algorithms which how ev er are very demanding from a computational point of view. Two p opular methods are those prop osed b y T. P . Sp eed and H. T. Kiiv eri in [35], that we no w brieﬂy discuss. Sp eed and Kiiveri’s algorithms W e will denote an undirected graph b y G = ( V , E ), where V is the vertex set and E the edge set which consists of unordered pairs of distinct v ertices. In an y undirected graph we say that 2 vertices u , v ∈ V are adjac ent if ( u, v ) ∈ E . F or an y v ertex set S ⊆ V , consider the edge set E ( S ) ⊆ E given b y E ( S ) := { ( u, v ) ∈ E | u, v ∈ S } The graph G ( S ) = ( S, E ( S )) is called sub gr aph of G induc e d by S . An induced subgraph G ( S ) is c omplete if the v ertices in S are pairwise adjacent in G . A clique is a complete subgraph that is not contained within another complete subgraph. Finally , we deﬁne the complemen tary graph of G = ( V , E ) as the graph ˜ G with vertex set V and edge set ˜ E with the prop ert y that ( u, v ) ∈ ˜ E if and only if u 6 = v and ( u, v ) / ∈ E . 15 Let I b b e the set of pairs of indices consistent with a banded, symmetric blo c k–circulan t structure of bandwidth n , i.e. the set of the ( i, j )’s which satisﬁes the follo wing rules set for i ∈ { 1 , . . . , m } , j ∈ { i, . . . , mN } , if | i − j | ≤ m ( n + 1) − i ⇒ ( i, j ) ∈ I b if ( i, j ) ∈ I b ⇒  ( i + m ) mod mN , ( j + m ) mod mN  ∈ I b if ( i, j ) ∈ I b ⇒ ( j, i ) ∈ I b (an example of this structure is shown in Figure 3a). Moreo ver, w e will denote by I C b the complemen t of I b with respect to { 1 , . . . , mN } × { 1 , . . . , mN } and b y G = ( { 1 , . . . , mN } , I b ) the graph asso ciated with the given en tries. As mentioned in the In tro duction, for the class of problems studied by Dempster, the in verse of the unique completion whic h maximizes the entrop y functional has the prop erty to b e zero in the complementary p ositions of those ﬁxed in Σ N . Thus, a rather natural pro cedure to compute the solution of the cov ariance selection problem for blo c k–circulant matrices seems to b e the follo wing: iterate maintaing the elemen ts of Σ N indexed b y I b at the desired v alue (i.e. equal to the corresp onding elements in the sample cov ariance matrix) while forcing the elemen ts of Σ − 1 N in I C b to zero. T o this aim, the following pro cedure can b e devised. Algorithm 2 First algorithm (Sp eed and Kiiveri [35]) Compute all the cliques ˜ c t in the complementary graph ˜ G , sa y { ˜ c t , t = 1 , . . . , n ˜ c t } ; Initialize Σ (0) N = R N ; while some stopping criterion is satisﬁed do for all the cliques ˜ c t in ˜ G do Σ ( t ) N = Σ ( t − 1) N + φ  Σ ( t − 1) N  end for end while where φ  Σ ( t − 1) N  is the mN × mN zero matrix whic h equals ( diag   ( Σ ( t − 1) N ) − 1  ˜ c t  − 1 ) − 1 −   ( Σ ( t − 1) N ) − 1  ˜ c t  − 1 in the p ositions corresponding to the curren t clique ˜ c t (giv en a mN × mN matrix M and a set a ⊂ { 1 , . . . , N m } , M a denotes the submatrix with en tries { m ij : i, j ∈ a } ). Every cycle consists of as man y steps as the cliques in the complemen tary graph ˜ G (the graph associated to the elements indexed by I C b ). At eac h step, only the elements in Σ N corresp onding to the current clique ˜ c t (i.e. only a subset of the entries indexed by I C b ) are mo diﬁed in such a wa y to set the elements of Σ − 1 N in the corresp onding p ositions to the desired zero–v alue. Throughout the iterations, the elemen ts in Σ N are ﬁxed o ver I b , while the elements of ( Σ N ) − 1 v ary o v er I C b . The role of Σ N and Σ − 1 N can also b e sw app ed, yielding an alternative procedure, which is the analog of iterativ e proportional scaling (IPS) for con tingency tables [23]. Let ϕ  Σ ( t − 1) N  b e the mN × mN zero matrix which equals (( R N ) c t ) − 1 −   Σ ( t − 1) N  c t  − 1 (39) 16 in the p ositions corresp onding to the current clique c t in G (the graph asso ciated with the giv en entries). The second algorithm reads as follows. Algorithm 3 Second algorithm (Sp eed and Kiiveri [35]) Compute all the cliques c t in G , say { c t , t = 1 , . . . , n c t } ; Initialize Σ (0) N = I mN ; while some stopping criterion is satisﬁed do for all the cliques c t in G do  Σ ( t ) N  − 1 =  Σ ( t − 1) N  − 1 + ϕ  Σ ( t − 1) N  (40) end for end while Ev ery cycle consists of as many steps as the cliques in the graph of the speciﬁed entries G . A t each step, only the elements in Σ − 1 N corresp onding to the current clique c t (i.e. only a subset of the en tries indexed by I b ) are mo diﬁed in such a wa y to set the elements of Σ N in the corresp onding p ositions to the desired v alue, namely equal to the sample cov ariance R N . Through the iterations the elements in ( Σ N ) − 1 are ﬁxed ov er I C b while the elements of Σ N v ary o v er I b . The c hoice of which algorithm is to be preferred depends on the application and is very m uch dep enden t on the num b er and size of the cliques in G and ˜ G . In our setting, the complexit y of the graph asso ciated with the given en tries dep ends on the bandwidth n . In particular, for a bandwidth n not to o large with resp ect to the completion size (which is the case we are in terested in) the complexit y of the graph asso ciated with the given data G is far lo wer than the complexity of its complemen tary (whic h, for small n , is almost complete), see Figure 4. The execution time of the tw o algorithms has b een compared for a completion size N = 30 and a bandwidth n v arying b et w een 2 and 8. The results are sho wn in Figure 5 and T able 2. It turns out that for n small the second algorithm (whic h, from no w on, will b e referred to as IPS) runs faster than the ﬁrst, and thus has to b e preferred. First algorithm Second algorithm n # of cl. (max. cl. size) CPU time [s] # of cl. (max. cl. size) CPU time [s] 2 4608(10) 9 . 7877 30(3) 0 . 4109 3 2406(7) 4 . 1515 30(4) 0 . 1783 4 1241(6) 1 . 9419 30(5) 0 . 3153 5 706(5) 1 . 0525 30(6) 0 . 5535 6 445(4) 0 . 6258 30(7) 0 . 9854 7 295(3) 0 . 4145 30(8) 1 . 7477 8 175(3) 0 . 2480 30(9) 3 . 0665 T able 2: Execution time of the ﬁrst and second algorithm for N = 30, m = 1, bandwidth n = { 2 , . . . , 8 } . Co v ariance selection via c hordal em b edding Recen tly , Dahl, V anderb erghe and Ro y- c howdh ury [9] hav e prop osed a new technique to impro ve the eﬃciency of the Newton’s metho d for the cov ariance selection problem based on c hordal em b edding: the giv en spar- sit y pattern is em b edded in a chordal one for whic h they pro vide eﬃcient tec hniques for 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (a) G for n = 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (b) ˜ G for n = 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (c) G for n = 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (d) ˜ G for n = 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (e) G for n = 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (f ) ˜ G for n = 8 Figure 4: Graph G asso ciated with the given data (on the right) and its complemen tary ˜ G (on the left) for N = 20 and bandwidth n = 2 , 5 , 8. 18 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 9 10 Executio n time CP U time [sec.] ba n dw idth n Algo rit h m 1 Algo rit h m 2 Figure 5: Comparison b et ween the execution time of the ﬁrst and second algorithm for N = 30, m = 1, n = { 2 , . . . , 8 } . computing the gradient and the Hess ian. The complexit y of the method is dominated b y the cost of forming and solving a system of linear equations in whic h the num ber of un- kno wns dep ends on the num ber of nonzero entries added in the chordal embedding. F or a circulan t sparsit y pattern, it is easy to c heck that the n umber of nonzero elements added in the c hordal embedding is quite large. Hence, their metho d do es not seem to b e eﬀectiv e for our problem. 6 Comparison of the prop osed algorithm and the IPS algo- rithm The prop osed gradient descent algorithm (GD) applied to the mo diﬁed dual functional ¯ J has b een compared to the iterative prop ortional scaling pro cedure (IPS) by Speed and Kiiveri. Both algorithms are implemented in Matlab. The Bron–Kerb osc h algorithm [4] has been emplo yed for ﬁnding the cliques in the graph for IPS. W e recall (see Section 4) that the n umber of op erations p er iteration required by our mo diﬁed gradien t descen t algorithm is cubic in the block–size m , as opp osed to the O ([ m ( N − ( n + 1))] 3 ) op erations p er iteration of the IPS algorithm (see equations (39) and (40)). It follo ws that for large instances of the CME our newly prop osed algorithm is exp ected to run considerably faster than the IPS algorithm. The execution times for diﬀeren t c ompletion size N and block size m are plotted in Figures 6 and 7. The sim ulation study conﬁrms that our gradien t descent algorithm applied to the mo diﬁed dual functional ¯ J outp erforms the iterative prop ortional scaling and the gap b et w een the tw o increases as N increases. Moreov er, the gap b ecomes m uch more eviden t as m grows, making the gradien t descent algorithm more attractiv e for applications where the pro cess under observ ation is vector–v alued ( m > 1). 19 50 100 150 200 250 0 10 20 30 40 50 60 70 CP U time [se c.] N · m (com pleti o n s ize) Mo d iﬁ ed g rad ie nt d es cent Iterativ e p r o p o r tio na l scalin g N m IPS GD 10 5 4.7048 0.0767 20 5 16.4981 0.1270 30 5 29.2779 0.2504 40 5 43.8072 0.4285 50 5 63.8069 0.6603 Figure 6: Matricial gradient descen t algorithm vs. iterativ e prop ortional scaling: CPU time [in sec.] for N = [10 , 20 , 30 , 40 , 50], m = 5, bandwidth n = 3. 100 150 200 250 300 350 400 450 500 0 200 400 600 800 1000 1200 1400 CP U time [s ec.] N · m (com pleti o n size) Mo diﬁ ed g rad ie nt descent Iterativ e p r o p o r tion a l scaling N m IPS GD 10 10 69.7671 0.1516 20 10 307.9596 0.4459 30 10 597.3791 0.8988 40 10 924.6431 1.4798 50 10 1341.0976 2.2052 Figure 7: Matricial gradient descen t algorithm vs. iterativ e prop ortional scaling: CPU time [in sec.] for N = [10 , 20 , 30 , 40 , 50], m = 10, bandwidth n = 3. 7 Conclusions The main con tribution of the presen t pap er is an eﬃcien t algorithm to solv e the maximum en tropy band extension problem for blo c k–circulant matrices. This problem has many applications in signal pro cessing since it arises in connection with maximum likelihoo d estimation of p erio dic, and in particular quasi–Mark ov (or recipro cal), pro cesses. Ev en if matrix completion problems ha ve gained considerable atten tion in the past (think for 20 example to the cov ariance extension problem for stationary pro cesses on the in teger line, i.e. for T o eplitz matrices), the maxim um en trop y band extension problem for blo ck–circulan t matrices has b een addressed for the ﬁrst time in [6]. The prop osed algorithm exploits the circulan t structure and relies on the v ariational analysis brought forth in [6]. An eﬃcien t initialization for the prop osed algorithm is provided thanks to the established relationship b et w een the solutions of the maxim um entrop y problem for blo c k–circulant and block– T o eplitz matrices. F urther light is also shed on the feasibilit y issue for the CME problem. A F easibilit y of the CME: an example In Section 3 we hav e shown that, for giv en σ 0 and σ 1 , feasibility of the CME depends on the completion size N . The follo wing example, aims at clarifying the in terplay b et ween feasibilit y and completion size N in the simple case of unitary bandwidth and blo c k–size using the characterization of the set of all p ositive deﬁnite completions deriv ed in [7]. Example A.1. L et σ 0 = 1 , σ 1 = − 0 . 91 . We want to investigate the fe asibility of Pr oblem 4 for N = 7 and N = 9 , i.e. we want to determine if, for N = 7 and N = 9 , ther e exist a p ositive deﬁnite cir culant c ompletion for the p artial ly sp e ciﬁe d matric es Σ 7 = Circ ( σ 0 , σ 1 , x, y , y , x, σ 1 ) , Σ 9 = Circ ( σ 0 , σ 1 , x, y , z , z , y , x, σ 1 ) , wher e Circ ( a ) denotes the cir culant symmetric matrix sp e ciﬁe d by its ﬁrst r ow a , and x , y and z denote the unsp e ciﬁe d entries. Sinc e cos  ( N − 1) N π  = ( − 0 . 9010 for N = 7 − 0 . 9397 for N = 9 , by The or em 3.1, we exp e ct that for N = 7 the pr oblem is unfe asible while for N ≥ 9 it is exp e cte d to b e c ome fe asible. F or N = 7 the set of al l p ositive deﬁnite c ompletions is delimite d by the interse ction of the half–planes indentiﬁe d by the eigenvalues Ψ( w k ) , k = 0 , . . . , 6 of Σ 7 Ψ( w 0 ) = − 0 . 82 + 2 x + 2 y Ψ( w 1 ) = Ψ( w 6 ) = − 0 . 134751 − 0 . 445042 x − 1 . 80194 y Ψ( w 2 ) = Ψ( w 5 ) = 1 . 40499 − 1 . 80194 x + 1 . 24698 y Ψ( w 3 ) = Ψ( w 4 ) = 2 . 63976 + 1 . 24698 x − 0 . 445042 y . (se e [7] for details). In Figur e 8 the interse ction Γ of the half–planes Ψ( w 0 ) > 0 and Ψ( w 1 ) > 0 is shown, to gether with the half–plane Ψ( w 2 ) > 0 . The interse ction of these two r e gions is empty. It fol lows that the interse ction of the four half–planes Ψ( w k ) > 0 , k = 0 , . . . , 3 is also empty, as claime d. On the other hand, if N = 9 , the eigenvalues of Σ 9 ar e Ψ( w 0 ) = − 0 . 82 + 2 x + 2 y + 2 z Ψ( w 1 ) = Ψ( w 8 ) = − 0 . 394201 + 0 . 347296 x − y − 1 . 87939 z Ψ( w 2 ) = Ψ( w 7 ) = 0 . 68396 − 1 . 87939 x − y + 1 . 53209 z Ψ( w 3 ) = Ψ( w 6 ) = 1 . 91 − x + 2 y − z Ψ( w 4 ) = Ψ( w 5 ) = 2 . 71024 + 1 . 53209 x − y + 0 . 347296 z 21 and the fe asible set is the nonempty r e gion shown in Figur e 9. − 0 . 5 0 . 5 1 1 . 5 2 − 0 . 8 − 0 . 6 − 0 . 4 − 0 . 2 0 . 2 0 Ψ  ω 0  Ψ  ω 1  Ψ  ω 2  Γ y x Figure 8: Half–plane Ψ( w 2 ) > 0 and intersection of the half–planes Ψ( w 0 ) > 0 and Ψ( w 1 ) > 0. The intersection of the t wo regions is empty . 0.73 0.74 0.75 0.76 - 0.49 - 0.48 - 0.47 0.16 0.17 0.18 Figure 9: F easible region { ( x, y , z ) | Σ N > 0 } for Σ N = Circ { 1 , − 0 . 91 , x, y , z , z , y , x, − 0 . 91 } . References [1] W. Barrett, C.R. Johnson, and P . T arazaga. The real p ositiv e deﬁnite completion problem for a simple cycle. Line ar algebr a and its applic ations , 192:3–31, 1993. 22 [2] W.W. Barrett, C.R. Johnson, and R. Lo ewy . The real p ositiv e deﬁnite completion problem: cycle completability . Mem. Am. Math. So c. , 122, 1996. [3] W.W. Barrett, C.R. Johnson, and M. Lundquist. Determinantal formulation for matrix completions associated with chordal graphs. Line ar Algebr a and Applic ations , 121:265– 289, 1989. [4] C. Bron and J. Kerb osc h. Algorithm 475: ﬁnding all cliques of an undirected graph. Commun. ACM , 16(9):575–577, 1973. [5] J.P . Burg. Maximum entr opy sp e ctr al analysis. PhD thesis, Dept. of Geophysiscs, Stanford Universit y , Stanford, CA, 1967. [6] F.P . Carli, A. F errante, M. P av on, and G. Picci. A maxim um entrop y solution of the co- v ariance extension problem for reciprocal pro cesses. IEEE T r ansactions on Automatic Contr ol , 56(9):1999–2012, 2011. [7] F.P . Carli and T.T. Georgiou. On the co v ariance completion problem under a circulan t structure. IEEE T r ansactions on A utomatic Contr ol , 56(4):918 – 922, 2011. [8] F.P . Carli and G. Picci. On the factorization approach to band extension of blo c k- circulan t matrices. In Pr o c e e dings of the 19th Int. symp osium on the mathematic al the ory of networks and systems (MTNS) , pages 907–914, Budap est, Hungary , July 2010. [9] J. Dahl, L. V anderb erghe, and V. Royc ho wdhury . Cov ariance selection for non–chordal graphs via chordal embedding. Optimization Metho ds and Softwar e , 23:501–520, 2008. [10] P . Da vis. Cir culant Matric es . John Wiley & Sons, 1979. [11] A. Dembo, C.L. Mallows, and L. A. Shepp. Em b edding nonnegative deﬁnite to eplitz matrices in nonnegative deﬁnite circulant matrices, with application to cov ariance es- timation. IEEE T r ansactions on Information The ory , 35(6):1206 –1212, Nov 1989. [12] A.P . Dempster. Cov ariance selection. Biometrics , 28:157–175, 1972. [13] H. Dym and I. Goh berg. Extension of band matrices with band inv erses. Line ar A lgebr a and Applic ations , 36:1–24, 1981. [14] A. F erran te and B.C. Levy . Canonical form of symplectic matrix pencils. Line ar Algebr a and its Applic ations , 274(1–3):259–300, April 1998. [15] A. F errante and M. Pa v on. Matrix completion ` a la Dempster by the principle of parsimon y . IEEE T r ansactions on Information The ory , 57(6):3925–3931, 2011. [16] M. F ukuda, M. Ko jima, K. Murota, and K. Nak ata. Exploiting sparsity in semideﬁnite programming via matrix completion i: general framework. SIAM Journal in Optimiza- tion , 11:647–674, 2000. [17] T.T. Georgiou. Spectral analysis based on the state cov ariance: The maxmum entrop y sp ectrum and linear fractional parametrization. IEEE T r ansactions on Information The ory , 52:1052–1066, 2006. 23 [18] W. Glunt, T.L. Hayden, C.R. Johnson, and P . T arazaga. P ositive deﬁnite completions and determinant maximization. Line ar algebr a and its applic ations , 288:1–10, 1999. [19] I. Gohberg, S. Goldb erg, and M. Kaasho ek. Classes of Line ar Op er ators vol II . Birkhauser, Boston, 1994. [20] I. Goh b erg, M.A. Kaasho ek, and H.J. W oerdeman. The band method for p ositiv e and strictly contractiv e extension problems: An alternative version and new applications. Inte gr al Equations and Op er ator The ory , 12:343–382, 1989. [21] M. Golum bic. Algorithmic Gr aph The ory and Perfe ct Gr aphs . Academic Press, New Y ork, 1980. [22] R. Grone, C.R. Johnson, E.M. Sa, and H. W olk owicz. Positiv e Deﬁnite Completions of Partial Hermitian Matrices. Line ar A lgebr a and Its Applic ations , 58:109–124, 1984. [23] S.J. Hab erman. The A nalysis of fr e quancy data. Univ. Chicago Press, 1974. [24] B. Jamison. Recipro cal pro cesses: The stationary gaussian case. Ann. Math. Stat. , 41:1624–1630, 1970. [25] C.R. Johnson, B. Kroschel, and H. W olko wicz. An interior-point metho d for approxi- mate p ositiv e semideﬁnite completions. Computational optimization and applic ations , 9(2):175–190, 1998. [26] C.R. Johnson and T.A. McKee. Structural conditions for cycle completable graphs. Discr ete Mathematics , 159(1):155–160, 1996. [27] S. Kullback. Probability densities with giv en marginals. The Annals of Mathematic al Statistics , 39(4):1236–1243, 1968. [28] M. Laurent. The real p ositiv e semideﬁnite completion problem for series-parallel graphs. Line ar algebr a and its applic ations , 252(1-3):347–366, 1997. [29] M. Laurent. Polynomial instances of the p ositiv e semideﬁnite and euclidean distance matrix completion problems. SIAM Journal on Matrix Analysis and Applic ations , 22(3):874–894, 2001. [30] B.C. Levy . Regular and recipro cal multiv ariate stationary Gaussian recipro cal pro cesses o ver Z are necessarily Marko v. J. Math. Systems, Estimation and Contr ol , 2:133–154, 1992. [31] B.C. Levy and A. F errante. Characterization of stationary discrete-time gaussian recip- ro cal processes o ver a ﬁnite interv al. SIAM J. Matrix Analysis Applic ations , 24(2):334– 355, 2002. [32] B.C. Levy , R. F rezza, and A.J. Krener. Mo deling and estimation of discrete-time Gaussian recipro cal pro cesses. IEEE T r ans. Automatic Contr ol , AC-35(9):1013–1023, 1990. 24 [33] K. Nak ata, K. F ujitsa w a, M. F ukuda, M. Ko jima, and K. Murota. Exploiting sparsity in semideﬁnite programming via matrix completion ii: implemen tation and numerical details. Mathematic al Pr o gr amming Series B , 95:303–327, 2003. [34] J.A. Sand. F our p ap ers in Sto chastic R e alization The ory . PhD thesis, Dept. of Mathe- matics, Roy al Institute of T echnology (KTH), Stockholm, Sw eden, 1994. [35] T.P . Sp eed and H.T. Kiiv eri. Gaussian mark ov distribution o ver ﬁnite graphs. The A nnals of Statistics , 14(1):138–150, 1986. [36] N. W erm ut and E. Scheidt. Fitting a cov ariance selection mo del to a matrix. algorithm as105. Appl. Statist. , 26:88–92, 1977. [37] P . Whittle. On the ﬁtting of multiv ariate autoregressions and the appro ximate sp ectral factorization of a sp ectral density matrix. Biometric a , 50:129–134, 1963. 25

An Efficient Algorithm for Maximum-Entropy Extension of Block-Circulant Covariance Matrices

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment