Twice-Ramanujan Sparsifiers

Twice-Raman ujan Sparsiﬁers ∗ Josh ua Batson † Daniel A. Spielman ‡ Nikhil Sriv asta v a § June 1, 2024 Abstract W e prov e that every graph has a sp ectral sparsiﬁer with a n umber of edges lin- ear in its n umber of vertices. As linear-sized sp ectral sparsiﬁers of complete graphs are expanders, our sparsiﬁers of arbitrary graphs can b e viewed as generalizations of expander graphs. In particular, w e prov e that for every d > 1 and ev ery undirected, w eighted graph G = ( V , E , w ) on n vertices, there exists a weigh ted graph H = ( V , F , ˜ w ) with at most d d ( n − 1) e edges suc h that for ev ery x ∈ R V , x T L G x ≤ x T L H x ≤ d + 1 + 2 √ d d + 1 − 2 √ d ! · x T L G x where L G and L H are the Laplacian matrices of G and H , resp ectively . Th us, H appro ximates G spectrally at least as w ell as a Raman ujan expander with dn/ 2 edges appro ximates the complete graph. W e give an elemen tary deterministic p olynomial time algorithm for constructing H . 1 In tro duction A sparsiﬁer of a graph G = ( V , E , w ) is a sparse graph H that is similar to G in some useful w ay . Many notions of similarity ha ve b een considered. F or example, Chew’s [6] spanners ha ve the prop erty that the distance b etw een every pair of v ertices in H is approximately the same as in G . Benczur and Karger’s [3] cut-sparsiﬁers ha ve the property that the w eigh t of the b oundary of every set of vertices is approximately the same in G as in H . W e consider ∗ This material is based up on work supported b y the National Science F oundation under Grant CCF- 0634957. Any opinions, ﬁndings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reﬂect the views of the National Science F oundation. † Departmen t of Mathematics, MIT. W ork on this pap er p erformed while at Y ale College. ‡ Program in Applied Mathematics and Departmen t of Computer Science, Y ale Universit y . § Departmen t of Computer Science, Y ale Universit y . 1 the sp ectral notion of similarity in tro duced by Spielman and T eng [17, 19]: we sa y that H is a κ -appro ximation of G if for all x ∈ R V , x T L G x ≤ x T L H x ≤ κ · x T L G x, (1) where L G and L H are the Laplacian matrices of G and H . W e recall that x T L G x = X ( u,v ) ∈ E w u,v ( x u − x v ) 2 , where w u,v is the w eight of edge ( u, v ) in G . By considering vectors x that are the character- istic v ectors of sets, one can see that condition (1) is strictly stronger than the cut condition of Benczur and Karger. In the case where G is the complete graph, excellent sp ectral sparsiﬁers are supplied by R amanujan Gr aphs [12, 13]. These are d -regular graphs H all of whose non-zero Laplacian eigen v alues lie betw een d − 2 √ d − 1 and d + 2 √ d − 1. Th us, if w e take a Ramanujan graph on n v ertices and multiply the w eight of ev ery edge b y n/ ( d − 2 √ d − 1), w e obtain a graph that κ -appro ximates the complete graph, for κ = d + 2 √ d − 1 d − 2 √ d − 1 . In this pap er, we pro ve that ev ery graph can b e appro ximated at least this w ell 1 b y a graph with only t wice as many edges as the Ramanujan graph (as a d -regular graph has dn/ 2 edges). Theorem 1.1. F or every d > 1 , every undir e cte d weighte d gr aph G = ( V , E , w ) on n vertic es c ontains a weighte d sub gr aph H = ( V , F , ˜ w ) with d d ( n − 1) e e dges (i.e., aver age de gr e e at most 2 d ) that satisﬁes: x T L G x ≤ x T L H x ≤ d + 1 + 2 √ d d + 1 − 2 √ d ! · x T L G x ∀ x ∈ R V . Our pro of pro vides a deterministic greedy algorithm for computing the graph H in time O ( dn 3 m ). W e remark that while the edges of H are a subset of the edges of G , the w eigh ts of edges in H and G will t ypically b e diﬀerent. In fact, there exist un weigh ted graphs G for whic h ev ery go o d sp ectral sparsiﬁer H must contain edges of widely v arying w eights [19]. 1 Strictly sp eaking, our approximation constan t is only better than the Ramanujan bound κ = d +2 √ d − 1 d − 2 √ d − 1 in the regime d ≥ 1+ √ 5 2 . This includes the actual Raman ujan graphs, for which d is an in teger greater than 2. 2 1.1 Expanders: Sparsiﬁers of the Complete Graph In the case that G is a complete graph, our construction pro duces expanders. Ho wev er, these expanders are sligh tly un usual in that their edges hav e w eigh ts, they ma y b e irregular, and the w eighted degrees of v ertices can v ary slightly . This may lead one to ask whether they should really b e considered expanders. In Section 4 w e argue that they should be. As the graphs w e pro duce are irregular and weigh ted, it is also not immediately clear that w e should b e comparing κ with the Raman ujan bound of d + 2 √ d − 1 d − 2 √ d − 1 = 1 + 4 √ d + O (1 /d ) . (2) It is known 2 that no d -regular graph of uniform weigh t can κ -approximate a complete graph for κ asymptotically b etter than (2) [14]. While we b elieve that no graph of aver age degree d can b e a κ -approximation of a complete graph for κ asymptotically b etter than (2), w e are unable to sho w this at the momen t and pro v e instead the w eak er claim that no suc h graph can ac hiev e κ less than 1 + 2 √ d − O √ d n ! . 1.2 Prior W ork Spielman and T eng [17, 19] in tro duced the notion of sparsiﬁcation that we consider, and pro ved that (1 +  )-approximations with e O ( n/ 2 ) edges could b e constructed in e O ( m ) time. They used these sparsiﬁers to obtain a nearly-linear time algorithm for solving diagonally dominan t systems of linear equations [17, 18]. Spielman and T eng were inspired b y the notion of sparsiﬁcation in tro duced by Benczur and Karger [3] for cut problems, which only required inequalit y (1) to hold for all x ∈ { 0 , 1 } V . Benczur and Karger sho wed how to construct graphs H meeting this guarantee with O ( n log n/ 2 ) edges in O  m log 3 n  time; their cut sparsiﬁers ha ve b een used to obtain faster algorithms for cut problems [3, 11]. Spielman and Sriv astav a [16] prov ed the existence of spectral sparsiﬁers with O ( n log n/ 2 ) edges, and show ed ho w to construct them in e O ( m ) time. They conjectured that it should b e p ossible to ﬁnd suc h sparsiﬁers with only O ( n/ 2 ) edges. W e aﬃrmatively resolve this conjecture. Recen tly , partial progress w as made to w ards this conjecture by Goy al, Rademac her and V empala [9], who sho wed how to ﬁnd graphs H with only 2 n edges that O (log n )-appro ximate b ounded degree graphs G under the cut notion of Benczur and Karger. W e remark that all of these constructions were randomized. Ours is the ﬁrst deterministic algorithm to ac hiev e the guaran tees of an y of these pap ers. 2 While lo wer bounds on the sp ectral gap of d -regular graphs fo cus on sho wing that the second-smallest eigen v alue is asymptotically at most d − 2 √ d − 1, the same proofs b y test functions can be used to show that the largest eigenv alue is at asymptotically least d + 2 √ d − 1. 3 2 Preliminaries 2.1 The Incidence Matrix and the Laplacian Let G = ( V , E , w ) b e a connected weigh ted undirected graph with n v ertices and m edges and edge weigh ts w e > 0. If w e orien t the edges of G arbitrarily , w e can write its Laplacian as L = B T W B , where B m × n is the signe d e dge-vertex incidenc e matrix , giv en b y B ( e, v ) =    1 if v is e ’s head − 1 if v is e ’s tail 0 otherwise and W m × m is the diagonal matrix with W ( e, e ) = w e . It is immediate that L is p ositive semideﬁnite since: x T Lx = x T B T W B x = k W 1 / 2 B x k 2 2 = X ( u,v ) ∈ E w u,v ( x u − x v ) 2 ≥ 0 , for ev ery x ∈ R n . and that G is connected if and only if k er( L ) = k er( W 1 / 2 B ) = span( 1 ). 2.2 The Pseudoin verse Since L is symmetric we can diagonalize it and write L = n − 1 X i =1 λ i u i u T i where λ 1 , . . . , λ n − 1 are the nonzero eigenv alues of L and u 1 , . . . , u n − 1 are a corresp onding set of orthonormal eigen v ectors. The Mo or e-Penr ose Pseudoinverse of L is then deﬁned as L + = n − 1 X i =1 1 λ i u i u T i . Notice that k er( L ) = ker( L + ) and that LL + = L + L = n − 1 X i =1 u i u T i , whic h is simply the pro jection onto the span of the nonzero eigen vectors of L (which are also the eigen v ectors of L + ). Th us, LL + = L + L is the iden tit y on im( L ) = k er( L ) ⊥ . 4 2.3 F orm ulas for Rank-one Up dates W e use the follo wing well-kno wn theorem from linear algebra, whic h describ es the behavior of the in v erse of a matrix under rank-one updates (see [8, Section 2.1.3]). Lemma 2.1 (Sherman-Morrison F ormula) . If A is a nonsingular n × n matrix and v is a ve ctor, then ( A + vv T ) − 1 = A − 1 − A − 1 vv T A − 1 1 + v T A − 1 v . There is a related form ula describing the c hange in the determinant of a matrix under the same up date: Lemma 2.2 (Matrix Determinant Lemma) . If A is nonsingular and v is a ve ctor, then det( A + vv T ) = det( A )(1 + v T A − 1 v ) . 3 The Main Result A t the heart of this w ork is the follow ing purely linear algebraic theorem. W e use the notation A  B to mean that B − A is p ositive semideﬁnite, and id S to denote the iden tit y operator on a v ector space S . Theorem 3.1. Supp ose d > 1 and v 1 , v 2 , . . . , v m ar e ve ctors in R n with X i ≤ m v i v T i = id R n . Then ther e exist sc alars s i ≥ 0 with |{ i : s i 6 = 0 }| ≤ dn so that id R n  X i ≤ m s i v i v T i  d + 1 + 2 √ d d + 1 − 2 √ d ! id R n . The sparsiﬁcation result for graphs follo ws quic kly from this theorem as sho wn b elo w. Pr o of of The or em 1.1. Assume without loss of generality that G is connected. W rite L G = B T W B as in Section 2.1 and ﬁx d > 1. Restrict attention to im( L G ) ∼ = R n − 1 and apply Theorem 3.1 to the columns { v i } i ≤ m of V n × m = ( L + G ) 1 2 B T W 1 2 , whic h are indexed b y the edges of G and satisfy X i ≤ m v i v T i = V V T = ( L + G ) 1 2 B T W B ( L + G ) 1 2 = ( L + G ) 1 2 L G ( L + G ) 1 2 = id im( L G ) . 5 W rite the scalars s i ≥ 0 guaran teed by the theorem in the m × m diagonal matrix S ( i, i ) = s i and set L H = B T W 1 2 S W 1 2 B . Then L H is the Laplacian of the subgraph H of G with edge w eights { ˜ w i = w i s i } i ∈ E , and H has at most d ( n − 1) edges since at most that man y of the s i are nonzero. Also, id im( L G )  X i ≤ m s i v i v T i = V S V T  κ · id im( L G ) for κ = d + 1 + 2 √ d d + 1 − 2 √ d . By the Couran t-Fisc her Theorem, this is equiv alen t to: 1 ≤ y T V S V T y y T y ≤ κ ∀ y ∈ im(( L G ) 1 2 ) = im( L G ) ⇐ ⇒ 1 ≤ y T ( L + G ) 1 2 L H ( L + G ) 1 2 y y T y ≤ κ ∀ y ∈ im(( L G ) 1 2 ) ⇐ ⇒ 1 ≤ x T L 1 2 G ( L + G ) 1 2 L H ( L + G ) 1 2 L 1 2 G x x T L 1 2 G L 1 2 G x ≤ κ ∀ x ⊥ 1 ⇐ ⇒ 1 ≤ x T L H x x T L G x ≤ κ ∀ x ⊥ 1 , as desired. It is worth men tioning that the ab o ve reduction is essen tially the same as the one in [16]. In that pap er, the authors consider the symmetric pro jection matrix Π = B L + G B T whose columns { Π e } e ∈ E corresp ond to the edges of G . They show, by a concen tration lemma of Rudelson [15], that randomly sampling O ( n log n ) of the columns with probabilities prop ortional to k Π e k 2 = w e R eﬀ ( e ) (where R eﬀ is the eﬀectiv e resistance) gives a matrix ˜ Π that appro ximates Π in the sp ectral norm and corresp onds to a graph sparsiﬁer, with high probabilit y . In this pap er, we do essen tially the same thing with tw o mo diﬁcations: w e eliminate Π in order to simplify notation, since we are no longer following the in tuition of sampling b y eﬀectiv e resistances; and, instead of Rudelson’s sampling lemma, we use Theorem 3.1 to deterministic al ly select O ( n ) edges (equiv alen tly , columns of Π). The rest of this section is dev oted to proving Theorem 3.1. The pro of is constructive and yields a deterministic p olynomial time algorithm for ﬁnding the scalars s i , whic h can then b e used to sparsify graphs, as advertised. Giv en vectors { v i } , our goal is to choose a small set of co eﬃcients s i so that A = P i s i v i v T i is w ell-conditioned. W e will build the matrix A in steps, starting with A = 0 and adding one v ector s i v i v T i at a time. Before b eginning the proof, it will b e instructiv e to study ho w the eigen v alues and c haracteristic polynomial of a matrix ev olve up on the addition of a vector. This discussion should pro vide some in tuition for the structure of the pro of, and dem ystify the origin of the ‘Twice-Ramanujan’ n umber d +1+2 √ d d +1 − 2 √ d whic h app ears in our ﬁnal result. 6 3.1 In tuition for the Pro of It is w ell known that the eigenv alues of A + vv T in terlace those of A . In fact, the new eigen v alues can be determined exactly b y lo oking a t the characteristic p olynomial of A + vv T , whic h is computed using Lemma 2.2 as follows: p A + vv T ( x ) = det( xI − A − vv T ) = p A ( x ) 1 − X j h v , u j i 2 x − λ i ! , where λ i are the eigen v alues of A and u j are the corresponding eigenv ectors. The p olynomial p A + vv T ( x ) has t w o kinds of zeros λ : 1. Those for whic h p A ( λ ) = 0. These are equal to the eigenv alues λ j of A for which the added vector v is orthogonal to the corresp onding eigenv ector u j , and whic h do not therefore ‘mo v e’ up on adding vv T . 2. Those for whic h p A ( λ ) 6 = 0 and f ( λ ) = 1 − X j h v , u j i 2 λ − λ j ! = 0 . These are the eigen v alues whic h ha v e mo v ed and strictly in terlace the old eigen v alues. The ab o v e equation immediately suggests a simple ph ysical model whic h giv es intuition as to where these new eigen v alues are lo cated. λ 1 λ 2 λ 3 λ n λ 1 λ 2 λ 3 λ n h v , u 2 i 2 = 1 / 4 h v , u n i 2 = 1 / 4 h v , u 1 i 2 = 1 / 2 h v , u 3 i 2 = 0 Figure 1: Physical mo del of interlacing eigen v alues. Ph ysical Mo del. W e in terpret the eigen v alues λ as charged particles lying on a slope. On the slop e are n ﬁxed, c hargeless barriers lo cated at the initial eigen v alues λ j , and 7 eac h particle is resting against one of the barriers under the inﬂuence of gravit y . Adding the v ector vv T corresp onds to placing a charge of h v , u j i 2 on the barrier corresp onding to λ j . The c harges on the barriers rep el those on the eigen v alues with a force that is prop ortional to the charge on the barrier and inv ersely prop ortional to the distance from the barrier — i.e., the force from barrier j is giv en by h v , u j i 2 λ − λ j , a quan tit y whic h is p ositive for λ j ‘b elo w’ λ , which are pushing the partical ‘upw ard’, and negative otherwise. The eigenv alues mov e up the slop e until they reach an equi- librium in which the repulsive forces from the barriers cancel the eﬀect of gravit y , whic h we take to b e a +1 in the do wnw ard direction. Thus the equilibrium condition corresp onds exactly to ha ving the total ‘do wn ward pull’ f ( λ ) equal to zero. With this ph ysical mo del in mind, w e b egin to consider what happ ens to the eigen v alues of A when w e add a r andom v ector from our set { v i } . The ﬁrst observ ation is that for an y eigen vector u j (in fact for any v ector at all), the exp ected pro jection of a randomly c hosen v ∈ { v i } i ≤ m is E v h v , u j i 2 = 1 m X i h v i , u j i 2 = 1 m u T j X i v i v T i ! u j = k u j k 2 m = 1 m . Of course, this do es not mean that there is an y single v ector v i in our set that realizes this ‘exp ected b eha vior’ of equal pro jections on the eigenv ectors. But if we w ere to add such a v ector 3 in our physical model, w e would add equal c harges of 1 /m to eac h of the barriers, and w e would expect all of the eigen v alues of A to drift forw ard ‘steadily’. In fact, one might exp ect that after suﬃciently man y iterations of this pro cess, the eigenv alues would all marc h forw ard together, with no eigen v alue too far ahead or to o far b ehind, and we would end up in a p osition where λ max /λ min is bounded. In fact, this intuition turns out to b e correct. Adding a vector with equal pro jections c hanges the c haracteristic p olynomial in the follo wing manner: p A + v avg v T avg ( x ) = p A ( x ) 1 − X j 1 /m x − λ j ! = p A ( x ) − (1 /m ) p 0 A ( x ) , since p 0 A ( x ) = P j Q i 6 = j ( x − λ i ). If w e start with A = 0, whic h has c haracteristic p olynomial p 0 ( x ) = x n , then after k iterations of this process w e obtain the p olynomial p k ( x ) = ( I − (1 /m ) D ) k x n 3 F or concreteness, we remark that this ‘a verage’ v ector w ould b e precisely v avg = 1 √ m X j u j . 8 where D is the deriv ative with resp ect to x . F ortunately , iterating the op erator ( I − αD ) for an y α > 0 generates a standard family of orthogonal p olynomials – the asso ciate d L aguerr e p olynomials [7]. These p olynomials are very well-studied and the lo cations of their zeros are kno wn; in particular, after k = dn iterations the ratio of the largest to the smallest zero is kno wn [7] to b e d + 1 + 2 √ d d + 1 − 2 √ d , whic h is exactly what w e w an t. T o prov e the theorem, w e will sho w that we can choose a sequence of actual v ectors that realizes the expected b ehavior (i.e. the b eha vior of rep eatedly adding v avg ), as long as w e are allo w ed to add arbitrary fractional amounts of the v i v T i via the w eigh ts s i ≥ 0. W e will con trol the eigenv alues of our matrix b y main taining tw o barriers as in the ph ysical mo del, and keeping the eigenv alues b etw een them. The low er barrier will ‘rep el’ the eigen v alues forw ard; the upp er one will make sure they do not go to o far. The barriers will mo ve forw ard at a steady pace. By maintaining that the total ‘repulsion’ at ev ery step of this pro cess is bounded, we will b e able to guarantee that there is alwa ys some m ultiple of a v ector to add that allo ws us to con tin ue the pro cess. 3.2 Pro of by Barrier F unctions W e b egin b y deﬁning t wo ‘barrier’ p oten tial functions whic h measure the quality of the eigen v alues of a matrix. These p oten tial functions are inspired by the in v erse la w of repulsion in the ph ysical model discussed in the last section. Deﬁnition 3.2. F or u, l ∈ R and A a symmetric matrix with eigenvalues λ 1 , λ 2 , . . . , λ n , deﬁne: Φ u ( A ) def = T r( uI − A ) − 1 = X i 1 u − λ i (Upp er p otential) . Φ l ( A ) def = T r( A − l I ) − 1 = X i 1 λ i − l (L ower p otential) . As long as A ≺ uI and A  l I (i.e., λ max ( A ) < u and λ min ( A ) > l ), these p oten tial functions measure ho w far the eigen v alues of A are from the barriers u and l . In particular, they blo w up as an y eigenv alue approac hes a barrier, since then uI − A (or A − lI ) approac hes a singular matrix. Their strength lies in that they reﬂect the locations of all the eigen v alues sim ultaneously: for instance, Φ u ( A ) ≤ 1 implies that no λ i is within distance one of u , no 2 λ i ’s are at distance 2, no k are at distance k , and so on. In terms of the ph ysical mo del, the upp er p oten tial Φ u ( A ) is equal to the total repulsion of the eigen v alues of A from the upp er barrier u , while Φ l ( A ) is the analogous quantit y for the lo w er barrier. T o pro ve the theorem, w e will build the sum P i s i v i v T i iterativ ely , adding one v ector at a time. Sp eciﬁcally , w e will construct a sequence of matrices 0 = A (0) , A (1) , . . . , A ( Q ) 9 along with p ositiv e constan ts 4 u 0 , l 0 , δ U , δ L ,  U and  L whic h satisfy the follo wing conditions: (a) Initially , the barriers are at u = u 0 and l = l 0 and the p oten tials are Φ u 0 ( A (0) ) =  U and Φ l 0 ( A (0) ) =  L . (b) Each matrix is obtained b y a rank-one up date of the previous one — sp eciﬁcally b y adding a p ositiv e m ultiple of an outer product of some v i . A ( q +1) = A ( q ) + t vv T for some v ∈ { v i } and t ≥ 0. (c) If w e increment the barriers u and l b y δ U and δ L resp ectiv ely at eac h step, then the upp er and lo wer p otentials do not increase. F or every q = 0 , 1 , . . . Q , Φ u + δ U ( A ( q +1) ) ≤ Φ u ( A ( q ) ) ≤  U for u = u 0 + q δ U . Φ l + δ L ( A ( q +1) ) ≤ Φ l ( A ( q ) ) ≤  L for l = l 0 + q δ L . (d) No eigen v alue ev er jumps across a barrier. F or ev ery q = 0 , 1 , . . . Q , λ max ( A ( q ) ) < u 0 + q δ U and λ min ( A ( q ) ) > l 0 + q δ L . T o complete the pro of w e will c ho ose u 0 , l 0 , δ U , δ L ,  U and  L so that after Q = dn steps, the condition n um b er of A ( Q ) is bounded b y λ max ( A ( Q ) ) λ min ( A ( Q ) ) ≤ u 0 + dnδ U l 0 + dnδ L = d + 1 + 2 √ d d + 1 − 2 √ d . By construction, A ( Q ) is a w eigh ted sum of at most dn of the v ectors, as desired. The main technical c hallenge is to sho w that conditions (b) and (c) can b e satisﬁed sim ultaneously — i.e., that there is alw a ys a choice of vv T to add to the current matrix whic h allo ws us to shift b oth barriers up b y a constant without increasing either potential. W e ac hieve this in the follo wing three lemmas. The ﬁrst lemma concerns shifting the upper barrier. If w e shift u forw ard to u + δ U without c hanging the matrix A , then the upp er p oten tial Φ u ( A ) decreases since the eigenv alues λ i do not mo v e and u mo v es a wa y from them. This giv es us room to add some m ultiple of a v ector t vv T , whic h will mo ve the λ i to wards l and increase the p oten tial, coun teracting the initial decrease due to shifting. The following lemma quantiﬁes exactly ho w muc h of a given vv T w e can add without increasing the potential b ey ond its original v alue before shifting. 4 On ﬁrst reading the paper, we suggest the reader follow the pro of with the assignmen t  U =  L = 1, u 0 = n , l 0 = − n , δ U = 2, δ L = 1 / 3. This will provide the b ound (6 d + 1) / ( d − 1), and eliminates the need to use Claim 3.6. 10 Lemma 3.3 (Upper Barrier Shift) . Supp ose λ max ( A ) < u , and v is any ve ctor. If 1 t ≥ v T (( u + δ U ) I − A ) − 2 v Φ u ( A ) − Φ u + δ U ( A ) + v T (( u + δ U ) I − A ) − 1 v def = U A ( v ) then Φ u + δ U ( A + t vv T ) ≤ Φ u ( A ) and λ max ( A + t vv T ) < u + δ U . That is, if we add t times vv T to A and shift the upp er b arrier by δ U , then we do not incr e ase the upp er p otential. W e remark that U A ( v ) is linear in the outer product vv T . Pr o of. Let u 0 = u + δ U . By the Sherman-Morrison form ula, we can write the up dated p oten tial as: Φ u + δ U ( A + t vv T ) = T r( u 0 I − A − t vv T ) − 1 = T r  ( u 0 I − A ) − 1 + t ( u 0 I − A ) − 1 vv T ( u 0 I − A ) − 1 1 − t v T ( u 0 I − A ) − 1 v  = T r( u 0 I − A ) − 1 + t T r( v T ( u 0 I − A ) − 1 ( u 0 I − A ) − 1 v ) 1 − t v T ( u 0 I − A ) − 1 v since T r is linear and T r( X Y ) = T r( Y X ) = Φ u + δ U ( A ) + t v T ( u 0 I − A ) − 2 v 1 − t v T ( u 0 I − A ) − 1 v = Φ u ( A ) − (Φ u ( A ) − Φ u + δ U ( A )) + v T ( u 0 I − A ) − 2 v 1 /t − v T ( u 0 I − A ) − 1 v As U A ( v ) > v T ( u 0 I − A ) − 1 v , the last term is ﬁnite for 1 /t ≥ U A ( v ). By now substituting an y 1 /t ≥ U A ( v ) w e ﬁnd Φ u + δ U ( A + t vv T ) ≤ Φ u ( A ). This also tells us that λ max ( A + t vv T ) < u + δ U , as if this were not the case, then there w ould b e some p ositiv e t 0 ≤ t for whic h λ max ( A + t 0 vv T ) = u + δ U . But, at such a t 0 , Φ u + δ U ( A + t 0 vv T ) would blo w up, and w e ha v e just established that it is ﬁnite. The second lemma is ab out shifting the lo wer barrier. Here, shifting l forw ard to l + δ L while k eeping A ﬁxed has the opposite eﬀect — it increases the lo w er p oten tial Φ l ( A ) since the barrier l mo v es tow ards the eigen v alues λ i . Adding a multiple of a v ector t vv T will mo ve the λ i forw ard and a wa y from the barrier, decreasing the p otential. Here, w e quantify exactly ho w muc h of a given vv T w e need to add to comp ensate for the initial increase from shifting l , and return the p oten tial to its original v alue before the shift. Lemma 3.4 (Low er Barrier Shift) . Supp ose λ min ( A ) > l , Φ l ( A ) ≤ 1 /δ L , and v is any ve ctor. If 0 < 1 t ≤ v T ( A − ( l + δ L ) I ) − 2 v Φ l + δ L ( A ) − Φ l ( A ) − v T ( A − ( l + δ L ) I ) − 1 v def = L A ( v ) 11 then Φ l + δ L ( A + t vv T ) ≤ Φ l ( A ) and λ min ( A + t vv T ) > l + δ L . That is, if we add t times vv T to A and shift the lower b arrier by δ L , then we do not incr e ase the lower p otential. Pr o of. First, observ e that λ min ( A ) > l and Φ l ( A ) ≤ 1 /δ L imply that λ min ( A ) > l + δ L . So, for ev ery t > 0, λ min ( A + t vv T ) > l + δ L . No w pro ceed as in the proof for the upp er p oten tial. Let l 0 = l + δ L . By Sherman- Morrison, w e ha v e: Φ l + δ L ( A + t vv T ) = T r( A + t vv T − l 0 I ) − 1 = T r  ( A − l 0 I ) − 1 − t ( A − l 0 I ) − 1 vv T ( A − l 0 I ) − 1 1 + t v T ( A − l 0 ) − 1 v  = T r( A − l 0 I ) − 1 − t T r( v T ( A − l 0 I ) − 1 ( A − l 0 I ) − 1 v ) 1 + t v T ( A − l 0 I ) − 1 v = Φ l + δ L ( A ) − t v T ( A − l 0 I ) − 2 v 1 + t v T ( A − l 0 I ) − 1 v = Φ l ( A ) + (Φ l + δ L ( A ) − Φ l ( A )) − v T ( A − l 0 I ) − 2 v 1 /t + v T ( A − l 0 I ) − 1 v Rearranging sho ws that Φ l + δ L ( A + t vv T ) ≤ Φ l ( A ) when 1 /t ≤ L A ( v ). The third lemma identiﬁes the conditions under which w e can ﬁnd a single t vv T whic h allo ws us to main tain b oth potentials while shifting barriers, and thereby contin ue the pro- cess. The pro of that suc h a v ector exists is by an av eraging argumen t, so this can b e seen as the step in whic h we relate the b eha vior of actual vectors to the b eha vior of the exp ected v ector v avg . Notice that the use of v ariable w eights t , from which the ev entual s i arise, is crucial to this part of the pro of. Lemma 3.5 (Both Barriers) . If λ max ( A ) < u , λ min ( A ) > l , Φ u ( A ) ≤  U , Φ l ( A ) ≤  L , and  U ,  L , δ U and δ L satisfy 0 ≤ 1 δ U +  U ≤ 1 δ L −  L (3) then ther e exists an i and p ositive t for which L A ( v i ) ≥ 1 /t ≥ U A ( v i ) , λ max ( A + t v i v T i ) < u + δ U , and λ min ( A + t v i v T i ) > l + δ L . Pr o of. W e will show that X i L A ( v i ) ≥ X i U A ( v i ) , 12 from whic h the claim will follow b y Lemmas 3.3 and 3.4. W e begin b y b ounding X i U A ( v i ) = P i v T i (( u + δ U ) I − A ) − 2 v i Φ u ( A ) − Φ u + δ U ( A ) + X i v T i (( u + δ U ) I − A ) − 1 v i = (( u + δ U ) I − A ) − 2 • ( P i v i v T i ) Φ u ( A ) − Φ u + δ U ( A ) + (( u + δ U ) I − A ) − 1 • X i v i v T i ! = T r(( u + δ U ) I − A ) − 2 Φ u ( A ) − Φ u + δ U ( A ) + T r(( u + δ U ) I − A ) − 1 since X i v i v T i = I and X • I = T r( X ) = P i ( u + δ U − λ i ) − 2 P i ( u − λ i ) − 1 − P i ( u + δ U − λ i ) − 1 + Φ u + δ U ( A ) = P i ( u + δ U − λ i ) − 2 δ U P i ( u − λ i ) − 1 ( u + δ U − λ i ) − 1 + Φ u + δ U ( A ) ≤ 1 /δ U + Φ u + δ U ( A ) , as X i ( u − λ i ) − 1 ( u + δ U − λ i ) − 1 ≥ X i ( u + δ U − λ i ) − 2 ≤ 1 /δ U + Φ u ( A ) ≤ 1 /δ U +  U . On the other hand, we ha ve X i L A ( v i ) = P i v T i (( A − ( l + δ L )) − 2 v i Φ l + δ L ( A ) − Φ l ( A ) − X i v T i ( A − ( l + δ L ) I ) − 1 v i = ( A − ( l + δ L ) I ) − 2 • ( P i v i v T i ) Φ l + δ L ( A ) − Φ l ( A ) − ( A − ( l + δ L ) I ) − 1 • X i v i v T i ! = T r( A − ( l + δ L ) I ) − 2 Φ l + δ L ( A ) − Φ l ( A ) − T r( A − ( l + δ L ) I ) − 1 since X i v i v T i = I and X • I = T r( X ) = P i ( λ i − l − δ L ) − 2 P i ( λ i − l − δ L ) − 1 − P i ( λ i − l ) − 1 − X i ( λ i − l − δ L ) − 1 . ≥ 1 /δ L − X i ( λ i − l ) − 1 = 1 /δ L −  L , b y Claim 3.6. Putting these together, w e ﬁnd that X i U A ( v i ) ≤ 1 δ U +  U ≤ 1 δ L −  L ≤ X i L A ( v i ) , as desired. 13 Claim 3.6. If λ i > l for al l i , 0 ≤ P i ( λ i − l ) − 1 ≤  L , and 1 /δ L −  L ≥ 0 , then P i ( λ i − l − δ L ) − 2 P i ( λ i − l − δ L ) − 1 − P i ( λ i − l ) − 1 − X i 1 λ i − l − δ L ≥ 1 δ L − X i 1 λ i − l . (4) Pr o of. W e ha ve δ L ≤ 1 / L ≤ λ i − l , for ev ery i . So, the denominator of the left-most term on the left-hand side is positive, and the claimed inequalit y is equiv alen t to X i ( λ i − l − δ L ) − 2 ≥ X i 1 λ i − l − δ L − X i 1 λ i − l ! 1 δ L + X i 1 λ i − l − δ L − X i 1 λ i − l ! = δ L X i 1 ( λ i − l − δ L )( λ i − l ) ! 1 δ L + δ L X i 1 ( λ i − l − δ L )( λ i − l ) ! = X i 1 ( λ i − l − δ L )( λ i − l ) + δ L X i 1 ( λ i − l − δ L )( λ i − l ) ! 2 , whic h, b y mo ving the ﬁrst term on the RHS to the LHS, is just δ L X i 1 ( λ i − l − δ L ) 2 ( λ i − l ) ≥ δ L X i 1 ( λ i − l − δ L )( λ i − l ) ! 2 . By Cauc h y-Sch wartz, δ L X i 1 ( λ i − l − δ L )( λ i − l ) ! 2 ≤ δ L X i 1 λ i − l ! δ L X i 1 ( λ i − l − δ L ) 2 ( λ i − l ) ! ≤ ( δ L  L ) δ L X i 1 ( λ i − l − δ L ) 2 ( λ i − l ) ! since X ( λ i − l ) − 1 ≤  L ≤ 1 δ L X i 1 ( λ i − l − δ L ) 2 ( λ i − l ) ! , since 1 δ L −  L ≥ 0 , and so (4) is established. 14 Pr o of of The or em 3.1. All w e need to do now is set  U ,  L , δ U , and δ L in a manner that satisﬁes Lemma 3.5 and giv es a go o d b ound on the condition n umber. Then, w e can tak e A (0) = 0 and construct A ( q +1) from A ( q ) b y c ho osing an y v ector v i with L A ( q ) ( v i ) ≥ U A ( q ) ( v i ) (suc h a v ector is guaranteed to exist by Lemma 3.5) and setting A ( q +1) = A ( q ) + t v i v T i for an y t ≥ 0 satisfying: L A ( q ) ( v i ) ≥ 1 t ≥ U A ( q ) ( v i ) . It is suﬃcien t to tak e δ L = 1  L = 1 √ d l 0 = − n/ L δ U = √ d + 1 √ d − 1  U = √ d − 1 d + √ d u 0 = n/ U . W e can c hec k that: 1 δ U +  U = √ d − 1 √ d + 1 + √ d − 1 √ d ( √ d + 1) = 1 − 1 √ d = 1 δ L −  L so that (3) is satisﬁed. The initial p oten tials are Φ n  U (0) =  U and Φ n  L (0) =  L . After dn steps, w e ha v e λ max ( A ( dn ) ) λ min ( A ( dn ) ) ≤ n/ U + dnδ U − n/ L + dnδ L = d + √ d √ d − 1 + d √ d +1 √ d − 1 d − √ d = d + 2 √ d + 1 d − 2 √ d + 1 , as desired. T o turn this pro of in to an algorithm, one must ﬁrst compute the v ectors v i , which can b e done in time O ( n 2 m ). F or eac h iteration of the algorithm, we m ust compute (( u + δ U ) I − A ) − 1 , (( u + δ U ) I − A ) − 2 , and the same matrices for the lo wer p oten tial function. This computation can b e p erformed in time O ( n 3 ). Finally , we can decide whic h edge to add in eac h iteration b y computing U A ( v i ) and L A ( v i ) for eac h edge, whic h can b e done in time O ( n 2 m ). As we run for dn iterations, the total time of the algorithm is O ( dn 3 m ). 15 4 Sparsiﬁers of the Complete Graph Let G = ( V , E ) b e the complete graph on n vertices, and let H = ( V , F , w ) b e a weigh ted graph of a verage degree d that (1 +  )-approximates G . As x T L G x = n k x k 2 for every x orthogonal to 1 , it is immediate that ev ery vertex of H has weighed degree b et ween n and (1 +  ) n . Th us, one should think of H as b eing an expander graph in which eac h edge w eight has been m ultiplied b y n/d . As H is w eighted and can be irregular, it may at ﬁrst seem strange to view it as an expander. Ho w ever, it may easily b e shown to ha v e the prop erties that deﬁne expanders: it has high edge-conductance, random w alks mix rapidly on H and conv erge to an almost- uniform distribution, and it satisﬁes the Expander Mixing Prop erty (see [2] or [10, Lemma 2.5]). High edge-conductance and rapid mixing w ould not b e so in teresting if the w eighted degrees w ere not nearly uniform — for example, the star graph has b oth of these prop erties, but the random w alk on the star graph conv erges to a v ery non-uniform distribution, and the star do es not satisfy the Expander Mixing Prop ert y . F or the conv enience of the reader, w e include a pro of that H has the Expander Mixing Prop ert y below. Lemma 4.1. L et L H = ( V , E , w ) b e a gr aph that (1 +  ) -appr oximates L G , the c omplete gr aph on V . Then, for every p air of disjoint sets S and T ,    w ( S, T ) −  1 +  2  | S | | T |    ≤ n ( / 2) p | S | | T | , wher e w ( S, T ) denotes the sum of the weights of e dges b etwe en S and T . Pr o of. W e ha ve −  2 L G  L H −  1 +  2  L G   2 L G , so w e can write L H =  1 +  2  L G + M , where M is a matrix of norm at most ( / 2) k L G k ≤ n/ 2. Let x be the c haracteristic v ector of S , and let y b e the characteristic v ector of T . W e ha ve − w ( S, T ) = x T L H y . As G is the complete graph and S and T are disjoin t, w e also know x T L G y = − | S | | T | . Th us, x T L H y =  1 +  2  x T L G y + x T M y = −  1 +  2  | S | | T | + x T M y . The lemma no w follo ws b y observing that x T M y ≤ k M k k x k k y k ≤ n ( / 2) p | S | | T | . 16 Using the pro of of the low er b ound on the sp ectral gap of Alon and Boppana (see [14]) one can show that a d -regular unw eighted graph cannot κ -approximate a complete graph for κ asymptotically better than (2). W e conjecture that this bound also holds for weigh ted graphs of a verage degree d . Presently , w e prov e the follo wing w eaker result for suc h graphs. Prop osition 4.2. L et G b e the c omplete gr aph on vertex set V , and let H = ( V , E , w ) b e a weighte d gr aph with n vertic es and a vertex of de gr e e d . If H κ -appr oximates G , then κ ≥ 1 + 2 √ d − O √ d n ! . Pr o of. W e use a standard approac h. Suppose H is a κ -appro ximation of the complete graph. W e will construct v ectors x ∗ and y ∗ orthogonal to the 1 v ector so that y ∗ T L H y ∗ x ∗ T L H x ∗ k x ∗ k 2 k y ∗ k 2 is large, and this will giv e us a lo wer b ound on κ . Let v 0 b e the vertex of degree d , and let its neighbors b e v 1 , . . . , v d . Supp ose v i is connected to v 0 b y an edge of w eight w i , and the total weigh t of the edges b etw een v i and v ertices other than v 0 , v 1 , . . . , v d is δ i . W e b egin by considering v ectors x and y with x ( u ) =      1 for u = v 0 , 1 / √ d for u = v i , i ≥ 1 , 0 for u 6∈ { v 0 , . . . , v d } y ( u ) =      1 for u = v 0 , − 1 / √ d for u = v i , i ≥ 1 , 0 for u 6∈ { v 0 , . . . , v d } These v ectors are not orthogonal to 1 , but w e will tak e care of that later. It is easy to compute the v alues taken by the quadratic form at x and y : x T L H x = d X i =1 w i (1 − 1 / √ d ) 2 + d X i =1 δ i (1 / √ d − 0) 2 = d X i =1 w i + d X i =1 ( δ i + w i ) /d − 2 d X i =1 w i / √ d and y T L H y = d X i =1 w i (1 + 1 / √ d ) 2 + d X i =1 δ i ( − 1 / √ d − 0) 2 = d X i =1 w i + d X i =1 ( δ i + w i ) /d + 2 d X i =1 w i / √ d. 17 The ratio in question is th us y T L H y x T L H x = P i w i + P i ( δ i + w i ) /d + 2 P i w i / √ d P i w i + P i ( δ i + w i ) /d − 2 P i w i / √ d = 1 + 1 √ d 2 P i w i P i w i + P i ( δ i + w i ) /d 1 − 1 √ d 2 P i w i P i w i + P i ( δ i + w i ) /d . Since H is a κ -approximation, all w eigh ted degrees m ust lie betw een n and nκ , which giv es 2 P i w i P i w i + P i ( δ i + w i ) /d = 2 1 + P i ( δ i + w i ) /d P i w i ≥ 2 1 + κ . Therefore, y T L H y x T L H x ≥ 1 + 1 √ d 2 1+ κ 1 − 1 √ d 2 1+ κ . (5) Let x ∗ and y ∗ b e the pro jections of x and y respectively orthogonal to the 1 vector. Then k x ∗ k 2 = k x k 2 − h x, 1 / √ n i 2 = 2 − (1 + √ d ) 2 n and k y ∗ k 2 = k y k 2 − h y , 1 / √ n i 2 = 2 − (1 − √ d ) 2 n so that as n → ∞ k x ∗ k 2 k y ∗ k 2 = 1 − O √ d n ! . (6) Com bining (5) and (6), w e conclude that asymptotically: y ∗ T L H y ∗ x ∗ T L H x ∗ k x ∗ k 2 k y ∗ k 2 ≥ 1 + 1 √ d 2 1+ κ 1 − 1 √ d 2 1+ κ . 1 − O √ d n !! But b y our assumption the LHS is at most κ , so we hav e κ ≥ 1 + 1 √ d 2 1+ κ 1 − 1 √ d 2 1+ κ . 1 − O √ d n !! whic h on rearranging giv es κ ≥ 1 + 2 √ d − O √ d n ! as desired. 18 5 Conclusion W e conclude b y dra wing a connection b et ween Theorem 3.1 and an outstanding op en problem in mathematics, the Kadison-Singer conjecture. This conjecture, whic h dates bac k to 1959, is equiv alen t to the w ell-known Pa ving Conjecture [1, 5] as w ell as to a stronger form of the restricted inv ertibility theorem of Bourgain and Tzafriri [4, 5]. The follo wing form ulation is due to Nik W ea v er [20]. Conjecture 5.1. Ther e ar e universal c onstants  > 0 , δ > 0 , and r ∈ N for which the fol lowing statement holds. If v 1 , . . . , v m ∈ R n satisfy k v i k ≤ δ for al l i and X i ≤ m v i v T i = I , then ther e is a p artition X 1 , . . . X r of { 1 , . . . , m } for which       X i ∈ X j v i v T i       ≤ 1 −  for every j = 1 , . . . , r . Supp ose we had a v ersion of Theorem 3.1 which, assuming k v i k ≤ δ , guaranteed that the scalars s i w ere all either 0 or some constant β > 0, and gav e a constan t approximation factor κ < β . Then w e w ould ha ve I  β X i ∈ S v i v T i  κ · I , for S = { i : s i 6 = 0 } , yielding a pro of of Conjecture 5.1 with r = 2 and  = min { 1 − κ β , 1 β } since      X i ∈ S v i v T i      ≤ κ β ≤ 1 −  and       X i ∈ S v i v T i       = 1 − λ min X i ∈ S v i v T i ! ≤ 1 − 1 β ≤ 1 − . As a sp ecial case, suc h a theorem w ould also imply the existence of unweighte d sparsiﬁers for the complete graph and other (suﬃcien tly dense) edge-transitiv e graphs. It is also worth noting that the k v i k ≤ δ condition when applied to vectors { Π e } e ∈ E arising from a graph simply means that the eﬀective resistances of all edges are b ounded; th us, w e would be able to conclude that any graph with suﬃcien tly small resistances can be split in to tw o graphs that appro ximate it sp ectrally . 19 References [1] C. A. Akemann and J. Anderson. Ly apuno v theorems for op erator algebras. Mem. A mer. Math. So c. , 94, 1991. [2] Noga Alon and F an Chung. Explicit construction of linear sized tolerant netw orks. Discr ete Mathematics , 72:15–19, 1988. [3] Andr´ as A. Bencz ´ ur and David R. Karger. Appro ximating s-t minim um cuts in O(n 2 ) time. In STOC ’96 , pages 47–55, 1996. [4] J. Bourgain and L. Tzafriri. On a problem of Kadison and Singer. J. R eine A ngew. Math. , 420:1–43, 1991. [5] Peter G. Casazza and Janet C. T remain. The Kadison-Singer problem in mathematics and engineering. Pr o c e e dings of the National A c ademy of Scienc es of the Unite d States of Americ a , 103(7):2032–2039, 2006. [6] P . Chew. There is a planar graph almost as go od as the complete graph. In SoCG ’86 , pages 169–177, 1986. [7] H. Dette and W. J. Studden. Some new asymptotic properties for the zeros of Jacobi, Laguerre, and Hermite p olynomials. Constructive Appr oximation , 11(2):227–238, 1995. [8] G. H. Golub and C. F. V an Loan. Matrix Computations, 3r d. Edition . The Johns Hopkins Univ ersit y Press, Baltimore, MD, 1996. [9] Navin Go y al, Luis Rademac her, and Santosh V empala. Expanders via random spanning trees. In SODA ’09 , pages 576–585, 2009. [10] Shlomo Ho ory , Nathan Linial, and Avi Wigderson. Expander graphs and their applica- tions. Bul letin of the A meric an Mathematic al So ciety , 43(4):439–561, 2006. [11] Rohit Khandek ar, Satish Rao, and Umesh V azirani. Graph partitioning using single commo dit y ﬂo ws. In STOC ’06 , pages 385–390, 2006. [12] A. Lub otzky , R. Phillips, and P . Sarnak. Ramanujan graphs. Combinatoric a , 8(3):261– 277, 1988. [13] G. A. Margulis. Explicit group theoretical constructions of combinatorial schemes and their application to the design of expanders and concentrators. Pr oblems of Information T r ansmission , 24(1):39–46, 1988. [14] Alon Nilli. On the second eigen v alue of a graph. Discr ete Mathematics , 91(2):207–210, 1991. [15] Mark Rudelson. Random vectors in the isotropic position. J. of F unctional Analysis , 163(1):60–72, 1999. 20 [16] Daniel A. Spielman and Nikhil Sriv asta v a. Graph sparsiﬁcation by eﬀective resistances. In STOC ’08 , pages 563–568, 2008. [17] Daniel A. Spielman and Shang-Hua T eng. Nearly-linear time algorithms for graph partitioning, graph sparsiﬁcation, and solving linear systems. In STOC ’04 , pages 81– 90, 2004. [18] Daniel A. Spielman and Shang-Hua T eng. Nearly-linear time algorithms for pre- conditioning and solving symmetric, diagonally dominant linear systems. CoRR , abs/cs/0607105, 2008. Av ailable at . [19] Daniel A. Spielman and Shang-Hua T eng. Sp ectral sparsiﬁcation of graphs. CoRR , abs/0808.4134, 2008. Av ailable at . [20] Nik W eav er. The Kadison-Singer problem in discrepancy theory. Discr ete Mathematics , 278(1-3):227 – 239, 2004. 21

Twice-Ramanujan Sparsifiers

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment