Grothendieck-type inequalities in combinatorial optimization

GR OTHENDIECK-TYPE INEQUALITIES IN COMBINA TORIAL OPTIMIZA TION SUBHASH KHOT AND ASSAF NA OR Abstract. W e surv ey connections of the Grothendieck inequality and its v ariants to com- binatorial optimization and computational complexit y . Contents 1. In tro duction 2 1.1. Assumptions from computational complexit y 3 1.2. Con v ex and semideﬁnite programming 4 2. Applications of the classical Grothendiec k inequalit y 5 2.1. Cut norm estimation 5 2.1.1. Szemer ´ edi partitions 8 2.1.2. F rieze-Kannan matrix decomp osition 10 2.1.3. Maxim um acyclic subgraph 11 2.1.4. Linear equations mo dulo 2 14 2.2. Rounding 16 3. The Grothendiec k constan t of a graph 18 3.1. Algorithmic consequences 20 3.1.1. Spin glasses 20 3.1.2. Correlation clustering 20 4. Kernel clustering and the prop eller conjecture 21 5. The L p Grothendiec k problem 25 6. Higher rank Grothendiec k inequalities 28 7. Hardness of appro ximation 29 References 31 S. K. w as partially supp orted by NSF CAREER grant CCF-0833228, NSF Exp editions gran t CCF- 0832795, an NSF W aterman aw ard, and BSF grant 2008059. A. N. was partially supp orted by NSF Exp e- ditions gran t CCF-0832795, BSF grant 2006009, and the Pac k ard F oundation. 1 1. Introduction The Grothendiec k inequalit y asserts that there exists a univ ersal constan t K ∈ (0 , ∞ ) suc h that for ev ery m, n ∈ N and every m × n matrix A = ( a ij ) with real en tries w e hav e max ( m X i =1 n X j =1 a ij h x i , y j i : { x i } m i =1 , { y j } n j =1 ⊆ S n + m − 1 ) 6 K max ( m X i =1 n X j =1 a ij ε i δ j : { ε i } m i =1 , { δ j } n j =1 ⊆ {− 1 , 1 } ) . (1) Here, and in what follo ws, the standard scalar pro duct on R k is denoted h x, y i = P k i =1 x i y i and the Euclidean sphere in R k is denoted S k − 1 = { x ∈ R k : P k i =1 x 2 i = 1 } . W e refer to [34, 56] for the simplest known pro ofs of the Grothendieck inequality; see Section 2.2 for a pro of of (1) yielding the b est kno wn b ound on K . Grothendiec k prov ed the inequalit y (1) in [45], though it w as stated there in a diﬀeren t, but equiv alen t, form. The formulation of the Grothendiec k inequalit y app earing in (1) is due to Lindenstrauss and Pe lczy ´ nski [83]. The Grothendiec k inequalit y is of ma jor imp ortance to sev eral areas, ranging from Banac h space theory to C ∗ algebras and quan tum information theory . W e will not attempt to indicate here this wide range of applications of (1), and refer instead to [83, 114, 100, 55, 37, 34, 19, 1, 40, 33, 102, 101] and the references therein. The purp ose of this survey is to fo cus solely on applications of the Grothendiec k inequalit y and its v arian ts to com binatorial optimization, and to explain their connections to computational complexit y . The inﬁmum ov er those K ∈ (0 , ∞ ) for whic h (1) holds for all m, n ∈ N and all m × n matrices A = ( a ij ) is called the Grothendiec k constant, and is denoted K G . Ev aluating the exact v alue of K G remains a long-standing open problem, posed b y Grothendiec k in [45]. In fact, even the second digit of K G is curren tly unkno wn, though clearly this is of lesser imp ortance than the issue of understanding the structure of matrices A and spherical conﬁg- urations { x i } m i =1 , { y j } n j =1 ⊆ S n + m − 1 whic h mak e the inequality (1) “most diﬃcult”. F ollo wing a series of in v estigations [45, 83, 107, 77, 78], the b est known upp er b ound [21] on K G is K G < π 2 log  1 + √ 2  = 1 . 782 ..., (2) and the b est kno wn lo w er b ound [105] on K G is K G > π 2 e η 2 0 = 1 . 676 ..., (3) where η 0 = 0 . 25573 ... is the unique solution of the equation 1 − 2 r 2 π Z η 0 e − z 2 / 2 dz = 2 π e − η 2 . In [104] the problem of estimating K G up to an additive error of ε ∈ (0 , 1) was reduced to an optimization o ver a compact space, and by exhaustive search o v er an appropriate net it was sho wn that there exists an algorithm that computes K G up to an additiv e error of ε ∈ (0 , 1) in time exp(exp( O (1 /ε 3 ))). It do es not seem likely that this approac h can yield computer assisted proofs of estimates such as (2) and (3), though to the best of our kno wledge this has not b een attempted. 2 In the ab o v e discussion w e fo cused on the classical Grothendiec k inequalit y (1). Ho w ev er, the literature contains sev eral v ariants and extensions of (1) that hav e b een in tro duced for v arious purp oses and applications in the decades following Grothendieck’s original w ork. In this survey w e describ e some of these v arian ts, emphasizing relativ ely recent dev elop- men ts that yielded Grothendiec k-t yp e inequalities that are a useful to ol in the design of p olynomial time algorithms for computing appro ximate solutions of computationally hard optimization problems. In doing so, we omit some imp ortan t topics, including applications of the Grothendiec k inequalit y to communication complexity and quantum information the- ory . While these research directions can b e view ed as dealing with a type of optimization problem, they are of a diﬀerent nature than the applications describ ed here, which belong to classical optimization theory . Connections to comm unication complexit y hav e already been co v ered in the survey of Lee and Shraibman [81]; w e refer in addition to [84, 80, 85, 86] for more information on this topic. An explanation of the relation of the Grothendieck inequalit y to quantum mechanics is con tained in Section 19 of Pisier’s surv ey [101], the pioneering w ork in this direction b eing that of Tsirelson [114]. An in v estigation of these questions from a computational complexity p oin t of view was initiated in [28], where it was sho wn, for example, how to obtain a p olynomial time algorithm for computing the entan- gled v alue of an XOR game based on Tsirelson’s w ork. W e hop e that the dev elopmen ts surrounding applications of the Grothendiec k inequalit y in quan tum information theory will ev en tually be surv ey ed separately b y exp erts in this area. In terested readers are referred to [114, 37, 28, 1, 54, 98, 102, 61, 22, 80, 86, 106, 101]. P erhaps the most inﬂuential v arian ts of the Grothendiec k inequalit y are its noncomm utativ e generalizations. The noncomm uta- tiv e versions in [99, 49] were conjectured by Grothendieck himself [45]; additional extensions to op erator spaces are extensiv ely discussed in Pisier’s survey [101]. W e will not describ e these developmen ts here, ev en though we believe that they migh t ha v e applications to op- timization theory . Finally , m ulti-linear extensions of the Grothendiec k inequalit y ha v e also b een in v estigated in the literature; see for example [115, 112, 20, 109] and esp ecially Blei’s b o ok [19]. W e will not cov er this research direction since its relation to classical combinato- rial optimization has not (yet?) b een established, though there are recent inv estigations of m ulti-linear Grothendiec k inequalities in the con text of quan tum information theory [98, 80]. Being a mainstay of functional analysis, the Grothendieck inequality might attract to this surv ey readers who are not familiar with appro ximation algorithms and computational complexit y . W e wish to encourage suc h readers to p ersist b ey ond this introduction so that they will b e exp osed to, and hop efully ev en tually contribute to, the use of analytic tools in com binatorial optimization. F or this reason w e include Sections 1.1, 1.2 b elow; t w o very basic in tro ductory sections in tended to quic kly provi de background on computational complexity and con v ex programming for non-exp erts. 1.1. Assumptions from computational complexit y . At present there are few uncondi- tional results on the limitations of p olynomial time computation. The standard practice in this ﬁeld is to frame an imp ossibility result in computational complexit y by asserting that the p olynomial time solv ability of a certain algorithmic task w ould con tradict a b enc hmark h yp othesis. W e brieﬂy describ e b elow tw o key hypotheses of this type. A graph G = ( V , E ) is 3-colorable if there exists a partition { C 1 , C 2 , C 3 } of V suc h that for ev ery i ∈ { 1 , 2 , 3 } and u, v ∈ C i w e hav e { u, v } / ∈ E . The P 6 = N P h yp othesis as- serts that there is no p olynomial time algorithm that tak es an n -vertex graph as input and 3 determines whether or not it is 3-colorable. W e are doing an injustice to this imp ortan t question b y stating it this wa y , since it has many far-reac hing equiv alent formulations. W e refer to [39, 108, 31] for more information, but for non-experts it suﬃces to k eep the ab o v e simple form ulation in mind. When w e say that assuming P 6 = N P no p olynomial time algorithm can p erform a certain task T (e.g., ev aluating the maxim um of a certain function up to a predetermined error) we mean that given an algorithm ALG that p erforms the task T one can design an algorithm ALG 0 that determines whether or not any input graph is 3-colorable while making at most p olynomially many calls to the algorithm ALG , with at most p olynomially man y additional T uring machine steps. Thus, if ALG w ere a p olynomial time algorithm then the same w ould b e true for ALG 0 , con tradicting the P 6 = N P h yp othesis. Suc h results are called hardness results. The message that non-exp erts should k eep in mind is that a hardness result is nothing more than the design of a new algorithm for 3-colorability , and if one accepts the P 6 = N P hypothesis then it implies that there must exist inputs on which ALG takes sup er- p olynomial time to terminate. The Unique Games Conjecture (UGC) asserts that for ev ery ε ∈ (0 , 1) there exists a prime p = p ( ε ) ∈ N suc h that no p olynomial time algorithm can p erform the follo wing task. The input is a system of m linear equations in n v ariables x 1 , . . . , x n , each of whic h has the form x i − x j ≡ c ij mo d p (th us the input is S ⊆ { 1 , . . . , n } × { 1 , . . . , n } and { c ij } ( i,j ) ∈ S ⊆ N ). The algorithm must determine whether there exists an assignment of an integer v alue to eac h v ariable x i suc h that at least (1 − ε ) m of the equations are satisﬁed, or whether no assignmen t of suc h v alues can satisfy more than εm of the equations. If neither of these p ossibilities o ccur, then an arbitrary output is allow ed. As in the case of P 6 = N P , sa ying that assuming the UGC no p olynomial time algorithm can p erform a certain task T is the same as designing a p olynomial time algorithm that solv es the ab o v e linear equations problem while making at most p olynomially many calls to a “black box” that can perform the task T . The UGC w as introduced in [62], though the ab o v e form ulation of it, whic h is equiv alent to the original one, is due to [64]. The use of the UGC as a hardness hypothesis has b ecome p opular ov er the past decade; we refer to the surv ey [63] for more information on this topic. T o simplify matters (while describing all the essen tial ideas), we allo w polynomial time algorithms to be randomized. Most (if not all) of the algorithms describ ed here can be turned in to deterministic algorithms, and corresp onding hardness results can b e stated equally well in the con text randomized or deterministic algorithms. W e will ignore these distinctions, ev en though they are imp ortant. Moreo v er, it is widely believed that in our con text these distinctions do not exist, i.e., randomness do es not add computational p ow er to p olynomial time algorithms; see for example the discussion of the N P 6⊆ B P P h yp othesis in [11]. 1.2. Con v ex and semideﬁnite programming. An imp ortant paradigm of optimiza tion theory is that one can eﬃciently optimize linear functionals ov er compact conv ex sets that ha v e a “mem b ership oracle”. A detailed exposition of this statemen t is con tained in [46], but for the sak e of completeness w e no w quote the precise form ulation of the results that will b e used in this article. Let K ⊆ R n b e a compact conv ex set. W e are also given a point z ∈ Q n and tw o radii r , R ∈ (0 , ∞ ) ∩ Q such that B ( z , r ) ⊆ K ⊆ B ( z , R ), where B ( z , t ) = { x ∈ R n : k x − z k 2 6 t } . In what follows, stating that an algorithm is p olynomial means that w e allo w the running time 4 to gro w at most p olynomially in the num b er of bits required to represent the data ( z , r , R ). Th us, if, sa y , z = 0, r = 2 − n and R = 2 n then the running time will b e p olynomial in the dimension n . Assume that there exists an algorithm ALG with the follo wing properties. The input of ALG is a vector y ∈ Q n and ε ∈ (0 , 1) ∩ Q . The running time of ALG is p olynomial in n and the num ber of bits required to represen t the data ( ε, y ). The output of ALG is the assertion that either the distance of y from K is at most ε , or that the distance of y from the complement of K is at most ε . Then there exists an algorithm ALG 0 that takes as input a v ector c = ( c 1 , . . . , c n ) ∈ Q n and ε ∈ (0 , 1) ∩ Q and outputs a vector y = ( y 1 , . . . , y n ) ∈ R n that is at distance at most ε from K and for every x = ( x 1 , . . . , x n ) ∈ K that is at distance greater than ε from the complemen t of K w e ha v e P n i =1 c i y i > P n i =1 c i x i − ε . The running time of ALG 0 is allow ed to gro w at most p olynomially in n and the n um b er of bits required to represen t the data ( z , r , R, c, ε ). This imp ortan t result is due to [57]; we refer to [46] for an excellen t accoun t of this theory . The ab ov e statement is a k ey to ol in optimization, as it yields a p olynomial time metho d to compute the maximum of linear functionals on a giv en conv ex b o dy with arbitrarily go o d precision. W e note the following special case of this metho d, kno wn as semideﬁnite programming. Assume that n = k 2 and think of R n as the space of all k × k matrices. Assume that we are giv en a compact con v ex set K ⊆ R n that satisﬁes the abov e assumptions, and that for a giv en k × k matrix ( c ij ) we wish to compute in p olynomial time (up to a sp eciﬁed additive error) the maxim um of P k i =1 P k j =1 c ij x ij o v er the set of symmetric p ositive semideﬁnite matrices ( x ij ) that b elong to K . This can indeed b e done, since determining whether a giv en symmetric matrix is (appro ximately) p ositiv e semideﬁnite is an eignev alue computation and hence can b e p erformed in p olynomial time. The use of semideﬁnite programming to design appro ximation algorithms is b y no w a deep theory of fundamental imp ortance to sev eral areas of theoretical computer science. The Go emans-Williamson MAX-CUT algorithm [42] w as a key breakthrough in this con text. It is safe to sa y that after the discov ery of this algorithm the ﬁeld of approximation algorithms w as transformed, and man y subsequen t results, including those presen ted in the presen t article, can b e describ ed as attempts to mimic the success of the Go emans-Williamson approac h in other con texts. 2. Applica tions of the classical Grothendieck inequality The classical Grothendiec k inequalit y (1) has applications to algorithmic questions of cen tral interest. These applications will b e describ ed here in some detail. In Section 2.1 w e discuss the cut norm estimation problem, whose relation to the Grothendieck inequality w as ﬁrst noted in [8]. This is a generic combinatorial optimization problem that contains well- studied questions as subproblems. Examples of its usefulness are presented in Sections 2.1.1, 2.1.2, 2.1.3, 2.1.4. Section 2.2 is dev oted to the rounding problem, including the (algorithmic) metho d b ehind the pro of of the b est known upp er b ound on the Grothendieck constant. 2.1. Cut norm estimation. Let A = ( a ij ) b e an m × n matrix with real entries. The cut norm of A is deﬁned as follo ws k A k cut = max S ⊆{ 1 ,...,m } T ⊆{ 1 ,...,n }        X i ∈ S j ∈ T a ij        . (4) 5 W e will now explain how the Grothendiec k inequality can be used to obtain a polynomial time algorithm for the follo wing problem. The input is an m × n matrix A = ( a ij ) with real en tries, and the goal of the algorithm is to output in p olynomial time a num ber α that is guaran teed to satisfy k A k cut 6 α 6 C k A k cut , (5) where C is a (hop efully not to o large) univ ersal constant. A closely related algorithmic goal is to output in p olynomial time t w o subsets S 0 ⊆ { 1 , . . . , m } and T 0 ⊆ { 1 , . . . , n } satisfying        X i ∈ S 0 j ∈ T 0 a ij        > 1 C k A k cut . (6) The link to the Grothendieck inequalit y is made via tw o simple transformations. Firstly , deﬁne an ( m + 1) × ( n + 1) matrix B = ( b ij ) as follo ws. B =       a 11 a 12 . . . a 1 n − P n k =1 a 1 k a 21 a 22 . . . a 2 n − P n k =1 a 2 k . . . . . . . . . . . . . . . a m 1 a m 2 . . . a mn − P n k =1 a mk − P m ` =1 a ` 1 − P m ` =1 a ` 2 . . . − P m ` =1 a `n P n k =1 P m ` =1 a `k       . (7) Observ e that k A k cut = k B k cut . (8) Indeed, for every S ⊆ { 1 , . . . , m + 1 } and T ⊆ { 1 , . . . , n + 1 } deﬁne S ∗ ⊆ { 1 , . . . , m } and T ∗ ⊆ { 1 , . . . , n } by S ∗ =  S if m + 1 / ∈ S, { 1 , . . . , m } r S if m + 1 ∈ S , and T ∗ =  T if n + 1 / ∈ T , { 1 , . . . , n } r T if n + 1 ∈ T . One c hec ks that for all S ⊆ { 1 , . . . , m + 1 } and T ⊆ { 1 , . . . , n + 1 } we hav e        X i ∈ S j ∈ T b ij        =        X i ∈ S ∗ j ∈ T ∗ a ij        , implying (8). W e next claim that k B k cut = 1 4 k B k ∞→ 1 , (9) where k B k ∞→ 1 = max ( m +1 X i =1 n +1 X j =1 b ij ε i δ j : { ε i } m +1 i =1 , { δ j } n +1 j =1 ⊆ {− 1 , 1 } ) . (10) T o explain this notation observe that k B k ∞→ 1 is the norm of B when viewed as a linear op erator from ` n ∞ to ` m 1 . Here, and in what follows, for p ∈ [1 , ∞ ] and k ∈ N the space ` k p is R k equipp ed with the ` p norm k · k p , where k x k p p = P k ` =1 | x ` | p for x = ( x 1 , . . . , x k ) ∈ R k (for p = ∞ we set as usual k x k ∞ = max i ∈{ 1 ,...,n } | x i | ). Though it is imp ortant, this op erator theoretic in terpretation of the quantit y k B k ∞→ 1 will not ha v e an y role in this survey , so it ma y b e harmlessly ignored at ﬁrst reading. 6 The pro of of (9) is simple: for { ε i } m +1 i =1 , { δ j } n +1 j =1 ⊆ {− 1 , 1 } deﬁne S + , S − ⊆ { 1 , . . . , m + 1 } and T + , T − ⊆ { 1 , . . . , n + 1 } by setting S ± = { i ∈ { 1 , . . . , m + 1 } : ε i = ± 1 } and T ± = { j ∈ { 1 , . . . , n + 1 } : δ j = ± 1 } . Then m +1 X i =1 n +1 X j =1 b ij ε i δ j = X i ∈ S + j ∈ T + b ij + X i ∈ S − j ∈ T − b ij − X i ∈ S + j ∈ T − b ij − X i ∈ S − j ∈ T + b ij 6 4 k B k cut . (11) This shows that k B k ∞→ 1 6 4 k B k cut (for any matrix B , actually , not just the sp eciﬁc choice in (7); w e will use this observ ation later, in Section 2.1.3). In the reverse direction, given S ⊆ { 1 , . . . , m + 1 } and T ⊆ { 1 , . . . , n + 1 } deﬁne for i ∈ { 1 , . . . , m + 1 } and j ∈ { 1 , . . . , n + 1 } , ε i =  1 if i ∈ S, − 1 if i / ∈ S, and δ j =  1 if j ∈ T , − 1 if j / ∈ T . Then, since the sum of eac h ro w and each column of B v anishes, X i ∈ S j ∈ T b ij = m +1 X i =1 n +1 X j =1 b ij 1 + ε i 2 · 1 + δ j 2 = 1 4 m +1 X i =1 n +1 X j =1 b ij ε i δ j 6 1 4 k B k ∞→ 1 . This completes the pro of of (9). W e summarize the ab ov e simple transformations in the follo wing lemma. Lemma 2.1. L et A = ( a ij ) b e an m × n matrix with r e al entries and let B = ( b ij ) b e the ( m + 1) × ( n + 1) matrix given in (7) . Then k A k cut = 1 4 k B k ∞→ 1 . A consequence of Lemma 2.1 is that the problem of approximating k A k cut in p olynomial time is equiv alent to the problem of approximating k A k ∞→ 1 in p olynomial time in the sense that an y algorithm for one of these problems can be used to obtain an algorithm for the other problem with the same running time (up to constant factors) and the same (multiplicativ e) appro ximation guaran tee. Giv en an m × n matrix A = ( a ij ) consider the follo wing quan tity . SDP( A ) = max ( m X i =1 n X j =1 a ij h x i , y j i : { x i } m i =1 , { y j } n j =1 ⊆ S n + m − 1 ) . (12) The maximization problem in (12) falls into the framew ork of semideﬁnite programming as discussed in Section 1.2. Therefore SDP( A ) can be computed in polynomial time with arbitrarily go o d precision. It is clear that SDP( A ) > k A k ∞→ 1 , b ecause the maxim um in (12) is ov er a bigger set than the maximum in (10). The Grothendieck inequality says that SDP( A ) 6 K G k A k ∞→ 1 , so w e ha ve k A k ∞→ 1 6 SDP( A ) 6 K G k A k ∞→ 1 . Th us, the p olynomial time algorithm that outputs the n um b er SDP( A ) is guaran teed to b e within a factor of K G of k A k ∞→ 1 . By Lemma 2.1, the algorithm that outputs the num b er α = 1 4 SDP( B ), where the matrix B is as in (7), satisﬁes (5) with C = K G . Section 7 is dev oted to algorithmic impossibility results. But, it is w orthwhile to make at this juncture tw o commen ts regarding hardness of approximation. First of all, unless 7 P = N P , w e need to in tro duce an error C > 1 in our requirement (5). This w as observed in [8]: the classical MAXCUT problem from algorithmic graph theory was sho wn in [8] to b e a special case of the problem of computing k A k cut , and therefore b y [51] w e know that unless P = N P there do es not exist a p olynomial time algorithm that outputs a num ber α satisfying (5) with C strictly smaller than 17 16 . In fact, b y a reduction to the MAX DICUT problem one can show that C must b e at least 13 12 , unless P = N P ; w e refer to Section 7 and [8] for more information on this topic. Another (more striking) algorithmic imp ossibilit y result is based on the Unique Games Conjecture (UGC). Clearly the ab ov e algorithm cannot yield an appro ximation guaran tee strictly smaller than K G (this is the deﬁnition of K G ). In fact, it w as sho wn in [104] that unless the UGC is false, for every ε ∈ (0 , 1) any p olynomial time algorithm for estimating k A k cut whatso ev er, and not only the sp eciﬁc algorithm described ab ov e, m ust mak e an error of at least K G − ε on some input matrix A . Th us, if w e assume the UGC then the classical Grothendiec k constant has a complexity theoretic in terpretation: it equals the b est appro ximation ratio of polynomial time algorithms for the cut norm problem. Note that [104] manages to pro v e this statement despite the fact that the v alue of K G is unkno wn. W e ha v e thus far ignored the issue of ﬁnding in p olynomial time the subsets S 0 , T 0 satisfying (6), i.e., w e only explained how the Grothendiec k inequality can b e used for p olynomial time estimation of the quan tit y k A k cut without actually ﬁnding eﬃcien tly sub- sets at which k A k cut is appro ximately attained. In order to do this w e cannot use the Grothendiec k inequality as a blac k b o x: we need to lo ok in to its proof and argue that it yields a p olynomial time pro cedure that conv erts v ectors { x i } m i =1 , { y j } n j =1 ⊆ S n + m − 1 in to signs { ε i } m i =1 , { δ j } n j =1 ⊆ {− 1 , 1 } (this is kno wn as a rounding procedure). It is indeed pos- sible to do so, as explained in Section 2.2. W e postp one the explanation of the rounding pro cedure that hides b ehind the Grothendieck inequality in order to ﬁrst giv e examples why one migh t w ant to eﬃciently compute the cut norm of a matrix. 2.1.1. Szemer ´ edi p artitions. The Szemer´ edi regularit y lemma [111] (see also [72]) is a general and v ery useful structure theorem for graphs, asserting (informally) that an y graph can b e partitioned into a con trolled num b er of pieces that in teract with each other in a pseudo- random wa y . The Grothendiec k inequality , via the cut norm estimation algorithm, yields a p olynomial time algorithm that, when giv en a graph G = ( V , E ) as input, outputs a partition of V that satisﬁes the conclusion of the Szemer ´ edi regularit y lemma. T o mak e the ab o v e statemen ts formal, w e need to recall some deﬁnitions. Let G = ( V , E ) b e a graph. F or ev ery disjoin t X , Y ⊆ V denote the n um b er of edges joining X and Y by e ( X , Y ) = |{ ( u, v ) ∈ X × Y : { u, v } ∈ E }| . Let X , Y ⊆ V b e disjoint and nonempty , and ﬁx ε, δ ∈ (0 , 1). The pair of vertex sets ( X , Y ) is called ( ε, δ )-regular if for ev ery S ⊆ X and T ⊆ Y that are not to o small, the quantit y e ( S,T ) | S |·| T | (the density of edges b et w een S and T ) is essen tially independent of the pair ( S, T ) itself. F ormally , w e require that for ev ery S ⊆ X with | S | > δ | X | and every T ⊆ Y with | T | > δ | Y | we hav e     e ( S, T ) | S | · | T | − e ( X , Y ) | X | · | Y |     6 ε. (13) 8 The almost uniformity of the num bers e ( S,T ) | S |·| T | as exhibited in (13) sa ys that the pair ( X , Y ) is “pseudo-random”, i.e., it is similar to a random bipartite graph where eac h ( x, y ) ∈ X × Y is joined b y an edge indep enden tly with probabilit y e ( X,Y ) | X |·| Y | . The Szemer ´ edi regularity lemma says that for all ε, δ, η ∈ (0 , 1) and k ∈ N there exists K = K ( ε, δ, η , k ) ∈ N suc h that for all n ∈ N any n -v ertex graph G = ( V , E ) can b e partitioned in to m -sets S 1 , . . . , S m ⊆ V with the follo wing prop erties • k 6 m 6 K , • | S i | − | S j | 6 1 for all i, j ∈ { 1 , . . . , m } , • the n um b er of i, j ∈ { 1 , . . . , m } with i < j suc h that the pair ( S i , S j ) is ( ε, δ )-regular is at least (1 − η )  m 2  . Th us every graph is almost a sup erp osition of a b ounded num b er of pseudo-random graphs, the k ey p oin t b eing that K is indep endent of n and the sp eciﬁc com binatorial structure of the graph in question. It would b e of interest to hav e a wa y to pro duce a Szemer´ edi partition in p olynomial time with K independent of n (this is a go o d example of an appro ximation algorithm: one migh t care to ﬁnd suc h a partition into the minimum p ossible num ber of pieces, but pro ducing any partition into b oundedly many pieces is already a signiﬁcant ac hiev ement). Suc h a p olyno- mial time algorithm w as designed in [5] (see also [73]). W e refer to [5, 73] for applications of algorithms for constructing Szemer´ edi partitions, and to [5] for a discussion of the com- putational complexity of this algorithmic task. W e shall now explain ho w the Grothendieck inequalit y yields a diﬀeren t approac h to this problem, whic h has some adv an tages o v er [5, 73] that will b e describ ed later. The argumen t b elo w is due to [8]. Assume that X , Y are disjoint n -p oint subsets of a graph G = ( V , E ). Ho w can w e deter- mine in p olynomial time whether or not the pair ( X , Y ) is close to b eing ( ε, δ )-regular? It turns out that this is the main “b ottleneck” tow ards our goal to construct Szemer´ edi parti- tions in p olynomial time. T o this end consider the follo wing n × n matrix A = ( a xy ) ( x,y ) ∈ X × Y . a xy = ( 1 − e ( X,Y ) | X |·| Y | if { x, y } ∈ E , − e ( X,Y ) | X |·| Y | if { x, y } / ∈ E . (14) By the deﬁnition of A , if S ⊆ X and T ⊆ Y then        X x ∈ S y ∈ T a xy        = | S | · | T | ·     e ( S, T ) | S | · | T | − e ( X , Y ) | X | · | Y |     . (15) Hence if ( X , Y ) is not ( ε, δ )-regular then k A k cut > εδ 2 n 2 . The approximate cut norm al- gorithm based on the Grothendiec k inequalit y , together with the rounding pro cedure in Section 2.2, ﬁnds in p olynomial time subsets S ⊆ X and T ⊆ Y suc h that min  n | S | , n | T | , n 2     e ( S, T ) | S | · | T | − e ( X , Y ) | X | · | Y |      (15) >        X x ∈ S y ∈ T a xy        > 1 K G εδ 2 n 2 > 1 2 εδ 2 n 2 . This establishes the follo wing lemma. 9 Lemma 2.2. Ther e exists a p olynomial time algorithm that takes as input two disjoint n - p oint subsets X , Y of a gr aph, and either de cides that ( X , Y ) is ( ε, δ ) -r e gular or ﬁnds S ⊆ X and T ⊆ Y with | S | , | T | > 1 2 εδ 2 n and     e ( S, T ) | S | · | T | − e ( X , Y ) | X | · | Y |     > 1 2 εδ 2 . F rom Lemma 2.2 it is quite simple to design a p olynomial algorithm that constructs a Szemer ´ edi partition with b ounded cardinalit y; compare Lemma 2.2 to Corollary 3.3 in [5] and Theorem 1.5 in [73]. W e will not explain this deduction here since it is iden tical to the argument in [5]. W e note that the quantitativ e b ounds in Lemma 2.2 impro v e ov er the corresp onding b ounds in [5, 73] yielding, say , when ε = δ = η , an algorithm with the b est kno wn b ound on K as a function of ε (this b ound is nevertheless still h uge, as m ust b e the case due to [44]; see also [30]). See [8] for a precise statement of these b ounds. In addition, the algorithms of [5, 73] work ed only in the “dense case”, i.e., when k A k cut , for A as in (14), is of order n 2 , while the ab ov e algorithm do es not hav e this requirement. This observ ation can b e used to design the only kno wn p olynomial time algorithm for sparse versions of the Szemer ´ edi regularit y lemma [4] (see also [41]). W e will not discuss the sparse version of the regularit y lemma here, and refer instead to [71, 72] for a discussion of this topic. W e also refer to [4] for additional applications of the Grothendiec k inequalit y in sparse settings. 2.1.2. F rieze-Kannan matrix de c omp osition. The cut norm estimation problem was origi- nally raised in the w ork of F rieze and Kannan [38] which in tro duced a metho d to design p olynomial time appro ximation schemes for dense constraint satisfaction problems. The key to ol for this purp ose is a decomp osition theorem for matrices that we now describ e. An m × n matrix D = ( d ij ) is called a cut matrix if there exist subsets S ⊆ { 1 , . . . , m } and T ⊆ { 1 , . . . , n } , and d ∈ R suc h that for all ( i, j ) ∈ { 1 , . . . , m } × { 1 , . . . , n } we hav e, d ij =  d if ( i, j ) ∈ S × T , 0 if ( i, j ) / ∈ S × T . (16) Denote the matrix D deﬁned in (16) by C U T ( S, T , d ). In [38] it is pro v ed that for every ε > 0 there exists an integer s = O (1 /ε 2 ) suc h that for an y m × n matrix A = ( a ij ) with en tries b ounded in absolute v alue by 1, there are cut matrices D 1 , . . . , D s satisfying      A − s X k =1 D k      cut 6 εmn. (17) Moreo v er, these cut matrices D 1 , . . . , D s can b e found in time C ( ε )( mn ) O (1) . W e shall no w explain ho w this is done using the cut norm appro ximation algorithm of Section 2.1. The argument is iterativ e. Set A 0 = A , and assuming that the cut matrices D 1 , . . . , D r ha v e already b een deﬁned write A r = ( a ij ( r )) = A − P r k =1 D k . W e are done if k A r k cut 6 εmn , so w e ma y assume that k A r k cut > εmn . By the cut norm appro ximation algorithm we can ﬁnd in p olynomial time S ⊆ { 1 , . . . , m } and T ⊆ { 1 , . . . , n } satisfying        X i ∈ S j ∈ T a ij ( r )        > c k A r k cut > cεmn, (18) 10 where c > 0 is a univ ersal constan t. Set d = 1 | S | · | T | X i ∈ S j ∈ T a ij ( r ) . Deﬁne D r +1 = C U T ( S, T , d ) and A r +1 = ( a ij ( r + 1)) = A r − D r +1 . Then b y expanding the squares w e ha ve, m X i =1 n X j =1 a ij ( r + 1) 2 = m X i =1 n X j =1 a ij ( r ) 2 − 1 | S | · | T |    X i ∈ S j ∈ T a ij ( r )    2 (18) 6 m X i =1 n X j =1 a ij ( r ) 2 − c 2 ε 2 mn. It follo ws inductiv ely that if we can carry out this pro cedure r times then 0 6 m X i =1 n X j =1 a ij ( r ) 2 6 m X i =1 n X j =1 a 2 ij − r c 2 ε 2 mn 6 mn − rc 2 ε 2 mn, where w e used the assumption that | a ij | 6 1. Therefore the ab ov e iteration must terminate after d 1 / ( c 2 ε 2 ) e steps, yielding (17). W e note that the b ound s = O (1 /ε 2 ) in (17) cannot b e impro v ed [6]; see also [89, 30] for related low er b ounds. The key step in the ab o v e algorithm w as ﬁnding sets S, T as in (18). In [38] an algorithm w as designed that, giv en an m × n matrix A = ( a ij ) and ε > 0 as input, produces in time 2 1 /ε O (1) ( mn ) O (1) subsets S ⊆ { 1 , . . . , m } and T ⊆ { 1 , . . . , n } satisfying        X i ∈ S j ∈ T a ij        > k A k cut − εmn. (19) The additiv e approximation guarantee in (19) implies (18) only if k A k cut > ε ( c + 1) mn , and similarly the running time is not p olynomial if, say , ε = n − Ω(1) . Thus the Kannan-F rieze metho d is relev ant only to “dense” instances, while the cut norm algorithm based on the Grothendiec k inequality applies equally w ell for all v alues of k A k cut . This fact, com bined with more work (and, necessarily , additional assumptions on the matrix A ), was used in [29] to obtain a sparse version of (17): with εmn in the right hand side of (17) replaced by ε k A k cut and s = O (1 /ε 2 ) (imp ortan tly , here s is indep endent of m, n ). W e hav e indicated abov e ho w the cut norm appro ximation problem is relev ant to Kannan- F rieze matrix decomp ositions, but we did not indicate the uses of such decomp ositions since this is beyond the scop e of the curren t surv ey . W e refer to [38, 6, 15, 29] for a v ariety of applications of this metho dology to com binatorial optimization problems. 2.1.3. Maximum acyclic sub gr aph. In the maxim um acyclic subgraph problem we are giv en as input an n -vertex directed graph G = ( { 1 , . . . , n } , E ). Th us E consists of a family of or der e d pairs of distinct elements in { 1 , . . . , n } . W e are interested in the maximum of   { ( i, j ) ∈ { 1 , . . . , n } 2 : σ ( i ) < σ ( j ) } ∩ E   −   { ( i, j ) ∈ { 1 , . . . , n } 2 : σ ( i ) > σ ( j ) } ∩ E   o v er all p ossible p erm utations σ ∈ S n ( S n denotes the group of p ermutations of { 1 , . . . , n } ). In w ords, the quan tit y of interest is the maxim um o v er all orderings of the v ertices of the n um b er of edges going “forw ard” min us the n um b er of edges going “backw ard”. The b est 11 kno wn approximation algorithm for this problem was disco vered in [26] as an application of the cut norm appro ximation algorithm. It is most natural to explain the algorithm of [26] for a weigh ted version of the maxim um acyclic subgraph problem. Let W : { 1 , . . . , n } × { 1 , . . . , n } → R b e sk ew symmetric, i.e., W ( u, v ) = − W ( v , u ) for all u, v ∈ { 1 , . . . , n } . F or σ ∈ S n deﬁne W ( σ ) = X u,v ∈{ 1 ,...,n } u c k W k cut , (20) where c ∈ (0 , ∞ ) is a universal constan t. Note that w e do not need to take the absolute v alue of the left hand side of (20) b ecause W is skew symmetric. Observe also that since W is sk ew symmetric w e hav e P u,v ∈ S ∩ T W ( u, v ) = 0 and therefore X u ∈ S v ∈ T W ( u, v ) = X u ∈ S r T v ∈ T r S W ( u, v ) + X u ∈ S r T v ∈ S ∩ T W ( u, v ) + X u ∈ S ∩ T v ∈ T r S W ( u, v ) . By replacing the pair of subsets ( S, T ) by one of { ( S r T , T r S ) , ( S r T , S ∩ T ) , ( S ∩ T , T r S ) } , and replacing the constan t c is (20) b y c/ 3, w e may assume without loss of generality that (20) holds with S and T disjoin t. Denote R = { 1 , . . . , n } r ( S ∪ T ) and write S = { s 1 , . . . , s | S | } , T = { t 1 , . . . , t | T | } and R = { r 1 , . . . , r | R | } , where s 1 < · · · < s | S | , t 1 < · · · < t | T | and r 1 < · · · < r | R | . Deﬁne t w o p erm utations σ 1 , σ 2 ∈ S n as follo ws. σ 1 ( u ) =    s u if u ∈ { 1 , . . . , | S |} , t u −| S | if u ∈ {| S | + 1 , . . . , | S | + | T |} , r u −| S |−| T | if u ∈ {| S | + | T | + 1 , . . . , n } , and σ 2 ( u ) =    r | R |− u +1 if u ∈ { 1 , . . . , | R |} , s | R | + | S |− u +1 if u ∈ {| R | + 1 , . . . , | R | + | S |} , t n − u +1 if u ∈ {| R | + | S | + 1 , . . . , n } . 1 Here, and in what follows, the relations & , . indicate the corresp onding inequalities up to an absolute factor. The relation  stands for & ∧ . . 12 In w ords, σ 1 orders { 1 , . . . , n } b y starting with the el ements of S in increasing order, then the elemen ts of T in increasing order, and ﬁnally the elemen ts of R in increasing order. A t the same time, σ 2 orders { 1 , . . . , n } b y starting with the elements of R in decreasing order, then the elements of S in decreasing order, and ﬁnally the elements of T in decreasing order. The quan tit y W ( σ 1 ) + W ( σ 2 ) consists of a sum of terms of the form W ( u, v ) for u, v ∈ { 1 , . . . , n } , where if ( u, v ) ∈ ( S × S ) ∪ ( T × T ) ∪ ( R × { 1 , . . . , n } ) then b oth W ( u, v ) and W ( v, u ) app ear exactly once in this sum, and if ( u, v ) ∈ S × T then W ( u, v ) app ears twice in this sum and W ( v , u ) do es not appear in this sum at all. Therefore, using the fact that W is skew symmetric w e ha ve the following identit y . W ( σ 1 ) + W ( σ 2 ) = 2 X u ∈ S v ∈ T W ( u, v ) . It follo ws that for some ` ∈ { 1 , 2 } we hav e M ( σ ` ) > X u ∈ S v ∈ T W ( u, v ) (20) > c k W k cut . The output of the algorithm will b e the p erm utation σ ` , so it suﬃces to pro v e that k W k cut & M W log n . (21) W e will prov e b elow that k W k cut & 1 log n X u,v ∈{ 1 ,...,n } u 1 4 k W k ∞→ 1 ; we hav e already prov ed this inequalit y as a consequence of the simple iden tit y (11). Moreo v er, w e hav e k W k ∞→ 1 & max ( n X u =1 n X v =1 W ( u, v ) sin( α u − β v ) : { α u } n u =1 , { β v } n v =1 ⊆ R ) . (23) Inequalit y (23) is a special case of (1) with the c hoice of v ectors x u = (sin α u , cos α u ) ∈ R 2 and y v = (cos β v , − sin β v ) ∈ R 2 . W e note that this tw o-dimensional version of the Grothendieck inequalit y is trivial with the constan t in the righ t hand side of (23) b eing 1 2 , and it is shown in [78] that the b est constan t in the righ t hand side of (23) is actually 1 √ 2 . F or every θ 1 , . . . , θ n ∈ R , an application of (23) when α u = β u = θ u and α u = β u = − θ u yields the inequalit y k W k cut &      n X u =1 n X v =1 W ( u, v ) sin ( θ u − θ v )      = 2        X u,v ∈{ 1 ,...,n } u 1 20 K G r log n n M # > 1 − e − cm/ 4 √ n . (31) Once (31) is established, it w ould follow that for m  4 √ n w e hav e α > 1 20 K G q log n n M with probabilit y at least 1 2 . This combined with (30) would complete the pro of of (28) since α as deﬁned in (29) can b e computed in p olynomial time, b eing the maximum of O ( 4 √ n ) semideﬁnite programs. T o chec k (31) let k · k b e the norm on R n deﬁned for ev ery x = ( x 1 , . . . , x n ) ∈ R n b y k x k = max ( n X i =1 n X j =1 n X k =1 a ij k x i h y j , z k i : { y j } n j =1 , { z k } n k =1 ⊆ S 2 n − 1 ) . Deﬁne K = { x ∈ R n : k x k 6 1 } and let K ◦ = { y ∈ R n : sup x ∈ K h x, y i 6 1 } be the polar of K . Then max {k y k 1 : y ∈ K ◦ } = max {k x k : k x k ∞ 6 1 } > M , where the ﬁrst equalit y 15 is straigh tforw ard duality and the ﬁnal inequality is a consequence of the deﬁnition of k · k and M . It follo ws that there exists y ∈ K ◦ with k y k 1 > M . Hence, Pr " α > 1 20 K G r log n n M # (29) = 1 − m Y ` =1 Pr " k ε ` k < 1 2 r log n n M # > 1 − Pr " n X i =1 ε 1 i y i < 1 2 r log n n n X i =1 | y i | #! m . In order to pro ve (31) it therefore suﬃces to pro v e that if ε is chosen uniformly at random from {− 1 , 1 } n and a ∈ R n satisﬁes k a k 1 = 1 then Pr h P n i =1 ε i a i > p log n/ (4 n ) i > 1 − c/ 4 √ n , where c ∈ (0 , ∞ ) is a universal constan t. This probabilistic estimate for i.i.d. Bernoulli sums can b e pro v ed directly; see [65, Lem. 3.2]. 2.2. Rounding. Let A = ( a ij ) be an m × n matrix. In Section 2.1 we described a p olynomial time algorithm for appro ximating k A k cut and k A k ∞→ 1 . F or applications it is also imp ortant to ﬁnd in p olynomial time signs ε 1 , . . . , ε m , δ 1 , . . . , δ n ∈ {− 1 , 1 } for which P m i =1 P n j =1 a ij ε i δ j is at least a constant m ultiple of k A k ∞→ 1 . This amoun ts to a “rounding problem”: w e need to ﬁnd a pro cedure that, given vectors x 1 , . . . , x m , y 1 , . . . , y n ∈ S m + n − 1 , pro duces signs ε 1 , . . . , ε m , δ 1 , . . . , δ n ∈ {− 1 , 1 } whose existence is ensured b y the Grothendiec k inequalit y , i.e., P m i =1 P n j =1 a ij ε i δ j is at least a constan t multiple of P m i =1 P n j =1 a ij h x i , y j i . F or this pur- p ose one needs to examine pro ofs of the Grothendieck inequalit y , as done in [8]. W e will now describ e the rounding pro cedure that gives the b est known approximation guaran tee. This pro cedure yields a randomized algorithm that pro duces the desired signs; it is also p ossible to obtain a deterministic algorithm, as explained in [8]. The argumen t below is based on a clev er t w o-step rounding method due to Krivine [77]. Fix k ∈ N and assume that we are given t w o cen trally symmetric measurable partitions of R k , or equiv alen tly tw o o dd measurable functions f , g : R k → {− 1 , 1 } . Let G 1 , G 2 ∈ R k b e indep endent random vectors that are distributed according to the standard Gaussian measure on R k , i.e., the measure with densit y x 7→ e −k x k 2 2 / 2 / (2 π ) k/ 2 . F or t ∈ ( − 1 , 1) deﬁne H f ,g ( t ) def = E  f  1 √ 2 G 1  g  t √ 2 G 1 + √ 1 − t 2 √ 2 G 2  = 1 π k (1 − t 2 ) k/ 2 Z R k Z R k f ( x ) g ( y ) exp  −k x k 2 2 − k y k 2 2 + 2 t h x, y i 1 − t 2  dxdy . (32) Then H f ,g extends to an analytic function on the strip { z ∈ C : < ( z ) ∈ ( − 1 , 1) } . The pair of functions { f , g } is called a Krivine rounding sc heme if H f ,g is in v ertible on a neigh b orhoo d of the origin, and if we consider the T a ylor expansion H − 1 f ,g ( z ) = P ∞ j =0 a 2 j +1 z 2 j +1 then there exists c = c ( f , g ) ∈ (0 , ∞ ) satisfying P ∞ j =0 | a 2 j +1 | c 2 j +1 = 1. F or ( f , g ) as ab ov e and unit v ectors { x i } m i =1 , { y j } n j =1 ⊆ S m + n − 1 , one can ﬁnd new unit v ectors { u i } m i =1 , { v j } n j =1 ⊆ S m + n − 1 satisfying the iden tities ∀ ( i, j ) ∈ { 1 , . . . , m } × { 1 , . . . , n } , h u i , v j i = H − 1 f ,g ( c ( f , g ) h x i , y j i ) . (33) W e refer to [21] for the pro of that { u i } m i =1 , { v j } n j =1 exist. This existence proof is not via an eﬃcien t algorithm, but as explained in [8], once w e kno w that they exist the new vectors 16 can b e computed eﬃcien tly pro vided H − 1 f ,g can b e computed eﬃcien tly; this simply amounts to computing a Cholesky decomp osition or, alternativ ely , solving a semideﬁnite program corresp onding to (33). This completes the ﬁrst (prepro cessing) step of a generalized Krivine rounding pro cedure. The next step is to apply a random pro jection to the new v ectors thus obtained, as in Grothendiec k’s original proof [45] or the Goemans-Williamson algorithm [42]. Let G : R m + n → R k b e a random k × ( m + n ) matrix whose en tries are i.i.d. standard Gaussian random v ariables. Deﬁne random signs { ε i } m i =1 , { δ j } n j =1 ⊆ {− 1 , 1 } by ∀ ( i, j ) ∈ { 1 , . . . , m } × { 1 , . . . , n } , ε i def = f  1 √ 2 Gu i  and δ j def = g  1 √ 2 Gv j  . (34) No w, E " m X i =1 n X j =1 a ij ε i δ j # ( ∗ ) = E " m X i =1 n X j =1 a ij H f ,g ( h u i , v j i ) # (33) = c ( f , g ) m X i =1 n X j =1 a ij h x i , y j i , (35) where ( ∗ ) follows b y rotation in v ariance from (34) and (32). The identit y (35) yields the desired p olynomial time randomized rounding algorithm, provided one can b ound c ( f , g ) from b elo w. It also gives a systematic w a y to b ound the Grothendiec k constan t from ab ov e: for ev ery Krivine rounding scheme f , g : R k → {− 1 , 1 } we ha v e K G 6 1 /c ( f , g ). Krivine used this reasoning to obtain the b ound K G 6 π /  2 log  1 + √ 2  b y considering the case k = 1 and f 0 ( x ) = g 0 ( x ) = sign( x ). One c hecks that { f 0 , g 0 } is a Krivine rounding scheme with H f 0 ,g 0 ( t ) = 2 π arcsin( t ) (Grothendiec k’s iden tity) and c ( f 0 , g 0 ) = 2 π log  1 + √ 2  . Since the goal of the ab ov e discussion is to round v ectors { x i } m i =1 , { y j } n j =1 ⊆ S m + n − 1 to signs { ε i } m i =1 , { δ j } n j =1 ⊆ {− 1 , 1 } , it seems natural to exp ect that the b est p ossible Krivine rounding scheme o ccurs when k = 1 and f ( x ) = g ( x ) = sign( x ). If true, this w ould imply that K G = π /  2 log  1 + √ 2  ; a long-standing conjecture of Krivine [77]. Over the years additional evidence supp orting Krivine’s conjecture w as discov ered, and a natural analytic conjecture was made in [76] as a step tow ards pro ving it. W e will not discuss these topics here since in [21] it was sho wn that actually K G 6 π /  2 log  1 + √ 2  − ε 0 for some eﬀective constan t ε 0 > 0. It is kno wn [21, Lem. 2.4] that among all one dimensional Krivine rounding sc hemes f , g : R → {− 1 , 1 } we indeed ha v e c ( f , g ) 6 2 π log  1 + √ 2  , i.e., it do es not pay oﬀ to tak e partitions of R whic h are more complicated than the half-line partitions. Somewhat unexp ectedly , it was sho wn in [21] that a certain t w o dimensional Krivine rounding sc heme f , g : R 2 → {− 1 , 1 } satisﬁes c ( f , g ) > 2 π log  1 + √ 2  . The pro of of [21] uses a Krivine rounding scheme f , g : R 2 → {− 1 , 1 } when f = g corresponds to the partition of R 2 as the sub-graph and sup er-graph of the p olynomial y = c ( x 5 − 10 x 3 + 15 x ), where c > 0 is an appropriately c hosen constan t. This partition is depicted in Figure 1. As explained in [21, Sec. 3], there is a natural guess for the “b est” t w o dimensional Krivine rounding scheme based on a certain n umerical computation which w e will not discuss here. F or this (conjectural) sc heme we ha ve f 6 = g , and the planar partition corresp onding to f is depicted in Figure 2. Of course, once Krivine’s conjecture has been dispro v ed and the usefulness of higher dimensional rounding sc hemes has been established, there is no reason to exp ect that the situation won’t impro v e as w e consider k -dimensional Krivine rounding sc hemes for k > 3. A p ositiv e solution to an analytic question presen ted in [21] migh t even lead to an exact computation of K G ; see [21, Sec. 3] for the details. 17 Figure 1. The partition of R 2 used in [21] to sho w that K G is smaller than Krivine’s b ound; the shaded regions are separated by the graph y = c ( x 5 − 10 x 3 + 15 x ). Figure 2. The “tiger parti- tion” restricted to the square [ − 20 , 20] 2 . This is the con- jectured [21] optimal parti- tion of R 2 for the purp ose of Krivine-t yp e rounding. 3. The Gr othendieck const ant of a graph Fix n ∈ N and let G = ( { 1 , . . . , n } , E ) b e a graph on the vertices { 1 , . . . , n } . W e assume throughout that G does not contain any self lo ops, i.e., E ⊆ { S ⊆ { 1 , . . . , n } : | S | = 2 } . F ollowing [7], deﬁne the Grothendiec k constan t of G , denoted K ( G ), to b e the smallest constan t K ∈ (0 , ∞ ) such that every n × n matrix ( a ij ) satisﬁes max x 1 ,...,x n ∈ S n − 1 X i,j ∈{ 1 ,...,n } { i,j }∈ E a ij h x i , x j i 6 K max ε 1 ,...,ε n ∈{− 1 , 1 } X i,j ∈{ 1 ,...,n } { i,j }∈ E a ij ε i ε j . (36) Inequalit y (36) is an extension of the Grothendiec k inequalit y since (1) is the sp ecial case of (36) when G is a bipartite graph. Th us K G = sup n ∈ N { K ( G ) : G is an n − vertex bipartite graph } . (37) The opp osite extreme of bipartite graphs is G = K n , the n -vertex complete graph. In this case (36) b oils do wn to the follo wing inequalit y max x 1 ,...,x n ∈ S n − 1 X i,j ∈{ 1 ,...,n } i 6 = j a ij h x i , x j i 6 K ( K n ) max ε 1 ,...,ε n ∈{− 1 , 1 } X i,j ∈{ 1 ,...,n } i 6 = j a ij ε i ε j . (38) It turns out that K ( K n )  log n . The estimate K ( K n ) . log n w as prov ed in [94, 91, 60, 27]. In fact, as shown in [7, Thm. 3.7], the following stronger inequalit y holds true for ev ery n × n 18 matrix ( a ij ); it implies that K ( K n ) . log n b y the Cauch y-Sc h w artz inequalit y . max x 1 ,...,x n ∈ S n − 1 X i,j ∈{ 1 ,...,n } i 6 = j a ij h x i , x j i . log   P i ∈{ 1 ,...,n } P j ∈{ 1 ,...,n } r { i } | a ij | q P i ∈{ 1 ,...,n } P j ∈{ 1 ,...,n } r { i } a 2 ij   max ε 1 ,...,ε n ∈{− 1 , 1 } X i,j ∈{ 1 ,...,n } i 6 = j a ij ε i ε j . The matc hing lo wer b ound K ( K n ) & log n is due to [7], impro ving ov er a result of [60]. Ho w can w e interpolate betw een the t w o extremes (37) and (38)? The Grothendieck constan t K ( G ) dep ends on the combinatorial structure of the graph G , but at presen t our understanding of this dep endence is incomplete. The follo wing general b ounds are kno wn. log ω . K ( G ) . log ϑ, (39) and K ( G ) 6 π 2 log  1+ √ ( ϑ − 1) 2 +1 ϑ − 1  , (40) where (39) is due to [7] and (40) is due to [23]. Here ω is the clique num b er of G , i.e., the largest k ∈ { 2 , . . . , n } suc h that there exists S ⊆ { 1 , . . . , n } of cardinality k satisfying { i, j } ∈ E for all distinct i, j ∈ S , and ϑ = min  max i ∈{ 1 ,...,n } 1 h x i , y i 2 : x 1 , . . . , x n , y ∈ S n ∧ ∀{ i, j } ∈ E , h x i , x j i = 0  . (41) The parameter ϑ is known as the Lo v´ asz theta function of the complemen t of G ; an imp ortan t graph parameter that w as introduced in [87]. W e refer to [59] and [7, Thm. 3.5] for alternative characterizations of ϑ . It suﬃces to sa y here that it was sho wn in [87] that ϑ 6 χ , where χ is the c hromatic num ber of G , i.e., the smallest integer k suc h that there exists a partition { A 1 , . . . , A k } of { 1 , . . . , n } such that { i, j } / ∈ E for all ( i, j ) ∈ S k ` =1 A ` × A ` . Note that the upper b ound in (39) is sup erior to (40) when ϑ is large, but when ϑ = 2 the b ound (40) implies Krivine’s classical b ound [77] K G 6 π /  2 log  1 + √ 2  . The upp er and low er bounds in (39) are kno wn to matc h up to absolute constants for a v ariety of graph classes. Several such sharp Grothendiec k-t yp e inequalities are presented in Sections 5.2 and 5.3 of [7] . F or example, as explained in [7], it follo ws from (39), com bined with com binatorial results of [87, 9], that for ev ery n × n × n 3-tensor ( a ij k ) w e ha ve max { x ij } n i,j =1 ⊆ S n 2 − 1 X i,j,k ∈{ 1 ,...,n } i 6 = j 6 = k a ij k h x ij , x j k i . max { ε ij } n i,j =1 ⊆{− 1 , 1 } X i,j,k ∈{ 1 ,...,n } i 6 = j 6 = k a ij k ε ij ε j k . While (39) is often a satisfactory asymptotic ev aluation of K ( G ), this isn’t alw a ys the case. In particular, it is unkno wn whether K ( G ) can b e b ounded from b elow by a function of ϑ that tends to ∞ as ϑ → ∞ . An instance in which (39) is not sharp is the case of Erd˝ os-R ´ enyi [36] random graphs G ( n, 1 / 2). F or such graphs we hav e ω  log n almost surely as n → ∞ ; see [90] and [10, Sec. 4.5]. A t the same time, for G ( n, 1 / 2) w e hav e [58] ϑ  √ n almost surely as n → ∞ . Thus (39) b ecomes in this case the rather w eak estimate log log n . K ( G ( n, 1 / 2)) . log n . It turns out [3] that K ( G ( n, 1 / 2))  log n almost surely as 19 n → ∞ ; we refer to [3] for additional computations of this t yp e of the Grothendiec k constan t of random and psuedo-random graphs. An explicit ev aluation of the Grothendieck constan t of certain graph families can b e found in [79]; for example, if G is a graph of girth g that is not a forest and do es not admit K 5 as a minor then K ( G ) = g cos( π/g ) g − 2 . 3.1. Algorithmic consequences. Other than b eing a natural v arian t of the Grothendieck inequalit y , and hence of intrinsic mathematical in terest, (36) has ramiﬁcations to discrete optimization problems, whic h w e now describ e. 3.1.1. Spin glasses. Perhaps the most natural interpretation of (36) is in the context of solid state physics, sp eciﬁcally the problem of eﬃcient computation of ground states of Ising spin glasses. The graph G represents the in teraction pattern of n particles; thus { i, j } / ∈ E if and only if the particles i and j cannot in teract with eac h other. Let a ij b e the magnitude of the interaction of i and j (the sign of a ij corresp onds to attraction/repulsion). In the Ising mo del each particle i ∈ { 1 , . . . , n } has a spin ε i ∈ {− 1 , 1 } and the total energy of the system is giv en by the quan tit y − P { i,j }∈ E a ij ε i ε j . A spin conﬁguration ( ε 1 , . . . , ε n ) ∈ {− 1 , 1 } n is called a ground state if it minimizes the total energy . Thus the problem of ﬁnding a ground state is precisely that of computing the maximum app earing in the righ t hand side of (36). F or more information on this topic see [88, pp. 352–355]. Ph ysical systems seek to settle at a ground state, and therefore it is natural to ask whether it is computationally eﬃcien t (i.e., p olynomial time computable) to ﬁnd suc h a ground state, at least appro ximately . Such questions ha v e b een studied in the physics literature for sev eral decades; see [18, 16, 13, 22]. In particular, it w as sho wn in [16] that if G is a planar graph then one can ﬁnd a ground state in p olynomial time, but in [13] it was shown that when G is the three dimensional grid then this computational task is NP-hard. Since the quan tit y in the left hand side of (36) is a semideﬁnite program and therefore can b e computed in p olynomial time with arbitrarily goo d precision, a goo d b ound on K ( G ) yields a p olynomial time algorithm that computes the energy of a ground state with corresp ondingly go o d appro ximation guaran tee. Moreo ver, as explained in [7], the pro of of the upp er b ound in (39) yields a p olynomial time algorithm that ﬁnds a spin conﬁguration ( σ 1 , . . . , σ n ) ∈ {− 1 , 1 } n for whic h X i,j ∈{ 1 ,...,n } { i,j }∈ E a ij σ i σ j & 1 log ϑ · max { ε i } n i =1 ⊆{− 1 , 1 } X i,j ∈{ 1 ,...,n } { i,j }∈ E a ij ε i ε j . (42) An analogous p olynomial time algorithm corresp onds to the bound (40). These algorithms yield the b est known eﬃcient metho ds for computing a ground state of Ising spin glasses on a v ariety of interaction graphs. 3.1.2. Corr elation clustering. A diﬀerent in terpretation of (36) yields the b est kno wn p oly- nomial time approximation algorithm for the correlation clustering problem [14, 25]; this con- nection is due to [27]. In terpret the g raph G = ( { 1 , . . . , n } , E ) as the “similarit y/dissmilarit y graph” for the items { 1 , . . . , n } , in the follo wing sense. F or { i, j } ∈ E w e are giv en a sign a ij ∈ {− 1 , 1 } whic h has the follo wing meaning: if a ij = 1 then i and j are deemed to be similar, and if a ij = − 1 then i and j are deemed to b e diﬀeren t. If { i, j } / ∈ E then we do not express an y judgemen t on the similarity or dissimilarity of i and j . 20 Assume that A 1 , . . . , A k is a partition (or “clustering”) of { 1 , . . . , n } . An agreement b e- t w een this clustering and our similarit y/dissmilarit y judgemen ts is a pair i, j ∈ { 1 , . . . , n } suc h that a ij = 1 and i, j ∈ A r for some r ∈ { 1 , . . . , k } or a ij = − 1 and i ∈ A r , j ∈ A s for distinct r , s ∈ { 1 , . . . , k } . A disagreement b et w een this clustering and our similar- it y/dissmilarit y judgemen ts is a pair i, j ∈ { 1 , . . . , n } such that a ij = 1 and i ∈ A r , j ∈ A s for distinct r, s ∈ { 1 , . . . , k } or a ij = − 1 and i, j ∈ A r for some r ∈ { 1 , . . . , k } . Our goal is to cluster the items while encouraging agreemen ts and penalizing disagreemen ts. Thus, w e wish to ﬁnd a clustering of { 1 , . . . , n } in to an unsp eciﬁed num b er of clusters which maximizes the total n um b er of agreemen ts minus the total num ber of disagreements. It was prov ed in [27] that the case of clustering into tw o parts is the “b ottleneck” for this problem: if there w ere a polynomial time algorithm that ﬁnds a clustering into tw o parts for whic h the total n um b er of agreements minus the total n um b er of disagreements is at least a fraction α ∈ (0 , 1) of the maxim um p ossible (o v er all bi-partitions) total n um b er of agreemen ts min us the total n um ber of disagreemen ts, then one could ﬁnd in polynomial time a clustering which is at least a fraction α / (2 + α ) of the analogous maximum that is deﬁned without sp ecifying the n um b er of clusters. One chec ks that the problem of ﬁnding a partition in to t wo clusters that maximizes the total n um b er of agreements minus the total n um b er of disagreemen ts is the same as the problem of computing the maxim um in the righ t hand side of (36). Thus the upp er b ound in (39) yields a p olynomial time algorithm for correlation clustering with approximation guaran tee O (log ϑ ), whic h is the b est known approximation algorithm for this problem. Note that when G is the complete graph then the approximation ratio is O (log n ). As will be explained in Section 7, it is known [69] that for ev ery γ ∈ (0 , 1 / 6), if there w ere a p olynomial time algorithm for correlation clustering that yields an approximation guarantee of (log n ) γ then there would b e an algorithm for 3-colorability that runs in time 2 (log n ) O (1) , a conclusion whic h is widely b eliev ed to b e imp ossible. 4. Kernel clustering and the propeller conjecture Here w e describe a large class of Grothendiec k-t yp e inequalities that is motiv ated by algorithmic applications to a com binatorial optimization problem called Kernel Clustering. This problem originates in machine learning [110], and its only kno wn rigorous appro ximation algorithms follo w from Grothendiec k inequalities (these algorithms are sharp assuming the UGC). W e will ﬁrst describ e the inequalities and then the algorithmic application. Consider the sp ecial case of the Grothendieck inequalit y (1) where A = ( a ij ) is an n × n p ositiv e semideﬁnite matrix. In this case we may assume without loss of generalit y that in (1) x i = y i and ε i = δ i for every i ∈ { 1 , . . . , n } since this holds for the maxima on either side of (1) (see also the explanation in [8, Sec. 5.2]). It follo ws from [45, 107] (see also [95]) that for ev ery n × n symmetric p ositive semideﬁnite matrix A = ( a ij ) w e ha ve max x 1 ,...,x n ∈ S n − 1 n X i =1 n X j =1 a ij h x i , x j i 6 π 2 · max ε 1 ,...,ε n ∈{− 1 , 1 } n X i =1 n X j =1 a ij ε i ε j , (43) and that π 2 is the b est p ossible constant in (43). A natural v arian t of (43) is to replace the n um b ers − 1 , 1 b y general v ectors v 1 , . . . , v k ∈ R k , namely one migh t ask for the smallest constan t K ∈ (0 , ∞ ) such that for every symmetric 21 p ositiv e semideﬁnite n × n matrix ( a ij ) w e ha ve: max x 1 ,...,x n ∈ S n − 1 n X i =1 n X j =1 a ij h x i , x j i 6 K max u 1 ,...,u n ∈{ v 1 ,...,v k } n X i =1 n X j =1 a ij h u i , u j i . (44) The b est constant K in (44) can b e c haracterized as follo ws. Let B = ( b ij = h v i , v j i ) b e the Gram matrix of v 1 , . . . , v k . Let C ( B ) b e the maximum ov er all partitions { A 1 , . . . , A k } of R k − 1 in to measurable sets of the quan tit y P k i =1 P k j =1 b ij h z i , z j i , where for i ∈ { 1 , . . . , k } the v ector z i ∈ R k − 1 is the Gaussian momen t of A i , i.e., z i = 1 (2 π ) ( k − 1) / 2 Z A i xe −k x k 2 2 / 2 dx. It w as pro ved in [67] that (44) holds with K = 1 /C ( B ) and that this constant is sharp. Inequalit y (44) with K = 1 /C ( B ) is prov ed via the follo wing r ounding pr o c e dur e . Fix unit v ectors x 1 , . . . , x n ∈ S n − 1 . Let G = ( g ij ) b e a ( k − 1) × n random matrix whose en tries are i.i.d. standard Gaussian random v ariables. Let A 1 , . . . , A k ⊆ R k − 1 b e a measurable partition of R k − 1 at which C ( B ) is attained (for a pro of that the maximum deﬁning C ( B ) is indeed attained, see [67]). Deﬁne a random choice of u i ∈ { v 1 , . . . , v k } by setting u i = v ` for the unique ` ∈ { 1 , . . . , k } suc h that Gx i ∈ A ` . The fact that (44) holds with K = 1 /C ( B ) is a consequence of the follo wing fact, whose pro of w e skip (the full details are in [67]). E " n X i =1 n X j =1 a ij h u i , u j i # > C ( B ) n X i =1 n X j =1 a ij h x i , x j i . (45) Determining the partition of R k − 1 that achiev es the v alue C ( B ) is a nontrivial problem in general, even in the sp ecial case when B = I k is the k × k iden tit y matrix. Note that in this case one desires a partition { A 1 , . . . , A k } of R k − 1 in to measurable sets so as to maximize the follo wing quan tity . k X i =1     1 (2 π ) ( k − 1) / 2 Z A i xe −k x k 2 2 / 2 dx     2 2 . As sho wn in [66, 67], the optimal partition is given b y simplicial cones cen tered at the origin. When B = I 2 w e ha v e C ( I 2 ) = 1 π , and the optimal partition of R in to t w o cones is the p ositiv e and the negative axes. When B = I 3 it w as sho wn in [66] that C ( I 3 ) = 9 8 π , and the optimal partition of R 2 in to three cones is the pr op el ler partition, i.e., in to three cones with angular measure 120 ◦ eac h. Though it might b e surprising at ﬁrst sight, the authors p osed in [66] the pr op el ler c on- je ctur e : for any k > 4, the optimal partition of R k − 1 in to k parts is P × R k − 3 where P is the prop eller partition of R 2 . In other w ords, even if one is allow ed to use k parts, the prop eller conjecture asserts that the best partition consists of only three nonempty parts. Recently , this conjecture w as solv ed p ositiv ely [53] for k = 4, i.e., for partitions of R 3 in to four mea- surable parts. The pro of of [53] reduces the problem to a concrete ﬁnite set of numerical inequalities which are then veriﬁed with full rigor in a computer-assisted fashion. Note that this is the ﬁrst non trivial (surprising?) case of the prop eller conjecture, i.e., this is the ﬁrst case in whic h w e indeed drop one of the four allow ed parts in the optimal partition. 22 W e no w describ e an application of (44) to the Kernel Clustering problem; a general frame- w ork for clustering massiv e statistical data so as to uncov er a certain hypothesized struc- ture [110]. The problem is deﬁned as follo ws. Let A = ( a ij ) b e an n × n symmetric p ositive semideﬁnite matrix which is usually normalized to b e centered, i.e., P n i =1 P n j =1 a ij = 0. The matrix A is often though t of as the correlation matrix of random v ariables ( X 1 , . . . , X n ) that measure attributes of certain empirical data, i.e., a ij = E [ X i X j ]. W e are also giv en another symmetric p ositive semideﬁnite k × k matrix B = ( b ij ) which functions as a hypothesis, or test matrix. Think of n as h uge and k as a small constant. The goal is to cluster A so as to obtain a smaller matrix which most resembles B . F ormally , w e wish to ﬁnd a partition { S 1 , . . . , S k } of { 1 , . . . , n } so that if we write c ij = P ( p,q ) ∈ S i × S j a pq then the resulting clus- tered v ersion of A has the maximum correlation P k i =1 P k j =1 c ij b ij with the h yp othesis matrix B . In words, w e form a k × k matrix C = ( c ij ) by summing the entries of A o ver the blo c ks induced by the giv en partition, and we wish to pro duce in this w a y a matrix that is most correlated with B . Equiv alently , the goal is to ev aluate the num b er: Clust ( A | B ) = max σ : { 1 ,...,n }→{ 1 ,...,k } k X i =1 k X j =1 a ij b σ ( i ) σ ( j ) . (46) The strength of this generic clustering framework is based in part on the ﬂexibility of adapting the matrix B to the problem at hand. V arious particular choices of B lead to well studied optimization problems, while other sp ecialized choices of B are based on statistical h yp otheses whic h ha ve b een applied with some empirical success. W e refer to [110, 66] for additional bac kground and a discussion of sp eciﬁc examples. In [66] it w as shown that there exists a randomized p olynomial time algorithm that takes as input t w o positive semideﬁnite matrices A, B and outputs a n um b er α that satisﬁes Clust ( A | B ) 6 E [ α ] 6  1 + 3 π 2  Clust ( A | B ). There is no reason to b elieve that the appro xi- mation factor of 1 + 3 π 2 is sharp, but nevertheless prior to this result, which is based on (44), no constan t factor p olynomial time appro ximation algorithm for this problem w as known. Sharp er results can b e obtained if we assume that the input matrices are normalized appropriately . Sp eciﬁcally , assume that k > 3 and restrict only to inputs A that are cen tered, i.e., P n i =1 P n j =1 a ij = 0, and inputs B that are either the identit y matrix I k , or satisfy P k i =1 P k j =1 b ij = 0 ( B is cen tered as well) and b ii = 1 for all i ∈ { 1 , . . . , k } ( B is “spherical”). Under these assumptions the output of the algorithm of [66] satisﬁes Clust ( A | B ) 6 E [ α ] 6 8 π 9  1 − 1 k  Clust ( A | B ). Moreo ver, it was sho wn in [66] that assum- ing the prop eller conjecture and the UGC, no p olynomial time algorithm can achiev e an appro ximation guarantee that is strictly smaller than 8 π 9  1 − 1 k  (for input matrices normal- ized as ab ov e). Since the prop eller conjecture is known to hold true for k = 3 [66] and k = 4 [53], w e kno w that the UGC hardness threshold for the ab ov e problem is exactly 16 π 27 when k = 3 and 2 π 3 when k = 4. A ﬁner, and perhaps more natural, analysis of the kernel clustering problem can be ob- tained if we ﬁx the matrix B and let the input b e only the matrix A , with the goal b eing, as b efore, to appro ximate the quantit y Clust ( A | B ) in p olynomial time. Since B is symmetric and p ositiv e semideﬁnite we can ﬁnd vectors v 1 , . . . , v k ∈ R k suc h that B is their Gram matrix, i.e., b ij = h v i , v j i for all i, j ∈ { 1 , . . . , k } . Let R ( B ) b e the smallest p ossible radius of a Euclidean ball in R k whic h con tains { v 1 , . . . , v k } and let w ( B ) b e the cen ter of this ball. 23 W e note that both R ( B ) and w ( B ) can be eﬃcien tly computed b y solving an appropriate semideﬁnite program. Let C ( B ) b e the parameter deﬁned ab o ve. It is shown in [67] that for every ﬁxed symmetric p ositiv e semideﬁnite k × k matrix B there exists a randomized p olynomial time algorithm which given an n × n symmetric p ositiv e semideﬁnite cen tered matrix A , outputs a n umber Alg ( A ) such that Clust ( A | B ) 6 E [Alg( A )] 6 R ( B ) 2 C ( B ) Clust ( A | B ) . As w e will explain in Section 7, assuming the UGC no polynomial time algorithm can ac hiev e an appro ximation guaran ty strictly smaller than R ( B ) 2 /C ( B ). The algorithm of [67] uses semideﬁnite programming to compute the v alue SDP( A | B ) = max ( n X i =1 n X j =1 a ij h x i , x j i : x 1 , . . . , x n ∈ R n ∧ k x i k 2 6 1 ∀ i ∈ { 1 , . . . , n } ) = max ( n X i =1 n X j =1 a ij h x i , x j i : x 1 , . . . , x n ∈ S n − 1 ) , (47) where the last equality in (47) holds since the function ( x 1 , . . . , x n ) 7→ P n i =1 P n j =1 a ij h x i , x j i is con v ex (by virtue of the fact that A is p ositive semideﬁnite). W e claim that Clust ( A | B ) R ( B ) 2 6 SDP( A | B ) 6 Clust ( A | B ) C ( B ) , (48) whic h implies that if w e output the n um ber R ( B ) 2 SDP( A | B ) w e will obtain a p olynomial time algorithm whic h appro ximates Clust ( A | B ) up to a factor of R ( B ) 2 C ( B ) . T o verify (48) let x ∗ 1 , . . . , x ∗ n ∈ S n − 1 and σ ∗ : { 1 , . . . , n } → { 1 , . . . , k } b e suc h that SDP( A | B ) = n X i =1 n X j =1 a ij  x ∗ i , x ∗ j  and Clust ( A | B ) = n X i =1 n X j =1 a ij b σ ∗ ( i ) σ ∗ ( j ) . W rite ( a ij ) n i,j =1 = ( h u i , u j i ) n i,j =1 for some u 1 , . . . , u n ∈ R n . The assumption that A is cen tered means that P n i =1 u i = 0. The rightmost inequalit y in (48) is just the Grothendieck inequalit y (44). The leftmost inequality in (48) follows from the fact that v σ ∗ ( i ) − w ( B ) R ( B ) has norm at most 1 for all i ∈ { 1 , . . . , n } . Indeed, these norm b ounds imply that SDP( A | B ) > n X i =1 n X j =1 a ij  v σ ∗ ( i ) − w ( B ) R ( B ) , v σ ∗ ( j ) − w ( B ) R ( B )  = 1 R ( B ) 2 n X i =1 n X j =1 a ij  v σ ∗ ( i ) , v σ ∗ ( j )  − 2 R ( B ) 2 n X i =1  w ( B ) , v σ ∗ ( i )  * u i , n X j =1 u j + + k w ( B ) k 2 2 R ( B ) 2 n X i =1 n X j =1 a ij = Clust ( A | B ) R ( B ) 2 . 24 This completes the proof that the ab o v e algorithm appro ximates eﬃcien tly the num b er Clust ( A | B ), but do es not address the issue of how to eﬃcien tly compute an assignment σ : { 1 , . . . , n } → { 1 , . . . , k } for whic h the induced clustering of A has the required v alue. The issue here is to ﬁnd eﬃciently a conical simplicial partition A 1 , . . . , A k of R k − 1 at whic h C ( B ) is attained. Suc h a partition exists and may b e assumed to be hardwired into the description of the algorithm. Alternately , the partition that achiev es C ( B ) up to a desired degree of accuracy can be found b y brute-force for ﬁxed k (or k = k ( n ) gro wing suﬃcien tly slo wly as a function of n ); see [67]. F or large v alues of k the problem of computing C ( B ) eﬃcien tly remains op en. 5. The L p Gr othendieck pr oblem Fix p ∈ [1 , ∞ ] and consider the following algorithmic problem. The input is an n × n matrix A = ( a ij ) whose diagonal entries v anish, and the goal is to compute (or estimate) in p olynomial time the quan tit y M p ( A ) = max t 1 ,...,t n ∈ R P n k =1 | t k | p 6 1 n X i =1 n X j =1 a ij t i t j = max t 1 ,...,t n ∈ R P n k =1 | t k | p =1 n X i =1 n X j =1 a ij t i t j . (49) The second equality in (49) follo ws from a straightforw ard con v exity argument since the diagonal entries of A v anish. Some of the results describ ed b elow hold true without the v an- ishing diagonal assumption, but w e will tacitly mak e this assumption here since the second equalit y in (49) makes the problem b ecome purely combinatorial when p = ∞ . Sp eciﬁcally , if G = ( { 1 , . . . , n } , E ) is the complete graph then M ∞ ( A ) = max ε 1 ,...,ε n ∈{− 1 , 1 } P { i,j }∈ E a ij ε i ε j . The results describ ed in Section 3 therefore imply that there is a p olynomial time algorithm that approximates M ∞ ( A ) up to a O (log n ) factor, and that it is computationally hard to ac hiev e an approximation guarantee smaller than (log n ) γ for all γ ∈ (0 , 1 / 6). There are v alues of p for which the ab ov e problem can b e solved in p olynomial time. When p = 2 the quantit y M 2 ( A ) is the largest eigen v alue of A , and hence can b e computed in polynomial time [43, 82]. When p = 1 it w as sho wn in [2] that it is possible to appro ximate M 1 ( A ) up to a factor of 1 + ε in time n O (1 /ε ) . It is also shown in [2] that the problem of (1 + ε )-appro ximately computing M 1 ( A ) is W [1] complete; we refer to [35] for the deﬁnition of this type of hardness result and just say here that it indicates that a running time of c ( ε ) n O (1) is imp ossible. The algorithm of [2] pro ceeds b y sho wing that for ev ery m ∈ N there exist y 1 , . . . , y n ∈ 1 m Z with P n i =1 | y i | 6 1 and P n i =1 P n j =1 a ij y i y j >  1 − 1 m  M 1 ( A ). The num ber of such v ectors y is 1 + P m k =1 P k ` =1 2 `  n `  k − 1 ` − 1  6 4 n m . An exhaustive search o v er all suc h v ectors will then appro ximate M 1 ( A ) to within a factor of m/ ( m − 1) in time O ( n m ). T o pro ve the existence of y ﬁx t 1 , . . . , t n ∈ R with P n k =1 | t k | = 1 and P n i =1 P n j =1 a ij t i t j = M 1 ( A ). Let X ∈ R n b e a random v ector giv en b y Pr [ X = sign( t j ) e j ] = | t j | for ev ery j ∈ { 1 , . . . , n } . Here e 1 , . . . , e n is the standard basis of R n . Let { X s = ( X s 1 , . . . , X sn ) } m s =1 b e indep endent copies of X and set Y = ( Y 1 , . . . , Y n ) = 1 m P m s =1 X s . Note that if s, t ∈ { 1 , . . . , m } are distinct then for all i, j ∈ { 1 , . . . , n } we ha v e E [ X si X tj ] = sign( t i )sign( t j ) | t i | · | t j | = t i t j . Also, for every s ∈ { 1 , . . . , m } and every distinct i, j ∈ { 1 , . . . , n } we ha ve X si X sj = 0. Since the diagonal 25 en tries of A v anish it follows that E " n X i =1 n X j =1 a ij Y i Y j # = 1 m 2 X s,t ∈{ 1 ,...,m } s 6 = t X i,j ∈{ 1 ,...,n } i 6 = j a ij E [ X si X tj ] =  1 − 1 m  M 1 ( A ) . (50) Noting that the vector Y has ` 1 norm at most 1 and all of its entries are in teger m ultiples of 1 /m , it follo ws from (50) that with p ositiv e probabilit y Y will hav e the desired prop erties. Ho w can w e in terp olate betw een the abov e results for p ∈ { 1 , 2 , ∞} ? It turns out that there is a satisfactory answ er for p ∈ (2 , ∞ ) but the range p ∈ (1 , 2) remains a m ystery . T o explain this write γ p = ( E [ | G | p ]) 1 /p , where G is a standard Gaussian random v ariable. One computes that γ p = √ 2 Γ  p +1 2  √ π ! 1 /p . (51) Also, Stirling’s form ula implies that γ 2 p = p e + O (1) as p → ∞ . It follo ws from [92, 48] that for ev ery ﬁxed p ∈ [2 , ∞ ) there exists a polynomial time algorithm that appro ximates M p ( A ) to within a factor of γ 2 p , and that for every ε ∈ (0 , 1) the existence of a p olynomial time algorithm that appro ximates M p ( A ) to within a factor γ 2 p − ε would imply that P = N P . These results improv e o ver the earlier work [70] which designed a p olynomial time algorithm for M p ( A ) whose appro ximation guaran tee is (1 + o (1)) γ 2 p as p → ∞ , and whic h prov ed a γ 2 p − ε hardness results assuming the UGC rather than P 6 = N P . The follo wing Grothendiec k-t yp e inequalit y w as pro ved in [92] and indep enden tly in [48]. F or every n × n matrix A = ( a ij ) and ev ery p ∈ [2 , ∞ ) we hav e max x 1 ,...,x n ∈ R n P n k =1 k x k k p 2 6 1 n X i =1 n X j =1 a ij h x i , x j i 6 γ 2 p max t 1 ,...,t n ∈ R P n k =1 | t k | p 6 1 n X i =1 n X j =1 a ij t i t j . (52) The constan t γ 2 p in (52) is sharp. The v alidity of (52) implies that M p ( A ) can b e computed in p olynomial time to within a factor γ 2 p . This follo ws since the left hand side of (52) is the maxim um of P n i =1 P n j =1 a ij X ij , whic h is a linear functional in the v ariables ( X ij ), giv en the constrain t that ( X ij ) is a symmetric p ositive semideﬁnite matrix and P n i =1 X p/ 2 ii 6 1. The latter constrain t is con v ex since p > 2, and therefore this problem falls in to the framework of conv ex programming that was describ ed in Section 1.2. Thus the left hand side of (52) can b e computed in p olynomial time with arbitrarily goo d precision. Cho osing the speciﬁc v alue p = 3 in order to illustrate the curren t satisfactory state of aﬀairs concretely , the N P -hardness threshold of computing max P n i =1 | x i | 3 6 1 P n i =1 P n j =1 a ij x i x j equals 2 / 3 √ π . Suc h a sharp N P -hardness result (with transcendental hardness ratio) is quite remark able, since it shows that the geometric algorithm presen ted ab ov e probably yields the b est p ossible approximation guarantee ev en when one allows any p olynomial time algorithm whatso ev er. Results of this type ha v e been kno wn to hold under the UGC, but this N P - hardness result of [48] seems to b e the ﬁrst time that such an algorithm for a simple to state problem w as sho wn to b e optimal assuming P 6 = N P . 26 When p ∈ [1 , 2] one can easily show [92] that max x 1 ,...,x n ∈ R n P n k =1 k x k k p 2 6 1 n X i =1 n X j =1 a ij h x i , x j i = max t 1 ,...,t n ∈ R P n k =1 | t k | p 6 1 n X i =1 n X j =1 a ij t i t j . (53) While the identit y (53) seems to indicate the problem of computing M p ( A ) in p olynomial time migh t b e easy for p ∈ (1 , 2), the ab ov e argument fails since the constrain t P n i =1 X p/ 2 ii 6 1 is no longer conv ex. This is reﬂected b y the fact that despite (53) the problem of (1 + ε )- appro ximately computing M 1 ( A ) is W [1] complete [2]. It remains op en whether for p ∈ (1 , 2) one can appro ximate M p ( A ) in p olynomial time up to a factor O (1), and no hardness of appro ximation result is kno wn for this problem as well. Remark 5.1. If p ∈ [2 , ∞ ] then for positive semideﬁnite matrices ( a ij ) the constant γ 2 p in the right hand side of (52) can b e impro v ed [92] to γ − 2 p ∗ , where here and in what follo ws p ∗ = p/ ( p − 1). F or p = ∞ this estimate coincides with the classical b ound [45, 107] that we ha v e already encoun tered in (43), and it is sharp in the en tire range p ∈ [2 , ∞ ]. Moreo v er, this bound sho ws that there exists a polynomial time algorithm that tak es as input a positive semideﬁnite matrix A and outputs a num ber that is guaran teed to b e within a factor γ − 2 p ∗ of M p ( A ). Conv ersely , the existence of a p olynomial time algorithm for this problem whose appro ximation guaran tee is strictly smaller than γ − 2 p ∗ w ould con tradict the UGC [92]. Remark 5.2. The bilinear v arian t of (52) is an immediate consequence of the Grothendiec k inequalit y (1). Sp eciﬁcally , assume that p, q ∈ [1 , ∞ ] and x 1 , . . . , x m , y 1 , . . . , y n ∈ R m + n satisfy P m i =1 k x i k p 2 6 1 and P n j =1 k y j k q 2 6 1. W rite α i = k x i k 2 and β j = k y j k 2 . F or an m × n matrix ( a ij ) the Grothendiec k inequalit y pro vides ε 1 , . . . , ε m , δ 1 , . . . , δ n ∈ {− 1 , 1 } such that P m i =1 P n j =1 a ij h x i , y j i 6 K G P m i =1 P n j =1 a ij α i β j ε i δ j . This establishes the following inequality . max { x i } m i =1 , { y j } n j =1 ⊆ R n + m P m i =1 k x i k p 2 6 1 P n j =1 k y j k q 2 6 1 m X i =1 n X j =1 a ij h x i , y j i 6 K G · max { s i } m i =1 , { t j } n j =1 ⊆ R P m i =1 | s i | p 6 1 P n j =1 | t j | q 6 1 m X i =1 n X j =1 a ij s i t j . (54) Observ e that the maxim um on the righ t hand side of (54) is k A k p → q ∗ ; the op erator norm of A acting as a linear op erator from ( R m , k · k p ) to ( R n , k · k q ∗ ). Moreo v er, if p, q > 2 then the left hand side of (54) can be computed in p olynomial time. Th us, for p > 2 > r > 1, the generalized Grothendiec k inequality (54) yields a p olynomial time algorithm that takes as input an m × n matrix A = ( a ij ) and outputs a num ber that is guaran teed to be within a factor K G of k A k p → r . This algorithmic task has b een previously studied in [96] (see also [93, Sec. 4.3.2]), where for p > 2 > r > 1 a p olynomial time algorithm w as designed that appro ximates k A k p → r up to a factor 3 π /  6 √ 3 − 2 π  ∈ [2 . 293 , 2 . 294]. The ab ov e argument yields the appro ximation factor K G < 1 . 783 as a formal consequence of the Grothendiec k inequalit y . The complexit y of the problem of appro ximating k A k p → r has been studied in [17], where it is sho wn that if either p > r > 2 or 2 > p > r then it is N P -hard to appro ximate k A k p → r up to any constant factor, and unless 3-colorabilit y can b e solv ed in time 2 (log n ) O (1) , for an y ε ∈ (0 , 1) no p olynomial time algorithm can approximate k A k p → r up to 2 (log n ) 1 − ε . Remark 5.3. Let K ⊆ R n b e a compact and conv ex set whic h is inv arian t under reﬂections with resp ect to the co ordinate h yp erplanes. Denote b y C K the smallest C ∈ (0 , ∞ ) such 27 that for ev ery n × n matrix ( a ij ) w e ha ve max x 1 ,...,x n ∈ R n ( k x 1 k 2 ,..., k x n k 2 ) ∈ K n X i =1 n X j =1 a ij h x i , x j i 6 C max t 1 ,...,t n ∈ R ( t 1 ,...,t n ) ∈ K n X i =1 n X j =1 a ij t i t j . (55) Suc h generalized Grothendieck inequalities are inv estigated in [92], where b ounds on C K are obtained under certain geometric assumptions on K . These assumptions are easy to v erify when K = { x ∈ R n : k x k p 6 1 } , yielding (52). More subtle inequalities of this t ype for other con v ex b o dies K are discussed in [92], but we will not describ e them here. The natural bilinear v ersion of (55) is: if K ⊆ R m and L ⊆ R n are compact and conv ex sets that are in v ariant under reﬂections with resp ect to the co ordinate h yp erplanes then let C K,L denote the smallest constan t C ∈ (0 , ∞ ) such that for every m × n matrix ( a ij ) w e ha ve max { x i } m i =1 , { y j } n j =1 ⊆ R n + m ( k x 1 k 2 ,..., k x m k 2 ) ∈ K ( k y 1 k 2 ,..., k y n k 2 ) ∈ L m X i =1 n X j =1 a ij h x i , y j i 6 C max { s i } m i =1 , { t j } n j =1 ⊆ R ( s 1 ,...,s m ) ∈ K ( t 1 ,...,t n ) ∈ L m X i =1 n X j =1 a ij s i t j . (56) The argumen t in Remark 5.2 sho ws that C K,L 6 K G . Under certain geometric assumptions on K, L this b ound can b e impro v ed [92]. 6. Higher rank Grothendieck inequalities W e hav e already seen several v ariants of the classical Grothendieck inequalit y (1), in- cluding the Grothendieck inequality for graphs (36), the v ariant of the p ositiv e semideﬁnite Grothendiec k inequalit y arising from the Kernel Clustering problem (44), and Grothendieck inequalities for con v ex b odies other than the cub e (52), (54), (55), (56). The literature con- tains additional v arian ts of the Grothendieck inequalit y , some of which will b e describ ed in this section. Let G = ( { 1 , . . . , n } , E ) b e a graph and ﬁx q , r ∈ N . F ollowing [23], let K ( q → r , G ) be the smallest constan t K ∈ (0 , ∞ ) such that for every n × n matrix A = ( a ij ) w e ha ve max x 1 ,...,x n ∈ S q − 1 X i,j ∈{ 1 ,...,n } { i,j }∈ E a ij h x i , x j i 6 K max y 1 ,...,y n ∈ S r − 1 X i,j ∈{ 1 ,...,n } { i,j }∈ E a ij h y i , y j i . (57) Set also K ( r, G ) = sup q ∈ N K ( q → r , G ). W e similarly deﬁne K + ( q → r , G ) to b e the smallest constan t K ∈ (0 , ∞ ) satisfying (57) for all p ositive semideﬁnite matrices A , and corresp ondingly K + ( r , G ) = sup q ∈ N K + ( q → r, G ). T o link these deﬁnitions to what we hav e already seen in this article, observ e that K G is the suprem um of K (1 , G ) o v er all ﬁnite bipartite graphs G , and due to the results describ ed in Section 4 w e ha ve sup n ∈ N K + ( r , K  n ) = sup n ∈ N sup x 1 ,...,x n ∈ S r − 1 1 C  h x i , x j i ) n i,j =1  , (58) where K  n is the complete graph on n -vertices with self lo ops. Recall that the deﬁnition of C ( B ) for a p ositive semideﬁnite matrix B is giv en in the paragraph follo wing (44). The most important sp ecial case of (57) is when r = 2, since the suprem um of K (2 , G ) o v er all ﬁnite bipartite graphs G , denoted K C G , is the complex Grothendieck constan t, a fundamen tal quantit y whose v alue has b een inv estigated in [45, 83, 99, 50, 74]. The b est 28 kno wn bounds on K C G are 1 . 338 < K C G < 1 . 4049; see [101, Sec. 4] for more information on this topic. W e also refer to [32, 113] for information of the constan ts K (2 q → 2 , G ) where G is a bipartite graph. The supremum of K ( q → r, G ) o v er all biparpite graphs G was in v estigated in [78] for r = 1 and in [74] for r = 2; see also [75] for a uniﬁed treatmen t of these cases. The higher rank constants K ( q → r , G ) when G is bipartite w ere in troduced in [22]. Deﬁnition (57) in full generalit y is due to [23] where sev eral estimates on K ( q → r, G ) are giv en. One of the motiv ations of [23] is the case r = 3 (and G a subgraph of the grid Z 3 ), based on the connection to the p olynomial time appro ximation of ground states of spin glasses as described in Section 3.1.1; the case r = 1 was discussed in Section 3.1.1 in connection with the Ising model, but the case r = 3 corresp onds to the more ph ysically realistic Heisen b erg mo del of v ector-v alued spins. The parameter sup n ∈ N K + ( r , K  n ) (recall (58)) w as studied in [22] in the con text of quan tum information theory , and in [24] it w as sho wn that K + (1 , K  n ) 6 π n  Γ(( n + 1) / 2) Γ( n/ 2)  2 = π 2 − π 4 n + O  1 n 2  , (59) and sup n ∈ N K + ( r , K  n ) = r 2  Γ( r / 2) Γ(( r + 1) / 2)  2 = 1 + 1 2 r + O  1 r 2  . W e refer to [24] for a corresp onding UGC hardness result. Note that (59) impro v es ov er (43) for ﬁxed n ∈ N . 7. Hardness of appro xima tion W e ha ve seen examples of ho w Grothendiec k-t yp e inequalities yield upp er b ounds on the b est p ossible p olynomial time appro ximation ratio of certain optimization problems. F rom the algorithmic and computational complexit y viewp oint it is interesting to pro ve computational lo w er b ounds as well, i.e., results that rule out the existence of eﬃcient algorithms ac hieving a certain appro ximation guaran tee. Such results are kno wn as hardness or inappro ximabilit y results, and as explained in Section 1.1, at presen t the state of the art allo ws one to prov e such results while relying on complexit y theoretic assumptions such as P 6 = N P or the Unique Games Conjecture. A nice feature of the kno wn hardness results for problems in whic h a Grothendiec k-type inequality has b een applied is that often the hardness results (lo wer b ounds) exactly match the appro ximation ratios (upp er b ounds). In this section w e brieﬂy review the kno wn hardness results for optimization problems associated with Grothendiec k-t yp e inequalities. Let K n,n - QP denote the optimization problem associated with the classical Grothendiec k inequalit y (the acron ym QP stands for “quadratic programming”). Th us, in the problem K n,n - QP w e are giv en an n × n real matrix ( a ij ) and the goal is to determine the quan tit y max ( m X i =1 n X j =1 a ij ε i δ j : { ε i } m i =1 , { δ j } n j =1 ⊆ {− 1 , 1 } ) . As explained in [8], the MAX DICUT problem can b e framed as a sp ecial case of the problem K n,n - QP . Hence, as a consequence of [51], we kno w that for ev ery ε ∈ (0 , 1), assum- ing P 6 = N P there is no p olynomial time algorithm that approximates the K n,n - QP problem within ratio 13 12 − ε . In [68] it is sho wn that the lo w er bound (3) on the Grothendiec k constan t can b e translated into a hardness result, alb eit relying on the Unique Games Conjecture. 29 Namely , letting η 0 b e as in (3), for every ε ∈ (0 , 1) assuming the UGC there is no p olynomial time algorithm that appro ximates the K n,n - QP problem within a ratio π 2 e η 2 0 − ε . W e note that all the hardness results cited here rely on the well-kno wn paradigm of dictatorship testing . A low er b ound on the in tegralit y gap of a semideﬁnite program, suc h as the estimate K G > π 2 e η 2 0 , can b e translated in to a probabilistic test to c hec k whether a function f : {− 1 , 1 } n 7→ {− 1 , 1 } is a dictatorship, i.e., of the form f ( x ) = x i for some ﬁxed i ∈ { 1 , . . . , n } . If f is indeed a dictatorship, then the test passes with probabilit y c and if f is “far from a dictator” (in a formal sense that we do not describe here), the test passes with probabilit y at most s . The ratio c/s corresp onds exactly to the UGC-based hardness lo w er b ound. It is well-kno wn how to prov e a UGC-based hardness result once we ha v e the appropriate dictatorship test; see the surv ey [63]. The ab ov e quoted result of [68] relied on explicitly kno wing the lo w er bound construc- tion [105] leading to the estimate K G > π 2 e η 2 0 . On the other hand, in [104], building on the earlier w ork [103], it is sho wn that any low er bound on the Grothediec k constan t can b e translated in to a UGC-based hardness result, ev en without explicitly knowing the con- struction! Thus, mo dulo the UGC, the b est polynomial time algorithm to appro ximate the K n,n - QP problem is via the Grothendiec k inequalit y , even though w e do not kno w the precise v alue of K G . F ormally , for every ε ∈ (0 , 1), assuming the UGC there is no p olynomial time algorithm that appro ximates the K n,n - QP problem within a factor K G − ε . Let K n,n - QP PSD b e the special case of the K n,n - QP problem where the input matrix ( a ij ) is assumed to be p ositiv e semideﬁnite. By considering matrices that are Laplacians of graphs one sees that the MAX CUT problem is a sp ecial case of the problem K n,n - QP PSD (see [66]). Hence, due to [51], w e kno w that for every ε ∈ (0 , 1), assuming P 6 = N P there is no p olynomial time algorithm that appro ximates the K n,n - QP PSD problem within ratio 17 16 − ε . Moreo v er, it is prov ed in [66] that for ev ery ε ∈ (0 , 1), assuming the UGC there is no p olynomial time algorithm that appro ximates the K n,n - QP PSD problem within ratio π 2 − ε , an optimal hardness result due to the p ositive semideﬁnite Grothendiec k inequalit y (43). This follo ws from the more general results for the Kernel Clustering problem describ ed later. Let ( a ij ) b e an n × n real matrix with zero es on the diagonal. The K n - QP problem seeks to determine the quan tit y max ( m X i =1 n X j =1 a ij ε i ε j : { ε i } m i =1 ⊆ {− 1 , 1 } ) . In [69] it is prov ed that for every γ ∈ (0 , 1 / 6), assuming that N P do es not ha ve a 2 (log n ) O (1) time deterministic algorithm, there is no p olynomial time algorithm that approximates the K n - QP problem within ratio (log n ) γ . This improv es o v er [12] where a hardness factor of (log n ) c w as prov ed, under the same complexity assumption, for an unsp eciﬁed universal constan t c > 0. Recall that, as explained in Section 3, there is an algorithm for K n - QP that achiev es a ratio of O (log n ), so there remains an asymptotic gap in our understanding of the complexity of the K n - QP problem. F or the maximum acyclic subgraph problem, as discussed in Section 2.1.3, the gap b et w een the upper and lo w er bounds is ev en larger. W e hav e already seen that an approximation factor of O (log n ) is ac hiev able, but from the hardness persp ective we know due to [97] that there exists ε 0 > 0 such that assuming P 6 = N P there is no p olynomial time algorithm for the maximum acyclic subgraph problem that achiev es an approximation ratio less than 1 + ε 0 . In [47] it w as shown that assuming 30 the UGC there is no p olynomial time algorithm for the maximum acyclic subgraph problem that ac hiev es any constant approximation ratio. Fix p ∈ (0 , ∞ ). As discussed in Section 5, the L p Grothendiec k problem is as follo ws. Giv en an n × n real matrix A = ( a ij ) with zeros on the diagonal, the goal is to determine the quan tity M p ( A ) deﬁned in (49). F or p ∈ (2 , ∞ ) it was sho wn in [48] that for every ε ∈ (0 , 1), assuming P 6 = N P there is no p olynomial time algorithm that approximates the L p Grothendiec k problem within a ratio γ 2 p − ε . H ere γ p is deﬁned as in (51). This result (non trivially) builds on the previous result of [70] that obtained the same conclusion while assuming the UGC rather than P 6 = N P . F or the Kernel Clustering problem with a k × k hypothesis matrix B , an optimal hardness result is obtained in [67] in terms of the parameters R ( B ) and C ( B ) describ ed in Section 4. Sp eciﬁcally for a ﬁxed k × k symmetric p ositiv e semideﬁnite matrix B and for every ε ∈ (0 , 1), assuming the UGC there is no p olynomial time algorithm that, giv en an n × n matrix A approximates the quan tit y Clust ( A | B ) within ratio R ( B ) 2 C ( B ) − ε . When B = I k is the k × k identit y matrix, the following hardness result is obtained in [66]. Let ε > 0 b e an arbitrarily small constan t. Assuming the UGC, there is no p olynomial time algorithm that appro ximates Clust ( A | I 2 ) within ratio π 2 − ε . Similarly , assuming the UGC there is no p olynomial time algorithm that appro ximates Clust ( A | I 3 ) within ratio 16 π 27 − ε , and, using also the solution of the prop eller conjecture in R 3 giv en in [53], there is no p olynomial time algorithm that approximates Clust ( A | I 4 ) within ratio 2 π 3 − ε . F urthermore, for k > 5, assuming the prop eller conjecture and the UGC, there is no p olynomial time algorithm that appro ximates Clust ( A | I k ) within ratio 8 π 9  1 − 1 k  − ε . Ac kno wledgemen ts. W e are grateful to Oded Regev for many helpful suggestions. References [1] A. Ac ´ ın, N. Gisin, and B. T oner. Grothendieck’s constant and local models for noisy en tangled quan tum states. Phys. R ev. A (3) , 73(6, part A):062105, 5, 2006. [2] N. Alon. Maximizing a quadratic form on the ` n 1 unit ball. Unpublished man uscript, 2006. [3] N. Alon and E. Berger. The Grothendieck constan t of random and pseudo-random graphs. Discr ete Optim. , 5(2):323–327, 2008. [4] N. Alon, A. Co ja-Oghlan, H. H` an, M. Kang, V. R¨ odl, and M. Schac ht. Quasi-randomness and algo- rithmic regularity for graphs with general degree distributions. SIAM J. Comput. , 39(6):2336–2362, 2010. [5] N. Alon, R. A. Duke, H. Le fmann, V. R¨ odl, and R. Y uster. The algorithmic asp ects of the regularity lemma. J. A lgorithms , 16(1):80–109, 1994. [6] N. Alon, W. F ernandez de la V ega, R. Kannan, and M. Karpinski. Random sampling and approximation of MAX-CSPs. J. Comput. System Sci. , 67(2):212–243, 2003. Sp ecial issue on STOC2002 (Montreal, QC). [7] N. Alon, K. Mak aryc hev, Y. Mak aryc hev, and A. Naor. Quadratic forms on graphs. Invent. Math. , 163(3):499–522, 2006. [8] N. Alon and A. Naor. Appro ximating the cut-norm via Grothendieck’s inequality . SIAM J. Comput. , 35(4):787–803 (electronic), 2006. [9] N. Alon and A. Orlitsky . Rep eated communication and Ramsey graphs. IEEE T r ans. Inform. The ory , 41(5):1276–1289, 1995. [10] N. Alon and J. H. Spencer. The pr ob abilistic metho d . Wiley-In terscience Series in Discrete Mathematics and Optimization. Wiley-Interscience [John Wiley & Sons], New Y ork, second edition, 2000. With an app endix on the life and work of Paul Erd˝ os. 31 [11] S. Arora and B. Barak. Computational c omplexity . Cambridge Universit y Press, Cambridge, 2009. A mo dern approach. [12] S. Arora, E. Berger, G. Kindler, E. Hazan, and S. Safra. On non-appro ximability for quadratic pro- grams. In 46th Annual Symp osium on F oundations of Computer Scienc e , pages 206–215. IEEE Com- puter So ciet y , 2005. [13] C. P . Bac has. Computer-in tractability of the frustration mo del of a spin glass. J. Phys. A , 17(13):L709– L712, 1984. [14] N. Bansal, A. Blum, and S. Chawla. Correlation clustering. In 43r d A nnual IEEE Symp osium on F oundations of Computer Scienc e , pages 238–247, 2002. [15] N. Bansal and R. Williams. Regularit y lemmas and combinatorial algorithms. In 2009 50th Annual IEEE Symp osium on Foundations of Computer Scienc e (F OCS 2009) , pages 745–754. IEEE Computer So c., Los Alamitos, CA, 2009. [16] F. Barahona. On the computational complexity of Ising spin glass mo dels. J. Phys. A , 15(10):3241– 3253, 1982. [17] A. Bhask ara and A. Vija y araghav an. Approximating matrix p -norms. In Pr o c e e dings of the Twenty- Se c ond Annual ACM-SIAM Symp osium on Discr ete Algorithms , pages 497–511, 2011. [18] I. Bieche, R. Maynard, R. Rammal, and J.-P . Uhry . On the ground states of the frustration mo del of a spin glass b y a matching metho d of graph theory . J. Phys. A , 13(8):2553–2576, 1980. [19] R. Blei. A nalysis in inte ger and fr actional dimensions , volume 71 of Cambridge Studies in A dvanc e d Mathematics . Cam bridge Universit y Press, Cambridge, 2001. [20] R. C. Blei. Multidimensional extensions of the Grothendieck inequality and applications. Ark. Mat. , 17(1):51–68, 1979. [21] M. Brav erman, K. Mak aryc hev, Y. Mak aryc hev, and A. Naor. The Grothendiec k constant is strictly smaller than Krivine’s b ound. An extended abstract will app ear in 52nd Annual IEEE Symp osium on F oundations of Computer Science. Preprin t av ailable at , 2011. [22] J. Briet, H. Buhrman, and B. T oner. A generalized Grothendieck inequalit y and entanglemen t in XOR games. Preprin t av ailable at , 2009. [23] J. Briet, F. M. de Oliv eira Filho, and V. F. Grothendieck inequalities for semideﬁnite programs with rank constrain t. Preprint av ailable at , 2010. [24] J. Bri ¨ et, F. M. de Oliv eira Filho, and F. V allentin. The p ositiv e semideﬁnite Grothendieck problem with rank constraint. In Automata, L anguages and Pr o gr amming, 37th International Col lo quium, Part I , pages 31–42, 2010. [25] M. Charik ar, V. Guruswami, and A. Wirth. Clustering with qualitativ e information. J. Comput. System Sci. , 71(3):360–383, 2005. [26] M. Charik ar, K. Mak arychev, and Y. Mak arychev. On the adv antage o v er random for maximum acyclic subgraph. In 48th A nnual IEEE Symp osium on F oundations of Computer Scienc e , pages 625–633, 2007. [27] M. Charik ar and A. Wirth. Maximizing quadratic programs: extending Grothendiec k’s inequality . In 45th Annual Symp osium on F oundations of Computer Scienc e , pages 54–60. IEEE Computer So ciet y , 2004. [28] R. Cleve, P . Høy er, B. T oner, and J. W atrous. Consequences and limits of nonlo cal strategies. In 19th A nnual IEEE Confer enc e on Computational Complexity , pages 236–249, 2004. [29] A. Co ja-Oghlan, C. Co op er, and A. F rieze. An eﬃcien t sparse regularit y concept. SIAM J. Discr ete Math. , 23(4):2000–2034, 2009/10. [30] D. Conlon and J. F o x. Bounds for graph regularity and remo v al lemmas. Preprint a v ailable at http: //arxiv.org/abs/1107.4829 , 2011. [31] S. Co ok. The P v ersus NP problem. In The mil lennium prize pr oblems , pages 87–104. Cla y Math. Inst., Cam bridge, MA, 2006. [32] A. M. Davie. Matrix norms related to Grothendiec k’s inequality . In Banach sp ac es (Columbia, Mo., 1984) , v olume 1166 of L e ctur e Notes in Math. , pages 22–26. Springer, Berlin, 1985. [33] J. Diestel, J. H. F ourie, and J. Swart. The metric the ory of tensor pr o ducts . American Mathematical So ciet y , Pro vidence, RI, 2008. Grothendieck’s r´ esum ´ e revisited. [34] J. Diestel, H. Jarc how, and A. T onge. Absolutely summing op er ators , volume 43 of Cambridge Studies in A dvanc e d Mathematics . Cambridge Universit y Press, Cambridge, 1995. 32 [35] R. G. Downey and M. R. F ellows. Par ameterize d c omplexity . Monographs in Computer Science. Springer-V erlag, New Y ork, 1999. [36] P . Erd˝ os and A. R ´ en yi. On the ev olution of random graphs. Magyar T ud. Akad. Mat. Kutat´ o Int. K¨ ozl. , 5:17–61, 1960. [37] P . C. Fish burn and J. A. Reeds. Bell inequalities, Grothendiec k’s constan t, and ro ot tw o. SIAM J. Discr ete Math. , 7(1):48–56, 1994. [38] A. F rieze and R. Kannan. Quic k approximation to matrices and applications. Combinatoric a , 19(2):175– 220, 1999. [39] M. R. Garey and D. S. Johnson. Computers and intr actability . W. H. F reeman and Co., San F rancisco, Calif., 1979. A guide to the theory of NP-completeness, A Series of Bo oks in the Mathematical Sciences. [40] D. J. H. Garling. Ine qualities: a journey into line ar analysis . Cambridge Universit y Press, Cambridge, 2007. [41] S. Gerke and A. Steger. A characterization for sparse  -regular pairs. Ele ctr on. J. Combin. , 14(1):Re- searc h Paper 4, 12 pp. (electronic), 2007. [42] M. X. Go emans and D. P . Williamson. Improv ed approximation algorithms for maxim um cut and satisﬁabilit y problems using semideﬁnite programming. J. Asso c. Comput. Mach. , 42(6):1115–1145, 1995. [43] G. H. Golub and C. F. V an Loan. Matrix c omputations . Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins Univ ersity Press, Baltimore, MD, third edition, 1996. [44] W. T. Gow ers. Low er b ounds of tow er t yp e for Szemer ´ edi’s uniformit y lemma. Ge om. F unct. Anal. , 7(2):322–337, 1997. [45] A. Grothendiec k. R´ esum ´ e de la th´ eorie m ´ etrique des pro duits tensoriels top ologiques. Bol. So c. Mat. S˜ ao Paulo , 8:1–79, 1953. [46] M. Gr¨ otschel, L. Lov´ asz, and A. Schrijv er. Ge ometric algorithms and c ombinatorial optimization , vol- ume 2 of Algorithms and Combinatorics . Springer-V erlag, Berlin, se cond edition, 1993. [47] V. Guruswami, R. Manok aran, and P . Ragha vendra. Beating the random ordering is hard: Inappro x- imabilit y of maximum acyclic subgraph. In 49th Annual IEEE Symp osium on F oundations of Computer Scienc e , pages 573–582, 2008. [48] V. Gurusw ami, P . Raghav endra, R. Saket, and Y. W u. Bypassing UGC from some optimal geometric inappro ximability results. Ele ctr onic Col lo quium on Computational Complexity (ECCC) , 17:177, 2010. [49] U. Haagerup. The Grothendieck inequalit y for bilinear forms on C ∗ -algebras. A dv. in Math. , 56(2):93– 116, 1985. [50] U. Haagerup. A new upp er b ound for the complex Grothendieck constant. Isr ael J. Math. , 60(2):199– 224, 1987. [51] J. H ˚ astad. Some optimal inapproximabilit y results. J. ACM , 48(4):798–859 (electronic), 2001. [52] J. H ˚ astad and S. V enk atesh. On the adv an tage ov er a random assignme n t. R andom Structur es Algo- rithms , 25(2):117–149, 2004. [53] S. Heilman, A. Jagannath, and A. Naor. Solution of the prop eller conjecture in R 3 . Manuscript, 2011. [54] H. Heydari. Quantum correlation and Grothendieck’s constant. J. Phys. A , 39(38):11869–11875, 2006. [55] G. J. O. Jameson. Summing and nucle ar norms in Banach sp ac e the ory , volume 8 of L ondon Mathe- matic al So ciety Student T exts . Cambridge Universit y Press, Cambridge, 1987. [56] W. B. Johnson and J. Lindenstrauss. Basic concepts in the geometry of Banach spaces. In Handb o ok of the ge ometry of Banach sp ac es, Vol. I , pages 1–84. North-Holland, Amsterdam, 2001. [57] D. B. Judin and A. S. Nemirovski ˘ ı. Informational complexity and eﬀectiv e metho ds for the solution of con vex extremal problems. ` Ekonom. i Mat. Meto dy , 12(2):357–369, 1976. [58] F. Juh´ asz. The asymptotic b ehaviour of Lov´ asz’ θ function for random graphs. Combinatoric a , 2(2):153–155, 1982. [59] D. Karger, R. Motw ani, and M. Sudan. Approximate graph coloring by semideﬁnite programming. J. A CM , 45(2):246–265, 1998. [60] B. S. Kashin and S. ˘ I. Sharek. On the Gram matrices of systems of uniformly b ounded functions. T r. Mat. Inst. Steklova , 243(F unkts. Prostran., Priblizh., Diﬀer. Uravn.):237–243, 2003. 33 [61] J. Kemp e, H. Kobay ashi, K. Matsumoto, B. T oner, and T. Vidick. En tangled games are hard to appro ximate. In 49th Annual IEEE Symp osium on F oundations of Computer Scienc e , pages 447–456, 2008. [62] S. Khot. On the p o wer of unique 2-prov er 1-round games. In Pr o c e e dings of the Thirty-F ourth Annual A CM Symp osium on The ory of Computing , pages 767–775 (electronic), New Y ork, 2002. ACM. [63] S. Khot. On the unique games conjecture (invited surv ey). In Pr o c e e dings of the 25th A nnual IEEE Confer enc e on Computational Complexity , pages 99–121, 2010. [64] S. Khot, G. Kindler, E. Mossel, and R. O’Donnell. Optimal inapproximabilit y results for MAX-CUT and other 2-v ariable CSPs? SIAM J. Comput. , 37(1):319–357 (electronic), 2007. [65] S. Khot and A. Naor. Linear equations mo dulo 2 and the L 1 diameter of conv ex b odies. SIAM J. Comput. , 38(4):1448–1463, 2008. [66] S. Khot and A. Naor. Approximate k ernel clustering. Mathematika , 55(1-2):129–165, 2009. [67] S. Khot and A. Naor. Sharp kernel clustering algorithms and their asso ciated Grothendieck inequalities. In Pr o c e e dings of the Twenty-First Annual ACM-SIAM Symp osium on Discr ete A lgorithms , pages 664– 683, 2010. F ull v ersion to app ear in Random Structures and Algorithms. [68] S. Khot and R. O’Donnell. SDP gaps and UGC-hardness for max-cut-gain. The ory Comput. , 5:83–117, 2009. [69] S. Khot and S. Safra. A tw o prov er one round game with strong soundness. T o app ear in 52nd Annual IEEE Symp osium on F oundations of Computer Science, 2011. [70] G. Kindler, A. Naor, and G. Sc hech tman. The UGC hardness threshold of the L p Grothendiec k prob- lem. Math. Op er. R es. , 35(2):267–283, 2010. [71] Y. Kohay ak a w a. Szemer ´ edi’s regularit y lemma for sparse graphs. In F oundations of c omputational mathematics (Rio de Janeir o, 1997) , pages 216–230. Springer, Berlin, 1997. [72] Y. Kohay ak aw a and V. R¨ odl. Szemer´ edi’s regularity lemma and quasi-randomness. In R e c ent advanc es in algorithms and c ombinatorics , v olume 11 of CMS Bo oks Math./Ouvr ages Math. SMC , pages 289–351. Springer, New Y ork, 2003. [73] Y. Kohay ak a w a, V. R¨ odl, and L. Thoma. An optimal algorithm for chec king regularit y . SIAM J. Comput. , 32(5):1210–1235 (electronic), 2003. [74] H. K¨ onig. On the complex Grothendieck constant in the n -dimensional case. In Ge ometry of Banach sp ac es (Str obl, 1989) , v olume 158 of L ondon Math. So c. L e ctur e Note Ser. , pages 181–198. Cambridge Univ. Press, Cam bridge, 1990. [75] H. K¨ onig. Some remarks on the Grothendiec k inequalit y . In Gener al ine qualities, 6 (Ob erwolfach, 1990) , v olume 103 of Internat. Ser. Numer. Math. , pages 201–206. Birkh¨ auser, Basel, 1992. [76] H. K¨ onig. On an extremal problem originating in questions of unconditional con vergence. In R e c ent pr o gr ess in multivariate appr oximation (Witten-Bommerholz, 2000) , volume 137 of Internat. Ser. Nu- mer. Math. , pages 185–192. Birkh¨ auser, Basel, 2001. [77] J.-L. Krivine. Sur la constante de Grothendieck. C. R. A c ad. Sci. Paris S´ er. A-B , 284(8):A445–A446, 1977. [78] J.-L. Krivine. Constantes de Grothendieck et fonctions de type p ositif sur les sph` eres. A dv. in Math. , 31(1):16–30, 1979. [79] M. Laurent and A. V arvitsiotis. Computing the Grothendieck constant of some graph classes. Preprin t a v ailable at , 2011. [80] T. Lee, G. Schec h tman, and A. Shraibman. Low er bounds on quan tum m ultipart y communication complexit y . In Pr o c e e dings of the 24th Annual IEEE Confer enc e on Computational Complexity , pages 254–262, 2009. [81] T. Lee and A. Shraibman. Low er bounds in comm unication complexit y . F ound. T r ends The or. Comput. Sci. , 3(4):fron t matter, 263–399 (2009), 2007. [82] A. S. Lewis and M. L. Overton. Eigen v alue optimization. In A cta numeric a, 1996 , volume 5 of A cta Numer. , pages 149–190. Cam bridge Univ. Press, Cambridge, 1996. [83] J. Lindenstrauss and A. P e lczy ´ nski. Absolutely summing operators in L p -spaces and their applications. Studia Math. , 29:275–326, 1968. [84] N. Linial, S. Mendelson, G. Schec h tman, and A. Shraibman. Complexit y measures of sign matrices. Combinatoric a , 27(4):439–463, 2007. 34 [85] N. Linial and A. Shraibman. Learning complexit y vs. comm unication complexity . Combin. Pr ob ab. Comput. , 18(1-2):227–245, 2009. [86] N. Linial and A. Shraibman. Low er bounds in communication complexit y based on factorization norms. R andom Structur es Algorithms , 34(3):368–394, 2009. [87] L. Lov´ asz. On the Shannon capacity of a graph. IEEE T r ans. Inform. The ory , 25(1):1–7, 1979. [88] L. Lo v´ asz and M. D. Plummer. Matching the ory , volume 121 of North-Hol land Mathematics Studies . North-Holland Publishing Co., Amsterdam, 1986. Annals of Discrete Mathematics, 29. [89] L. Lo v´ asz and B. Szegedy . Szemer´ edi’s lemma for the analyst. Ge om. F unct. Anal. , 17(1):252–270, 2007. [90] D. W. Matula. On the complete subgraphs of a random graph. In Pr o c. Se c ond Chap el Hil l Conf. on Combinatorial Mathematics and its Applic ations (Univ. North Car olina, Chap el Hil l, N.C., 1970) , pages 356–369. Univ. North Carolina, Chap el Hill, N.C., 1970. [91] A. Megretski. Relaxations of quadratic programs in op erator theory and system analysis. In Systems, appr oximation, singular inte gr al op er ators, and r elate d topics (Bor de aux, 2000) , v olume 129 of Op er. The ory A dv. Appl. , pages 365–392. Birkh¨ auser, Basel, 2001. [92] A. Naor and G. Schec htman. An approximation scheme for quadratic form maximization on conv ex b odies. Manuscript, 2009. [93] A. Nemirovski. Adv ances in conv ex optimization: conic programming. In International Congr ess of Mathematicians. Vol. I , pages 413–444. Eur. Math. So c., Z ¨ urich, 2007. [94] A. Nemirovski, C. Ro os, and T. T erlaky . On maximization of quadratic form o v er intersection of ellipsoids with common cen ter. Math. Pr o gr am. , 86(3, Ser. A):463–473, 1999. [95] Y. Nesterov. Semideﬁnite relaxation and nonconv ex quadratic optimization. Optim. Metho ds Softw. , 9(1-3):141–160, 1998. [96] Y. Nesterov, H. W olko wicz, and Y. Y e. Semideﬁnite programming relaxations of nonconv ex quadratic optimization. In Handb o ok of semideﬁnite pr o gr amming , volume 27 of Internat. Ser. Op er. R es. Man- agement Sci. , pages 361–419. Klu wer Acad. Publ., Boston, MA, 2000. [97] C. H. P apadimitriou and M. Y annak akis. Optimization, appro ximation, and complexit y classes. J. Comput. System Sci. , 43(3):425–440, 1991. [98] D. P ´ erez-Garc ´ ıa, M. M. W olf, C. Palazuelos, I. Villan uev a, and M. Junge. Unbounded violation of tripartite Bell inequalities. Comm. Math. Phys. , 279(2):455–486, 2008. [99] G. Pisier. Grothendieck’s theorem for noncommutativ e C ∗ -algebras, with an app endix on Grothendiec k’s constants. J. F unct. Anal. , 29(3):397–415, 1978. [100] G. Pisier. F actorization of line ar op er ators and ge ometry of Banach sp ac es , volume 60 of CBMS R e- gional Confer enc e Series in Mathematics . Published for the Conference Board of the Mathematical Sciences, W ashington, DC, 1986. [101] G. Pisier. Grothendieck’s theorem, past and presen t. Preprint a v ailable at 1101.4195 , 2011. [102] I. Pitowsky . New Bell inequalities for the singlet state: going beyond the Grothendiec k bound. J. Math. Phys. , 49(1):012101, 11, 2008. [103] P . Raghav endra. Optimal algorithms and inappro ximability results for every CSP? In Pr o c e e dings of the 40th Annual ACM Symp osium on The ory of Computing , pages 245–254, 2008. [104] P . Raghav endra and D. Steurer. T ow ards computing the Grothendieck constant. In Pr o c e e dings of the Twentieth Annual ACM-SIAM Symp osium on Discr ete Algorithms , pages 525–534, 2009. [105] J. A. Reeds. A new low er b ound on the real Grothendieck constan t. Unpublished manuscript, a v ailable at http://www.dtc.umn.edu/reedsj/bound2.dvi , 1991. [106] O. Regev and B. T oner. Simulating quantum correlations with ﬁnite comm unication. SIAM J. Comput. , 39(4):1562–1580, 2009/10. [107] R. E. Rietz. A pro of of the Grothendieck inequality . Isr ael J. Math. , 19:271–276, 1974. [108] M. Sipser. Intr o duction to the the ory of c omputation . PWS Publishing Company , 1997. [109] R. R. Smith. Completely b ounded m ultilinear maps and Grothendieck’s inequalit y . Bul l. L ondon Math. So c. , 20(6):606–612, 1988. [110] L. Song, A. Smola, A. Gretton, and K. A. Borgwardt. A dep endence maximization view of clustering. In Pr o c e e dings of the 24th international c onfer enc e on Machine le arning , pages 815 – 822, 2007. 35 [111] E. Szemer´ edi. Regular partitions of graphs. In Pr obl` emes c ombinatoir es et th´ eorie des gr aphes (Col lo q. Internat. CNRS, Univ. Orsay, Orsay, 1976) , volume 260 of Col lo q. Internat. CNRS , pages 399–401. CNRS, P aris, 1978. [112] A. T onge. The von Neumann inequality for p olynomials in several Hilb ert-Sc hmidt op erators. J. L ondon Math. So c. (2) , 18(3):519–526, 1978. [113] A. T onge. The complex Grothendiec k inequalit y for 2 × 2 matrices. Bul l. So c. Math. Gr ` ec e (N.S.) , 27:133–136, 1986. [114] B. S. Tsirelson. Quantum analogues of Bell’s inequalities. The case of tw o spatially divided domains. Zap. Nauchn. Sem. L eningr ad. Otdel. Mat. Inst. Steklov. (LOMI) , 142:174–194, 200, 1985. Problems of the theory of probabilit y distributions, IX. [115] N. T. V arop oulos. On an inequalit y of von Neumann and an application of the metric theory of tensor pro ducts to op erators theory . J. F unctional Analysis , 16:83–100, 1974. Courant Institute, New York University, 251 Mercer Street, New York NY 10012, USA E-mail addr ess : khot@cims.nyu.edu Courant Institute, New York University, 251 Mercer Street, New York NY 10012, USA E-mail addr ess : naor@cims.nyu.edu 36

Grothendieck-type inequalities in combinatorial optimization

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment