Applying Practice to Theory

Applying Practice to Theory Ry an Williams ∗ Abstract How can complexit y theo ry and a lg orithms b eneﬁt from practical a dv ances in co mputing? W e giv e a short ov erview of some prior w ork using practical c omputing to attac k pr oblems in computational complexit y and algorithms, informally describ e how linear progra m solvers ma y be us e d to help prov e new lo wer b ounds for satisﬁability , and suggest a res earch prog ram for developing new understanding in circuit complexity . 1 In tro duction As hardw are b ecomes steadily more p o w erful, compu ter scienti sts should r ep eatedly ask themselv es what c an we do with al l the sp ar e c omp uting p ower at our disp osal ? There ha v e b een many inv en tiv e and exciting answ ers to this question in the form of distributed compu ting pro jects, from the study of protein folding (F o lding@Home) to the impr o v emen t of climate mo dels (Climate prediction.net) to the search for extraterrestrial in telligence (SETI@Home) to the mathematically in triguing (the Great Inte rnet Mersenne Prime Searc h). I w ould lik e to suggest that some spare co mputin g p o w er should b e dev oted to w ards ac hiev- ing a b etter understandin g o f computation it self: solving signiﬁcant op en p roblems in theoretical computer science. I w ould h op e t hat this suggesti on is indisputable. Considerin g a ll that w e d o not understand in the theory of computing, exploiting extraneous computation for improving ba- sic kno wledge should b e a priorit y . An attractiv e prop ert y of this suggestion is its p oten tial for self-impro v emen t. Ideally , improv emen ts in basic kno wledge can lead to m ore spare cycles in the future, leading to further adv ances in b asic knowledge , and so on. While the ab ov e lo oks splendid in writing, of co urse it is not clear how to b etter the causes of theory in this w a y . In fac t this problem ma y b e exce ptionally d iﬃcult in m ost cases o f inte rest. My hop e in this su rv ey to encourage readers to think more ser ious ly ab out the p roblem. Pro ofs relian t on computer calculatio n h a v e b ecome increasingly more common in m athematics. Examples include the pro of of the four color theorem b y Ap p el a nd Hak en [AH77a, AH 77b] whic h has been greatly simpliﬁed [RS ST97], Hales’ pro of of the K ep ler Conjecture [Hal05], Hass and Sc hlaﬂy’s pro of of the d ouble bubble conjecture [HS00], Lam’s pro of that there is n o ﬁnite pro jectiv e plane of order 10 [L TS89], and McCune’s p ro of of the Robbins Conjecture [McC97]. In general, computers can pla y a muc h greater r ole than simply disco v ering or ve rifying p ro ofs, the end pro d ucts of mathematical r esearc h. T o pro v e theorems w e need to formalize a pro of sys tem, a nd in doing so w e ma y ha v e to ﬁnagle messy deta ils. Computers are p ossib ly more useful in helping us disco v er ∗ School of Mathematics, Institute for Adv anced Stu dy , Princeton, NJ 08540, U SA. Email: rya nw@math.ias.ed u . This material is b ased on w ork supp orted by N SF grants CCF-0832797 and DMS-0835373. An alternative version of this article will app ear in SI GAC T News. 1 new co mputational artifacts suggestiv e of a deep er paradigm, suc h as an ingeniously tiny circuit for 7 × 7 matrix m ultiplicatio n which inspires us to look for something b etter. A t this p oin t I should note a distinction b et w een t w o u sages of computers in pro ofs. F or con v enience, I will designate them infe asibly che ckable and fe asibly che ckable . In the form er, a computer is needed to generate the pr o of of the theorem, an d the “pr o of ” is the co d e of some program, th e trace of that p rogram’s execution, and our obs erv a tion that the program ou tp ut th e appropriate answ ers. I p ersonally ha v e no pr ob lem with such pro ofs so long as they are carefully review ed, but ac kno wledge that they stretc h the b ounds of what one t ypically calls a pro of. In con trast, feasibly chec k able pro ofs ma y r equire strenuous computing eﬀort, b ut once obtained they can b e c hec k ed b y mathematicians within a reasonable time frame. Restricting ourselv es to feasibly c hec k a ble pro ofs k ee ps the task of pro of-ﬁnding roughly within N P , and I w ould lik e to strongly promote this usage of computers whenev er p ossible. First I will survey a few ideas in this v ein that hav e b een introd uced in algorithms and complexity theory . Then I will discuss some recen t work using compu ters to ﬁnd pro ofs of time lo w er b ounds in restrict ed mo dels and small circuits. I w ill not giv e prescriptions for solving y our ma jor op en problems via distributed computing w ith spare desktop cycles. But I do wish th at this article helps y ou consid er the possib ilit y . 2 Some Prior W ork Let me b egin b y sa ying that I cannot hop e to co v er all the inn o v a tiv e uses of compu ters in algorithms and complexit y . My goal is to merely p oin t ou t a f ew theory topics that I kno w where p ractical computing has made a notew orthy impact. 2.1 Mo derately E xp onen tial Algorithms In the area of mod erately exp onen tial algorithms, the goal is to dev elop faster algorithms for solving NP -hard problems exactly . Of course we do not exp ect these faster algorithms to b e remotely close to p olynomial time. Instead we settle for exp onen tial algorithms that are still signiﬁcan t impro v ement s ov er exhaustiv e searc h. F or example, a O (1 . 9 n · p oly( m )) time algo rithm for Bo olean Circuit Sati sﬁability on n v a riables and m gates w ould b e ve ry in teresting, bu t not nece ssarily for an y practical reason. T he ability to a v oid b rute-force sea rch in suc h a general case w ould b e an amazing disco v ery in itself. Currently we ha v e no idea how to construct such an algorithm, or eve n if its existence would imply something unexp ected. Ho we v er, we ca n solv e the 3-SA T problem in O (1 . 33 n ) time with a randomized algorithm, b y taking adv anta ge of the structure of 3-CNF. Th is is due to I wama and T amaki [IT04] and is just th e latest of a long line of r esults on the problem. 2.1.1 Analyzing Exponential Algorithms T ypically , one of the cen tral problems in an e xp onen tial algorithms pap er is to pro v e go o d upp er b ound s on the runnin g time of some algorithm wh ic h solv es some hard problem. (Usually the correctness of the algorithm is straigh tforw ard.) Much eﬀo rt h as b een undertak en to b etter un- derstand the b eha vior of bac ktrac king algo rithms. T o aid the discussion, let us work with a to y example. F or a no d e v in a grap h , let N ( v ) denote the set of v ’s neighbors. Consider the follo wing algorithm for solving Minim um V ertex Cov e r: 2 If al l no des have de gr e e at most two, solve in p olytime. If a no de has de gr e e 0 , r emove it fr om the gr a ph and r e curse on the r emaining gr aph. If a no de h as d e gr e e 1 , th en r emove it and its neighb or u , r e curse on th e r esulting gr aph getting a c over C , and r etu rn C ∪ { u } . Otherwise take a no de v of highest de g r e e. R e curse on the gr aph with v r emove d, getting a c o ver C 1 . R e curse on the gr aph with v and N ( v ) r e move d, getting a c over C 2 . R eturn t he minimum of C 1 ∪ { v } and C 2 ∪ N ( v ) . W e w on’t explain ho w to solve the degree-t w o case h ere, but lea v e it to the reader. Ho w do we analyze the runtime of su c h an algorithm? A natural approac h is to try wr iting a recurrence. Let T ( n ) b e the runtime o n an n no de graph. I n the t wo recursiv e calls of the algo rithm, w e remov e at least one no de and at lea st four no d es, resp ectiv ely . (W e r emo v e at least four no des b ecause | N ( v ) | ≥ 3.) This giv es u s the recurrence T ( n ) ≤ T ( n − 1) + T ( n − 4) + p oly( n ) . Solving this recurrence in the usual wa y (by ﬁn ding a real r o ot of f ( x ) = 1 − 1 /x − 1 /x 4 ) w e ﬁnd that T ( n ) ≤ 1 . 39 n . While expon ential, this is still b etter than th e ob vious 2 n algorithm, since w e can now handle instances of m ore than double th e size in th e same r u nning time. Our analysis certainly has muc h slac k. One can imagine cases where th e n umb er of n o des remo v ed increases b y m uc h more, and w e ha v e not tak en them into account . More generally , it is not clear th at n is the b est “progress measure” for the algorithm. P erhaps if we coun t the num b er of edges in the graph instead, w e m a y ﬁnd a b etter runtime b ound for sp arse graphs. I n deed, in the ﬁrst recur s iv e ca ll, at least thr ee ed ges are remo v ed (since we c hose a v of largest d egree) and in the second ca ll w e remo v e at least ﬁve edges: at least three edges for the neigh b ors of v , and at least tw o additional edge s since the neighbors ha v e degree at lea st t w o. W e ha v e T ( m ) ≤ T ( m − 3) + T ( m − 5) + O (p oly( n )) , leading to T ( m ) ≤ O (1 . 19 m ). F or suﬃcien tly sp arse graphs, th is impr o v es 1 . 39 n . More am bitiously , w e could try to captur e b oth obs erv a tions with the double recur rence T ( m, n ) ≤ T ( m − 3 , n − 1) + T ( m − 5 , n − 4) + O (p oly( n )) . No w ho w d o we deal with this? One wa y is to con v ert the doub le recurr ence into a single on e, b y letting k = α 1 m + α 2 n . Then T ( k ) ≤ T ( k − 3 α 1 − α 2 ) + T ( k − 5 α 1 − 4 α 2 ) + O (p oly( k )) . So T ( k ) ≤ O ( c k ) w here c is a real s olution to 1 − 1 /x 3 α 1 + α 2 − 1 /x 5 α 1 +3 α 2 = 0. F or an example w h en α 1 = 1 / 2 and α 2 = 1 then the runt ime b ound is O (1 . 21 m/ 2+ n ). In this w a y , w e can in terp olate b et we en the t w o time b ounds. In ge neral, optimizing this sort of analysis can become te rribly complicate d. When th ere are man y p ossible cases in the algorithm, and diﬀeren t v ariable m easures are decrea sing at diﬀeren t rates, the analysis b ecomes int ractable to carry out by hand. Ho w ev er, researchers hav e found wa ys to apply computers to the problem. Eppstein [Epp06] sho w ed that m ultiv aria te recurrences similar to the abov e can b e appro ximately s olv ed eﬃcien tly , b y expressing th e problem as a quasi-c onvex pr o gr am . T h is h as b ecome a v ery useful tool. F or example, we could kee p trac k of the n um b er of no des n i of d egree i in its time r ecurrence expressions, for i ≥ 2 (and for suﬃciently large k , w e 3 lump the num b er of degree ≥ k no des into a single quantit y n ≥ k ). This can also b e con v erted in to a single v ariable r ecurrence, introducing α i w eigh ts for eac h n i . F or s ome algorithms we can get su rprisingly go o d time b oun ds in t erms of th e total n um b er of no des: quasicon v ex optimizatio n un cov ers interesting α i ’s. In tuitiv ely , this m ak es sense, because remo ving high degree no des sh ould contribute more to the progress of the algo rithm th an remo ving those of lo w d egree. F omin, Grandon i, and K ratsc h [FGK05 a , FGK06] hav e found O (1 . 52 n ) and O (1 . 23 n ) algorithms f or Dominat ing Set and Maxim um Indep endent Set/Minim um V ertex C o v er resp ectiv ely , b y p erforming analyses of the ab o v e kind on simple new algorithms, usin g Eppstein’s computer app roac h to determine optimal settings of α i . F or more details, see the survey [F GK05b]. Scott and S orkin [SS07] ha v e analyzed algorithms for Max 2-CSP and related graph problems with a somewhat similar approac h, k eeping tr ac k of the degrees of neigh b ors in th e recurrence as w ell. Th eir approac h form ulates the analysis with linear programming instead. P r o vided the original recur rences are reasonably sized, the ab o v e app roac hes can generate f easibly c hec k able pro ofs: after the optimization has found a ppr opriate weig hts, one can often man ually c hec k that the r ecur rence w orks ou t. 2.1.2 Case Analysis of Exp onen tial Algo rithms Another approac h is to ha v e a co mpu ter chec k that a recursiv e algorithm will adm it a n eﬃcien t time r ecurrence, o v er all the p ossible inp u ts up to a certain size. Sin ce man y recursiv e bac ktrac king algorithms work v ery lo cally (they only look a t a su bgraph of ﬁnite s ize around a sp ecially c hosen no de) this sort of case analysis is sometimes enough to ensur e a go o d upp er b oun d on the ru nning time. Ho w ev er, this style of ap p roac h t ypically do es not lea d to feasibly chec k able pro ofs of upp er b ound s, and the case analysis is done by computer. Robson [Rob01], in an unpub lished technical rep ort, has written a program to do this for a Maxim um I n dep end en t Set algorithm, pr o ving that the algorithm ru ns in O (2 n/ 4 ) time. In particular, a length y and v ery complex extension of the ab o v e to y algorithm is presente d and analyzed case-b y-case, using a compu ter to en umerate man y of the possible c ases. In the to y alg orithm, there w ere some sp ecial cases prior to bac ktrac king: r emo ving nodes of degree less than t wo, and solving instances with o nly degree t w o no des. Rules similar to degree 0 a nd degree 1 no de remo v al are ca lled simpliﬁc ation rules . In ge neral, these are sh ort p olytime rules that allo w one to reduce the size of a n instance practically for free, provided that a certain substru cture exists. F edin, Ko jevniko v, and Kuliko v [FK06, KK06] devel op ed a natural formalism for e xpressin g sp ecial cases in SA T problems, w hic h made it n ot only p ossible for a computer to p erform case analyses, b u t also searc h for new simpliﬁ cation r u les for MAX SA T an d SA T on its o wn, resulting in faster new algorithms. F or earlier wo rk of this kind, s ee [NS03, GGHN04]. Could it b e p ossible to searc h ov e r (or r eason ab out) a l l recursiv e bac ktrac king algorithms in some s ense, and sho w exp onen tial limitations on solving problems lik e SA T ? Here I am using “all” v ery loosely; w ithout clev er simpliﬁcation rules, exp onential lo w er b ound s on treelik e resolution already giv e low er b ounds on simple bac ktrac king. P erhaps there is a 2 εn algorithm for eve ry ε > 0, by using a su ﬃcien tly complicated bac ktrac king algo rithm. Many seem to d isb eliev e in this p ossibilit y; some w ork has articulated th is b elief, in some sense. F or ins tance, Alekhno vic h et al. [ABBIMP05 ] formalized a mod el for bac ktrac king algorithms, pro ving that 3SA T requires 2 Ω( n ) time in their mo del. How ev er their pro of uses a SA T instance that enco des a linear s y s tem of equations ov er GF (2), whic h can b e solv ed trivially in p olynomial time. Suc h resu lts teac h us that we should not be m y opicall y fo cused on one sp eciﬁc algorithmic tec hnique. 4 2.2 Appro ximabilit y and Inappro ximabilit y Since the early 90’s there has b een signiﬁcan t progress in the study of hard -to-appro ximate prob- lems, aided by th e celebrated PCP theorem [AS98, ALMSS98]. An algorithm is a ρ -appro ximation for a minimizat ion problem Π if on all in puts the al gorithm outputs a solution that has v alue at most ρ times the minim um v alue of an y solutio n. In the maximization ca se, the output solution m ust h a v e cost at least ρ times the maxim um. Note when Π is a minimization (maximization) problem, we h a v e ρ ≥ 1 ( ρ ≤ 1), resp ectiv ely . A prime ob jectiv e in the study of a pp r o ximabilit y is to determine for whic h ρ a p r oblem can b e ρ -appr o ximated in p olytime, and for wh ic h ρ a problem is NP -hard to ρ -approxi mate. Several sur p risingly tigh t results are known; for instance, a random assignmen t satisﬁes at least 7 / 8 of the clauses in an y 3-CNF formula (with three distinct v ariables in eac h clause), y et H ˚ astad show ed [Has01] that it is NP -hard to satisfy a num b er of clauses that is at least 7 / 8 + ε of the optim um, for an y ε > 0. That is, a p olytime (7 / 8 + ε )-appr o ximation wo uld imply P = NP . Here I will brieﬂy survey a couple of w orks in the study of appro ximation that rely h ea vily on computer p o w er to ac hiev e their results. 2.2.1 Gadgets via Computer Ev ery one w ho h as seen an NP -co mpleteness reduction kno ws what a gadget is. T reating one p roblem Π as a programming language , y ou try to express pieces of an instance of Π ′ b y constructing gadgets, simple comp onents that can be used o v er and o v er to express instances of Π ′ as instances of Π. T o ill ustrate, co nsider the standard redu ction from 3-SA T to MAX 2-SA T due to Garey , John - son, and S to c kmey er [GJS76]. One can transform an y 3-CNF form ula F into a 2-CNF form ula F ′ , b y replacing eac h clause of F suc h as c i = ( ℓ 1 ∨ ℓ 2 ∨ ℓ 3 ), where ℓ 1 , ℓ 2 , and ℓ 3 are lit erals, with the “gadget” of 2-CNF clauses ( ℓ 1 ) , ( ℓ 2 ) , ( ℓ 3 ) , ( y i ) , ( ¬ ℓ 1 ∨ ¬ ℓ 2 ) , ( ¬ ℓ 2 ∨ ¬ ℓ 3 ) , ( ¬ ℓ 1 ∨ ¬ ℓ 3 ) , ( ℓ 1 ∨ ¬ y i ) , ( ℓ 2 ∨ ¬ y i ) , ( ℓ 3 ∨ ¬ y i ) , where y i is a new v ariable. If an assignmen t satisﬁes c i , then exactly 7 of the 10 clauses in the gadget can b e satisﬁed by setting y i appropriately . If an assignmen t d o es not satisfy c i , then exactly 6 of the 10 ca n b e sat isﬁed. Therefore F is satisﬁable if and only if 7 / 10 of the clauses in F ′ can b e satisﬁed. T his reduction also sa ys something about the app ro ximabilit y of MAX 3-SA T: Prop osition 2.1 If an algorith m is a (1 − ε ) -appr o ximation for MA X 2-SA T, then by applying the r e duction one c a n obtain a (1 − 7 ε ) -appr oximation to MAX 3-SA T. 1 Recall we men tioned that MAX 3-SA T d o es not ha v e a p olytime (7 / 8 + ε )-appro ximation unless P = NP . Hence the prop osition implies that MAX 2-SA T cannot b e (55 / 56 + ε )-appro ximated in p olytime unless P = NP . So gadgets can b e used to extend inappro ximabilit y results from one problem to another. 2 Ho w go o d is the ab o v e gadget f rom 3-SA T to 2-SA T ? Could we ﬁnd a gadget that imp lies stronger inapp ro ximabilit y for MAX 2-SA T? T o addr ess these kinds of questions, T revisan, Sorkin, Sudan, and Will iamson [TSSW00] formalized ga dgets, follo wing [BGS98]: 1 T o see this, let m 3 b e the num b er of clauses in the original 3-CNF F , let m ∗ 3 ≤ m 3 b e the optimal num b er of clauses that can be satisﬁed in F , and let c m 3 b e the number of clauses in F satisﬁed by running the MAX 2-SA T approximatio n on F ′ and translating the outpu t back to an assignmen t on the v ariable s of F . By our assumption w e hav e (1 − ε ) ≤ 6 m 3 + d m 3 6 m 3 + m ∗ 3 . By algebraic manipulation and the fact that m ∗ 3 ≤ m 3 w e derive that b m 3 m ∗ 3 ≥ 1 − 7 ε . 2 Note that the b est known inapproximabilit y result for MA X 2-SA T uses a diﬀeren t gadget red uction, cf. [Has01]. 5 Deﬁnition 2.2 L et α, ℓ, n ≥ 1 , let f : { 0 , 1 } k → { 0 , 1 } , and let F b e a family of functions fr om { 0 , 1 } k + n to { 0 , 1 } . An α - gadget r e ducing f to F is given by a set o f auxiliary v ariables y 1 , . . . , y n and weights w j ≥ 0 c ouple d with c onst r aints C j ∈ F , wher e j = 1 , . . . , ℓ . F or every a ∈ { 0 , 1 } k , • If f ( a ) = 1 then ( ∀ b ∈ { 0 , 1 } n ) P j w j C j ( a, b ) ≤ α and ( ∃ b ∈ { 0 , 1 } n ) P j w j C j ( a, b ) = α . • If f ( a ) = 0 then ( ∀ b ∈ { 0 , 1 } n ) P j w j C j ( a, b ) ≤ α − 1 . Note it is ﬁne to place w eigh ts on constrain ts: to “unw eigh t” them, w e can simply mak e a n umber of copies of eac h constrain t in the instance, p rop ortional to the w eigh ts. Observe the reduction from 3- SA T to MAX 2 -SA T is a 7-gadget redu cing f ( x 1 , x 2 , x 3 ) = x 1 ∨ x 2 ∨ x 3 to the family of f unctions repr esen table by 2-v ariable clauses, wh ere n = 1, ℓ = 10, and w j = 1 f or all j . First, [TSSW00] sho w ed that if w e ﬁx the num b er of auxiliary v ariables n , then the requiremen ts in the gadget d eﬁnition can b e describ ed b y a large n umber ( |F | ℓ ) of linear programs, with a large n umber of inequalities in ea c h linear program. Th at is, b y specifying n an d a tuple ( C 1 , . . . , C ℓ ) ∈ F ℓ , the p r oblem of setting w j ’s to minimize α and satisfy the gadget deﬁnition b oils d o wn to solving a large linear pr ogram which has inequ alities d ealing with ev ery p ossible a ∈ { 0 , 1 } k . Th is is almost ob vious, except that the deﬁnition h as elemen ts of a 0-1 in teger pr ogram: when f ( a ) = 1, we m ust ensure that there is an assignment b making the sum equal α . T o circum v en t this, [TSS W00] also try al l p ossible functions B from the set o f satisfying assig nments of f to { 0 , 1 } n . Then, for ev ery a su c h that f ( a ) = 1, w e simply set b = B ( a ) in the “ P j w j C j ( a, b ) = α ” constrain ts of the linear program. As one migh t exp ect, this can lead to some very large linear programs, but for small constrain t functions th ey are manage able. Still, we had to ﬁx n to get a ﬁnite search space, and it is entirely p ossible that gadgets k eep impro ving as n increases. [TSSW00] p ro v e that, for F satisfying very natural conditions, it su ﬃ ces to s et n ≤ 2 s , where s is the num b er o f satisfying assignments of f . T hese conditions are satisﬁed b y 2-CNF and man y o ther well-studied constraint families. Using the computer s earc h for gadge ts, the authors pro v ed sev eral in teresting results in ap- pro ximation w h ic h are still the b est kno wn to date; for example, (16 / 17 + ε )-approximati ng MAX CUT is NP -hard. Incidental ly , their searc h also unco v ered an optimal 3.5-gadget redu cing 3- SA T to MAX 2-SA T: taking c i as b efore, the 2-CNF ga dget is ( ℓ 1 ∨ ℓ 3 ) , ( ¬ ℓ 1 ∨ ¬ ℓ 3 ) , ( ℓ 1 ∨ ¬ y i ) , ( ¬ ℓ 1 ∨ y i ) , ( ℓ 3 ∨ ¬ y ) , ( ¬ ℓ 3 ∨ y ) , ( ℓ 2 ∨ y ) , where the weigh ts are 1 / 2 for every clause, except for th e last one whic h has w eigh t 1. In the unw eig hte d case, this amoun ts to having t w o copies of the last clause. The resu lts here are feasibly c hec k a ble, for small co nstraints with a small num b er of auxiliary v ariables. 2.2.2 Analyzing Appro xim ation Algorithms Earlier, we n oted that any 3-CNF formula can b e app r o ximated within 7 / 8 b y c ho osing a random assignmen t. But if some of the clauses are 2-CNF or 1-CNF, this no longer holds. Ho wev e r, it w ould b e strange if we could not 7 / 8-approxima te general MAX 3-SA T b ecause of this. Karloﬀ and Zwic k [KZ97] prop osed a p ossible 7 / 8-appro ximation based on semideﬁnite programming (SDP). 3 3 F or the purp oses of this article, just think of semideﬁnite programming as a generalization of linear programming where t he inequalities are b etw een linear com binations of inn er pro ducts of un known vectors, and the task is to ﬁnd vectors satisfying the inequ alities. Suc h systems are approximately solv able in p olynomial time. 6 Their algorithm is a fairly direct translation of MAX 3-SA T to an SDP , similar to the MAX CUT algorithm o f Goemans and Williamson [GW95], where a v ector v i in the solution corresp onds to the v ariable x i in the form ula, and one vect or v t corresp onds to TR UE. Giv en the v ectors retur n ed b y the SDP solve r, one obtains an assignmen t to the formula by picking a r andom hyperp lane th at passes thr ough the origin and s etting x i to TRUE if and only if v i and v t lie on diﬀeren t sides of th e h yp erp lane. Suc h a hyp erplane can b e c hosen by c ho osing a normal v ector r u niformly at rand om from the u nit sph er e in R n . Analyzing the K arloﬀ-Zwic k algo rithm is very diﬃcu lt. In order t o prov e that the algorithm is a 7 / 8- appr oximati on, one needs to pro v e sharp b oun ds on th e probabilit y that four v ectors v i , v j , v k , v t from the SDP lie on the s ame side of a random h yp erplane (i.e., the pr obabilit y that the clause ( x i ∨ x j ∨ x k ) is falsiﬁed by the random assignmen t). This amounts to proving b oun ds on the volume of certain ob jects whose corners are c hosen uniformly at random from the u nit sphere, whic h K arloﬀ and Zwic k call “v olume in equ alities.” Karloﬀ and Zw ic k were unable to prov e str on g enough inequalities to get a 7 / 8-appro ximation, but they did obtain some p artial results and ga v e a conjectured inequalit y that, if tru e, wo uld imply that the algorithm is a 7 / 8-appro ximation. Z w ic k [Zwi02] prov ed th is inequalit y along w ith others b y writing a program that u sed interval arithmetic . In terv a l arithmetic is a metho d for computing o v er r eal num b ers on a compu ter in a con trolled w a y , so that all errors are accoun ted for. The pro ofs of the Kepler conjecture and double-bub ble conjecture men tioned earlier also u tilize in terv al arithmetic in a cr itical w a y . In interv al arithmetic, one represent s a real num b er r by an int erv al [ r 0 , r 1 ] where r 0 , r 1 are mac hine representable and r 0 ≤ r ≤ r 1 . Ideally , one wan ts r 0 ( r 1 ) to b e as large (small) as p ossible. F or a real num b er r , le t r = r 1 and r = r 0 . W e deﬁne [ r 0 , r 1 ] + [ s 0 , s 1 ] = [ r 0 + s 0 , r 1 + s 1 ] , [ r 0 , r 1 ] · [ s 0 , s 1 ] =  min { r 0 s 0 , r 0 , s 1 , r 1 , s 0 , r 1 , s 1 } , max { r 0 s 0 , r 0 , s 1 , r 1 , s 0 , r 1 , s 1 }  . One can d eﬁne more complicated functions similarly . The p oin t is t hat by doing numerical com- putations in inte rv al arithmetic, the resulting in terv al must co nta in the correct v alue, ev en if that v a lue ca nn ot b e mac hine represen ted. But ho w can w e use in terv al arithmetic to prov e an in equ alit y? The k ey step in Zwic k’s w ork is a t ec hnical redu ction f rom th e desired volume inequalit y to the task of pro ving that a certain system of constrain ts has no solution o v er the reals. (Most of these constraints are inequalities, but some are d isjunctions of inequalit ies.) Z w ic k then w r ote a pr ogram, called RealSearch , wh ic h tak es any system of constrain ts o ve r b ound v ariables of t he f orm f 1 ( x 1 , . . . , x n ) ≥ 0 , . . . , f k ( x 1 , . . . , x n ) ≥ 0 ∨ f k +1 ( x 1 , . . . , x n ) ≥ 0 , . . . , and tr ies to prov e that they h a v e n o s olution. Let the b ound s on x i b e a i ≤ x i ≤ b i for some mac hine represent able a i , b i . The program starts b y letting the in terv a l X i = [ a i , b i ] to denote x i , and ev aluates the f -fu nctions with interv al arithmetic. I f any of the constrain ts fail on this assignment (e.g., f 1 ( X 1 , . . . , X n ) < 0), then the system returns no solution . Otherwise, th e pr ogram breaks some X i in to subin terv als X ′ i = [ a i , ( a i + b i ) / 2], X ′′ i = [( a i + b i ) / 2 , b i ], and recursiv ely tries to verify that the system of constraints fails with b oth X i := X ′ i and X i := X ′′ i . Of course, su c h a p ro cedure ma y not terminate, so the program is instr ucted to quit after some time. But surprisingly , this simple program can ve rify th e necessary inequalit y , as w ell as seve ral others that arise in SDP 7 appro ximations! Ev en th ou gh w e are not w orking explicitly o v er the reals, if we ﬁnd that the system of co nstraints fails ov er all appropriately c hosen interv als, then it f ollo ws that the system fails o v er a ll reals. In pr inciple, this strategy c ould w ork for an y fun ctions f deﬁnable in in terv al arithmetic. Of course, the resulting pro of of the v olume inequ alit y is of th e infeasibly c hec k able v ariet y , relying on th e corr ectness of the program and the correctness of th e ﬂoating-p oint op erations. Ev en greater issues arise w ith T om Hales’ proof of the Kepler conjecture, whic h requires that his programs b e run on a pro cessor that strictly conforms to the IEEE-754 ﬂoating p oint stand ard. Ho w ev er, Zwic k’s strategy does not require that m uc h s tringency , an d I b eliev e it should b e b etter kno wn as a general metho d for attac king diﬃcult inequalities. 3 Time Lo w er B ounds I ha v e r ecen tly found a nice domain in complexit y theory where computer searc hes help p erform the “hard work” in the pro ofs of theorems: namely , in pro ving time lo w er b ounds for h ard p roblems such as S A T on r estricted compu tational mo d els. In this case, the computer generates feasibly c hec k a ble pro ofs of lo w er b ounds . Since this st yle of time lo w er b ounds h as b een survey ed thoroughly b y V an Melk eb eek [vM04, vM07], I will not pro vide sub s tan tial b ac kground here. Instead I will fo cus more on d escribing ho w the reduction to a computer search works. F or more deta ils, please consult the draft a v aila ble [Wil08]. All the lo w er b ou n ds amenable to c omputer search ha v e one unifying prop erty: the restricted mo del in whic h a lo w er b ound is prov ed can b e sim ulated asymptotic al ly faster on an alte rnating mac hine. W e call suc h a phenomenon a sp e e d-up pr op erty . T his prop ert y is crucial for the argumen ts to w ork. Here w e will work w ith time lo wer b ounds for SA T on r andom ac cess mac hines that use only n o (1) w orkspace. Deﬁne DT S [ t ( n )] to b e the class of problems s olv able b y suc h mac hines in t ( n ) ≥ n time (the acronym stands for “deterministic time w ith small sp ace”). Here is one example of the speed-up prop ert y in th is setting. Theorem 3.1 DTS [ t ( n )] ⊆ Σ 2 TIME [ t ( n ) 1 / 2 n o (1) ] ∩ Π 2 TIME [ t ( n ) 1 / 2 n o (1) ] . (The classes Σ 2 TIME and Π 2 TIME are deﬁned in the usual w a y .) That is, w e can simulate a small space computation with a squ are-ro ot sp eedup using alternations. The pro of is due to Kannan [Kan84], b ut the basic idea goes bac k to Sa vitc h [Sa v70]. The idea is to guess snapshots of the n o (1) space a lgorithm at t 1 / 2 p oint s during its computatio n, th en verify in p arallel that th e guesses are correct. Giv en an algorithm A that ru ns in time t and u ses space n o (1) , the corresp ond ing Σ 2 algorithm B ( x ) existential ly writes t 1 / 2 conﬁgurations C 0 , . . . , C t 1 / 2 of A ( x ), where C 0 is the initial conﬁguration of A ( x ) and C t 1 / 2 is an accepting conﬁguration. Since A u ses only n o (1) space, these conﬁgurations can b e written do wn with n o (1) bits eac h. Next, B ( x ) unive rsal ly writes i ∈ { 0 , . . . , t 1 / 2 − 1 } and jumps to the conﬁgur ation C i . Th en it sim ulates A ( x ) from C i for t 1 / 2 steps, accepting if and only if A ( x ) is in conﬁguration C i +1 . T h e Π 2 -sim ulation can b e deﬁned analogously . Theorem 3.1 is already enough to prov e a non-trivial lo w er b oun d for SA T, after applying a few more observ ations fr om the literature. The ﬁrst observ ation is th at if SA T is in DTS [ n c ], then NTIME [ n ] ⊆ DTS [ n c · p oly(log n )]. This follo ws from the f act that SA T is very strongly NP -co mplete: 8 Theorem 3.2 ([Co o88, Sch7 8 , F LvMV05]) F or ev e ry L ∈ NTIME [ n ] , ther e is a r e duction fr om L to SA T that m aps strings of length n to formulas o f size n p oly(log n ) , wh er e an arbitr ar y bit of the r e duction c a n b e c ompute d in p oly(log n ) time. The proof is a v ery tec hnical v ersion of Co ok’s theorem wh ic h we will not describ e h ere, but let us note in p assing that other problems suc h as V ertex Co v er also enjo y a similar prop ert y . By padding and the fact that D TS classes are closed und er complemen t, we ha v e the follo wing. Theorem 3.3 If SA T is in DTS [ n c ] , then for al l k a nd t ( n ) ≥ n , Σ k TIME [ t ( n )] ⊆ Σ k − 1 TIME [ t ( n c )] and Π k TIME [ t ( n )] ⊆ Π k − 1 TIME [ t ( n c )] . Theorem 3 .3 sa ys w e can remo v e alternatio ns fr om a computation, with a sm all slowdo wn in runtime. (F or this reason, I lik e to c all it a “slo w-do wn theorem.”) Th eorem 3 .1 sa ys w e can add alternations to a DTS computation, with a s p eedup in ru nning time. Naturally , one’s inclination is to pit these tw o r esults against one another and see w h at w e can deriv e. Assuming SA T is in DTS [ n c ], w e ﬁnd NTIME [ n 2 ] ⊆ DTS [ n 2 c ] ⊆ Σ 2 TIME [ n c ] ⊆ NTIME [ n c 2 ] , where the ﬁ r st and third cont ainment s follo w from Theorem 3.3, and the second con tainmen t follo ws from Theorem 3.1. When c < 2 1 / 2 , the ab o v e con tradicts the nondeterministic time hierar- c h y [Coo72]. W e ha v e pro v ed the follo wing: Theorem 3.4 ([FLvMV05]) SA T c annot b e solve d by an algorithm that runs in n √ 2 − ε time and n o (1) sp ac e, for every ε > 0 . With a more complicate d argument in v olving the same to ols, [FLvMV05] pro v ed that SA T cannot b e in DTS [ n φ − ε ], wh ere φ = 1 . 618 . . . is the golden r atio. W e ca n pro v e a simple n 1 . 6 lo w er b ound by generalizing Th eorem 3.1. A t this p oint it w ill b e h elpful to in tro duce some new notatio n, and th is notatio nal shift is cru cial for the au tomated approac h. Letting t ( n ) b e a p olynomial and letting b ≥ (log t ( n )) / (lo g n ), deﬁne the class ( ∃ t ( n )) b C to b e the class of problems solv a ble by a mac hine that existentia lly guesses t ( n ) bits, then selects O ( n b ) of those b its (along w ith the input) and feeds them as input to a represent ativ e mac hine from class C . (The selecti on p ro cedure is required to tak e only linear time and logarithmic space, so it do es not in terfere with an y of the time/space co nstraints of the class.) W e deﬁne ( ∀ t ( n )) b C similarly . By prop erties of nondeterminism and co-nondeterminism, note that w e can “com bine” adjacen t quan tiﬁers in a class: Prop osition 3.5 ( ∃ t 1 ( n )) b 1 ( ∃ t 2 ( n )) b 2 C = ( ∃ t 1 ( n ) + t 2 ( n )) b 2 C , and the analo gous statement w ith ∀ also hol ds. Theorem 3.1 can n o w b e stated m ore generally: Theorem 3.6 (Sp eedup R ule) F or al l x such tha t n ≤ n x ≤ t ( n ) , DTS [ t ( n )] ⊆ ( ∃ n x + o (1) ) x ( ∀ log x ) 1 DTS [ t ( n ) /n x ] . The the or em also hold s wh en we inter cha nge ∀ and ∃ . 9 Theorem 3.6 holds b ecause w e ca n just guess n b + 1 conﬁgurations (in s tead o f t ( n ) 1 / 2 + 1 as b efore), univ ersally p ick i ∈ { 0 , . . . , n b } , a nd the inp ut to the ﬁnal DTS computation will simply b e the original input a long with the pair of conﬁgurations ( C i , C i +1 ), whic h has size n o (1) . Hence w e h a v e n b + o (1) in the ∃ -quan tiﬁer, a nd O ( n ) bits of input to the ﬁnal DTS class. Our new nota tion also lets us to state the “slow-do wn th eorem” in a more precise wa y . Theorem 3.7 (Slo wdo wn R ule) If SA T is in DTS [ n c ] then for al l a 1 , b 1 , . . . , a k , b k , a k +1 ≥ 1 , and Q i ∈ {∃ , ∀ } , the class ( Q 1 n a 1 ) b 1 · · · ( Q k − 1 n a k − 1 ) b k − 1 ( Q k n a k ) b k DTS [ n a k +1 ] is c ontaine d in th e class ( Q 1 n a 1 ) b 1 · · · ( Q k − 1 n a k − 1 ) b k − 1 DTS [ n c · max { b k − 1 ,a k ,a k +1 } + o (1) ] . Again, the result holds b y Theorem 3.2 and a standard p adding argument. In particular, b k − 1 ( Q k n a k ) b k DTS [ n a k +1 ] is con tained in either NTIME [ n max { b k − 1 ,a k ,a k +1 } ] or coNTIME [ n max { b k − 1 ,a k ,a k +1 } ], and b oth of th ese are in DTS [ n c · max { b k − 1 ,a k ,a k +1 } + o (1) ]. W e are now ready to pro v e a stronger time lo w er b ound for SA T. Theorem 3.8 SA T c anno t b e solve d b y an algorithm running in n 1 . 6 time and n o (1) sp ac e. Pro of. Sup p ose SA T ∈ DTS [ n c ] wher e √ 2 ≤ c ≤ 1 . 6. By T heorem 3.2, NTIME [ n ] ⊆ DTS [ n c + o (1) ]. W e can deriv e NTIME [ n c/ 2+2 /c ] ⊆ NTIME [ n c 3 / 2+ o (1) ], ignoring o (1) factors for simplicit y: NTIME [ n c/ 2+2 /c ] ⊆ DTS [ n c 2 / 2+2 ] (Slo wdo wn) ⊆ ( ∃ n c 2 / 2 ) c 2 / 2 ( ∀ log n ) 1 DTS [ n 2 ] (Sp eedup , with x = c 2 / 2) ⊆ ( ∃ n c 2 / 2 ) c 2 / 2 ( ∀ log n ) 1 ( ∀ n ) 1 ( ∃ log n ) 1 DTS [ n ] (Sp eedu p , with x = 1) = ( ∃ n c 2 / 2 ) c 2 / 2 ( ∀ n ) 1 ( ∃ log n ) 1 DTS [ n ] (Prop osition 3. 5) ⊆ ( ∃ n c 2 / 2 ) c 2 / 2 ( ∀ n ) 1 DTS [ n c ] (Slo wdo wn) ⊆ ( ∃ n c 2 / 2 ) c 2 / 2 DTS [ n c 2 ] (Slo wdown) ⊆ ( ∃ n c 2 / 2 )( ∃ n c 2 / 2 ) c 2 / 2 ( ∀ log n ) c 2 / 2 DTS [ n c 2 / 2 ] (Sp eedup , x = c 2 / 2) = ( ∃ n c 2 / 2 ) c 2 / 2 ( ∀ log n ) c 2 / 2 DTS [ n c 2 / 2 ] (Prop osition 3.5) ⊆ ( ∃ n c 2 / 2 ) c 2 / 2 DTS [ n c 3 / 2 ] (Slo wdo wn) ⊆ NTIME [ n c 3 / 2 ] . When c/ 2 + 2 /c > c 3 / 2 (wh ic h happ ens f or c < q 1+ √ 17 2 ≈ 1 . 6004) w e ha v e a con tradiction to the n ondeterministic time h ierarch y . ✷ The b est kn o wn SA T time lo wer b oun d (with n o (1) space) is from an earlier pap er of our s [Wil07], and it ac hiev es n 2 cos( π / 7) ≥ n 1 . 8 . It is a fairly elab orate inductiv e pro of that b uilds on the same ideas. A natural question is if w e can do ev en b etter than this. It is clear that w e hav e a v ery sp eciﬁc t yp e of p r o of sys tem o n our hands; it is also p o w erful, in that all kno wn time-space lo w er b ound s for SA T (and QBF) on r andom access machines wo rk o v er it. There is nothing too complicat ed ab out the p ro of of T heorem 3.8: we are jus t applying the Sp eedup and Slo wdown Rules in clev er wa ys. More in terestingly , the pro of w as disco v ered by a computer program. F ur th ermore it is the b est lo w er b oun d one can prov e with only 7 applications of Sp eedup and Slo wdo wn Rules, and we kno w this b ecause a computer program tried all the cases. Ho w did it try the cases? It w ould seem that the space of p ossibilities is to o high: ho w could a computer try all p ossib le expressions for the exp onents? O ne can show that onc e we have sp e ciﬁe d 10 the se qu e nc e in which the Sp e e dup and Slowdo wn R ules ar e appl ie d, the task of ﬁnding the optimal lower b ound ar gument c an b e formulate d as a line ar pr o gr am . T his mak e s our job of ﬁnding go o d lo w er b ound pro ofs muc h easier. Let me sk etc h h o w a linear program can b e constructed, for a ﬁxed sequence of r u les to apply . One can sho w that the sequence of rules completely determines th e n um b er of quan tiﬁers in eac h class in the c h ain of inclusions of the pro of. (There are actually t w o w a ys to apply the Sp eedup Rule, one where w e introdu ce a Σ 2 computation and the other a Π 2 computation, but we can prov e that one of these applications is alw a ys sup erior.) Supp ose w e ha ve a sequence of inclusions su c h as those in the pro of of Th eorem 3.8, bu t all exp onen ts in th e p olynomials are replaced with v ariables. So for example, the second inclusion (or “line”) in the pro of of Theorem 3.8 b ecomes ( ∃ n a 2 , 3 ) b 2 , 2 ( ∀ n a 2 , 2 ) b 2 , 1 DTS [ n a 2 , 1 ] , the initial class NTIME [ n c/ 2+2 /c ] is replaced with NTIME [ n a 0 , 1 ], and the ﬁnal class NTIME [ n c 3 / 2 ] b ecomes NTIME [ n a 8 , 1 ]. In general, w e replace the exp onents in the i th line with v ariables a i,j , b i,j . No w we wan t to write a linear program in te rms of these v ariables that expresses th e applications of the t w o r ules and ca ptures the fact t hat w e w ant a contradicti on. T o do the latter is v ery easy: w e sim p ly require a 8 , 1 < a 0 , 1 , or a 8 , 1 ≤ a 0 , 1 − ε for some ε > 0. T o e xpress a Sp eedu p Rule on the i th line, we int ro d uce a p arameter x i ≥ 0 and include the foll o wing inequalities: a i, 1 ≥ 1 , a i, 1 ≥ a i − 1 , 1 − x i , b i, 1 = b i − 1 , 1 , a i, 2 = 1 , b i, 2 ≥ x i , b i, 2 ≥ b i − 1 , 1 , a i, 3 ≥ a i − 1 , 2 , a i, 3 ≥ x i , ( ∀ k : 4 ≤ k ≤ m ) a i,k = a i − 1 ,k − 1 , ( ∀ k : 4 ≤ k ≤ m − 1) b i,k = b i − 1 ,k − 1 . In tuitiv ely , these constrain ts expr ess th at th e class ( Q m n a m ) b m − 1 · · · b 2 ( Q 2 n a 2 ) b 1 DTS [ n a 1 ] on the ( i − 1)th line is r eplaced with ( Q m n a m ) b m − 1 · · · b 2 ( Q 2 n max { a 2 ,x i } ) max { x i ,b 1 } ( Q 1 n ) b 1 DTS [ n max { a 1 − x i , 1 } ] on the i th lin e, wh ere Q 1 is the quanti ﬁer op p osite to Q 2 . (One can c hec k that this indeed sim ulates the Sp eedup R u le faithfully .) W e can express the Slowdo wn Rule in a simila r w a y , b y treating th e desired lo w er b ou n d exponent c as a constan t. Note if we minimize the sum of a i,j + b i,j o v er all i, j , then the ab o v e inequalities faithfully sim ulate max. Giv en that w e can tak e a sequence of r ules and turn it into an LP that can then b e solve d, what sequences are go o d to try? T he num b er of sequ en ces to searc h can b e reduced b y establishin g sev eral prop erties of the p ro of system that hold without loss of generalit y . F or example, w e ma y alw a ys start with DTS [ n k ] for some k , and if we deriv e DTS [ n k ] ⊆ DTS [ n k − ε ] then w e hav e a con tradiction, and ev ery p ro of that works otherwise can b e r ewritten to wo rk lik e this. There are sev eral simpliﬁcations of this t yp e and while their p ro ofs are not v ery enlighte ning, as a w hole they let us identify the r elev ant parts of pro ofs. They also h elp us prov e limitations on the pro of system. The chart on the next page giv es a graph of exp erimen tal results fr om a search for short p ro ofs of time-space lo w er b ounds for S A T. Th e x -co ordinate is the num b er of lin es in a p ro of (the num b er of S p eedup/Slo wdown applications) and the y -co ordinate is the exp onen t of the best lo w er b ound attained w ith that num b er of lin es. Up t o 25 lines, the searc h w as completely exhaustiv e. Bey ond that, I used a heur istic searc h that tak es a queue of b est-found pro ofs for small lengths and tries to lo cally imp ro v e them by inserting new ru le applications. When a b etter lo wer b ound is found, the new pro of is added to the qu eue. This h euristic searc h w as run u p to ab out 50+ lines. No w, all of the b est p r o ofs found up to 50+ li nes ha v e a certain pattern to them, resem bling the stru cture of the 2 cos ( π / 7) lo w er b ound . Restricting the searc h to work o nly within this p attern, w e can get annotations for 70+ lines whic h still exhibit th e pattern. Checking a 383-line pro of of similar form, the low er b ound attained was n 1 . 8017 , very close to 2 co s( π / 7) ≈ 1 . 8019. Th ese exp erimenta l results lead us to: 11 1 . 45 1 . 5 1 . 55 1 . 6 1 . 65 1 . 7 1 . 75 1 . 8 10 20 30 40 50 60 70 80 Exp onent of best time lo w er bou n d Num b er of lines in the p r o of SA T Time Lo wer Bounds With n o (1) Space Algorithms Conjecture 3.9 The b est time low er b ound f or SA T (in n o (1) sp ac e) that c an b e pr ove d with the ab ove pr o of system is the n 2 cos( π/ 7) b o und of [Wil07]. Giv en the scale at wh ic h the conject ure has b een v eriﬁed, I am fairly conﬁdent in its truth although I d o not kno w ho w to p ro v e it. Un like [TSSW00], we do not kn o w ho w to place a ﬁnite upp er b ou n d on al l the paramete rs, namely the lengt hs of pro ofs. T he conjecture is indeed surpr ising, if tru e. The general sent iment among researc hers I hav e talk ed to (and anonymous referees fr om the past) wa s that a qu adratic time lo w er b oun d (or more pr ecisely , n 2 − ε for all ε > 0) should b e p ossible with the ingredients we already ha v e. W e can sho w formally that a b etter lo w er b ound than this cann ot b e established w ith the curren t appr oac h. Theorem 3.10 In the ab ove pr o of system , one c annot pr ove that SA T r e quir es n 2 time with n o (1) sp ac e algorithms. This theorem is pro v en b y minimal coun terexample: we ta ke a m in im um pro of of a quadratic lo w er b ound, and sh o w th at th ere is a sub sequence of rules that can b e remo v ed suc h th at th e underlying LP remains feasible with the same parameters as b efore. S o one p ossible strategy for pro ving the c onjecture is to ﬁnd some subsequence of ru les that m ust arise in any optimal pro of, and sho w that if one assumes c > 2 cos ( π / 7) then this sub sequence can b e remov ed from the pro of without wea k ening it. Ho w ev er this strategy ap p ears to be diﬃcult to carry out. 4 Finding Small Circuits In this last section, I will sp eculate ab out a computatio nal appr oac h to un derstanding Bo olean circuit complexit y . The follo wing has b een a joint eﬀort with Ma v eric k W o o. 12 Our kno wledge of Boolean circuit complexity is quite po or. (F or concreteness, let us concen- trate on circuits comprised of AND, OR, and NOT gates.) W e do not kno w ho w to pro v e strong circuit lo we r b ound s f or problems in P ; the b est known is 5 n [LR01, IM02]. One go o d reason wh y w e don’t kno w muc h ab out the true p o w er of circuits is that we don’t ha v e man y examples of minim um circuits. W e don’t know, for example, what an o ptimal circuit for 3 × 3 Bo olean matrix m ultiplication lo oks lik e. It is p ossible that w e could mak e progress in understanding circuits b y catalo ging the smallest circuits w e kn ow for basic functions, on small input s izes (such as n = 1 , . . . , 10). This s u ggestion mak es more sense for some problems than others. F or SA T, the circuit complexit y can dep end on the enco ding of Bo olean formulas; for matrix op erations, the enco ding is clear. S loane and Plouﬀe ﬁr st published [SP 95] and no w main tain the Online Encyclop ed ia o f In teger Sequences, an exhaustiv e catalog of interesting sequences that arise in mathematics and t he sciences. Migh t w e b eneﬁt f rom an E n cyclop edia of Minim um Circuits? F or example, what do the smallest Boolean circuits for 10 × 10 Boolean matrix m ultiplication look lik e? Are they r egular in stru cture? It is lik ely that the answ ers w ould giv e v aluable insigh t into the complexit y of the problem. Th e b est algorithms we kno w of redu ce the problem to matrix multiplic ation ov er a ring, whic h is then solv ed b y a highly regular, recursive construction (suc h as Strassen’s [Str69]). E ven if the catal oged circuits are n ot truly minimal b ut close to that, concrete examples for small inputs could b e useful for theoreticia ns to min e for inspiration, or p erhaps for computers to m ine for patterns via mac hine learning tec hniqu es. The p o w er of small examples cannot b e u nderestimated. Ho w can w e get small examples of minimum circuits? On e p oten tial app roac h is to reduce this task to the task of d evelo ping go o d solv ers for quant iﬁed Bo olean formulas, an area in AI that has seen m uch tec hnical pr ogress lately . W e can p ose the problem of ﬁn ding a small circuit as a quantiﬁed Bo olean f ormula (QBF), and feed the Q BF to o ne of man y recen tly dev elop ed Q BF solv ers. A Q BF Φ s,n for size- s n × n matrix m ultiplica tion circuits can b e stat ed roughly as: Φ s,n = ( ∃ circuit C of s gat es , 2 n 2 inputs , n 2 outputs)( ∀ n × n matrices X, Y )[ X · Y = C ( X , Y )] , where th e predicate can b e easily enco d ed as a SA T in stance. In our enco d ing, w e allo w the circuits to hav e unb ounded fan -in . F or simplicit y , w e searc hed for circuits made up of only N OR gates. Exp eriments with QBF solv ers h a v e not y et r ev ealed signiﬁcan t n ew insight . S o far, they ha v e disco v ered one fact: the optimal size cir cuit for 2 × 2 Bo ole an matrix multiplic a tion is the obvious one . W ell, duh. What a b out the 3 × 3 case? T h is is already diﬃcult! The sKiz zo QBF solv er [Ben05 ] can pro v e that there is n o circuit for 3 × 3 th at has 10 g ates, but nothing b ey ond that. Ev en when we restrict th e gates to h a v e fan-in t w o, the solv er crashes o n la rger instances. I d o not see this limited progress as a substantia l deterrent . On the practical side, QBF solv ers ha v e only seen serious sci enti ﬁc att enti on in the last several y ears, and huge develo pmenta l strides ha v e already b een made. On the theoretical side, if w e lo ok at matrix m ultiplication o v er a ﬁeld, a b etter w a y to approac h the problem w ould b e to phrase the Q BF for sm all circuits as something lik e a “Merlin-Arthur form ula”: th at is, we guess the circuit to b e used and v erify that it computes the pr o duct by ev aluating on random matrices. In that case, the instances should b e muc h easier to solv e. In general, we can ﬁnd app ro ximately min im um circuits for problems with p olysize c ircuits in ZPP NP [BCGKT96]. By u sing a high-qualit y SA T solv er in p lace of the NP oracle, the idea of building an appro ximate circuit encyclop edia do es not see m to o implausible. Ho w ev er some eﬀort will b e needed to adapt the results to w ork in practice . I do b eliev e that in the near futu r e, the general problem of ﬁnding small m inimal circu its for P 13 problems w ill b e within the reac h of p ractice. Analyzing these new gadgets sh ou ld inject a f resh dose of idea s in the area of circuit complexit y . References [ABBIMP05] M. Alekhno vic h, A. B oro din, J. Buresh-Opp enheim, R. Impagliazzo, A. Magen, and T. Pitassi. T o wa rd a mo del f or bac ktrac king and dynamic pr ogramming. Pr o c. IEEE Confer- enc e on Computational Complexity , 308–322, 2005 . [AH77a] K. App el and W. Hak en. Ev ery planar m ap is four colorable. Pa rt I. Disc harging. Il linois J. Math. 21:429– 490, 1977. [AH77b] K. App el, W. Hak en, and J. Ko c h. Ev ery planar map is four colorable. P art I I. Reducibil- it y . Il linois J. Math. 21:49 1–567, 1977. [AS98] S. Arora and S. S afra. Probabilistic c hec king of p ro ofs: A new characte rization of NP . J. A CM 45(1):70– 122, 1998. [ALMSS98] S . Arora, C. Lund, R. Mot w ani, M . S u dan and M. S zegedy . Pro of veriﬁcatio n and the hardness of app ro ximation problems. J. A CM 45(3):501 –555, 1998. [BGS98] M. Bel lare, O. Goldreic h, and M. Sudan. F ree bits, PCP s , and non-appro ximabilit y: to- w ards tight results. SIAM J . Comput. 27(3):804–9 15, 1 998. [Ben05] M. Benedetti. sKizzo: A suite to ev a luate and certify QBFs. Pr o c. Int’l Conf. on Automate d De duction , 369– 376, 2005. [BCGKT96] N. H. Bshout y , R. Clev e, R. Ga v ald´ a, S. Kannan, and C. T amon. O racles and queries that are suﬃcien t for exact le arning. J. Comput. Syst . Sci. 5 2(3):421 –433, 1996. [Co o72] S. A. Co ok. A hierarc hy for nondeterministic time complexit y . Pr o c. ACM STOC , 187–192, 1972. [Co o88] S. A. Co ok. Sh ort prop ositional form ulas represent nondeterministic computations. Infor- mation Pr o c e ssing L etters 26(5): 269-270 , 1988. [Epp06] D. E p pstein. Quasico nv ex a nalysis of multiv ariate recurren ce equations for bac ktrac king algorithms. ACM T r ans. on A lgorithms 2(4):492 –509, 2006. [FK06] S. S . F edin and A. S. Kuliko v. Automated pro ofs of upp er b oun ds on the r unnin g time of splitting algorithms. J. Math. Scienc es 134(5):2383 –2391, 2006. [F GK05a] F. V. F omin , F. Grandoni, and D. Kratsc h. Measure and conquer: domination - a case study . Pr o c. ICALP , 19 1–203, 200 5. [F GK05b] F. V. F omin, F. Grand oni, and D. Kratsc h. Some new tec hniques in design and analysis of exact (exp onen tial) algorithms. Bul letin of the EA TCS 87:47–77, 200 5. [F GK06] F. V. F omin, F. Grandoni, an d D. Kratsc h. Measure and conquer: a simple O (2 0 . 288 n ) indep end en t set algorithm. Pr o c. A CM-SIA M SODA , 18–25 , 2006. 14 [FLvMV05] L. F ortno w, R. Lipton, D. v an Melk eb eek, and A. Vigla s. Time-space lo w er b ounds for satisﬁabilit y . Journal of the A CM 52(6):8 35–865, 2005. [GJS76] M. Garey , D. Johnson, and L. S to ckmey er. Some simpliﬁed NP-complete graph problems. The or. Co mput. Sci. 1:237–267 , 19 76. [GW95] M. G o emans and D. Willi amson. Impr ov ed app ro ximation algo rithms for maxim um cut and satisﬁabilit y problems using semid eﬁnite pr ogramming. J. A CM 42:1115–1 145, 1995. [GGHN04] J. Gramm, J. Guo, F. H¨ uﬀner, and R. Niedermeier. Automa ted generation of searc h tree algorithms for hard graph modiﬁ cation problems. Algorithmic a 39:3 21–347, 2004 . [Hal05] T. C. Hales. A pro of of the Kepler conjecture. A nnals of M ath. 162:1065 –1185, 2005. [HS00] J. Hass and R. Sc hlaﬂy . Doub le bubbles minimize. A nnals of Math. 15 1:459–51 5, 200 0. [Has01] J . H ˚ a stad. Some optimal inappro ximabilit y results. J. ACM 48:798–8 59, 2001. [IM02] K. Iwa ma and H. Morizumi. An explicit lo w er b ound of 5 n − o ( n ) for Boolean circuits. P r o c. MFCS , 353–364, 200 2. [IT04] K . Iwama and S. T amaki. Impr o v ed upp er b ounds for 3-SA T. Pr o c. ACM SODA’04 , 321– 322, 2004. [Kan84] R. Kannan. T o wards separating nond eterminism from determinism. Mathematic al Systems The ory 17(1): 29–45, 1984. [KZ97] H. J. Karloﬀ and U. Zwic k. A 7/8-a ppr o ximation algorithm for MAX 3S A T? Pr o c. IEEE F OCS 406–415, 1997 . [KK06] A. Ko jevniko v and A. S. Kulik o v. A new appr oac h to pro ving upp er boun ds for MAX- 2- SA T . Pr o c. ACM-SIAM SOD A , 11–17, 2006. [LR01] O. Lac hish and R. Raz. Explicit lo w er b ound of 4 . 5 n − o ( n ) for Bo olean circuits. Pr o c. ACM STOC , 399–408 , 2001. [L TS 89] C. W. H. Lam, L. Thiel, and S. Swiercz. T he n onexistence of ﬁnite pro jective planes of order 10. Canad. J. Math. 4 1:1117–1 123, 1989. [McC97] W. McCune. Solution of the Robbins problem. JAR 19(3):26 3–276, 1997. [vM04] D. v an Melk eb eek. Time-space lo w er b ounds for NP-complete p roblems. In Curr ent T r ends in The or etic al Computer Scie nc e 2 65–291, W orld Scien tiﬁc, 2004. [vM07] D. v an Mel ke b eek. A survey of lo w er b ounds for satisﬁabilit y a nd relate d problems. F oun- dations and T r ends in TCS 2:197–3 03, 2007. [NS03] S. I. Nik ole nko and A. V. Sirotkin. W orst-case upp er b ound s for SA T: automated pro of. Pr o c. ESSLLI 2 25–232, 2003. URL: http://l ogic.pdm i.ras.ru/~sergey/ [RSST97] N. Rob ertson, D. P . Sanders, P . D. Seymour and R. Thomas. A new proof o f th e four colour theorem. J. Combinatorial The ory B 70:2–4, 1 997. 15 [Rob01] M. Robson. Finding a maxim um indep endent set in time O (2 n/ 4 ). T ec hnical Rep ort 1251- 01, LaBRI, Un iversit ´ e d e Bordeaux I, 2001. URL: http:/ /www.labr i.fr/per so/robson/mis/techrep.html [Sa v70] W. J . Sa vitc h. Relationships b etw een nondeterministic and deterministic tap e classes. J. Comp. Sys. Sci. 4:177– 192, 1970. [Sc h78] C. Sc hnorr. Satisﬁabilit y is quasilinear complete in NQL. Journal of the A CM 25(1) :136– 145, 1978. [SS07] A. D. Scott and G. B. Sorkin. Linear-programming design a nd analysis of fast alg orithms for Max 2-CS P . Discr. Optimizat ion 4(3-4 ):260–2 87, 2007. [SP95] N. J. A. Sloane and S. Plouﬀe. The encyclop edia of int eger sequen ces. Academic Press, 1995. [Str69] V. Strassen. Gaussian eliminat ion is not optimal. Numer. Math . 13: 354–356 , 19 69. [TSSW00] L. T revisan, G. S orkin, M. Sudan, D. P . Williamson. Gadgets, approxima tion, and linear programming. SIAM J. Computing 29(6):2074– 2097, 20 00. [Wil07] R. Wi lliams. Time-space tradeoﬀs for coun ting NP solutio ns mo dulo in tegers. Computa- tional Complexity 17(2):179 –219, 2008. [Wil08] R. Willia ms. Automated pr o ofs of time lo w er b ou n ds. Manuscript av ailable at http://w ww.cs.cmu .edu/~ryanw/projects.html . [Zwi02] U. Zwic k. Computer a ssisted proof of optimal appro ximabilit y results. Pr o c. ACM-SIAM SODA 496–505, 200 2. 16

Applying Practice to Theory

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment