On the freezing of variables in random constraint satisfaction problems

The set of solutions of random constraint satisfaction problems (zero energy groundstates of mean-field diluted spin glasses) undergoes several structural phase transitions as the amount of constraints is increased. This set first breaks down into a …

Authors: ** M. Mézard, G. Parisi, R. Zecchina (주요 저자) – 외 다수 공동 연구자 **

On the freezing of v ariables in random constrain t satisfaction problems Guilhem Semerjian LPTENS, Unit´ e Mixte de R e cher che (UMR 8549) du CNRS et de l’ENS asso ci ´ ee ` a l’universit´ e Pierr e et Marie Curie, 24 Rue Lhomond, 75231 Paris Ce dex 05, F r anc e (Dated: Ma y 2 9, 2018 ) The set of solutions of random constraint satisfaction problems (zero energy groundstates of mean-field diluted spin glasses) un d ergoes sev eral structural phase t ran sitions as the amount of constrain ts is increased. This set first breaks dow n into a larg e number of w ell separated clusters. At the freezing transition, whic h is in general distinct from the clustering one, some v ariables (spins) take th e same v alue in all solutions of a given cluster. In this pap er we study the critical b ehavior around the freezing transition, whic h app ears in th e unfrozen phase as the divergence of the sizes of the rearrangemen ts induced in resp onse to the modification of a v ariable. The formalis m is develo p ed on generic constraint satis faction p roblems and applied in particular to the random satis fiability of b oolean formulas and to the coloring of random graphs. The computation is first p erformed in random tree ensem bles, for whic h w e und erline a connection wi th p ercolation models and with t he reconstruction problem of informa tion theory . The v alidity of th ese results for the original random ensem bles is then discussed in t h e fr amework of th e ca vity meth od. I. INTRO DUCT ION The theo ry of computational complexit y [1] esta blishes a classification of constraint satisfaction pr oblems (CSP) according to their difficulty in the worst ca se. F or concretenes s let us introduce the three problems w e shall use a s running exa mples in the pap er: • k - X ORSA T. Find a vector ~ x of bo o lean v a riables satisfying the linea r eq ua tions A~ x = ~ b (mo d 2), wher e each row of the 0 / 1 matrix A contains exactly k non-null element s, and ~ b is a given b o olean vector. • q -coloring ( q - COL). Given a graph, assign one of q colo rs to each of its vertices, without giving the same co lor to the t wo extremities o f an edge. • k - satisfiability ( k -SA T). Find a solution of a b o olean formula made of the conjunction (logica l AND) of clauses, each ma de of the disjunction (logica l OR) o f k literals (a v a riable or its log ical negation). Each of these problems admits sev er al v a r iants. In the de c ision version one has to assert the existence o r not of a solution, for instance a prope r co loring of a given graph. More elab ora te questions a r e the estimation of the num b er of such so lutions, or, in the absence of solution, the discov ery of o ptimal configurations, for instance colorings minimizing the n umber of mono chromatic edges. The decision v aria n t o f the three examples stated ab ov e fall into tw o distinct complexity cla sses: k -XORSA T is in the P class , while the t wo others are NP-complete for k , q ≥ 3 (see [2] for a classification of generic bo olea n CSP s). This mea ns that the existence of a so lution o f the XORSA T problem can be decided in a time growing p olynomia lly with the num b er of v aria ble s , for an y instance of the problem; one can indeed use the Gaussian elimination algorithm. On the contrary no fast algor ithm able of s olving ev ery coloring or satisfiability problem is known, and the existence of suc h a p olyno mial time algo rithm is considered as highly improbable. This notion of computational complexity , b eing ba sed on worst-case considera tions, co uld overlook the p ossibility that “most” of the instance s of an NP problem are in fact easy and that the difficult c a ses are very ra r e. Random ensembles of problems hav e thus been int ro duced in or der to give a quantit ative conten t to this notion of typical instances; a pro per t y of a problem will b e cons ider ed as typical if its pro bability (with r esp ect to the ra ndom choice of the instance) go es to one in the limit o f large problem sizes. Most random ensembles dep end o n an external parameter that can b e v a r ied co n tinuously . In the coloring problem one can for instance consider the traditional Erd¨ os- R´ enyi random graphs [3] whic h are parameterize d b y their mean connectivity c . F or (XOR)SA T instances this role is pla yed by the ratio α of the num b er of constraints (clauses for SA T or ro ws in the ma trix for XORSA T) to the nu mber of v aria bles. A remark able threshold phenomenon, first o bserved numerically [4], o ccurs when this parameter is v aried: when a par ticular v alue c s , α s is cros sed from b elow, the insta nce s go fr om typically satisfia ble to typically unsatisfiable. This sta temen t has be e n rigo rously pr ov en for X ORSA T [5, 6] and for 2-SA T [7 ], in the other cas es it is only a la rgely accepted conjecture, with sharpnes s condition o n the width of the transition window [8] and b ounds on its p ossible lo ca tio n [9, 10]. Threshold pheno mena ar e largely studied in sta tistical mechanics under the name of phase tr ansitions. There is moreov er a natural analo g y b etw een optimization pro blems and statistical mechanics; if one defines the energy as 2 the num be r of violated constr a int s, for ins tance the num be r of mono chromatic edges , the optimal co nfig urations o f a problem coincide with the g roundstates of the asso ciated ph ysical system, an ant iferr omagnetic Potts mo del in the colo ring cas e. This ana logy trig gered a large a moun t o f resea rch, relying on methods of statistical mechanics of disordered systems or iginally devised for the study of mean-field spin-glasses [11]. Early examples of this appr oach for the satisfiability and colo r ing problems ca n b e found in [12, 13]. One of the mo st interesting outcomes of this line of r esearch [14, 15] has be en the sugges tio n that other structural threshold phenomenon take place b e fore the s atisfiability one 1 . The set of solutions o f a rando m CSP , viewed as a subset o f the whole configura tion space, is smo oth a t low v alues of the constr aint ratio but b ecomes fragmented in to clusters of solutions for intermediate v alues of the control parameter , α ∈ [ α d , α s ]. This clustering transition has been rigoro usly demonstrated in the XORSA T case [5, 6], for which it has a simple geometric interpretation. α d is indeed the threshold for the p ercola tion of the 2-core of the hypergr aph underlying the CSP; betw een α d and α s there is t ypica lly a finite fra ction of the v ar ia bles and constraints in a p eculiar sub-formula known a s the ba ckb one. Every solution of the backbone gives bir th to a cluster of the complete formula. The v a riables of the backbone are said to be frozen in a given cluster, i.e. they take the s a me v alue in a ll the so lutions b elong ing to a clus ter; this is merely a consequence o f the definition of a cluster in this ca se. Establishing a precise and generic definition of the clusters is not an easy task, not to sp eak ab out proving tight rigoro us r e s ults on their e x istence or prop e rties (for recen t results in this direction see [16–19]). E ven at the heuristic level, it w as r ecently ar gued [20–22] that the c omputation o f α d for r andom satisfia bilit y (or c d for c oloring) by previous statistical mechanics studies [23, 24] was incorrec t. Roughly sp eaking, in these tw o mo dels, the size s of the cluster s can have large fluctuations [2 5] that must b e taken in to consider ation. In [20] the existence of yet a no ther thr e shold (for k, q ≥ 4) α c ∈ [ α d , α s ] w as also p ointed out; this condensatio n thres hold separ ates tw o clustered regimes, one where the re le v ant clusters are exp onentially numerous (for smaller v a lues o f α ) and the other where there is only a sub-exp onential n umber of them. The clustering tr ansition of XORSA T, b eca use of its geo metric interpretation, is certainly a go o d example on which developing one’s intuition of the clustering phenomenon. T he r e are how ever at least tw o asp ects in which X ORSA T departs from other CSP and where the intuitiv e picture must b e taken with a g rain o f salt. The firs t is that the clusters of X ORSA T all have the same size, because of the linear a lg ebra structure of its set of solutions. F or this r e ason the condens ation phenomenon is no t present in X ORSA T. The second po int is that clusters of XORSA T hav e frozen v a r iables, b y definition. There is howev er no obvious reas o n that this should b e true for any CSP . On the contrary we shall ar gue in this paper that in general fr ozen v aria bles app ear at ano ther v alue α f of the control parameter , with generically α f ∈ [ α d , α s ]. This was o ne of the results o f [21, 22], here w e shall develop this point and quan tify the precursor s of the tr a nsition b efor e α f . F or this we build up on the study of XORSA T presented in [26] a nd extend it to generic CSPs, in particular satisfiability and coloring. The central notion studied here is the one of rearrange men t (to some ex ten t r e lated to the lo ng-range frustratio n of [27]): giv en an initial so lution of a CSP and a v aria ble i that one would like to modify , a r e a rrange men t is a path in configuration space that starts from the initial solution and leads to another s olution where the v a lue of the i ’th v ariable is changed with resp ect to the initial one. The minimal length of suc h a path is a measure of ho w constrained was the v ariable i in the initial configuration. In in tuitive terms this length diverges with the system s ize when the v ariable was frozen in the initial cluster. The pap er is org anized as follows. In Sec. II we in tro duce a ge ner ic class of CSPs and precise the definition of the rea rrangements. Sections I I I and IV a re devoted to mo dified (tree) ra ndom ensembles in which the a pproach is essentially r igorous ; the former presents detailed computations in a r ather gener ic setting and its application to the three selected examples, while the latter presents the numerical r esults and discuss the generic phenomenology at the approach of the freezing transition in the tree ens e m bles, w ith some more technical details deferre d to App. A. The computation is reconside r ed in the p ersp ective of the r econstruction proble m in Sec. V. The applicability of these results to the original ensembles is dis c ussed in Sec. VI, through a precise sta temen t of the h yp otheses of the cavit y metho d. Conclusions and p ers pectives for future work ar e presented in Sec. VI I. II. DEFINITIONS W e intro duce her e so me notations and definitions for a class of problems that encompass e s the three examples we shall trea t in more details. The degrees o f freedom of the CSP will b e N v aria bles σ i taking v alues in a discrete 1 It was of course already known that the algorithms rigorously studied to derive l o wer b ounds on the satisfiability threshold w ork only upto to v alues of α smaller than α s [9]. These v alues are how eve r lar gely algori thm- dependen t and not directly related to a chan ge of structure in the con figuration space. 3 a i c b d FIG. 1: An example of factor graph. The neighborho o ds are for instance ∂ i = { a, b, c, d } and ∂ i \ a = { b, c, d } alphab et X ; global configura tions a re deno ted σ = ( σ 1 , . . . , σ N ). An instance (or formula) F of the CSP is a set of M constr aints betw een the v a riables σ i . The a ’th co nstraint is defined b y a function ψ a ( σ a ) → { 0 , 1 } , which depends on the config uration of a subset of the v ariables σ a and is equal to 1 if the cons traint is satisfied, 0 otherwise. The set S F ⊂ X N of solutions of F is co mp os e d of the configuratio ns satisfying simultaneously all the constraints. It can th us b e formally defined as S F = { σ | ψ F ( σ ) = 1 } , where the indicator function ψ F is ψ F ( σ ) = M Y a =1 ψ a ( σ a ) . (1) When the formula admits a p ositive n um b er o f solutions, call it Z F , the uniform measure o ver the solutions is denoted µ F ( σ ) = ψ F ( σ ) / Z F . F acto r graphs [2 8] provide an useful repr esentation of a CSP . These graphs (see Fig. 1 for an example) have tw o kind of nodes. V aria ble no des (filled circles on the figure) are a sso ciated to t he degrees of freedom σ i , while constraint nodes (empt y squares) repre sent th e clauses ψ a . An edge b etw een cons traint a and v ariable i is drawn whenever ψ a depe nds on σ i . The neighborho o d ∂ a of a constraint node is the set o f v ar iable no des tha t app ear in σ a . Con versely ∂ i is the set o f constra ints that dep end on σ i . W e s hall conv entionally use the indice s i, j, . . . for the v ar iable no des, a, b , . . . for the co nstraints, and de no te \ the subtra ction fro m a set. Tw o v ariable no des are called adjacent if they appea r in a co mmon constraint. The graph distance b etw een t wo v ariable nodes i and j is the n umber of co nstraint no des encountered on a shortest path linking i a nd j (formally infinite if the tw o v ar iables are not in the same connected comp onent of the g raph). The three illustrative examples pr esented abov e admits a s imple repr esentation in this fo r malism: • k - X ORSA T. The degrees o f freedo m of this CSP are bo o lean v ar iables that we shall r epresent, following the ph ys ic s conv entions, b y Ising spins, X = {− 1 , +1 } . Each co nstraint inv olves a s ubset of k v aria bles, σ a = ( σ i 1 a , . . . , σ i k a ), and rea ds ψ a ( σ a ) = I ( σ i 1 a . . . σ i k a = J a ), where here and in the following I ( · ) denotes the indicator function o f an event and J a ∈ {− 1 , +1 } is a giv en constant. This is equiv alent to the definition given in the int ro ductio n: defining x i , b a ∈ { 0 , 1 } such that σ i = ( − 1) x i and J a = ( − 1) b a , the constr aint impos ed b y ψ a reads x i 1 a + · · · + x i k a = b a (mo d 2), which is nothing but the a ’th row of the matrix eq uation A ~ x = ~ b . The addition mo dulo 2 o f Bo olean v ar iables can also b e read as the binar y exclusive O R op eration, hence the name X ORSA T used for this pr oblem. • q -COL. Here X = { 1 , . . . , q } is the s et of a llow ed co lors on the N vertices of a g raph. Each edge a connecting the vertices i and j pre vents them from b eing of the same color : ψ a ( σ i , σ j ) = I ( σ i 6 = σ j ). • k - SA T. As in the XORSA T problem one deals with Ising repr esented bo ole a n v ariables , but in each clause the X OR op eration b et ween v a riables is replaced b y an OR b etw een literals (i.e. a v ariable or its negation). In other w or ds a constraint a is unsa tis fie d only when all literals ev a lua te to false, or in Ising terms when all spins σ i inv olved in the constra in t take their wr ong v a lue that we denote J i a : ψ a ( σ a ) = 1 − I ( σ i = J i a ∀ i ∈ ∂ a ). The r a ndom ensembles of CSPs ins tances we shall use are defined as fo llows: • k - X ORSA T. F or each of the M clauses a a k - uplet of distinct v aria ble indices ( i 1 a , . . . , i k a ) is c hosen uniformly at random among the  N k  po ssible ones, and the cons tant J a is taken to b e ± 1 with pro bability one- half. • q -coloring. A set of M among the  N 2  po ssible edges a = { i, j } is chosen uniformly at rando m. 4 • k - SA T. The v a riables i j a are chosen as in the X ORSA T ensemble, and the J i a are indep endently taken to b e ± 1 with equal probability . F or the colo ring problem this construction is the cla ssical Erd¨ os -R´ enyi random graph G ( N , M ), the tw o other cases are its ra ndom hypergr aph generalizatio n. W e a re interested in the thermo dynamic limit of larg e instances where N and M b oth diverge with a fixe d ra tio α = M / N 2 . Random (hyper)gra phs hav e many interesting prop erties in this limit [3]. F or instance the degre e of a v ariable no de of the factor g raph converges to a Poisso n law of av er age αk for the XORSA T a nd SA T cas es, and 2 α for the color ing ensem ble. F or cla rity in the latter case we shall use the notation c = 2 α for the av erage c o nnectivity . Moreov er, pic king at rando m one v ariable no de i and isolating the subgraph induced by the v ariable no des at a graph distance smaller than a given co ns tant L yields, with a probability going to one in the thermo dyna mic limit, a (ra ndom) tree. This tree can be describ ed by a Galton-W a ts o n branc hing proc ess: the root i b elong s to l constraints, whe r e l is a P ois s on random v aria ble of parameter αk ( c in the color ing case). T he v a r iable no des adjacent to i give themselves birth to new constra in ts, in nu mbers w hich are indep endently Poisson distributed with the same parameter. This repro duction pro ce s s is iterated on L generations, un til the v aria ble no des at gra ph distance L from the initial ro o t i hav e b een generated. W e now define the main ob ject of our study . First recall the well-known definition of the Hamming distance b e t ween t wo configura tions, d ( σ , τ ) = P N i =1 I ( σ i 6 = τ i ). Co ns ider an initial so lution of the for m ula, σ ∈ S F , and imag ine one wan ts to mo dify the v alue o f the v ariable i . A rea rranged solution is a new c onfiguration τ ∈ S F such that τ i 6 = σ i . The minimal size of a rear rangement (m.s.r.) for v a riable i star ting from σ ∈ S F is defined as n i ( σ , F ) = min τ { d ( σ , τ ) | τ ∈ S F , τ i 6 = σ i } , (2) and measures ho w costly (in terms of Hamming distance) it is to p ertur b the solutio n at v ariable i 3 . It ca n a ls o be v ie wed as the minimal length of a path in co nfiguration space, mo difying one v ar iable at a time, betw een σ and another solution with a different v alue of v aria ble i , thus pr oviding a quantification o f how muc h constra ined was initially this v ariable. W e shall a lso s pea k of the suppor t of a rearrange ment as the set of v ariables whic h differ in the initial a nd final co nfigurations, the size o f the rea rrangement b eing the cardina lit y of its supp o rt. In general the m.s.r . will dep end on the starting configura tion, we thus define its distribution with resp ect to a n uniform choice of σ (in abbr e v iation m.s.r.d.), q ( i,F ) n = X σ µ F ( σ ) δ n,n i ( σ,F ) . (3) There should b e no possibility o f confusio n b etw een the dis tribution q n and the num ber q of allow ed colo r s in the q -COL pr oblem. When dealing with r andom CSPs we shall study the av er age of this distr ibutio n, q n = E q ( i,F ) n , (4) where the exp ectation is taken with r esp e ct to the instance ensemble (in the ca ses consider ed here a ll v ariable no des are equiv alent on average). Its b ehavior in the thermody namic limit will dr astically ch a ng e with the connectivity parameter α (or c for the co loring). W e sha ll indeed define the threshold α f ( c f ) as the v alue a bove which a finite fraction of the dis tribution q n is supp o rted o n sizes n that diverge with the n umber of v a riables. In picto rial terms clusters acquire fro zen v ariables at this po int , their rear rangements must be of diverging size and thus lead to a final solution outside the initial cluster . The computation of the av erage m.s.r.d. will b e first undertaken in a r a ndom tree ensemble, mimicking the tree neighborho o ds of the random g raphs. The thresho ld for the freezing transition in these t r e e instances will b e computed, along with a set of exp onents characterizing the behavior o f the av erage m.s.r.d. when the tr ansition is approached from the unfroze n phase . F o r c la rity w e shall deno te α p instead o f α f the thresholds in the tree ensembles. W e s hall then argue in Sec. VI, o n the basis of the non-r igorous cavit y method, that for some v alues of α a nd k the prop erties of the random graphs instances are correctly describ ed by the computations in the tree ensemble. In par ticular for large enoug h v a lues o f k we shall conjecture that α p = α f . W e will also explain ho w the computation has to be amended to handle the more ela b or ated version of the cavit y metho d (with replica -symmetry break ing), a nd what are the exp ectedly universal characteris tics of the critical b ehavior a t the freez ing transitio n. 2 In this limit the quan tities studied in th is paper are not affecte d by some v ariations around these models. F or instance in the coloring case G ( N , M ) can b e replaced b y the ensem ble G ( N , p ) where eac h edge is presen t indep enden tly with pr obabilit y p = 2 α/ N , suc h that the av erage num b er of edges is close to M . The cho ice of the (hyper)edges with or without replacemen t is also i rrelev an t. 3 if σ i tak es the same v alue i n ev ery solution w e formally define n i = N + 1. 5 d b c a i i FIG. 2: The ca vity graphs F a → i and F i → a obtained from the example of Fig. 1. II I. MINIMAL SIZE REARRANGEMENTS IN RAND OM TREE ENSEMB LES In this and the next Section all the instances of CSP encountered have an underlying factor gra ph which is a finite tree. Given such a for m ula F (or equiv alently its factor graph) a nd an e dg e i − a b etw een a v ariable no de i and an adjacent constraint node a , we define tw o sub for mu las (cavit y g r aphs) F i → a and F a → i . F i → a is o btained from F by deleting the branch of the formula ro oted at i starting with constra in t a . Conversely F a → i is o btained by keeping only this br anch (see Fig . 2). W e also decomp ose the co nfiguration σ as ( σ a → i , σ i , σ i → a ), where σ a → i (resp. σ i → a ) is the configurations of the v a riable no des in F a → i (resp. F i → a ) distinct from i . T he notation σ \ i will b e used for the configura tion o f all v ar iables except i . The computation, ba sed on the na tur al r ecursive structure of trees, will be per formed in three steps: we sha ll first see how to o btain n i ( σ , F ), then its distribution with r esp ect to σ , q ( i,F ) n , which shall fina lly b e av era ged o ver a ra ndom tree ensemble. F or notatio nal simplicity F will o ften b e kept implicit. This appr oach is presented in a general setting be fo re the three s pecific cases of X ORSA T, CO L and SA T are trea ted. A. General case 1. Given tr e e, given σ The computation of the m.s.r. n i on a tree factor graph can b e p erformed in a recursive wa y . One has to determine, for each v alue of τ i 6 = σ i , the cost, in terms of Hamming distance, o f the mo dification σ i → τ i . This can be done by co mputing separately these cos ts in the factor graphs F a → i for all the constr aint no des a a round i and then patching together the rearra ngements of the s ub-formulae. Rear ranging a factor gra ph F a → i amounts to lo oking for a configuratio n of the v ariables j ∈ ∂ a \ i which sa tisfies the in teractio n a and which pr ov okes a minimal propagation of the rea r rangement in the bra nc hes F j → a . T o formalize this reaso ning we intro duce a q -comp onent vectorial no tation, ~ n , where the rows of the vectors ar e indexed by a spin v alue in X , and we shall denote [ ~ n ] τ the τ th comp onent o f ~ n . W e define ~ n i ( σ ) as the m.s .r . for i starting fro m the initial config uration σ , and with the final v alue τ i enco ded in the row of the vector: [ ~ n i ( σ )] τ i = min τ \ i { d ( σ , τ = ( τ i , τ \ i )) | τ ∈ S F } . (5) The origina l qua ntit y n i ( σ ) is o btained from this more detailed one as n i ( σ ) = min τ i 6 = σ i [ ~ n i ( σ )] τ i . The re c ur sive compu- tation of ~ n i is p erformed in terms of vectorial mess a ges on the directed edges of the fa ctor graph, ~ n i → a and ~ n a → i . The former, ~ n i → a ( σ i , σ i → a ) is defined exactly a s ~ n i with the cavity graph F i → a replacing the or iginal formula F . The latter rea ds [ ~ n a → i ( σ a → i )] τ i = min τ a → i { d ( σ a → i , τ a → i ) | ( τ i , τ a → i ) ∈ S F a → i } . (6) Note that here one do es not coun t the cost of flipping the ro ot v a riable, which av oids o vercounting when gluing together the cavit y graphs. A mo ment of though t reveals that these messages ob ey the following r ecursive equations: ~ n a → i ( σ a → i ) = e f ( { ~ n j → a ( σ j , σ j → a ) } j ∈ ∂ a \ i ) , ~ n i → a ( σ i , σ i → a ) = e g σ i ( { ~ n b → i ( σ b → i ) } b ∈ ∂ i \ a ) , (7) 6 where the functions e f a nd e g are given b y h e f ( { ~ n j → a } j ∈ ∂ a \ i ) i τ i ≡ min τ a \ i    X j ∈ ∂ a \ i [ ~ n j → a ] τ j    ψ a ( τ i , τ a \ i ) = 1    , (8) [ e g σ ( ~ n 1 , . . . , ~ n l )] τ ≡ I ( τ 6 = σ ) + [ ~ n 1 ] τ + · · · + [ ~ n l ] τ . (9) T o lighten the no tations we keep implicit the dep endence o f the functions e f and e g o n the edges o f the factor g raph. These equations can b e easily solv ed, for a given initial satisfying assignment σ , noting that the messag e s from t he le a f v a r iable nodes i satisfy the bo undary condition ~ n i → a ( σ i ) = ~ o ( σ i ), where w e define [ ~ o ( σ )] τ = I ( σ 6 = τ ). The recursions (7) ca n then b e succes sively applied to deter mine the v alue o f all mes sages in a single sweep from the ex terior of the graph tow ards its center. When this is do ne the m.s.r . for a v ariable i is obtained from ~ n i ( σ ) = e g σ i ( { ~ n a → i ( σ i , σ a → i ) } a ∈ ∂ i ) . (10) Note that this r ecursive a pproach provides not only the s ize of a minimal r earra ngement, but als o a final configuration achieving this b ound. One just has to to b o okkeep, a long with the siz e informa tions enc o de d in the messag es ~ n , the configuratio n r eaching the minim um in Eq. (8) (if there are several of them one is chosen arbitrar ily). By construction the supp ort of these optimal rear rangements is connected. 2. Given tr e e, distribution with r esp e ct to σ F ollowing the pr ogram sketc hed ab ov e, w e int r o duce now a proba bilit y distribution µ for the initial so lution σ of the formula: µ ( σ ) = 1 Z Y a ψ a ( σ a ) Y i ∈ B η ext ,i ( σ i ) , (11) where Z is a normaliza tio n constan t, B is a subse t of the lea ves of the factor g raph, and the η ext are probability laws on X that, by analo gy with magnetic sy s tems, we shall call fields. µ v anishes for configuratio ns whic h do not satisfy the form ula; if B = ∅ it is uniform on the set o f solutions , otherwise the external fields η ext can intro duce a bias in the law (this po ssibility will rev eal useful in the following). W e a ssume that the expres s ion above rema ins w ell defined in the pres ence of the external fields, i.e . that they do not put a v anishing weight on the solutions of the formula. The absence of cycle s in the factor gra ph induces a Mar ko vian pr o pe r t y of the measure µ which gr eatly simplifies its characterization. One can indeed compute recursively the marg inals o f the law on any subset of v a riable no des, int ro ducing on ea ch dire c ted e dg e of the factor graph another family o f mess a ges (cavit y meas ures) ν a → i ( σ i ) (resp. η i → a ( σ i )). These are the law of σ i in the measure a sso ciated to the cavit y factor graph F a → i (resp. F i → a ), and are solutions o f ν a → i = f ( { η j → a } j ∈ ∂ a \ i ) f ( { η j → a } j ∈ ∂ a \ i )( σ i ) = 1 z ( { η j → a } j ∈ ∂ a \ i ) X σ a \ i ψ a ( σ i , σ a \ i ) Y j ∈ ∂ a \ i η j → a ( σ j ) , (12) η i → a = g ( { ν b → i } b ∈ ∂ i \ a ) g ( { ν b → i } b ∈ ∂ i \ a )( σ i ) = 1 z ( { ν b → i } b ∈ ∂ i \ a ) Y b ∈ ∂ i \ a ν b → i ( σ i ) , (13) where the functions z are defined by normalization. Ag ain for clar ity w e do not indica te explicitly the dep endence of the functions f , g and z o n the edges . The bo undary conditions a re η i → a = η ext ,i when i is a leaf in B , η i → a = η (the uniform law o n X ) if i is a lea f not in B . This set of eq ua tions e njoys the sa me structure as the o ne on the ~ n ’s (se e Eq. (7)), and can also be solved in a sweep fro m the leav es of the factor gr a ph. The marginals of µ for a n y connected subset of v ariables can be ea s ily expressed in terms of the solution of this set of equations. F or insta nce the marginal of a single v aria ble reads µ ( σ i ) = g ( { ν a → i } a ∈ ∂ i )( σ i ) , (14) while the v ariables of a co nstraint, c o nditioned to the v alue of one of them, are drawn according to µ ( σ a \ i | σ i ; { η j → a } j ∈ ∂ a \ i ) = 1 z ( σ i , { η j → a } j ∈ ∂ a \ i ) ψ a ( σ i , σ a \ i ) Y j ∈ ∂ a \ i η j → a ( σ j ) , (15) 7 where ag ain z is a normalizing fa c to r. W e hav e now to co mpute the distribution of the minimal size r e arrang emen ts when the s tarting config uration σ is drawn from µ . The generatio n of σ can b e p er formed in a recursive broadca sting wa y: one fir st draws an arbitrar ily chosen ro ot v ariable σ i according to its margina l µ ( σ i ). Because the factor gr aph is a tree, the law of the remaining v a r iables factorizes on the different branches around i , µ ( σ \ i | σ i ) = Y a ∈ ∂ i µ ( σ a → i | σ i ) . (16) F or each br anch F a → i one proc e eds b y drawing the v a r iables of σ a \ i , conditioned on σ i (see Eq.(15)). Then the v alue of σ j for each j ∈ ∂ a \ i co nditions the generation of σ j → a , whic h can itself be broken in subtrees as in Eq. (16). This pro cess is rep eated o utw ar ds un til the leaves of the tr e e are rea ched. This observ ation lea ds us to int ro duce the distr ibutio n of the ~ n ’s messages with resp ect to the co nditional dis tr i- butions o f the initial configur ation, q ( i → a,σ i ) ~ n = X σ i → a µ ( σ i → a | σ i ) δ ~ n,~ n i → a ( σ i ,σ i → a ) , b q ( a → i,σ i ) ~ n = X σ a → i µ ( σ a → i | σ i ) δ ~ n,~ n a → i ( σ a → i ) . (17) Combining the r ecursive computations of the messages ~ n expressed in Eq. (7) and the r ecursive generation of the initial co nfiguration σ leads to b q ( a → i,σ i ) ~ n = X σ a \ i µ ( σ a \ i | σ i ; { η j → a } ) Y j ∈ ∂ a \ i X ~ n j → a q ( j → a,σ j ) ~ n j → a δ ~ n, e f ( { ~ n j → a } ) , (18) q ( i → a,σ i ) ~ n = Y b ∈ ∂ i \ a X ~ n b → i b q ( b → i,σ i ) ~ n b → i δ ~ n, e g σ i ( { ~ n b → i } ) , (19) with the b oundar y condition given by q ( i → a,σ i ) ~ n = δ ~ n,~ o ( σ i ) for the leaves i . The distribution o f the m.s.r. for i when σ is drawn from µ can then b e o btained fro m the distr ibutions on the edg es neighbor ing i , q ( i ) n = X σ i µ ( σ i ) X ~ n q ( i,σ i ) ~ n δ n, min τ i 6 = σ i [ ~ n ] τ i , q ( i,σ i ) ~ n = Y a ∈ ∂ i X ~ n a → i b q ( a → i,σ i ) ~ n a → i δ ~ n, e g σ i ( { ~ n a → i } ) . (20) 3. A ver age over the choic e of the tr e e A t this p oint we define an ensemble o f random ro o ted tree fa ctor graphs o n which we s hall p erform the av er age of the m.s.r. distribution. The ingredients o f the definition ar e p l , a distribution on the positive integers, ρ ( ψ ) a distribution on the 0 / 1 constraint functions (with p oss ibly a random degree k ), and a distribution of fields P ( η ). Let us denote T L a random tree of the ensemble of depth L , and for notatio nal simplicit y b T L the elements of th is ensem ble conditioned on their r o ot b eing o f degr ee o ne . T L is defined by induction on L as a (Galton-W a tson like) br a nching pro cess. T 0 is made of a single v ar iable no de (the ro ot) to which is applied an ex ter nal field η drawn from P . b T L is generated by intro ducing a ro ot v ar ia ble no de i , connected to a single interaction no de a whos e constraint function ψ a is dr awn fr o m ρ . Then each v aria ble no de in ∂ a \ i is taken to b e the ro ot of an indep endently ge nerated T L . Conv ersely T L +1 is ma de by iden tifying the r o ots of l (a ra ndom in teger drawn from p l ) indep endent copies of b T L . F or ea ch tree drawn from this ensem ble the tw o recursive computatio ns yie ld a set of messages on each edg e of the factor graph directed towards the r o ot, ( η , { q ( σ ) ~ n } q σ =1 ) for an edge fro m a v ar iable to a constraint, ( ν, { b q ( σ ) ~ n } q σ =1 ) from a constraint to a v ariable. The ra ndomness in the definition of the tree turn these ob jects in to random v ariables, whose distribution depe nds only on the distance between the consider ed e dge and the lea ves. T o be mor e precise, let us call P L ( η , { q ( σ ) ~ n } ) the distribution of ( µ ( σ i ) , { q ( i,σ i ) ~ n } ) when i is the ro ot of a random T L tree, and similar ly b P L ( ν, { b q ( σ ) ~ n } ) for the distribution of the messages dir ected to the ro ot v ariable no de of b T L . One c a n firs t notice that the r ecursion b etw een the messages η , ν do not in volve the s ize distributio ns q ~ n and b q ~ n , and thus define P L ( η ) as the marginal of P L disregar ding the q ~ n ’s, a nd s imilarly b P L ( ν ) fro m b P L . P L and b P L ob ey functional equations of the form b P L = F [ P L ], P L +1 = G [ b P L ], with P L =0 = P , and where the functionals F and G hav e a c o mpact distributional writing, ν d = f ( η 1 , . . . , η k − 1 , ψ ) , η d = g ( ν 1 , . . . , ν l ) . (21) 8 The firs t equation means that drawing a v ariable ν from b P L amounts to drawing a constr aint fu nction ψ from ρ , k − 1 i.i.d. v aria bles η i from P L and co mputing ν from Eq. (1 2). Similarly P L +1 is obtained fro m b P L thanks to E q. (13 ), with the bra nching n umber l drawn fr om p l . In the fo llowing we shall as sume that the distribution P on the b oundar y of the tree is a solution of the fixed point functional equation P = G [ F [ P ]]. This implies a sta tionarity prop erty with resp ect to the num ber of generation L , P L = P , b P L = b P = F [ P ]. This justifies a p os teriori the choice we ma de of including non-trivial biases at the b oundar y in the law (1 1) : in generic mo dels un biased boundar y conditions represented b y P ( η ) = δ ( η − η ) do no t satisfy this stationary prop erty , this will b e in particular the ca s e for the random k - SA T pr o blem studied b elow. The evolution of the size distributions when iterating the tree construction is coupled, through the term µ ( σ a \ i | σ i ) of E q. (18), to the η , ν messages. W e are howev er in teres ted in a rather simple quantit y , the av erage of the m.s.r. distribution of the r o ot (see Eq. (20)) with r esp ect to the random tree. It is th us p ossible to co mpute an average of the q ( i → a,σ i ) ~ n on an edge of depth L , provided this average is c onditione d o n the v alue of the ass o ciated message η i → a . This conditional a verage, denoted q ( σ,L ) ~ n ( η ), and its counterpart b q ( σ,L ) ~ n ( ν ), ar e then found to ob ey the follo wing equations, b q ( σ,L ) ~ n ( ν ) b P ( ν ) = E ψ Z d P ( η 1 ) . . . d P ( η k − 1 ) δ ( ν − f ( η 1 , . . . , η k − 1 , ψ )) X σ 1 ,...,σ k − 1 µ ( σ 1 , . . . , σ k − 1 | σ , η 1 , . . . , η k − 1 , ψ ) X ~ n 1 ,...,~ n k − 1 q ( σ 1 ,L ) ~ n 1 ( η 1 ) . . . q ( σ k − 1 ,L ) ~ n k − 1 ( η k − 1 ) δ ~ n, e f ( ~ n 1 ,...,~ n k − 1 ,ψ ) , (22) q ( σ,L +1) ~ n ( η ) P ( η ) = X l p l Z d b P ( ν 1 ) . . . d b P ( ν l ) δ ( η − g ( ν 1 , . . . , ν l )) X ~ n 1 ,...,~ n l b q ( σ,L ) ~ n 1 ( ν 1 ) . . . b q ( σ,L ) ~ n l ( ν l ) δ ~ n, e g σ ( ~ n 1 ,...,~ n l ) , (23) with the b oundar y condition q ( σ,L =0) ~ n ( η ) = δ ~ n,~ o ( σ ) . Finally the sought-for av er age m.s.r.d. for the ro ot of a ra ndom tree of depth L reads : q ( L ) n = Z d P ( η ) X σ η ( σ ) X ~ n q ( σ,L ) ~ n ( η ) δ n, min τ 6 = σ [ ~ n ] τ . (24) The numerical resolution of E qs. (22,2 3) could at first sight seem rather difficult, as they inv olve, for each v alue of the random v ariable η (or ν ), q distr ibutio ns of vectors ~ n . One can ho wev er devise a simple metho d, g eneralizing the po pulation dynamics algorithm of [2 9]. The impor tant p oint is to notice that for a given v alue of σ , q ( σ,L ) ~ n ( η ) P ( η ) can be viewed as a joint distribution of v ariables ( η , ~ n ( σ ) ), which ca n b e numerically represented by a p opulation of a large num b er N of couples { ( η i , ~ n ( σ ) i ) } N i =1 . The empirica l distributio n of these couples is taken as an a pproximation (known as a particle a pproximation in the statistics literature) o f q ( σ,L ) ~ n ( η ) P ( η ). This sug gests t he following alg orithm. Initialize a p opulation { η i } N i =1 drawn i.i.d. from P (this shall b e itself p erformed by a standard p opulation dyna mics approach), and asso ciate to ea ch of them q vectors, ~ n ( σ ) i = ~ o ( σ ). W e thus hav e, for tre es of depth L = 0, a p opulatio n { ( η i , ~ n (1) i , . . . , ~ n ( q ) i ) } N i =1 . T o ta ke this po pulation from depth L to depth L + 1 one ha s to - generate in an i.i.d. way N element s ( ν j , ~ n (1) j , . . . , ~ n ( q ) j ), with j ∈ [ N + 1 , 2 N ] to a void notationa l confusion, b y: • c ho osing randomly a constraint function ψ from ρ , and k − 1 indices i 1 , . . . , i k − 1 uniformly a t ra ndom in [1 , N ]. • computing ν j = f ( η i 1 , . . . , η i k − 1 , ψ ). • for each σ ∈ [1 , q ], ∗ gener ating a configuratio n ( σ 1 , . . . , σ k − 1 ) acco rding to the law µ ( ·| σ, η i 1 , . . . , η i k − 1 , ψ ). ∗ computing ~ n ( σ ) j = e f ( ~ n ( σ 1 ) i 1 , . . . , ~ n ( σ k − 1 ) i k − 1 , ψ ). - then gener ate a new population { ( η i , ~ n (1) i , . . . , ~ n ( q ) i ) } N i =1 , repea ting for each i ∈ [1 , N ] independently the following steps : • Choo s e randomly a degr ee l from p l and l indices j 1 , . . . , j l uniformly at rando m in [ N + 1 , 2 N ]. • Compute η i = g ( ν j 1 , . . . , ν j l ). • F or e ach σ ∈ [1 , q ], c ompute ~ n ( σ ) i = e g σ ( ~ n ( σ ) j 1 , . . . , ~ n ( σ ) j l ). 9 After L iterations of these tw o steps, for a given v a lue of σ , an element ( η i , ~ n ( σ ) i ) w ith i uniformly chosen in [1 , N ] is distributed with the joint la w q ( σ,L ) ~ n ( η ) P ( η ) 4 . W e can thus complete the c o mputation o f q ( L ) n in terms of a weigh ted histogram, q ( L ) n = 1 N N X i =1 q X σ =1 η i ( σ ) δ n, min τ 6 = σ [ ~ n ( σ ) i ] τ . (25) W e s hall now examine how this general formalism ca n b e a pplied to the three exemplar pr oblems of X ORSA T, COL and SA T . B. k -XO RSA T 1. On a given tr e e factor gr aph Let us reca ll the factor gra ph repr esentation of a k -X O RSA T formula w e use: the v ariables are Ising spins σ i = ± 1, and ea ch cons traint node a is satisfied if and only if the pro duct of its k neighboring v ariables Q i ∈ ∂ a σ i is eq ual to a given constant J a = ± 1. The computation of the m.s.r., a lready p erfor med in [26], is muc h s impler than the ge neral case pres ent ed a bove. No te first that for any CSP w her e v aria ble ca n only take tw o v alues, a r e arrang emen t σ → τ is completely sp ecified by its supp or t, the set R of v aria bles whic h are different in the initial and final config urations. A second simplification is s p ecific to the X ORSA T problem. Co nsider an initial solution σ and the configur ation τ obtained by flipping the v aria bles in R . This seco nd configur ation is also a so lution if and only if for ea ch constraint a , a n even (po s sibly) null num ber of v ariables o f ∂ a ar e in R . A rea rrangement for the v a riable i is hence a set R verifying this condition and containing i . The m.s.r. n i is the minimal cardinality of such a set of v aria bles; on a tree this minimum can alw ays be achiev ed requir ing that e a ch a contains either zero or tw o (and not an higher even v alue) v a r iables of R . The recur sive strateg y for the computation of n i and the construction of a r e arrang emen t of this s ize amounts to co nstructing a m.s.r. R a → i for all the branches F a → i around i (their sizes b eing denoted 1 + n a → i ) and to com bining the rearrang emen ts o f the sub-facto r graphs, R = { i } ∪ a ∈ ∂ i R a → i . T o construct R a → i one has to choos e exactly one v aria ble j ∈ ∂ a \ i that minimizes the co st n j → a of the rea rrangement in the branch F j → a . Summarizing this re asoning in formulas, we obtain: n a → i = min j ∈ ∂ a \ i n j → a , n i → a = 1 + X b ∈ ∂ i \ a n b → i , n i = 1 + X a ∈ ∂ i n a → i . (26) The reader will easily verify that the equations (7,8,9,10 ) o f the general formalism reduce indeed to this simple form, noting in particular that the m.s.r. is her e indep e nden t of the initia l config ur ation, a s app ears clea r ly from the geometric characterizatio n of the optimal supp orts R . 2. R andom tr e e This independence with resp ect to the initial configuration allows to skip the second step of the g eneral formalism, as for a given tree the distribution of the m.s.r. is trivia lly concentrated o n a single integer, a nd to study directly the ensemble of random tree formula. W e shall follow the general definition of T L given a bove, with a Poisson law of parameter αk for the branching probability p l , and all cons tr aint no des of deg ree k . F or definiteness o ne can as sume that the b oundary condition is free (no bias on the leaves o f the tr e e) a nd that J a = ± 1 with probability one half; these la st tw o choices are in fact irr elev a nt, as the m.s.r. depends only o n the g eometry of the factor g raph. This r andom ensemble induces a pr o bability law q ( L ) n for the m.s.r. of the ro o t of T L , and an asso ciated law b q ( L ) n for the mess a ge sen t to the ro ot of b T L . Simplifying the equations (22,2 3,24) of the genera l formalism, or interpreting 4 W e do not claim that ( η i , ~ n (1) i , . . . , ~ n ( q ) i ) is drawn acco rdi ng to P ( η ) q (1 ,L ) ~ n (1) . . . q ( q,L ) ~ n ( q ) , i.e. that the ~ n ( σ ) i are independent conditionally on η i , whic h is not true. The algorithm induc es correlations b et ween th e v arious v alues of σ , yet these are irrelev ant for the linear a v erages we compute. 10 the sp ecific ones (26) in a distributio na l sense, lea ds to b q ( L ) n = X n 1 ,...,n k − 1 q ( L ) n 1 . . . q ( L ) n k − 1 δ n, min[ n 1 ,...,n k − 1 ] , (27) q ( L +1) n = ∞ X l =0 e − αk ( αk ) l l ! X n 1 ,...,n l b q ( L ) n 1 . . . b q ( L ) n l δ n, 1+ n 1 + ··· + n l , (28) with the initial condition q ( L =0) n = δ n, 1 . These eq uations can b e solved by a simplified version of the po pulation dy namics a lgorithm int r o duce d in th e general case. The dis tributions q ( L ) n and b q ( L ) n are repre s en ted by samples o f integers { n i } , eac h element o f the p opulatio n asso ciated to q ( L +1) n is g enerated by dr awing a Poisson distributed int eg er l , e xtracting at random l elements of the sample repr e s ent ing b q ( L ) n and computing their sum plus one. Con versely the elemen ts of b q ( L ) n are the minimum of k − 1 randomly c hosen integers drawn from the po pulation encoding q ( L ) n . In the following we shall b e in terested in the L → ∞ limit, which is the counterpart of the N → ∞ thermo dynamic limit o f the or iginal random gr aph ensembles. One c o uld reach it numerically by rep eated itera tions of the p opulation dynamics step. There is ho wev er a simpler nu meric al metho d which allows to p erform analytically this limit. Let us first define the integrated version of the m.s.r .d., Q ( L ) n = X n ′ ≥ n q ( L ) n ′ , (29) which gives the pro babilit y of a m.s.r . being larger tha n n . A few simple pr op erties follow from this definition, q ( L ) n = Q ( L ) n − Q ( L ) n +1 , Q ( L ) n = 1 − X n ′

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment