Susceptibility Propagation for Constraint Satisfaction Problems

We study the susceptibility propagation, a message-passing algorithm to compute correlation functions. It is applied to constraint satisfaction problems and its accuracy is examined. As a heuristic method to find a satisfying assignment, we propose s…

Authors: ** 논문에 명시된 저자 정보는 제공되지 않았으나, 원문에 따르면 **

Susceptibility Propagation for Constraint Satisfaction Problems
Susceptibilit y Propagation for Constrain t Satisfacti on Problems Saburo Higuc hi 1 , 2 and Marc M ´ ezard 1 1 L ab or atoir e de Physique Th´ eorique et Mo d` eles Statistiques, Universit´ e Paris-Sud, Bˆ at 100, 9140 5 Orsay Ce dex 2 Dep artment of Applie d Mathematics and Informatics, Ryukoku University, Otsu, Shiga, 520-2194, Jap an W e study the s usceptibilit y propagation, a message-passing algorithm to compute correlation functions. It is applied to constrain t satisfaction problems and its accu- racy is examined. As a heur istic metho d to find a satisfying assignment , we prop ose susceptibilit y-guided d ecimation where correlations among the v ariables pla y an im- p ortant role. W e app ly this nov el decimation to lo ck ed o ccupation pr oblems, a class of hard constraint satisfaction problems exhibited recen tly . It is sh o wn that the present metho d p erforms b etter than the stand ard b elief-guided decimation. I. INTR ODUCTION Message-passing alg orithms ha ve sho wn to b e effectiv e in helping to find solutions of some h ard constrain t satisfaction problems (CSPs) lik e K -satisfiabilit y and colo ring. The simplest application consists in using b elief pro pagation (BP), when it conv erges, in order to get some estimate of the margina ls o f eac h of the v ariables. One m ust then exploit the information obtained in this w ay (whic h is in gene ral only appro ximate in a CSP described b y a lo op y f a ctor gr aph). So far only tw o metho ds hav e b een explored thoroughly: decimation [1, 2] and reinforcemen t [3 ]. Decimation consists in identifying fro m some criterion the most “p olarized” v ariable (e.g. the one with the smallest en trop y), and in fixing it to its most probable v alue. After this v a riable has b een fixed, one obtains a new, smaller, CSP , to whic h one can apply recursiv ely the whole pro cedure (BP follow ed by identifying and fixing the most p olarized v ar ia ble). In reinforcemen t, one finds fr o m the BP marginals the most probable v alue of eac h v ariable, and one adds, in the lo cal measure of each v ariable, an extra bias in this preferred direction. The new CSP therefore has t he same num b er of v ariables as the original one, but the local measure on eac h v ar ia ble ha s b een c hanged. One iterates this reinforcemen t pro cedure un til the v a riables are infinitely p olar ized. If the algo rithm 2 is successful this r eturns a configurat io n of v ariables whic h satisfies a ll constraints. These t wo pro cedures, BP+decimation and BP+reinforcemen t, are remark ably efficien t in random CSPs lik e K -satisfiability [2], graph colouring [2], and p erceptron learning [4]. When one approac hes the SA T-UNSA T threshold of these problems, a mo r e elab orate v ersion whic h uses t he information on margina ls from surv ey propagation (SP) is more effectiv e[1, 3, 5], and at presen t the SP-based decimation and reinforcemen t metho ds are the most efficien t incomplete SA T solv ers for random 3 - satisfiabilit y . Recen tly , a class of problems has b een described[6 ][7] where these pro cedures are muc h less efficie n t. These are the lo c k ed o ccupation problems(LOPs), a class of CSPs where the set o f solution consists of isolated configurations, far aw ay from eac h other. Apart from the X ORSA T problem[8] which can b e solv ed by Gaussian elimination, the random LOPs are v ery hard to s olv e in a broad region of the de nsit y of constrain ts, below their SA T-UNSA T transition. F or these LOPs, it is kno wn t ha t SP is equiv alen t with BP . The BP+decimation metho d has b een found to giv e rather p o or results, and the BP+reinforcemen t, whic h w orks b etter, is still rather limited. One reason for this hardness is the fact that lo cal margina ls often con ve y little informat io n on the solution. This has motiv ated us to explore some extensions of the me ssage-passing approa c hes, in whic h one uses, on top of lo cal marginals, some correlatio n prop erties of the v ariables. Sev eral p ossibilities to obtain inf o rmation on the correlations fro m message-passing pro cedures hav e b een explored recen tly [4, 9, 10, 11]. Here we use the susceptibilit y propagation initially in tro duced in [4]. W e show that some of the hard LOPs that could not b e solv ed b y previous metho ds can now b e solve d by a mixture of the single-v ariable decimation with a new pair - decimation pro cedure whic h mak es use of the know ledge of correlation. In the case of binar y v ariables which w e study here, this new pro cedure amounts t o identifying a strongly correlated pair o f v ariables, and fixing the relativ e o rien tatio n of t he t w o v ariables. The pap er is or g anised as follows. In Section I I, w e in tro duce the susceptibilit y propaga- tion, deriv ed as a linear response to b elief propagation. This metho d is examined analytically in Section I I I, where it is applied t o simple systems for whic h exact fixed p oin ts o f the itera- tion ar e determined. In Section IV, it is applied n umerically to lo c ke d o ccupation problems and the accuracy of the metho d is examined: w e measures the p erformance of the decimation pro cess whic h mak es use of the correlations obtained with this metho d. The final Section V is dev oted to conclusion a nd discussions. 3 I I. SUSCEPTIBILI TY PR O P A GA TION A. Occupation Problems Let us consider an o ccupation problem, whic h consists of | V | = N binary v a r ia bles x i ∈ { 0 , 1 } ( i ∈ V ) and | F | = M constraints ψ a ( x i a, 1 , . . . , x i a,k ) = 1 ( a ∈ F ). Eac h constrain t in volv es exactly k v ariables and is parameterized by a ( k + 1 )-comp onen t “constraint-v ector” A = ( A (0) , . . . , A ( k )) with binary entries defined as follo ws. W e sa y a v a r iable x i is o ccupied if x i = 1. Let r a = P i ∈ ∂ a x i b e the n umber o f o ccupied v ariables that are in volv ed in the constrain t ψ a . By definition, the constrain t a is satisfied ( ψ a = 1) if and o nly if A ( r a ) = 1. An o ccupation problem is lo ck ed if the follo wing three conditions are met[6 ][7][12] • A (0) = A ( k ) = 0. • A ( r ) A ( r + 1) = 0 for r = 0 , . . . , k − 1. • Each v a r ia ble app ears in at least t wo constraints. Standard examples of lo ck ed o ccupation problems include p ositiv e 1 -in- K satisfiabilit y [13] and parit y c hec ks [14]. As can b e done for general constraint satisfaction problem, a factor graph G = ( V , F ; E ) can b e asso ciated with an instance of the o ccupation problems[15]. The set of v ertices of this bipartite graph G is V and F while the set of edges is E = { ( i, a ) | i ∈ V , a ∈ F , x i is in v olve d in ψ a } . The not ion o f neigh b orho o d is naturally in tro duced: ∂ a = { i ∈ F | ( i, a ) ∈ E } , ∂ i = { a ∈ V | ( i, a ) ∈ E } . F or a collection of v ariables in S ⊂ V , w e shall write x S = { x i | i ∈ S } . W e also use t he short-hand notation x = x V . B. Belief Propagation Updat e Rules Consider an o ccupation pro blem described by a factor graph G = ( V , F , E ) a nd a constrain t- vector A . F or later use, w e intro duce lo cal ‘external fields’ h x ℓ ( x ∈ { 0 , 1 } , ℓ ∈ V ), whic h will b e sen t to zero a t the end, and consider a joint probability distribution p ( x | h x ) = 1 Z ( h x ) M Y a =1 ψ a ( x ∂ a ) × N Y ℓ =1 Y x e h x ℓ δ x ℓ ,x . (1) 4 This proba bility distribution is w ell defined as so on as there exists at least one (“SA T”) configuration satisfying all the constrain ts. The constant Z ( h x ) is a normalization fa cto r . Our final aim is to extract solutions from the uniform measure p ( x | 0) ov er solutions satisfying all constrain t s (when there exists a t least one solution). The marginal distribution p i ( x i | h ) can b e estimated by the BP alg orithm. The BP up date rules for t wo families of messages, namely ca vit y fields and ca vit y biases, are giv en by [16, 17] ν ( t +1) i → a ( x i | h x ) = 1 Z ( t ) i → a ( h x ) Y b ∈ ∂ i \ a ˆ ν ( t ) b → i ( x i | h x ) × Y x e h x i δ x i ,x , (2) ˆ ν ( t ) a → i ( x i | h x ) = X x ′ ∂ a δ x i ,x ′ i ψ a ( x ′ ∂ a ) Y ℓ ∈ ∂ a \ i ν ( t ) ℓ → a ( x ′ ℓ | h x ) . (3) Here, w e hav e decided to in tro duce a normalization factor Z ( t ) i → a ( h x ) for ν ( t ) i → a ( x i | h x ) and to a void the normalization for ˆ ν ( t ) a → i ( x i | h x ). This c hoice is p erfectly v alid for BP , and it helps to get r elativ ely simple susceptibilit y propagation up date rules (8)( 9 ). Assuming con ve rgence to a fix ed p oint, the BP estimate for the margina l distribution of v ariable i is: p i ( x i | h x ) = 1 Z i ( h x ) Y b ∈ ∂ i ˆ ν ( ∗ ) b → i ( x i | h x ) , (4) where ˆ ν ( ∗ ) a → i ( x i | h x ) is the fixed point of the BP iteration. C. Susceptibilit y Propagation Up date Rules The 2-p o in t connected correlation function at h = 0 is obtained as p conn ij ( x i , x j ) ≡ p ij ( x i , x j ) − p i ( x i ) p j ( x j ) = ∂ p i ( x i | h x ) ∂ h x j j      h =0 . (5) T o ha v e a message-passing alg o rithm to calculate this quantit y , we introduce the ca vit y susceptibilit y and its companion by ν i → a,j ( x i , x j ) = ∂ ν i → a ( x i | h x ) ∂ h x j j      h =0 , (6) ˆ ν a → i,j ( x i , x j ) = ∂ ˆ ν a → i ( x i | h x ) ∂ h x j j      h =0 . (7) Note that the roles of v ariables x i and x j are asymmetric; j can b e an ar bitr a ry v a riable while i is a neighbor of the constraint a . 5 The ca vit y susceptibilit y and it s companion can b e calculated by a message-passing metho d [9]. The susceptibilit y propagation up date rules can b e o btained by differentiat- ing the belief propagation up date rules (2 ) and (3 ) with resp ect to h x j . They read[4][18] ν ( t +1) i → a,j ( x i , x j ) = 1 Z ( t ) i → a ( h x ) Y b ∈ ∂ i \ a   δ i,j δ x i ,x j + X b ∈ ∂ i \ a ˆ ν ( t ) b → i,j ( x i , x j ) ˆ ν ( t ) b → i ( x i ) + C ( t ) i → a,j ( x j )   , (8) ˆ ν ( t ) a → i,j ( x i , x j ) = X x ′ ∂ a δ x ′ i ,x ′ i ψ a ( x ′ ∂ a ) ×   Y ℓ ∈ ∂ a \ i ν ( t ) ℓ → a ( x ′ ℓ )   X m ∈ ∂ a \ i ν ( t ) m → a,j ( x ′ m , x j ) ν ( t ) m → a ( x ′ m ) , (9) where ν ( t ) i → a ( x i ) = ν ( t ) i → a ( x i | h x = 0 ) , ˆ ν ( t ) a → i ( x i ) = ˆ ν ( t ) a → i ( x i | h x = 0 ) . (10) The function C ( t ) i → a,j ( x j ) o r iginates fro m the deriv ative of Z ( t ) i → a and can b e determined b y requiring the normalization X x i ν ( t ) i → a,j ( x i , x j ) = 0 . (11) Let us supp ose that w e hav e found a fixed p oint of BP and the susceptibilit y propagatio n. By differen tia t ing (4) with resp ect to the external fields, w e can express the 2-p oin t connected correlation function in terms of the mes sages at the fix ed p oin t as p conn ij ( x i , x j ) = p i ( x i )[ δ i,j δ x i ,x j + C ij ( x j )] + 1 Z i (0) X b ∈ ∂ i ˆ ν ( ∗ ) b → i,j ( x i , x j ) Y c ∈ ∂ i \ b ˆ ν ( ∗ ) c → i ( x i ) . (12) The constan t C ij ( x j ) is related to the deriv ativ e of Z i ( h ) and is con venie n tly fixed by the condition P x j p conn ij ( x i , x j ) = 0. D. Log-lik eliho o d represen tation The rules (8 ,9) apply to all t yp es of CSPs with discrete v a r ia bles. When dealing with binary v ariables, it is helpful to rewrite the b elief and susceptibilit y up date equations in terms of log-likelihoo d v ariables. W e in tro duce the ca vit y field and ca vity bias in the log- lik eliho o d represen tat ion n i → a and ˆ n a → i as (w e omit the time sup erscript ( t ) where it is ob vious): ν i → a ( x i | h ) = A i → a e n i → a ( h ) s i , (13) ˆ ν a → i ( x i | h ) = B a → i e ˆ n a → i ( h ) s i , (14) 6 where s i is the spin v ariable s i = 2 x i − 1 = ± 1 and the external fie lds in the t w o represen- tations are relat ed b y h j = h 1 j − h 0 j 2 . Naturally w e define the cavit y s usceptibilit y in the log-lik eliho o d represen tation as η i → a,j = ∂ n i → a ( h ) ∂ h j     h =0 , ˆ η a → i,j = ∂ ˆ n a → i ( h ) ∂ h j     h =0 (15) The b elief pro pagation up da te rules read n ( t +1) i → a = X b ∈ ∂ i \ a ˆ n ( t ) b → i + h i , (16) ˆ n ( t ) a → i = f a → i ( { n ( t ) j → a } j ∈ ∂ a \ i ) , (17) where f a → i ( { n j → a } j ∈ ∂ a \ i ) = 1 2 log F (+1) F ( − 1) , (18) F ( σ ) = X s ∂ a δ s i ,σ ψ a ( s ∂ a ) Y j ∈ ∂ a \ i e n j → a s j . (19) By differen tia ting both s ides of (16,17), we obtain η ( t +1) i → a,j = X b ∈ ∂ i \ a ˆ η ( t ) b → i,j + δ i,j (20) ˆ η ( t ) a → i,j = X m ∈ ∂ a \ i ∂ f a → i ( { n ( t ) j → a } j ∈ ∂ a \ i ) ∂ n m → a × η ( t ) m → a,j . (21) Assuming that a solution n ( t ) j → a of the BP equations (16,17) is used, one sees tha t the sus- ceptibilit y propag a tion up date rule (20,20) is an inhomogeneous linear system in η and ˆ η . The co efficien t matrix takes the following f orm: ∂ f a → i ( { n j → a } j ∈ ∂ a \ i ) ∂ n m → a = h s m s i i − h s m ih s i i 1 − h s i i 2 (22) where i, m ∈ ∂ a a nd h·i means Here h s i i and h s m s i i for i, m ∈ ∂ a means the exp ectation v alue with resp ect to t he joint pro ba bilit y distribution fo r v ariables tha t are neigh b ors of a constrain t obtained solely fro m b eliefs[17, Sec.14.2.3]. 7 In the log-like liho o d represen tation, the magnetization and the pair correlation are given in terms of the fixed-p o in t messages b y h s i i = tanh X b ∈ ∂ i ˆ n ( ∗ ) b → i ! , (23) h s i s j i conn ≡ h s i s j i − h s i ih s j i = " 1 − tanh 2 X b ∈ ∂ i ˆ n ( ∗ ) b → i !# × " X c ∈ ∂ i ˆ η ( ∗ ) c → i,j + δ i,j # . (24) In the a b ov e ex pression, i and j can b e arbitrary v ariables on the factor g r a ph. I I I. PR OPER TIE S A. Linear Equation In order to study the structure of susc eptibilit y pro pagation up dat e rules (20,21), w e construct a kM N -comp onen t column v ector y ( t ) = ( η ( t ) i → a,j , ˆ η ( t ) a → i,j ) t ( i,a ) ∈ E ,j ∈ V . (25) Then the fix ed p oint condition asso ciat ed with (20,21) can b e written as a linear equation y ( ∗ ) = My ( ∗ ) + b , (26) with the inhomog eneous term b = ( δ i,j , 0) t ( i,a ) ∈ E ,j ∈ V (27) The co efficien t matrix is block -diagonal in j : M ( iaj ) , ( i ′ a ′ j ′ ) = δ j,j ′ M ( ia ) , ( i ′ a ′ ) , (28) M =   0 1 ( a ′ ∈ ∂ i \ a ) δ i,i ′ 1 ( i ′ ∈ ∂ a \ i ) δ a,a ′ ∂ f a → i ( { n ( ∗ ) j → a } j ∈ ∂ a \ i ) ∂ n i ′ → a ′ 0   , (29) where the block M is independen t of the blo ck index j . Th us w e obtain the unique fixed p oint y ( ∗ ) = ( 1 − M ) − 1 b (30) if ( 1 − M ) is inv ertible, whic h is equiv alen t to the inv ertibility of ( 1 − M ). 8 The susceptibilit y propagatio n up date rules (20,21) can b e r ega rded a s an iterativ e metho d to solve the linear equation equation (30). It con v erges to a v alue irresp ectiv e of the initial vec tor if all the eigen v alues of M hav e mo duli smaller than unit y . Because the blo c k M do es not dep end on j , the existence o f the fixed p oin ts and con v ergence to them are solely determined b y M and do not dep end on j . B. Application to simple problems When the fa ctor g raph is a tr ee, ev en in presence of the external fields h x , the exact marginals are obtained by (4) on a fixed p o in t ν ( ∗ ) i → a , ˆ ν ( ∗ ) i → a [15]. Therefore, b y differentiation with resp ect to h x , there exists a susceptibilit y fixed-p oint whic h give s the exact 2-p o in t correlation function. In the examples whic h w e ha v e considered, the iteration of susceptibilit y propagation conv erges to this fixed-p oint. On the other hand, if the graph has more than one lo op, there is no guarantee either that the fixed p oin t exists or the itera t io n leads to that fixed p oint. In order to test these statemen ts, we ha v e studied a simple problem, t he 1-in-2 satisfiabilit y problem, o r an ti-f erro magnetic Ising mo del. W e first study this problem on a c hain of length N . Namely , w e tak e k = 2 and A = (0 , 1 , 0), and V = { 0 , 1 , 2 , . . . , N − 1 } , F = { 0 + 1 2 , 1 + 1 2 , . . . , N − 1 − 1 2 } . This giv es a simple case of a tree factor g raph with E = { ( i, i + 1 2 ) | i = 0 , 1 , . . . , N − 2 } ⊔{ ( i, i − 1 2 ) | i = 1 , . . . , N − 1 } . Aw ay from the b o undar ies,, since ∂ a and ∂ i consist of only tw o v ariables and constrain ts, resp ectiv ely , (1 6), (17), (2 0), (21) a re simplified to yield n ( t +1) i → i ± 1 2 = ˆ n ( t ) i ∓ 1 2 → i ; ˆ n ( t ) i ± 1 2 → i = − n ( t ) i ± 1 → i ± 1 2 , (31) η ( t +1) i → i ± 1 2 ,j = ˆ η ( t ) i ∓ 1 2 → i,j + δ i,j ; ˆ η ( t ) i ± 1 2 → i,j = − η ( t ) i ± 1 → i ± 1 2 ,j . (32) On the boundary , on the other hand, one has n 0 → 1 2 = n N − 1 → N − 3 2 = 0 , η 0 → 1 2 ,j = δ j, 0 , η N − 1 → N − 3 2 ,j = δ j,N − 1 . (33) This in turn implies that n ( ∗ ) i → i ± 1 2 = ˆ n ( ∗ ) i ± 1 2 → i = 0 ; η i → i ± 1 2 ,j =      ( − 1) i − j ( ± ( i − j ) ≥ 0) 0 (otherwise) (34) 9 whic h giv es: h s i i = 0 , h s i s j i conn =  − n ( ∗ ) i − 1 → i − 1 2 ,j − n ( ∗ ) i +1 → i + 1 2 ,j  + δ ij = ( − 1) i − j . (35) In summary , for 1-in-2 satisfiability on a c hain, whic h is a simple X ORSA T problem [19] with a tree factor graph, the b elief propagation and susceptibilit y pr o pagation giv e the correct magnetization and susc eptibilit y . Consider now the same problem on the simple st g raph with one lo op, a ring. Namely , let G b e a 1-dimensional ring, whic h is defined by iden tifying v aria ble i = 0 with N and adding a factor a = N − 1 2 as w ell as t w o inciden t edges ( N , N − 1 2 ) and ( N − 1 , N − 1 2 ). Moreo ve r, w e assume that N is an even in teger so that there is no frustration. BP has a con t inuous family o f fixed p oin ts: n ( ∗ ) i → i ± 1 2 = ˆ n ( ∗ ) i ∓ 1 2 → i = ( − 1) i A ± , (36) where A ± is a constant [17]. As a consequence o f the existence of this fa mily of fixed p oints, (1 − M ) is not in ve rtible; in fa ct it ha s an eigenv ector with zero eigenv alue, y 0 = ( 1 , − 1 ) where 1 corresp o nds to the η -blo c k and − 1 corr esp o nds to the ˆ η blo ck. In agreemen t with the existence of this dangerous eigenv ector, one finds that the susceptibilit y propagation up date rule do es not con verge. As the susceptibilit y messages are up dated, η i → i + 1 2 ,j pic ks up the constant shift δ i,j = 1 . This effect is accumu lated as the messages go a r o und the ring, and t he conseq uence is tha t the messages div erge as t → ∞ . In summary , for 1-in- 2 satisfiabilit y o n a ring, whic h is an X OR SA T problem on a graph with a lo op, the b elief propagation can con v erge t o a fa mily of solutions f o r the magnetization among which only one solution is exact. On the other hand, t he sus ceptibilit y pr o pagation up date do es not hav e a fixed p oin t, it dive rges. In the simple case of a ring , this b ehaviour can b e cured b y using the finite temp erature vers ion of the BP and susceptibilit y propagation up date equations. But in general there is no guarantee of conv ergence of lo opy BP a nd lo opy susceptibilit y propagation, and when they conv erge the quality of their results cannot b e assesse d a priori. Fig.1 gives a n example of analysis of a small instance of 1-in-4 satisfiability , giving an idea of the errors made by susceptibilit y propagation on small factor gra phs. On the ot her hand, as for standard BP , one ma y hop e that the metho d b ecomes b etter fo r large instances when the factor graph is lo cally tree-lik e. 10 -0.4 -0.2 0 0.2 0.4 -0.4 -0.2 0 0.2 0.4 Susceptibility Propagation Exact (Enumeration) p ij conn (0,0) -0.4 -0.2 0 0.2 0.4 -0.4 -0.2 0 0.2 0.4 Susceptibility Propagation Exact (Enumeration) p ij conn (0,0) FIG. 1: C omparison b etw een the 2- p oint connected correlation function calculated exactly and that estimated with su sceptibilit y propagation. A 1-in-4 satisfiabilit y instance on a randomly generated factor graph with N = 27 v ariables and M = 16 constraints with Poi sson degree distribu tion w ith a ve rage degree ℓ = 2 . 485 6. IV. NUMERICAL INVESTIGA T ION OF SUSCEPTI BILITY PR O P A GA TION IN LOCKED OCCUP A TION MODELS In this section w e study t he use of susc eptibilit y propaga tion, together w ith decimation, in some lo c k ed o ccupation mo dels. Sp ecifically , w e shall study random instances of a lo c k ed o ccupation pro blem, where the factor graph is uniformly c hosen among the graphs with the f ollo wing degree distribution. All function no des hav e degree K and t he v ariables ha v e random degrees c hosen fro m truncated P oisson degree distribution q ( ℓ ) =      0 ( ℓ = 0 , 1 ) e − c c ℓ ℓ !(1 − (1+ c )e − c ) . ( ℓ ≥ 2) , (37) for whic h the av erage degree is ℓ = ∞ X ℓ =0 ℓq ( ℓ ) = c (1 − e − c ) (1 − (1 + c )e − c ) . (38) The basic message-passing algorithm that w e use is describ ed by the follo wing pseudo co de: Input: F acto r graph, constrain t- v ector, con v ergence c riterion, initial messages 11 Output: Estimate for 2-p o in t connected correlation functions (o r ERROR-NOT- CONVER GED) • Initialize messages • Rep eat until ev erything con v erges – Up date cavit y fie lds and c a vity biases ν ( t ) i → a ( x i ) and ˆ ν ( t ) a → i ( x i ) with (2),(3 ) – Up date cavit y susceptibilities ν ( t ) i → a,j ( x i , x j ) and ˆ ν ( t ) a → i,j ( x i , x j ) with (8 ) (9) with the help of ν ( t ) i → a ( x i ) and ˆ ν ( t ) a → i ( x i ) obtained abov e • Compute 1-v aria ble mar g inals p i ( x i ) from the fix ed-p oin t mes sages ˆ ν ( ∗ ) a → i ( x i ) by (4) • Compute 2-p oint connected correlation functions p conn ij ( x i , x j ) from the fixed-p oint mes- sages ˆ ν ( ∗ ) a → i ( x i ) and ˆ ν ( ∗ ) a → i,a ( x i , x j ) by (12) This a lgorithm requires a memory prop ortio na l to k M N , and eac h step of iteration requires a computation of O ( N 2 ) for fix ed k . A. Deca y of correlations Fig.2 shows the distribution of magnitude of 2-p o in t connected correlation function com- puted with susceptibilit y propaga tion for all pairs of points in a g raph for a fixed distanc e b et w een the p oints. One o bserv es a broad dispersion of correlations, a nd a n a ppro ximate ex- p onen tial deca y with the distance. Here we measure the distance d with the conv ention that eac h edge connecting a v ariable to a constraint is of length 1. Because of this exp onential deca y , it is p o ssible t o use in some cases approximate v ersions of susceptibilit y propag ation whic h a re faster and use less memory . This is done by truncating to zero the ca vity suscep- tibilities ν i → a,j , ˆ ν a → i,j b ey ond some prescrib ed distance dist( a, j ) > d or dist( i, j ) > d and k eeping only the correlation functions b et w een pairs of v ariables not far from eac h other. Although one can estimate the 2-v a r iable marginal distribution solely from the kno wledge of ca vit y fields [1 7, Sec.14.2.3],this truncation provides us with a more efficien t practical metho d to compute the 2-v ariable correlatio ns b etw een v aria bles with d ≥ 4. 12 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| d=2 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| d=2 d=4 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| d=2 d=4 d=6 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| d=2 d=4 d=6 d=8 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| d=2 d=4 d=6 d=8 d=10 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| d=2 d=4 d=6 d=8 d=10 d=12 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| d=2 d=4 d=6 d=8 d=10 d=12 d=14 0 0.25 0.5 0.75 1 0 0.05 0.1 0.15 0.2 0.25 Distribution of Pairs (i,j) |p ij conn (0,0)| d=2 d=4 d=6 d=8 d=10 d=12 d=14 -8 -6 -4 -2 0 0 2 4 6 8 10 12 14 16 log(<|p ij conn |>) distance between (i,j) decay of susceptibility FIG. 2: This graph sh ows ho w the 2-p oint connected correlation p conn ij ( x i , x j ) deca ys as the distance d b et ween x i and x j increases. A t eac h distance d , the distribution of | p conn ij (0 , 0) | is plotted. In the inset, logarithm of the a verag e of that quan tit y is p lotted agai nst the distance. The instance is 1-in-4 Satisfiability on a random facto r graph with N = 1618 v ariables and M = 1000 factors with the truncated Po isson d egree distribution w ith a verage degree ℓ = 2 . 485 6. B. P air Decimation Algorithm As w e mentioned in the intro duction, decimation consists in finding a v ar ia ble with the smallest en trop y and fixing it to the most probable v alue. Assuming that t he susc eptibilit y propagation provides us with the go o d estimate for the 2 -p oint connected correlation, we can think of decimation whic h acts on a pair o f v ariable instead of a single v a riable. Let x i and x j b e v ar ia bles. If one defines a random v ariable y ij = 1 ( x i = x j ), one can compute the en tropy for y ij once o ne know s the 2-v ariable marginal p ij ( x i , x j ) = p i ( x i ) p j ( x j ) + p conn ij ( x i , x j ). In pair decimation, one iden tifies the pair ( i, j ) with the smallest en tropy of y ij and one fixes either x i = x j or x i + x j = 1, dep ending on whic h ev ent is the most probable according to the measured correlation. This results in a reduced smaller CSP , whic h is still an o ccupation problem. The efficienc y of this no v el decimation pro cess dep ends on whether w e can find a pair with less entrop y than the single v ariable with the smallest entrop y . It is easy to see that, in the absence o f correlations, namely if p ij ( x i , x j ) = p i ( x i ) p j ( x j ), then the en tropy of y ij is la r g er than the one of x i or x j . So the whole pr o cedure relies on being able to detect 13 FIG. 3: Comp arison b et ween the minimum entrop y min( S i , S j ) (where S i and S j are the entropies of x i , x j )and S ij , that of y ij = 1 ( x i = x j ). Th e in stance is 1-in-4 Satisfiability on a rand om factor graph with N = 1618 v ariables and M = 1000 factors with the truncated Poi sson d egree distribution w ith a verag e degree ℓ = 2 . 485 6. correlations. Fig.3 s ho ws that strongly correlated pairs can b e found. In practice, we ha ve used t he following decimation algorithm whic h mixes the tw o strate- gies of single-v ariable decimation and pair de cimation: Input: F acto r graph, constrain t- v ector, con v ergence c riterion, initial messages Output: A satisfying assignme n t (or F AIL-NOT-FOUND) • While graph has more than R v ariables: – Compute lo cal en tro py estimates f o r the 1- v aria ble mar g inals – Compute lo cal en tro py estimates f o r the 2- v aria ble mar g inals – if ‘heuristic criterion finds that sin gle-v a riable decimation is b etter’, ∗ then fix the v alue of the v ariable. ∗ else identify a v ariable in the pair with the o ther (or its negation) – Lo cate completely lo c ked nearest neigh b or pairs – Clean the g r aph 14 ∗ F ix the v alue of is olated v ariables – Do w arning pro pagation. – Iden tify lo cal lo ck ed pairs • When the n um b er of v ariables is equal to or smaller tha n R : p erform an exhaustiv e searc h for satisfying a ssignmen ts. If found – Then return the satisfying assignme n t – Else return F AIL-NOT-F OUND The heu ristic criterion that we use in order to decide b etw een the t w o ty p es of decimation is the fo llo wing. W e lo cate a v ariable with the least entrop y and a pair of v ariables with the least en tropy for y ij . When the former is less than S th or is smaller than t he latter, w e c ho ose to do single-v a riable decimation. F o r the o ptima l reduction of the en trop y within a decimation step, it is reasonable to set S th = 0. Ho w ev er, w e find that S th > 0 p erforms b etter for finding a satisfying ass ignmen t. The optimal v alue of S th dep ends on the type of lo ck ed o ccupation model and the a v erage degree. This fact can b e interprete d as f o llo ws: the estimation of 1-v ariable marginals is more precise than the 2- v aria ble ones within given computational resource, th us it is adv a n ta g eous to resp ect the fo r mer if it is decisiv ely small. W arning propagation is a message-passing a lgorithm describ ed in [8, 20]. It logically infers the v alue o f v a riables o ne b y one from lo cal structure of the factor graph. In the identification of lo cal lo c k ed pairs, w e lo ok at eac h degree-2 constraint and see if the constrain t enforces y ij = 0 or y ij = 1. If it is the case, we iden tify this pair. The threshold for exhaustiv e searc h has b een fixed in our sim ulations to R = 1 6. The p erformance of this algorithm is sho wn for 1-in-4 satisfiability A = (0 , 1 , 0 , 0 , 0) and 1-or- 4-in- 5 satisfiability A = (0 , 1 , 0 , 0 , 1 , 0) in Fig.4. F or 1-in-4 satisfiabilit y , data with randomization is prese n ted: instead of fixing the most po larized v ariable or pair, w e fix a v ariable or pair randomly c hosen among a fixed n umber (here we adopt 8) o f most p o la rized v ar iables/pairs. The figure also sho ws the t wo imp o r t an t thresholds for these pr o blems, whic h are v alues of the av erage degree (a measure of the num b er of constrain ts) separating qualitativ ely distinct phases. The probability that a satisfying assignmen t exists drops f r o m 1 to 0 at the ‘satisfiability threshold’ ℓ s in the la rge f a ctor graph limit. Betw een the ‘clustering 15 0 0.2 0.4 0.6 0.8 1 2.2 2.3 2.4 2.5 2.6 2.7 satisfiability l s clustering l d probability of finding SAT assignments average degree l 1-in-4 Satisfiability Belief-guided, M=200 Belief-guided, M=1000 Belief-guided, M=2000 Pair, M=100 Pair, M=200 Pair, M=500 Pair, M=1000 0 0.2 0.4 0.6 0.8 1 2.2 2.4 2.6 2.8 3 3.2 satisfiability l s clustering l d probability of finding SAT assignments average degree l 1-or-4-in-5 Satisfiability Belief-guided, M=100 Belief-guided, M=200 Belief-guided, M=500 Belief-guided, M=1000 Pair, M=100 Pair, M=200 Pair, M=500 Pair, M=1000 0 0.2 0.4 0.6 0.8 1 2.2 2.4 2.6 2.8 3 3.2 satisfiability l s clustering l d probability of finding SAT assignments average degree l 1-or-4-in-5 Satisfiability Belief-guided, M=100 Belief-guided, M=200 Belief-guided, M=500 Belief-guided, M=1000 Pair, M=100 Pair, M=200 Pair, M=500 Pair, M=1000 FIG. 4: Success p robabilit y of p air d ecimation pro cess f or 1-in-4 satisfiabilit y A = (0 , 1 , 0 , 0 , 0) (left) and 1-or-4-in-5 satisfiabilit y A = (0 , 1 , 0 , 0 , 1 , 0) (r ight) on a random f actor graph with M constrain ts and a v erage d egree ℓ (righ t), plotted versus ℓ . F or comparison, th e p erformance of simple b elief-guided decimation pro cess is sh own. T he v ertical lines show the clustering and satisfiabilit y thresh olds. threshold’ ([6] and the satisfiabilit y threshold, althoug h t he satisfying assignmen ts still exist with probability o ne, it is v ery difficult to find one b y the algorithms know n so f ar, b ecause of the splitting o f the set of solutions into clusters. In b oth LOPs the p erformance is impro ved compared to t he simple b elief-g uided decimation emplo y ed in [6]. Esp ecially for 1 -or-4- in-5, the presen t algorithm w orks w ell ab ov e the clustering threshold, a region of ℓ where all kno wn algorithms are rep or t ed to p erform p o orly [6]. V. CONCLUSIO N AND DISCUSSIO N W e ha ve sho wn how to find satisfying assignmen ts for lo c k ed o ccupation problems based on the measuremen t of correlatio n among v ariables. This is in contrast with the conv entional metho d whic h is guided b y 1-v ariable margina ls only . Since flipping a v ariable in a LOP forces anot her v aria ble far apart to b e flipp ed, the p erformance of the algorithm is improv ed when w e take the correlations in t o accoun t. W e ha ve calculated correlations with the susceptibilit y propagation. In this metho d, t he correlations b et w een v ariables which a r efar apart can be calculated as w ell as b etw een those whic h a re neighbors. Na mely , the conv ergence property is controlled by a single matr ix M . The susceptibilit y propagation, ho wev er, requires more computational time and memory 16 resource than the simple b elief propagation. Therefore, as the problem becomes larger, w e face a (p olynomial) increase of computation time. The truncation in tro duced in subsec- tionIV A might give a remedy since it reduces by a factor of N the computation time as w ell a s the memory use. The deca y of correlation suggests tha t this is a reasonable ap- pro ximation. W e ha ve p erfo rmed preliminary exp eriments to find the p erformance of this appro ximate algorithm. As exp ected, it b ehav es similarly to that without truncation, the p erformance b eing only slightly degraded. Ac kno wledgmen t S.H. w as supp orted b y R yukoku Univ ersit y Researc h F ello wship (200 8 ). [1] M. M ´ ezard and R. Zecc hina, Physica l Review E 66 , 56126 (2002). [2] F. K r zak ala, A. Mon tanari, F. Ricci-T ersenghi, G. Semerjian, and L. Zdeb oro v´ a, Pro ceedings of the National Academ y of Sciences 104 , 10318 (2007). [3] J . Chav as, C . F urtlehner, M. M´ eza rd, and R. Z ecc hina, J ournal of S tatistical Mec hanics 11016 (2005 ). [4] M. M ´ ezard and T. Mora (2008), arXiv:0803.30 61. [5] M. M ´ ezard, G. Pa risi, and R. Z ecchina, Science 297 , 812 (2002). [6] L. Zdeb oro v´ a and M. M´ ezard, Physic al Review L etters 101 , 078702 (2008). [7] L. Zdeb oro v´ a and M. M´ ezard, Journal of S tatistical Mec h anics p. P12004 (2008). [8] M. M ´ ezard, F. Ricci-T ersenghi, and R. Zecc hina, Journ al of Statistica l Physics 111 , 505 (2003 ). [9] A. Mont anari and T. Rizzo, Journal of S tatistica l Mec hanics: Theory and Exp eriment 10 , P10011 (2005). [10] M. Chertko v and V. Cherny ak, Physica l Review E 73 , 065102 (2006). [11] G. P arisi and F. Slanina, Jour nal of S tatistical Mec hanics 602 (2006 ). [12] L. Zdeb oro v´ a, Ph.D. thesis, Un iv ersit´ e P aris-Sud (2008), arxiv:0806.4 112. [13] J. Ra y m ond, A. Sp ortiello, and L. Zdeb oro v´ a, Physical Review E 76 , 11101 (2007). [14] R. Gallager, Inform ation Theory , IEEE T ransactions on 8 , 21 (1962). 17 [15] F. Ksc h isc h an g, B. F rey , and H. Lo eliger, IEE E T ransactions on Inf ormation Theory 47 , 498 (2001 ). [16] J. Y ed id ia, W. F reeman, and Y. W eiss, U nderstanding b elief pr op agation and its gener alizations (Science & T ec hnology Bo oks, 2003), pp. 239–236. [17] M. M ´ ezard and A. Mon tanari, Information, Physics and Computation (O x f ord Universit y Press, 2009). [18] T. Mora, Ph .D. th esis, Univ ersit ´ e Paris-Sud (2007), http://t el.arc hive s-ouv ertes.fr/tel- 00175 221/e n/. [19] N. Creignou and H. Daude, Discrete App lied Mathematics 96 , 41 (1999). [20] A. Mon tanari, F. Ricci-T ersenghi, and G. Semerjian (2007) , arXiv:0709.16 67.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment