Characterizing and Improving Generalized Belief Propagation Algorithms on the 2D Edwards-Anderson Model

Characterizing and Impro ving Generalized Belief Propagation Algorithms on the 2D Edw ards-Anderson Mo del Eduardo Dom ´ ınguez, Alejandro Lage-Castellanos, Rob erto Mulet Dep artment of The or etic al Physics and “Henri-Poinc ar’e-Gr oup” of Complex Systems, Physics F aculty, University of Havana, L a Hab ana, CP 10400, Cub a. F ederico Ricci-T ersenghi Dip artiment o di Fisic a, INFN – Sezione di R oma 1 and CNR – IPCF, UOS di R oma, Universit` a L a Sapienza, P.le A. Mor o 5, 00185 R oma, Italy T ommaso Rizzo Dip artiment o di Fisic a and CNR – I PCF, UOS di R oma, Universit` a L a Sapienza, P.le A. Mor o 5, 00185 R oma, Italy (Dated: Septem b er 20, 2018) W e study the p erforma nc e o f diﬀerent message passing algor ithms in the tw o dimensional Edwards Anderson model. W e show that t he standard Belief Pro pa gation (BP ) a lgorithm conv erges only at high temp erature to a parama gnetic solution. Then, we test a Gener alized Belief P ropag ation (GBP) algorithm, derived from a Cluster V ar iational Metho d (CVM) a t the plaquette level. W e compa r e its p erformance with BP and with other algorithms derived under the same approximation: Double Lo op (DL) and a t wo-wa ys mes sage passing algo- rithm (HAK). The plaquette-CVM approximation improves BP in at leas t three wa ys: the quality o f the paramagne tic solution at high temp er a tures, a better estimate (low er) for the critical temp era ture, a nd the fact that the GB P message pas s ing algo rithm conv erges also to non paramagnetic solutions. The lack of co nv ergence of the standa rd GBP message pa s sing algorithm at low temp eratures s eems to b e r elated to the implemen tation details a nd not to the a pp ea rance of long r ange or de r . In fact, we prove that a gauge inv ariance of the c on- strained CVM free energy can b e exploited to derive a new message pas sing alg o rithm which conv erges a t even lo wer temper atures. In all its region of co nv ergence this new algorithm is faster than HAK and DL by s ome orders of magnitude. 2 I. INTR ODUCTION The 2D Edw ards-Anders on (EA) model in statistical mec hanics is deﬁned b y a set σ = { s 1 . . . s N } of N Ising spins s i = ± 1 p laced on the no d es of a 2D square lattice, and random in teractions J i,j at the edges, with a Hamiltonian H ( σ ) = − X J i,j s i s j where < i, j > run s o ver all couples of neigh b oring s pins (ﬁrst neigh b ors on th e lattice) . The J i,j are th e m agnetic interc hange constants b etw een spins and are sup p osed ﬁ xed f or any giv en instance of the system, and the sp ins s i are the dynamic v ariables. W e will fo cus on one of the most common disorder t yp es, the bimo dal interactio ns J = ± 1 with equal probabilities. The statistical mec hanics of the EA mo d el, at a temp erature T = 1 /β , is give n by the Gibbs- Boltzmann distr ib ution P ( σ ) = e − β H ( σ ) Z where Z = X σ e − β H ( σ ) The dir ect compu tation of the partition function Z , or any marginal probabilit y distr ibution like p ( s i , s j ) = P σ \ s i ,s j P ( σ ), is a time consum ing task, unattainable in general, and therefore an appro ximation is requ ir ed. W e are in terested in fast algorithms for inferring such marginal distri- butions. Actually for the 2D EA m o del, thanks to the graph p lanarit y , algorithms computing Z in a time p olynomial in N exist. Ho wev er we are in terested in very fast (i.e. linear in N ) algorithms that can b e used also for m ore general mod el, e.g. the EA mo del in a ﬁeld or deﬁned on a 3D cubic lattice . F or these more general cases a p olynomial algorithm is very unlik ely to exist and some appro ximations are required. A simp le and eﬀect ive mean ﬁeld approximat ion is th e on e d ue to Bethe [1], in whic h the marginals o v er the dynamic v ariables, lik e p ( s i ), are obtained from the minimizatio n of a v ariational free energy in a s elf consisten t w ay . The Bethe appro ximation is exact f or a mo del without lo ops in the interacti ons net work, whic h unfortun ately is far from b eing the u s ual case in physic s. In the conte xt of ﬁnite dimensional latt ices, Kiku c hi [2] derived an extension of th is app ro ximation to larger groups of v ariables, whic h ac counts for short lo ops exa ctly , and is usually referred as Clu ster V ariational Metho d (CVM). The int erest in spin glasses, with qu enc hed random disord er, br ough t a new testing ground for b oth app ro ximations. In particular Bethe appro ximation (exact on tr ees) has b een the starting p oint of man y useful th eoretical and applied dev elopmen ts. It is at the basis of the ca vit y m etho d, whic h allo ws a restatemen t of replica theory in probabilistic terms for ﬁnite connectivit y systems [3]. The Bethe appro ximation is connected to w ell kno wn algorithms in computer science, namely Belief Propagation [4] and the sum-pro du ct algorithm [5]. A ma jor achiev ement of this conﬂuence b et ween computer science and statistical mec hanics, has b een the conception of the Surv ey P r opa- gation algorithm [6, 7], inspired b y the ca vity metho d and the replica symmetry breaking [3, 8 , 9], that sh ows great p er f ormance on hard optimization p roblems [6, 7, 10, 11 ]. Statistical mec hanics clariﬁed the r elation b et w een p hase transitions and easy-hard transitions in optimization p roblems, and allo w ed the statistica l c haracterization of the onset of the hard phase [12–14], as we ll as the analytical description of searc h algorithms based on BP [15, 16]. The correctness of Bethe app r o ximation and the related algorithms is, ho we ver, link ed to the lac k of top ological correlations in the inte r actions (rand om graphs are lo cally tree-lik e), since the appro ximation is exact only on tree top ologies. T his is a strong limitation for physical pu rp oses, since tree top ologi es or random graphs are n ot the common situation. Bethe approximati on p er- forms p o orly in ﬁ nite dimensional lattice s, and the asso ciated algorithm are usu ally non con ve rgent at lo w temp eratures. Recen tly the Cluster V ariational Method (CVM) has b een reformulate d in a br oader probabilis- tic framew ork called r e gion-b ase d approxi mations to free energy [17] and c onn ected to a Generalized 3 Belief Propagation (GBP) algorithm to ﬁ nd the stationary p oints of the free energy . It extends Bethe app ro ximation by considering correlations in larger regions, allo wing, in principle, to tak e in to account sh ort lo ops accurately . In [17] wa s sh o wn that s table ﬁ x ed p oin ts of GBP message passing algorithm corresp ond s to stationary p oin ts of th e app r o ximated CVM free energy , while the con verse is n ot necessarily tru e. F urthermore, the GBP message passing is n ot guarantee d to con v erge at all. P r ompted b y this lac k of con v ergence, a new kind of prov ably con vergen t algo- rithms for minimizing the CVM appr o ximated free energy , known as Double Lo op (DL) algorithms [18, 19], has b een d ev elop ed, at the cost of a drastic drop oﬀ in sp eed. GBP h as b een applied in the last decade to inference pr ob lems [20–22], consisten tly outp er- forming BP . In p articular, the image reconstru ction p roblems [20, 23] are based on a 2D lattice s structure, b ut, at v ariance with 2D E A mo del, the int eractions among n earby s pins (pixels) are ferromagnetic, an d the damaged image is used as an external ﬁeld. Both f actors help conv er gence of GBP algorithms. An analysis of CVM appro ximation using GB P algorithms on single instances of ﬁnite dimensional disord ered mo dels of physical in terest, lik e the EA mo del, has not b een done so f ar. The Edwards Ander s on mo del in 2D has b een largely s tu died by other metho ds (see [24, 25] and reference th erein) s uggesting that it remains paramagnetic all the wa y d o wn to zero temp erature, lac king any thermo d y n amic tran s ition at any ﬁ nite T , although at lo w T there are metastable states of v ery long lifetime, leading to v ery slo w d ynamics. Based on th is fact, a paramagnetic v ersion of the GBP on 2D EA mo del was stu died recentl y in [26]. Th e connection of C VM with the replica tric k and a Generalized Surv ey Propagation ha ve b een pr esented recentl y [27]. Ho we ver the imp lemen tation of the latter algorithm on ﬁ nite dim en sional lattices is computationally v ery demanding, and should b e preceded by the stud y of th e original CVM approximat ion and GBP algorithm. In this p ap er we s tudy the con ve rgence pr op erties of GBP message passing algorithm and the p erforman ce of the CVM approximat ion on the 2D EA mo del. After the in tro duction of the region-based f r ee energy in Sec. I I and the message passing algorithm in terms of ca vit y ﬁ elds, w e compute the critical (in verse) temp erature T CVM ≃ 0 . 82 ( β CVM ≃ 1 . 22) of the plaquette-CVM appro ximation in Sec. I I I, impro ving Bethe estimate T Bethe = 1 . 51 ( β Bethe ≃ 0 . 66) by r oughly a factor 2. The CVM av erage case temp eratur e, ho wev er , do es not clearly corresp onds to the single instance b eha vior of the GBP message passing al gorithm, as is shown in Sec. IV. A t v ariance with Belief Propagation, GBP conv erges to spin glass s olutions (b elo w T SG ≃ 1 . 27, ab o ve β SG ≃ 0 . 79), and stops conv erging near T ≃ 1 . 0, b efore the a v erage case pr ed iction T CVM . In Sec.V we sho w th at this con v ergence problem dep end s on th e implemen tation details of the messag e passing algorithm, and can b e improv ed b y a sim ultaneous up date of message. In order to do so the ga u ge inv ariance of the message passing equ ations h as to b e ﬁxed. In Sec. VI w e compare the solutions and the p erforman ce of GBP with 3 other algorithms for the minimization of the CVM free energy: Double Lo op [19], Tw o-W a ys Message Passing [19], and the Dual algo rith m [26]. In terms of the CVM free energy , the paramagnetic solution is in general th e one to b e c hosen, except for a small inte rv al in temp eratures wh ere the spin glass solution has a lo wer fr ee energy . Our results are summarized in Sec. VI I. I I. GENERALIZED BE LIEF PROP A GA TION ON EA 2D Giv en that a d etailed d eriv ation of plaquette-GBP message passing equations for the 2D Ed- w ards Anderson mo d el w ere presented in [26], here we only summarize such d eriv ation, skippin g unnecessary details. The idea of the r e gion-b ase d free energy appro ximation [17, 28] is to mimic the exact (Boltzmann- Gibbs) distrib ution P ( σ ), b y a redu ced set of its marginals. A hierarc hy of approximat ions is giv en b y the size of su c h marginals, starting with the set of all single sp ins marginals p i ( s i ) (mean ﬁeld), then follo wing to all neigh b oring sites m arginals p ( s i , s j ) (Bethe appr o ximation), then to all squ are plaquettes marginals p ( s i , s j , s k , s l ), and so on. Since the only w ay of kno wing su c h marginals exactly is the un attainable compu tation of Z , the metho d pretends to appr o ximate th em by a set 4 of b eliefs b i ( s i ), b L ( s i , s j ), b P ( s i , s j , s k , s l ), etc. obtained from a minimization of a r egion based free energy . F ollo wing the deriv ation d one in [26], the p laquette level approximat ed free energy for the 2D EA mo d el is giv en as a contribution of all Plaquettes, Links and Spins in the 2D lattice : − β F = X P X σ P b P ( σ P ) log b P ( σ P ) exp( − β E P ( σ P )) Plaquettes − X L X σ L b L ( σ L ) log b L ( σ L ) exp( − β E L ( σ L )) Links (1) + X i X s i b i ( s i ) log b i ( s i ) exp( − β E i ( s i )) Spins where the sym b ol σ R = ( s 1 , . . . , s k ) stands for the set of s pins in regio n R , wh ile E R ( σ R ) = − P ∈ R J i,j s i s j stands for the energy contribution in that region. The energy term E i ( s i ) in the spins contribution is only relev ant w hen an extern al ﬁeld acts o v er spins, and will b e neglected from n o w on. i j L i i j R U D L k l L P L R D U P FIG. 1. Schematic representation of belief equations (2). La grange multipliers ar e depicted as arr ows, going from parent regions to c hildren regions. Italics capital letters are used to denote Plaquettes, simple capital letters denote Link s, and lower case letters denote Spins. An unrestricted m in imization of the f ree energy (1) in terms of its b eliefs, pro duces incongruen t results. Beliefs are only meaningful as an appr oximati on to the correct marginals if they ob ey the m arginalizatio n constrains b i ( s i ) = P s j b L ( s i , s j ) and b L ( s i , s j ) = P s k ,s l b P ( s i , s j , s k , s l ). T his marginalizatio n is enforced b y the in tro duction of Lagrange m ultipliers (see [17] for a general in tro d uction, and [26] for this p articular case) in the free energy expr ession. Ther e is one Lagrange m ultiplier µ L → i ( s i ) for ev ery lin k L and s p in i ∈ L , and a Lagrange multiplier ν P → L ( s i , s j ) for eac h plaquette P and link L ∈ P . In terms of these Lagrange m ultipliers, the stationary condition of the ap p ro ximated free energy is ac hiev ed with b i ( s i ) = 1 Z i exp − β E i ( s i ) − 4 X L ⊃ i µ L → i ( s i ) ! , b L ( σ L ) = 1 Z L exp     − β E L ( σ L ) − 2 X P ⊃ L ν P → L ( σ L ) − 2 X i ⊂ L 3 X L ′ ⊃ i L ′ 6 = L µ L ′ → i ( s i )     , (2) b P ( σ P ) = 1 Z P exp     − β E P ( σ P ) − 4 X L ⊂P 1 X P ′ ⊃ L P ′ 6 = P ν P ′ → L ( σ L ) − 4 X i ⊂P 2 X L ⊃ i L 6⊂P µ L → i ( s i )     . 5 A graph ical represent ation of these equ ations is giv en in ﬁgure 1. Lagrange m ultipliers are sho wn as arrows going from parent regions, to children. T ak e, for one, the mid d le equation for the b elief in link regions b L ( σ L ) = b L ( s i , s j ). Th e su m of the t wo Lagrange multiplie r s ν P → L ( s i , s j ) corresp onds to the triple arro ws on b oth sides of the link in central ﬁ gure 1, while th e t w o sums o v er three messages µ L ′ → i ( s i ) corresp onds to the three arrows acting o ver the top ( j ) and b ottom ( i ) s p ins, resp ectiv ely . In equatio n s (2), the Z R are normalizat ion constant s. The terms E P ( σ P ) = E P ( s i , s j , s k , s l ) = − ( J i,j s i s j + J j,k s j s k + J k ,l s k s l + J l,i s l s i ) and E L ( s i , s j ) = − J i,j s i s j are the corresp ondin g energies in plaquettes and links resp ectiv ely , and are represented in the diagram b y the lin es (int eractions) b et ween circles (spins). zero since n o ﬁeld is acting up on spins. i L i i = = R U D k l j L j i L L j P P L P R U D A U U E B D G C F FIG. 2. Message passing eq uations (5) a nd (6), shown s chematically . Messa ges are depicted as ar r ows, going fr om parent reg ions to children r egions. On any link J i,j , represented as b old lines b etw een spins (circles), a Boltzmann fac to r e β J i,j s i s j exists. Dark circles repr esent spins to b e tr aced ov er. Messa ges from plaquettes to links ν P → L ( s i , s j ) are represented b y a triple arr ow, b ecause they ca n be written in terms of three para meters U , u i and u j , deﬁning the co rrelatio n h s i s j i and ma gnetizations h s i i and h s j i , res p ec tively . The Lagrange m ultipliers can b e parametrized in terms of ca vit y ﬁelds u and ( U, u a , u b ) as − µ L → i ( s i ) = β u L → i s i (3) − ν P → L ( s i , s j ) = β ( U P → L s i s j + u P → i s i + u P → j s j ) (4) In particular, the ﬁ eld u L → i corresp onds to the cavi ty ﬁeld in th e Bethe approxi mation [17 ]. Th e c hoice of these parametrization is the reason for the use of single and triple arro ws in ﬁgur es 1 and 2. In particular, the messages going from plaquettes to lin ks, are c haracterized by th r ee ﬁelds ( U P → L , u P → i , u P → j ), and the capital U P → L acts as an eﬀectiv e int eraction term. The Lagrange m ultipliers are related among them by the constrains they are su pp osed to imp ose (see [26]). In te r m s of the ca vit y ﬁelds and using th e n otation in ﬁgure 2, Link-to-Spin ca vit y ﬁ elds shall b e related by u L → i = ˆ u ( u P → i + u L→ i , U P → L + U L→ L + J ij , u P → j + u L→ j + u A → j + u B → j + u U → j ) , (5 ) where ˆ u ( u, U, h ) ≡ u + 1 2 β log cosh β ( U + h ) cosh β ( U − h ) Note that the usual ca vity equation f or ﬁelds in the Bethe appro ximation [3] is reco vered if all con tributions from plaquettes P an d L are set to zero. Similarly , b y imp osing the marginalizatio n of the b eliefs at Plaquettes ont o their children L inks, w e ﬁnd the self consisten t exp ression for the Plaquette-to-Link cavi ty ﬁelds: U P → L = ˆ U (#) = 1 4 β log K (1 , 1) K ( − 1 , − 1) K (1 , − 1) K ( − 1 , 1) u P → i = − u D → i + ˆ u i (#) = u D → i − u D → i + 1 4 β log K (1 , 1) K (1 , − 1) K ( − 1 , 1) K ( − 1 , − 1) (6) u P → j = − u U → j + ˆ u j (#) = u U → j − u U → j + 1 4 β log K (1 , 1) K ( − 1 , 1) K (1 , − 1) K ( − 1 , − 1) 6 where K ( s i , s j ) = X s k ,s l exp  β  ( U U → U + J j k ) s j s k + ( U R→ R + J k l ) s k s l + ( U D → D + J li ) s l s i + ( u U → k + u C → k + u E → k + u R→ k ) s k + ( u R→ l + u F → l + u G → l + u D → l ) s l   and the sym b ol # stands f or all in coming ﬁelds in the righ t hand side of th e equations. The functions ˆ u ( u, U, h ) and [ ˆ U (#) , ˆ u i (#) , ˆ u j (#)] will b e used in next section for the a verag e case calculatio n. F or a give n sys tem of size N (n umb er of spins) there are 2 N Links and N square p laquettes, and therefore there are 4 N Plaquette-to-Link ﬁelds [ U P → L , u P → i , u P → j ], and 4 N Link-to-Spins ﬁelds u L → i . At the stationary p oin ts of the free energy their v alues are related by the set of 4 N + 4 N equations (5) and (6). The set of 4 N + 4 N self-consisten t equations are also called m essage-passing equations when they are u sed as up date ru les f or ﬁelds in the message passing algorithm, or ca vit y iteration equations in the con text of ca vit y calculations. Th e ﬁeld notation is more comprehensible than the original Lagrange m ultipliers notation, and has a clear p hysical meaning: eac h plaquette is telling its c h ildren links that they should ad d an eﬀec tive in teraction term U P → L to th e d ir ect interacti on J i,j , due to the fact that sp ins s i and s j are also in teracting throu gh th e other thr ee links in the plaquette. T erms u i act lik e magnetic ﬁelds up on spins, and the complete ν ( s i , s j ) − message is c haracterized by the triplet ( U i,j , u i , u j ). I I I. CRITICAL TEMPERA TURE O F PLA QUETTE-CVM APPR OXIMA TION In this section we revisit the metho d used in [27] to compu te the critical temp erature at whic h CVM appr o ximation develo p s a sp in glass p hase. By spin glass ph ase w e mean a phase charact erized b y non zero local magnetizations m i = tanh  β P 4 L u L → i  and n early zero total magnetization m = 1 N P i m i ≃ 0 (remem b er w e wo rk with no external ﬁeld). The 2D EA mo del is paramagnetic do wn to zero temp erature, but spin glass like s olutions can app ear in the C VM approximat ion due to its mean ﬁ eld charact er. W e correct one of the conclusions r eac hed in [27], where we fail to observ e the app earance of the spin glass p hase in the CVM app ro ximation to the 2D Ed w ard s Anderson mo del. W e follo w an a ve rage case approac h, whic h is similar in spirit but d iﬀeren t fr om the single instance stabilit y analysis done in [29] for the Bethe appr o ximation (Belief Propagation). The av erage case calculation is a mathematical tec hniqu e deve lop ed in [3] to study the typica l solutions of ca vit y equations in disorder ed systems, with a deep and fun damen tal conn ection to the replica tric k [9]. When applied to the plaquette-CVM appro ximation [27], we end up with t w o equations, in whic h ﬁ elds (messages) are no w replaced b y fu nctions of ﬁelds q ( u ) and Q ( U, u 1 , u 2 ), and the in teractions are a verag ed out. As a consequence of the homogeneit y of the 2D lattice and the a ve r aging ov er lo cal disorder J i,j , all plaquettes, links, and sp ins in the graph are now equiv alent , and we only n eed to study one of them to c haracterize the w hole sys tem. More pr ecisely , the av erage case self consistent equations for the distribution q ( u ) is giv en by q ( u i ) = E J Z d q ( u A → j ) d q ( u B → j ) d q ( u U → j ) (7) d Q ( U P → L , u P → i , u P → j ) d Q ( U L→ L , u L→ i , u L→ j ) δ  u i − ˆ u (#)  with ˆ u (#) as deﬁn ed in the right hand s ide of equation (5), an d d f ( x ) ≡ f ( x )d x 7 The corresp ond ing self-consistent equation for Q ( U, u 1 , u 2 ) is Z Z Q ( U, u a , u b ) q ( u i − u a ) q ( u j − u b )d u a d u b = (8) = E J Z d q ( u C → k ) d q ( u E → k ) d q ( u F → l ) d q ( u G → l ) d Q ( U U → U , u U → j , u U → k ) d Q ( U R→ R , u R→ k , u R→ l ) d Q ( U D → D , u D → l , u D → i ) δ  U − ˆ U (#)  δ  u i − ˆ u i (#)  δ  u j − ˆ u j (#)  where the notation corresp onds to equation (6). I n b oth equations (7) and (8) th e expression E J = R d J P ( J ) . . . stands for the a v erage ov er the qu enc hed randomness. A t high temp eratures w e exp ect ﬁxed point equations (5) and (6) to yield a paramagnetic solution. Suc h a s olution is c haracterized b y Link to Site messages u = 0, and Plaquette to L in k messages ( U, u 1 , u 2 ) = ( U, 0 , 0). If we imp ose this ans atz to ﬁelds, w e reco ve r the paramagnetic or dual algorithm of [26] f or th e single in stance message passing, and the paramagnetic a ve rage case study of [27] for th e a verag e case. L et us rememb er that th e 2D EA m o del is exp ected to ha ve no thermo dyn amic transition at any ﬁnite temp eratur e, and hence remain paramagnetic all th e wa y do wn to T = 0. F ollo wing [27], in th e av erage case the paramagnetic solution h as the form q ( u ) = δ ( u ) Q ( U, u 1 , u 2 ) = Q ( U ) δ ( u 1 ) δ ( u 2 ) The equation (7) is alw a ys satisﬁed when q ( u ) = δ ( u ) f or whatev er Q ( U ). The equ ation (8) can b e solv ed self-consisten tly f or Q ( U ): Q ( U ) = E J Z d Q ( U U ) d Q ( U R ) d Q ( U D ) (9) δ  U − 1 β arctanh h tanh β ( J U + U U ) tanh β ( J R + U R ) tanh β ( J D + U D ) i  and the a v erage free energy and all other r elev an t fu n ctions can be deriv ed in terms of it (see [27]). On the other hand, a general (not paramagnetic) s olution of the a v erage case equatio ns (7) and (8) is v ery diﬃcult, since it in vol ves the decon vo lution of distrib utions q ( u ) in the left hand side of eq. (8) in order to up date Q ( U, u 1 , u 2 ) by an iterativ e metho d. A critical temp eratur e can b e found, h o we ver, b y an expansion in small u around the paramagnetic solution. W e can fo cus on the s econd moments of the distribu tions a = Z q ( u ) u 2 d u a i j ( U ) = Z Z Q ( U, u 1 , u 2 ) u i u j d u 1 d u 2 where i, j ∈ { 1 , 2 } and c hec k wh ether the p aramagnetic solution ( a = 0 and a ij ( U ) = 0) is lo cally stable. T o do this w e expand equations (7) and (8) to second order, and w e obtain the f ollo wing linearized equations: a = K a,a a + Z d U ′ K a,a 11 ( U ′ ) a 11 ( U ′ ) + Z d U ′ K a,a 12 ( U ′ ) a 12 ( U ′ ) a Q ( U ) + a 11 ( U ) = K a 11 ,a ( U ) a + Z d U ′ K a 11 ,a 11 ( U, U ′ ) a 11 ( U ′ ) + Z d U ′ K a 11 ,a 12 ( U, U ′ ) a 12 ( U ′ ) a 12 ( U ) = K a 12 ,a ( U ) a + Z d U ′ K a 12 ,a 11 ( U, U ′ ) a 11 ( U ′ ) + Z d U ′ K a 12 ,a 12 ( U, U ′ ) a 12 ( U ′ ) The actual v alues of the K a x ,a y come from the expansion in small u of the original equations (see equation 90 in [27 ] for an example). 8 W e can not solv e these equations analytically b ecause we d o not ha ve an analytical expression of Q ( U ) for the p aramagnetic solution at all temp eratur es. By discretizing the v alues of U uniformly in ( − U max , U max ), i.e. U = i ∆ U w ith i ∈ [ − I max , I max ], we can transform th e con tinuous set of equations to a system of the form ~ a = K ( β ) · ~ a (10) where the v ector of the second moments ~ a = ( a, a 11 ( U ) , a 12 ( U )) ha ve the form ~ a =  a, a 11 ( − U max ) , a 11 ( − U max + ∆ U ) , . . . , a 11 ( U max − ∆ U ) , a 11 ( U max ) , a 12 ( − U max ) , a 12 ( − U max + ∆ U ) , . . . , a 12 ( U max − ∆ U ) , a 12 ( U max )  K ( β ) is a (2 I max + 1) × (2 I max + 1) matrix, that s tand for the discrete r epresen tation of the in tegrals in the right hand side of the linearized equations, and dep en ds on the in verse temp erature via th e solution Q ( U ) of eq. (9 ). The p aramagnetic solution ~ a = 0 alw ays satisfy th e homogeneous eq. (10). The stabilit y criterion for the paramagnetic solutio n is the singularity of the Jacobian det( I − K ( β )) = 0. When suc h condition is satisﬁed, a non paramagnetic solution cont inuously arises from the paramagnetic one, sin ce a ﬂ at d irection app ears in the free energy . 0.5 1.0 1.5 2.0 2.5 3.0 Β - 0.4 - 0.3 - 0.2 - 0.1 0.1 Det @ J D FIG. 3 . Determina nt o f the Ja cobian J = I − K ( β ) as a function of inv erse temp er ature β . The critica l inv erse temp era ture is β CVM ≃ 1 . 22 . Numerically , w e work ed with a d iscretizatio n of 2 I max + 1 = 41 p oin ts b etw een ( − U max = − 3 . 5 , U max = 3 . 5). The p aramagnetic solution Q ( U ) is foun d solving eq. (9) by an iterativ e metho d at ev ery temp erature, and then used to compu te the elements of the K ( β ) m atrix. In ﬁgure 3 we sho w the determinant of the Jacobian matrix J = I − K ( β ). The critical inv erse temp erature deriv ed from this analysis is β CVM ≃ 1 . 22 for the app earance of a ﬂat direction in the free energy . In [27] β CVM w as though t to b e in ﬁ nite (zero temp erature) b ecause an incomplete range of the v alues of β was examined. The critical temp erature found here is b elo w th e Bethe critical temp erature β Bethe ≃ 0 . 66, and th erefore impr o v es the Bethe approxima tion by roughly a factor 2, since the 2D EA mo d el is lik ely to remain paramagnetic at all ﬁnite temp eratures. A t v ariance with the Bethe appro ximation, the sin gle instance b eh a vior of the message passing is not so clearly related to the a v erage case critical temp erature, as w e sho w in the next section. IV. PERFORMANCE OF GBP ON 2D EA MODEL Before studying GBP message passing f or the plaquette-CVM appr o ximation, let us c heck what happ en s to the simpler Bethe approximati on and th e corresp onding message passing algorithm kno wn as Belief P ropagation (BP) in the 2D EA mo del. When ru nning BP at high temp eratures 9 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 2.5 Prob convergence β Probability of convergence GBP β Bethe = 0.66 GBP 16 x 16 GBP 32 x 32 GBP 64 x 64 GBP 128 x 128 GBP 256 x 256 BP 32 x 32 BP 128 x 128 FIG. 4. P robability of con vergence of BP a nd GBP on a 2D EA mo de l, with random bimo dal interactions, as a function of inv ers e temp er ature β = 1 / T . The Bethe spin glas s transition is exp ected to o ccur a t β Bethe ≃ 0 . 66 on a r andom g raph with the s ame connec tivit y . The BP message pa ssing algo rithm on 2D EA mo del s tops conv erg ing very clos e to that p oint. Ab ov e that temp erature, BP equations conv erg e to the pa ramagne tic solution, i.e. all messages are trivia l u = 0. Below the Bethe temp eratur e (near ly ) the Bethe ins tability takes messages awa y from the para magnetic solution, and the pr esence o f short lo ops is thought to b e res po nsible of the lack of c onv ergence. On the o ther hand, the GB P equations converge at low er temp era tures, but event ually stops con verging as well. (ab o ve T Bethe = 1 /β Bethe ≃ 1 . 51) in a typical in stance of the mo del with bimo dal in teractions, we ﬁnd the paramagnetic solution (giv en b y all ﬁelds u = 0), and therefore, the sy s tem is equ iv alen t to a set of indep enden t in teracting pairs of spins, wh ic h is only correct at inﬁ nite temp erature. The Bethe temp eratur e T Bethe (computed in a ve rage case and exact on acyclic graphs [30]), seems to mark precisely the p oin t where BP stops con v erging (see Fig. 4). Indeed messages ﬂo w a wa y from zero b elo w T Bethe , and con v ergence of the BP message p assing algo rithm is not ac hiev ed an ymore. So, the Bethe appro ximation is d isapp ointing wh en applied to single instances of th e Edwards Anderson mo del: either it con verge to a paramagnetic solution at high temp eratur es, or it do es not conv er ge at all b elo w T Bethe . The n atural question arises, as to what extent GBP message passing algorithm for the plaquette- CVM ap p ro ximation is also non conv ergent b elo w its critical temp eratur e, and whether this tem- p erature coincides with the av erage case one. T o c h ec k th is w e used GBP m essage passing equ ations (5) and (6), with a dampin g factor 0 . 5 in the L ink-to-Site ﬁelds u : u new L → i = 0 . 5 u old L → i + 0 . 5 ˆ u (#) W e will mak e the distinction b etw een t wo t yp es of solutions f or the GBP algorithm. Th e high temp erature or paramagnetic solution is characte rized by zero lo cal magnetizat ion of sp in s m i = P s i s i b i ( s i ) = tanh  β P 4 L u L → i  = 0. At lo w temp eratures, follo win g the av erage case anal- ysis, a non paramagnetic or sp in-glass solution should app ear, c haracterized by non zero lo cal magnetizati ons, b ut rough ly null global m agnetization. The temp erature at w h ic h non zero lo cal magnetizati ons app ear will b e called T SG = 1 /β SG . Figure 4 sho ws that GBP is able to conv erge b elo w the Bethe critical temp eratur e, bu t stops 10 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 2.5 SG Fraction β Fraction of SG solutions in GBP β SG = 0.79 β CVM = 1.22 GBP L= 16 GBP L= 32 GBP L= 64 GBP L= 128 GBP L= 256 0 0.2 0.4 0.6 0.8 1 L 0.9 ( β - β SG ) FIG. 5. Data p oints cor resp ond to the fra ction of SG solutions in a p o pulation of 100 sys tems of sizes 16 2 , 32 2 , 64 2 , 128 2 , 256 2 resp ectively . At high temp eratures (low β ) GBP message passing con verge always to the paramag netic solution. The average ca se critica l in verse tempe r ature β CVM ≃ 1 . 22 does not cor resp onds to the single instance behavior, as the spin glass solutions in GBP app ear aro und β SG ≃ 0 . 79 . The inset shows that all data co lla psed if plotted as a function of the scaling v aria ble L 0 . 9 ( β − 0 . 7 9), where the exp onent 0 . 9 and the critica l inv erse tempera ture β SG ≃ 0 . 79 are obtaine d from best data collapse. con v erging b efore the CVM av erage case critical temp eratur e β CVM ≃ 1 . 22. F urtherm ore, ﬁgure 5 sho ws that even b efore stop con v erging, GBP ﬁnd s a sp in glass solution in most instances. The inn er plot of ﬁgure 5 sho ws a collapse of the d ata p oin ts for diﬀeren t system sizes using the scaling v ariable L 0 . 9 ( β − 0 . 79), wh ic h gives an estimate β SG ≃ 0 . 79 (the exp onent 0 . 9 is obtained from the b est d ata collapse). Since β SG ≃ 0 . 79 is well b elo w the a ve r age ca se inv erse critical temp erature β CVM ≃ 1 . 22, the relev ance of the latter on the b eha vior of GBP on single samp les is questionable. By a similar data collapse pr o cedure, w e estimate the non-con v ergence temp erature for th e GBP message p assin g algorithm to b e β con v ≃ 0 . 96 (see Fig. 9 ), which is again far a w ay from the a v erage case prediction β SG . So, b ey ond the simp le Bethe appro ximation, w e found thr ee diﬀerent temp eratures in the CVM appro ximation: β SG ≃ 0 . 79 < β con v ≃ 0 . 96 < β CVM ≃ 1 . 22 corresp onding resp ectiv ely to the app earance of spin glass solutions, to the lac k of con v ergence on single instances, and to the a v erage case prediction for the critical temp erature. W e can summ arize three m ain diﬀerences b et ween the p rop erties of BP and GBP . At high temp eratures (b elo w β SG ≃ 0 . 79) GBP giv es a quite goo d appro ximate of the marginals [26], namely the paramagnetic solution w ith non trivial correlations ﬁ elds U 6 = 0, while BP treats the system as a set of ind ep endent pairs of linke d sp ins. F urthermore, this naiv e appr oac h is all that BP can d o for us , sin ce ab o v e β Bethe ≃ 0 . 66, it no longer con ve r ges. GBP , on th e other h an d , is not only able to con v erge b eyond β Bethe , but it is also able to ﬁnd spin glass solutions ab ov e β SG . The third diﬀerence b etw een b oth algorithms is that the n on conv ergence of BP seems to o ccur exactly at the s ame temp erature wh er e a spin glass phase should app ear (and arguably b ecause of it), wh ile the GBP con ve r gence problems app ear deep into the spin glass phase. The lac k of con v ergence of GBP , how ever, seems to dep end strongly on implementa tion details as we sho w next. 11 V. GA UGE INV ARIANCE O F GBP EQ UA TIONS The con v ergence prop erties of the GBP message passing is sensitiv e to implemen tation details, e.g. the dampin g v alue in the up date equations, and th is is not an inherent prop erty of the CVM (or region-graph) approximati on. W e m igh t try , f or instance, to up date sim ultaneously all smal l- u ﬁ elds p oin ting to w ards a giv en spin, hopin g to gain some more stabilit y in message passing algorithm. When trying to do this we ﬁnd out that ther e is a freedom in th e c hoice of these ﬁelds that has no eﬀect ov er the ﬁxed p oin t solutions. Th is freedom (similar to the one noticed in [31]) is the result of h a ving introd uced u nnecessary L agrange m ultipliers to en f orce marginalization constrain ts that w ere already indirectly enforced. FIG. 6. Null mo des of the plaquette CVM free energ y in terms of ﬁelds. The small- u ﬁelds that act ov er a given spin i inside a plaquette can b e shifted by an ar bitrary amount δ a s in equation (11) without changing the self co nsistent (message passing) eq ua tions. Consider, for instance, the messages shown in ﬁgure 6. I f th e b elief on a plaquette b P ( s i , s j , s k , s l ) correctly marginalizes to the b eliefs of tw o of its c hildren links b L ( s i , s j ) and b D ( s l , s i ), and one of those b eliefs marginaliz es to the common spin b i ( s i ) = P s j b L ( s i , s j ), it is inevitable that the second link D also marginalizes to the same b elief on s i , since b i ( s i ) = P s j b L ( s i , s j ) = P s j ,s l ,s k b P ( s i , s j , s k , s l ) = P s l b D ( s l , s i ). Th erefore the L agrange multiplier that w as in tro duced to force this last marginalizatio n is n ot needed. This redun dancy is a general feature of GBP equations wh en there are more than t wo lev el of regions (Plaquette, Links, and Spins, in our ca se). The consequence of ha ving in tro duced u nnecessary multipliers leads to a gauge inv ariance on the ﬁelds (messages) v alues. S uc h in v ariance can b e b etter und ersto o d by lo oking at the GBP equations at inﬁn ite temp eratur e: for β = 0 the n on linear parts of the message passing equations (5) and (6) disapp ear, b ut ther e is still a set of linear equations to b e satisﬁed for the small- u messages with inﬁ n ite many non trivial solutions. These solutions corresp ond, how ever, to the same physical paramagnetic solution, since the total ﬁeld h i = P 4 L u L → i and the m agnetizations m i = tanh( β h i ) are alw a ys zero. It is easy to c hec k that once we h a v e a solution of the m essage passing equ ations (5) and (6) at any temp erature, w e can c hange by an arbitrary amount δ any group of 4 u -messages in s ide a plaquette (ﬁ gu r e 6) p ointing to the same s pin as u L → i → u L → i + δ , u P L → i → u P L → i + δ , (11) u D → i → u D → i − δ , u P D → i → u P D → i − δ , and still all self-consisten t equations are satisﬁed. This lo cal null mo d e of the s tandard GBP equ ations can b e a v oided by arbitrary setting to ze ro one of the four small- u ﬁ elds en tering equation (11). W e c ho ose to ﬁ x the gauge by removing the righ t small- u ﬁeld in ev ery Plaquette-to-Link ﬁeld ( U, u left , u right ), as s h o wn in ﬁgure 7. Once the gauge is ﬁx ed , the ﬁelds are uniquely determined, and we can try to implement the simulta n eous up d ating of all smal l- u ﬁelds around a giv en spin, hop efu lly impro ving conv ergence. 12 FIG. 7. In the left diagra m, all 8 s m al l- u messag es p ointing to the ce ntral s pin are highlighted with b old face. They are 4 Link-Site u -messag es, and 4 Plaquette-Link u left -messages . They hav e linear dep endence among them. The right dia gram shows four plaquettes around a spin, and the messages that contribute in a non linear wa y to the aforementioned 8 messa ges. The idea o f GBP+GF is to compute the non linea r contributions to the message passing equations , and then assign the v alues of the u -mess ages in order to satisfy their line a r relations. In the left diagram of ﬁ gu r e 7 all messages in volving th e central s pin are rep resen ted, and in b old face those th at act precisely up on that spin . These messages ent er lin early in the message passing equations of eac h other (see equations (5) and (6)). T herefore, th e self consistent equations they should satisfy at the ﬁxed p oints, can b e written as (usin g the notation of ﬁgur e 7) u 1 = u a + N L 1 u 2 = u b + N L 2 u 3 = u c + N L 3 u 4 = u d + N L 4 u a = u b − u 2 + N L a u b = u c − u 3 + N L b u c = u d − u 4 + N L c u d = u a − u 1 + N L d (12) where th e N L stand f or the non linear cont r ib utions to the corresp onding equation. As a con- sequence, the v alues of the 8 u -messages p oin ting to the cen tral spin can b e assigned pr ecisely b y a linear transformation for an y give n v alues of the n on lin ear con tributions. This gauge ﬁxed up d ating metho d, that w e will call GBP+GF, up dates all u -messages around a spin sim ultaneously and in a w ay that th ey are consisten t with eac h other via the message passing equations. The right diagram in ﬁ gure 7 shows the messages entering the non linear parts. T aking the 8 u -messages as zero, the non linear con tributions are th e right hand sides of the message passing equations inv olve d . Wit h the non linear parts computed, the system of equations (12) is solv ed for th e u -v ariables multiplying the non linearities vect or by the corresp ond ing matrix. The 8 u - messages are then u p dated, usually with a damp in g facto r. The u p d ate of the U corr elation ﬁelds is done as in the original GBP m etho d, via the equation (6), since it do es not dep end on the u -messages th at are b eing u p d ated. Figure 8 shows the probabilit y of conv ergence v ersu s in v erse temp erature for GBP and GBP+GF, and also the fraction of the solutions f ound that corresp ond to a spin glass p hase. Let us emph asize here that GBP and GBP+GF are not d iﬀeren t app ro ximations, b ut diﬀerent metho ds to ﬁn d the same ﬁ xed p oint solution b y message passing. Th ey are exp ected to ﬁnd the same solutions, and in f act they d o. At high temp eratures b oth metho ds con ve rge to the paramagnetic solutio n , with all null lo cal magnetizations m i = tanh  β P 4 L u L → i  = 0. The standard message passin g up date of GBP equations hard ly con ve rges ab o ve β con v ≃ 0 . 96, w hile the GBP+GF metho d reac hes low er temp eratures, β con v-GF ≃ 1 . 2, as can b e seen in Fig. 9. F ur- thermore, the GBP+GF allo w s u s to work in a range of temp eratures where most solutions are 13 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 Convergence prob. SG fraction β 2D EA model L = 64 GBP conv. GBP, SG frac. GBP-GF conv. GBP-GF, SG frac. FIG. 8. Co nv ergence pr obability of GBP a nd GBP +GF as a function of β . The solution found by either iteration metho d is a lwa ys the same (when b oth conv er ge), but GBP+ GF re a ches lo wer temp eratures while conv erging . The fraction of s pin gla ss solutio ns found by either algorithm show that GBP+ GF sees the same s pin g lass trans ition temper ature. The fraction of spin g lass solutions is a lwa ys given resp ect to the amount of conv ergent so lutions. 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1/512 1/256 1/128 1/64 β conv 1/L GBP GBP-GF FIG. 9. E stimate of the no n co nv ergence temp era ture for diﬀere nt sy s tem siz e s using the s tandard GB P (squares) and the Gauge Fixed GB P (circles). As shown, with the ga uge ﬁxed pr o cedure the non conv ergence extrap olated temp er a ture is quite close to the av era g e case pr ediction β CVM ≃ 1 . 22 . E ach data p o int corres p o nds to the av era g e of the no n convergence temperature ov er many realiza tions of the disorder : 10 realizations for the 512 × 512 systems, 20 for the 256 × 256 and 100 for the others. 14 spin glass lik e. This prov es th at the non con v erging temp erature fou n d for GBP , β con v ≃ 0 . 96, is not a feature of the CVM appro ximation, but a charact eristic of the message passing metho d used, and can b e outp erf ormed by other message passing sc h emes, like GBP+GF. Kin dly note in ﬁgure 9 that the non conv ergence in verse temp erature of GBP+GF β con v-GF ≃ 1 . 2 is quite close to the av erage case pr ediction for the critical temp erature β CVM ≃ 1 . 22. Whether th is is acciden tal or not is still unclear. Sin ce the a ve r age case instabilit y sh ou ld describ e the breakdo wn of the paramagnetic p hase, and the lac k of con vergence in single instances o ccurs while already in a non paramagnetic p hase, it seems far fetc h ed assuming th at b oth critical b eha viors are r elated. A. Gauge ﬁxed av erage case stability The disagreemen t b et w een the a v erage case critical temp erature β CVM and th e one observ ed in the single instance β SG , can b e due to a num b er of reasons. First, th e a ve rage case calculation assumes that cavit y ﬁelds are uncorr elated. But, in our case, messages participating in the ca vity iteration are ve ry close to eac h other in the lattice, and thus correlated. F urthermore, GBP d o es not hav e the equiv alen t of a Bethe lattice for BP , i.e. a mo del in wh ic h the correlation b et w een ca vit y messages is close to zero by construction. The second reason for a failure of th e av erage case pr ediction is that the trans ition we observe in single in stances m igh t b e d ue to the almost inevitable app earance of ferromagnetic d omains in large systems (Griﬃth instabilit y). The third , and the m ost ob vious reason, is that the gauge inv ariance wa s not accounte d in the a v erage case calculatio n. Repro du cing the metho d of Sec. I I I to obtain an a ve r age case prediction of the critical temp er- ature for the Gauge Fixed GBP is not straigh tforwa rd . The reason is that Link-to-Spins messages u , should fulﬁll tw o diﬀerent equations: their own original equation (5), and the implicit equation deriv ed from the fact that the gauge is ﬁxed and one of the ﬁelds in the Plaquette-to-Link message ( U, u, u ) is set to zero. FIG. 10 . Lef t: The s et o f four messa ges that we compute jo int ly by a p opulation dynamic. Righ t: the po pulation dynamic step consists in taking four q ua druplets at r andom from the popula tion (those in bla ck), and co mputing a new quadr uplet (the one in g ray ins ide the plaquette) using ra ndomly selected int er actions J ij on the plaquette. Ho w eve r, a diﬀeren t av erage case calculation is p ossible. W e can represen t th e messages ﬂo wing in the lattice by a p opulation of quadr u plets ( u L l → l , u P → l , U P → lr , u L r → r ), wh ere one of the original messages is abs ent b ecause the gauge has b een ﬁ xed (see left panel in Fig. 10). Give n any f our of these quadrup lets of messages around a plaquette, w e can compute, using the message passin g equations, the new messages insid e the plaquette (see righ t p anel in Fig. 10). T he new p opu lation dynamics consists in p ic king four of these quadru plets out of the p opulation at r andom, then computing th e new quadru plet (using also random interac tions in the plaquette) and ﬁnally put 15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.6 0.8 1 1.2 1.4 1.6 1.8 2 q EA β Pop Dynamics FIG. 11. E dwards Anderson or der parameter, see eq.(13), obta ined using a p opulation of N = 10 3 messages, and running the popula tion dynamic step 10 3 × N times. In agreement with the s ingle instance behavior, the transition b etw een paramag ne tic ( q E A = 0) a nd no n para magnetic (spin glass) phases is found a t β ≃ 0 . 78 . it bac k in the p opulation. After sev eral steps, th e p opulation stabilizes either to a paramagnetic solution (wh ere all u = 0 and only U 6 = 0), either to a n on p aramagnetic one (wh ere also u 6 = 0). In Fig. 11 we sho w the Edwards And erson ord er parameter q E A = P i m 2 i / N obtained at diﬀeren t temp eratures using th is p opulation dynamics a v erage case metho d. W e ﬁnd that q E A b ecomes larger than zero at β CVM-GF ≃ 0 . 78 , whic h is qu ite close to the inv ers e temp er atur e β SG ≃ 0 . 79 where single instances dev elop non-zero lo cal m agnetizati ons and a spin glass ph ase. The corresp ondence b et w een this a verag e case result and the sin gle instance b eha viour is v ery enligh tening: indeed the a verag e case computation do es not tak e into accoun t correlations among quadrup lets of messages and it is n ot sensible to Griﬃth’s singularities. S o, the most simple explanation for the GBP-GF b eh a viour on single samples of the 2D EA mo del is that quadruplets of messages arriving on any giv en plaquette are mostly un correlated and that at β SG a true spin glass instabilit y tak es p lace (whic h is an artifact of the m ean-ﬁeld lik e appro ximation). Please consider that und er the Bethe appro ximation the S G in s tabilit y happ ens at β Bethe ≃ 0 . 66, while the CVM ap p ro ximation impro ves the estimate of the SG critical b oun d ary to β SG ≃ 0 . 79 (on single ins tances) and to β CVM-GF ≃ 0 . 78 (on the a v erage case). VI. SAME APPRO XIMA TION, FOUR ALGORITHM S It can b e prov ed [17] that stable ﬁxed p oints of the m essage passin g equations corresp ond to stationary p oints of the region graph appro ximated f ree energy (or CVM free energy). T h e con v erse is not necessarily true, and some of the stationary p oints of the free energy , might n ot b e stable under the message passing heuristic. As w e h a v e seen, the m essage passing might not ev en con v erge at all. F or a given free energy approxima tion (eq. (1) in our case), there are other algorithms to searc h for stationary p oin ts, includin g other t yp es of message passing and p r o v ably con v ergent algorithms. In this section we study t w o of these algorithms and show that they do ﬁn d the s ame spin glass like transition at β m , b ut ha ve a diﬀerent b ehavio r at lo w er temp eratures. The one pr esen ted so f ar is the so called P arent-to -Child (PTC) message passing algorithm, in which Lagrange m ultipliers are in tro duced to force m arginalizatio n of bigger (paren t) regions on to their c hildren. Oth er choic es of Lagrange multipliers are p ossible [17], leading to the so 16 -0.002 0 0.002 0.004 0.006 f-f dual Double Loop HAK GBP 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 2 2.5 3 3.5 q EA β 0.5 1 1.5 2 2.5 3 3.5 β FIG. 12 . F ree ener gy of the solutions found b y Double Lo op alg orithm, HAK and the GBP PTC alg orithm relative to the free ener g y of the par amagnetic solutio n (Dual approximation), in a t ypica l system in which GBP PTC ﬁnds a spin gla ss so lution. At hig h temp er a tures b o th algor ithm ﬁnd the same par amagnetic solution. Interestingly , there is a sma ll ra nge of temper atures whe r e the spin glass solution fo und b y GBP is actually the one that minimizes the free energy . But at even low er temperatures the para magnetic solution bec omes ag ain the co rrect one. While Double Lo o p and HAK switch back to the para magnetic solution (even if a t a wrong T ), the GBP PTC g et stuck in the spin glass solution (and for this rea son, it even tually stops con verging). called Child-to-P arent and Two-W a ys algorithms. Next we test the f ollo wing four algorithms for minimizing the plaquette-CVM free energy in t ypical instances of 2D E A: • Double-Lo op algorithm of Hesk es et. al. [19]. I s a prov ably conv ergent algorithm that guaran tees a step by step m inimization of the free energy fun ctional. It consist of tw o loops , the inner of whic h is a Tw o-W a ys message passing algorithm that w e will call HAK. W e use the im p lemen tation in LibDai p ublic library [32]. • HAK message passing algorithm. Is a T w o-W a ys message passing algorithm [19]. When it con v erges, it is usu ally faster th an Double-Lo op. • GBP Pa rent-to -Child is the message passing algorithm we h a v e pr esen ted so far in this pap er , and for whic h th e sim ultaneous up dating of ca vit y ﬁelds was introd uced to help con verge nce. Nev ertheless the follo wing results were obtained usin g standard GBP PTC. • Dual algorithm of [26 ]. I s the same GBP PTC setting all small ﬁelds u = 0, and doing only message passing in terms of correlation ﬁelds U (ﬁrst equation in eq. (6)). F or the last three algorithms w e use ou r o wn implementa tion in terms of ca vit y ﬁelds u and ( U, u a , u b ). Th e d ual algorithm forces the solution of GBP to remain paramagnetic since all u = 0. This paramagnetic ans atz is sp ecia lly suited for the 2D EA mo del since it is exp ected to b e paramagnetic at any ﬁnite temp erature (in the ther m o dynamical limit). As sho wn in the p revious s ection, the GBP PTC message passing equations ﬁn ds a paramagnetic solution in the 2D EA mo d el at h igh temp eratures, w hile b elo w T SG = 1 /β SG ≃ 1 . 27 it ﬁnds a 17 0.01 0.1 1 10 100 1000 10000 0.5 1 1.5 2 2.5 3 3.5 t conv (seconds) β Double Loop HAK GBP Dual 0.5 1 1.5 2 2.5 3 3.5 β FIG. 13. Co nv ergence time in seco nds fo r the Double Lo op a lgorithm (full p o ints) and standa rd message passing alg orithms (empty p oints) for the plaq ue tte- GB P appr oximation in tw o diﬀere nt realiz ations of a 16 2 Edwards Anderson system. Message passing a lgorithms ar e t ypica lly faster , but not alw ays conv ergent. The ﬁrst cusp is rela ted to the app ear ance of the s pin glas s solution, while the seco nd cusp in the Double Lo op algorithm is related to the switching bac k to the paramag netic solution (see Fig. 12). spin glass lik e solution. By spin glass like we mean that the total ﬁeld h i = P 4 L u L → i and th e magnetizati on m i = tanh( β h i ) are non zero and c hange fr om spin to sp in . The order parameter q EA = 1 N X i m 2 i (13) is u sed to lo cate this phase. T he critical temp erature T SG , where q EA b ecomes larger than zero, seems to b e in dep end en t of message p assing details, like damping or the u s e of gauge ﬁxing for sim ultaneous up dates of ﬁ elds. In ﬁ gu r e 12 w e s ho w the free energy and the q EA parameter of the solutions foun d by Double Lo op, HAK and GBP PTC for t wo t ypical realizatio n s of an N = 16 × 16 EA system with bimo dal in teractions. The fr ee energy of the dual appr oximati on is sub tracted to highligh t the diﬀerences with resp ect to th e paramagnetic solution. The ﬁgure shows that HAK and Double Lo op d o ﬁnd the same spin glass solution that GBP PT C ﬁnds when going d o wn in temp erature. T his solution is actually lo w er in fr ee en er gy w hen it app ears, but at ev en lo we r temp eratur es b ecomes sub dominant compared to the paramagnetic one. The GBP PTC kee p s ﬁ n ding the sp in glass solutions wh ile Double Lo op and HAK switc h b ack to th e paramagnetic on e. This is an int eresting feature of Double Loop and in particular of HAK whic h is a fast message passing algorithm. By returning to the d u al (paramagnetic) solution, HAK is also ensu ring its con ve rgence at lo w temp er atur e [26], while GBP PTC get lost in the irrelev an t (and physicall y wr ong) spin glass solution, and even tu ally stops conv erging. Ho w eve r note that DL and HAK may stop ﬁndin g the S G solution w hen this solution is still the one with low er f ree energy . Moreo ver the lac k of co nv ergence of GBP can b e u sed as a warning that something wrong is happ ening with the CVM appro ximation, something that is imp ossible to understand b y lo oking at the b eha vior of prov ably con verge nt algorithms. In ﬁgure 13 we compare the run ning times of Double Lo op (Lib Dai [32]), HAK and GBP PTC 18 (our imp lemen tation) for th e t wo systems of ﬁ gure 12. As exp ected, Double Lo op is m u ch m ore slo wly th an the message passing heuristics of HAK and GBP (please notice the log scale in the time axis). The p eaks in run ning times corresp ond to the transition p oin ts from paramagnetic to spin glass solution. Double L o op and HAK ha ve tw o p eaks, the second corresp onding to th e transition b ac k to p aramagnetic solution, while the GBP PTC has only the ﬁrst p eak. VI I. SUMMAR Y AND CONCLUSIONS W e studied the prop erties of th e Generalized Belie f Propagation algorithm deriv ed from a Cluster V ariational Metho d appr oximati on to the free energy of the Edwards Anderson mo del in 2D at the lev el of p laquettes. W e compared th e results obtained b y Paren t-to-Child GBP w ith the ones obtained b y the Dual (p aramagnetic) algorithm [26] and by HAK Tw o-W ays algorithm [19] and Double-Lo op p ro v ably con verge nt algorithm [19 ]. W e found that th e plaquette-CVM appro ximation (using Paren t-to-Child GBP) is far ric her than the Bethe (BP) approximat ion in 2D EA mo del. BP conv er ges only at high temp eratures (ab o ve T Bethe = 1 /β Bethe = 1 . 51), and in su c h case it treats the system as a set of indep enden t p airs of lin ked sp in s. GBP on the other hand , make s a b etter prediction on the paramagnetic b eha vior of the mo d el at high T, since it imp lemen ts a message passing of correlations ﬁelds ﬂ o wing fr om plaquettes to links in the graph. F u rthermore with GBP the p aramagnetic phase is extended to temp eratures b elo w T Bethe = 1 . 51 u nt il T SG = 1 /β SG ≃ 1 . 27 w here sp in glass solutions app ear in th e s in gle in s tance imp lemen tation of the message passing algorithm. In con trast to Bethe appro ximation, GBP is able to ﬁnd spin glass solutions, and th e standard message p assing s tops con v erging n ear T con v ≃ 1. The a verag e case calculation of the stabilit y of the paramagnetic solution in the CVM appro xi- mation predicted that non paramagnetic (spin glass) solutions should app ear at lo w er temp eratures T CVM = 1 /β CVM ≃ 0 . 82. Th is av erage case result do es not coincide with the single ins tance b e- ha vior of the standard GBP , since it fails to mark b oth the p oin t w here GBP start ﬁnd ing spin glass solutions T SG and the p oint where GBP stops con vergi n g T con v . Ho w eve r, the non con v ergence of GBP is not a feature of the CVM approxima tion, and is susceptible of changes fr om one implemen tation of the message passing to another. W e show ed that b y ﬁ x in g a hidden gauge in v ariance in the message p assin g equ ation, a simultaneo u s up date of all ca vit y ﬁelds p oin ting to a single spin in the lattice improv es the con verge nce of the algorithm, without changing dr astically its sp eed. Usin g the gauge ﬁxed GBP , the non conv ergence inv erse temp erature is mo v ed to T con v-GF ≃ 1 . 2, quite close to the av erage case prediction T CVM (whether this is only a coincidence is still not clear). Most imp ortan tly the a verage case computation (p opulation dynamics) with the gauge ﬁxed id en tiﬁes the same S G critical temp eratur e T CVM-GF ≃ 1 . 28 measur ed on single samp les (where T SG ≃ 1 . 27). Finally we compared the ﬁxed p oin t solutions f ou n d by the GBP message passing with those found b y the prov ably con vergen t Double-Lo op algorithm and the message passing heuristic of the Tw o-W ays algorithm of [19]. All the algorithms ﬁnd the same p aramagnetic s olutions at high T, while b elo w T SG they ﬁn d a spin glass solution, in th e sense that lo cal magnetizations are non zero, w h ile the global magnetization is n u ll. Decreasing the temp erature Double-Loop and HAK switc h bac k from the spin glass to the paramagnetic solution, at the cost of a facto r 10 2 − 10 3 and 10 − 1 0 2 resp ectiv ely in runnin g time, compared to GBP . F urthermore, th e paramagnetic solution can alw a ys b e foun d fast b y the Du al algorithm of [26], making these tw o algorithms (Double-Loop and HAK) unnecessarily slo w. Although the thermo d ynamics of the 2D EA mo d el is paramagnetic, at low temp eratures, the correlation length gro ws until even tually sur passing L/ 2 and therefore b eing eﬀec tivel y inﬁ nite for an y ﬁnite size 2D system. I n such a situation the non p aramagnetic solutions obtained by GBP can acc ount for long range correlations, and presumably giv es b etter estimat es for the co rr elations among sp ins than the p aramagnetic solution obtained b y HAK and Doub le Lo op. Establishing the previous claim requires a detailed stu dy of the qualit y of C VM appr oximati on at lo w temp eratures (in the non paramagnetic range) and its co n nections to the statics and d ynamics 19 of 2D Edwards Anderson mo del, wh ic h is already under stud y . Applicatio n of CVM and GBP message p assing to Edwards Anderson mo del in 3D is also app ealing, since this mo del do es ha ve a sp in glass b eha vior at lo w temp erature. [1] H. A. Bethe, Pr o c. R. So c. A. 150 , 55 2 (19 35). [2] R. Kikuchi, Phys. Rev. 81 , 9 88 (1 9 51). [3] M. M ´ ezard and G. Parisi, Eur. Phys. J. B 20 , 21 7 (20 01). [4] J. Pearl, Pr ob abilistic R e asoning in Intel ligent S ystems (2nd e d.) (Morg an Kaufmann, San F rancisco, CA, 1988). [5] F. R. Ksc hischang, B. J. F rey , and H. Lo elig er, IE E E T ra ns. Info rm. Theory 47 , 49 8 (19 98). [6] M. M ´ ezard and R. Zec china, P hys. Rev. E 66 , 056 126 (20 02). [7] M. M ´ ezard, G. P aris i, and R. Z ecchina, Science 297 , 8 1 2 (2002). [8] M. M ´ ezard and G. Parisi, J. Stat. Phys. 111 , 1 (2 0 03). [9] G. P ar isi, M. M ´ e zard, and M. A. Vir a soro , Spin Glass The ory and Beyond (W orld Scientiﬁc, Singap o re, 1987). [10] R. Mulet, A. Pagnani, M. W eigt, and R. Zecchina, Phys. Rev. Lett. 89 , 2 6870 1 (2 002). [11] A. Braunstein, R. Mulet, A. Pagnani, M. W eigt, and R. Zecchina, Phys. Rev. E 68 , 036702 (2003 ). [12] D. Ac hlio pta s, A. Naor , and Y. P er e s , Nature 435 , 759 (200 5 ). [13] F. Krzak ala, A. Mon tanari, F. Ricci-T ersenghi, G. Semerjian, and L. Zdeb orov a, Pro c. Nat. Acad. Sci. 104 , 10318 (200 7). [14] A. Montanari, F. Ricci-T ersenghi, and G. Semerjian, J. Stat. Mech. , P04004 (2 0 08). [15] A. Montanari, F. Ricci-T ersenghi, a nd G. Semer jian, in Pr o c e e dings of the 45th Annual Al lerton Confer enc e on Communic ation, Contr ol, and Computing (2007 ) pp. 352–3 59. [16] F. Ricci-T ersenghi and G. Semerjian, J. Stat. Mech. , P0900 1 (2009 ). [17] J. Y edidia, W. T. F reema n, a nd Y. W eiss, IEEE T ra ns. Inform. Theory 51 , 2282 (2005). [18] Y. S.-K. Ey e and A. L. Y uille, Neural Comput. 14 , 20 02 (2 001). [19] T. Heskes, C. A. Alb ers , a nd H. J. Kapp en, UAI-03 , 313 (2003). [20] K. T anak a, J. Inoue, and D. M. Titterington, in XIII Pr o c e e dings of t he 2003 IEEE Signal Pr o c essing So ciety Workshop (17-19 Septemb er, 2003 (IE EE Computer So ciet y Press, 2 003) pp. 32 9–33 8. [21] C. A. Albers, T. Hesk es, and H. J. Kappen, Genetics 177 , 1 101 (2007). [22] H. J. Kappen, in Mo deling Bio-me dic al signals (W o rld Scien tiﬁc, 2002) pp. 3–1 6. [23] K. T a nak a a nd T . Mor ita , P hys. Lett. A 203 , 122 (1995). [24] T. Jorg , J. Lukic, E. Mar inari, a nd O . Ma rtin, Phys. Rev . Lett. 9 6 , 2 3720 5 (2006 ). [25] C. K. Thomas a nd A. A. Middleton, Ph ys. Rev. E 80 , 04670 8 (2009). [26] A. Lage-Ca stellanos, R. Mulet, F. Ricci- T ersenghi, a nd T. Rizzo, Phys. Rev. E , to appe a r (20 11). [27] T. Rizzo, A. La ge-Castella nos, R. Mulet, and F. Ricci-T ersenghi, J . Sta t. Phys. 13 9 , 375 (2010 ). [28] A. Pelizzola, J. Ph ys . A 38 , R30 9 (200 5). [29] J. M. Mooij and H. J. Kapp en, J. Sta t. Mec h. , P 1101 2 (2005). [30] The Bethe tempe r ature T B ethe is the o ne at whic h a non trivial spin glass solution app ea r s for a random regular Bethe lattice with connectivity K = 4 . The B ethe lattice looks loca lly as a tree. [31] M. W eigt, R. A. Whitea, H. Szur mantc, J. A. Ho chc, and T. Hw a, P ro c. Nat. Aca d. Sci. 36 , 110 96 (2003). [32] J. M. Mooij, J. Mac h. Learn. Res. 11 , 2169 (2010).

Characterizing and Improving Generalized Belief Propagation Algorithms on the 2D Edwards-Anderson Model

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment