Characterizing and Improving Generalized Belief Propagation Algorithms on the 2D Edwards-Anderson Model
We study the performance of different message passing algorithms in the two dimensional Edwards Anderson model. We show that the standard Belief Propagation (BP) algorithm converges only at high temperature to a paramagnetic solution. Then, we test a…
Authors: E. Dominguez, A. Lage-Castellanos, R. Mulet
Characterizing and Impro ving Generalized Belief Propagation Algorithms on the 2D Edw ards-Anderson Mo del Eduardo Dom ´ ınguez, Alejandro Lage-Castellanos, Rob erto Mulet Dep artment of The or etic al Physics and “Henri-Poinc ar’e-Gr oup” of Complex Systems, Physics F aculty, University of Havana, L a Hab ana, CP 10400, Cub a. F ederico Ricci-T ersenghi Dip artiment o di Fisic a, INFN – Sezione di R oma 1 and CNR – IPCF, UOS di R oma, Universit` a L a Sapienza, P.le A. Mor o 5, 00185 R oma, Italy T ommaso Rizzo Dip artiment o di Fisic a and CNR – I PCF, UOS di R oma, Universit` a L a Sapienza, P.le A. Mor o 5, 00185 R oma, Italy (Dated: Septem b er 20, 2018) W e study the p erforma nc e o f different message passing algor ithms in the tw o dimensional Edwards Anderson model. W e show that t he standard Belief Pro pa gation (BP ) a lgorithm conv erges only at high temp erature to a parama gnetic solution. Then, we test a Gener alized Belief P ropag ation (GBP) algorithm, derived from a Cluster V ar iational Metho d (CVM) a t the plaquette level. W e compa r e its p erformance with BP and with other algorithms derived under the same approximation: Double Lo op (DL) and a t wo-wa ys mes sage passing algo- rithm (HAK). The plaquette-CVM approximation improves BP in at leas t three wa ys: the quality o f the paramagne tic solution at high temp er a tures, a better estimate (low er) for the critical temp era ture, a nd the fact that the GB P message pas s ing algo rithm conv erges also to non paramagnetic solutions. The lack of co nv ergence of the standa rd GBP message pa s sing algorithm at low temp eratures s eems to b e r elated to the implemen tation details a nd not to the a pp ea rance of long r ange or de r . In fact, we prove that a gauge inv ariance of the c on- strained CVM free energy can b e exploited to derive a new message pas sing alg o rithm which conv erges a t even lo wer temper atures. In all its region of co nv ergence this new algorithm is faster than HAK and DL by s ome orders of magnitude. 2 I. INTR ODUCTION The 2D Edw ards-Anders on (EA) model in statistical mec hanics is defined b y a set σ = { s 1 . . . s N } of N Ising spins s i = ± 1 p laced on the no d es of a 2D square lattice, and random in teractions J i,j at the edges, with a Hamiltonian H ( σ ) = − X J i,j s i s j where < i, j > run s o ver all couples of neigh b oring s pins (first neigh b ors on th e lattice) . The J i,j are th e m agnetic interc hange constants b etw een spins and are sup p osed fi xed f or any giv en instance of the system, and the sp ins s i are the dynamic v ariables. W e will fo cus on one of the most common disorder t yp es, the bimo dal interactio ns J = ± 1 with equal probabilities. The statistical mec hanics of the EA mo d el, at a temp erature T = 1 /β , is give n by the Gibbs- Boltzmann distr ib ution P ( σ ) = e − β H ( σ ) Z where Z = X σ e − β H ( σ ) The dir ect compu tation of the partition function Z , or any marginal probabilit y distr ibution like p ( s i , s j ) = P σ \ s i ,s j P ( σ ), is a time consum ing task, unattainable in general, and therefore an appro ximation is requ ir ed. W e are in terested in fast algorithms for inferring such marginal distri- butions. Actually for the 2D EA m o del, thanks to the graph p lanarit y , algorithms computing Z in a time p olynomial in N exist. Ho wev er we are in terested in very fast (i.e. linear in N ) algorithms that can b e used also for m ore general mod el, e.g. the EA mo del in a field or defined on a 3D cubic lattice . F or these more general cases a p olynomial algorithm is very unlik ely to exist and some appro ximations are required. A simp le and effect ive mean field approximat ion is th e on e d ue to Bethe [1], in whic h the marginals o v er the dynamic v ariables, lik e p ( s i ), are obtained from the minimizatio n of a v ariational free energy in a s elf consisten t w ay . The Bethe appro ximation is exact f or a mo del without lo ops in the interacti ons net work, whic h unfortun ately is far from b eing the u s ual case in physic s. In the conte xt of finite dimensional latt ices, Kiku c hi [2] derived an extension of th is app ro ximation to larger groups of v ariables, whic h ac counts for short lo ops exa ctly , and is usually referred as Clu ster V ariational Metho d (CVM). The int erest in spin glasses, with qu enc hed random disord er, br ough t a new testing ground for b oth app ro ximations. In particular Bethe appro ximation (exact on tr ees) has b een the starting p oint of man y useful th eoretical and applied dev elopmen ts. It is at the basis of the ca vit y m etho d, whic h allo ws a restatemen t of replica theory in probabilistic terms for finite connectivit y systems [3]. The Bethe appro ximation is connected to w ell kno wn algorithms in computer science, namely Belief Propagation [4] and the sum-pro du ct algorithm [5]. A ma jor achiev ement of this confluence b et ween computer science and statistical mec hanics, has b een the conception of the Surv ey P r opa- gation algorithm [6, 7], inspired b y the ca vity metho d and the replica symmetry breaking [3, 8 , 9], that sh ows great p er f ormance on hard optimization p roblems [6, 7, 10, 11 ]. Statistical mec hanics clarified the r elation b et w een p hase transitions and easy-hard transitions in optimization p roblems, and allo w ed the statistica l c haracterization of the onset of the hard phase [12–14], as we ll as the analytical description of searc h algorithms based on BP [15, 16]. The correctness of Bethe app r o ximation and the related algorithms is, ho we ver, link ed to the lac k of top ological correlations in the inte r actions (rand om graphs are lo cally tree-lik e), since the appro ximation is exact only on tree top ologies. T his is a strong limitation for physical pu rp oses, since tree top ologi es or random graphs are n ot the common situation. Bethe approximati on p er- forms p o orly in fi nite dimensional lattice s, and the asso ciated algorithm are usu ally non con ve rgent at lo w temp eratures. Recen tly the Cluster V ariational Method (CVM) has b een reformulate d in a br oader probabilis- tic framew ork called r e gion-b ase d approxi mations to free energy [17] and c onn ected to a Generalized 3 Belief Propagation (GBP) algorithm to fi nd the stationary p oints of the free energy . It extends Bethe app ro ximation by considering correlations in larger regions, allo wing, in principle, to tak e in to account sh ort lo ops accurately . In [17] wa s sh o wn that s table fi x ed p oin ts of GBP message passing algorithm corresp ond s to stationary p oin ts of th e app r o ximated CVM free energy , while the con verse is n ot necessarily tru e. F urthermore, the GBP message passing is n ot guarantee d to con v erge at all. P r ompted b y this lac k of con v ergence, a new kind of prov ably con vergen t algo- rithms for minimizing the CVM appr o ximated free energy , known as Double Lo op (DL) algorithms [18, 19], has b een d ev elop ed, at the cost of a drastic drop off in sp eed. GBP h as b een applied in the last decade to inference pr ob lems [20–22], consisten tly outp er- forming BP . In p articular, the image reconstru ction p roblems [20, 23] are based on a 2D lattice s structure, b ut, at v ariance with 2D E A mo del, the int eractions among n earby s pins (pixels) are ferromagnetic, an d the damaged image is used as an external field. Both f actors help conv er gence of GBP algorithms. An analysis of CVM appro ximation using GB P algorithms on single instances of finite dimensional disord ered mo dels of physical in terest, lik e the EA mo del, has not b een done so f ar. The Edwards Ander s on mo del in 2D has b een largely s tu died by other metho ds (see [24, 25] and reference th erein) s uggesting that it remains paramagnetic all the wa y d o wn to zero temp erature, lac king any thermo d y n amic tran s ition at any fi nite T , although at lo w T there are metastable states of v ery long lifetime, leading to v ery slo w d ynamics. Based on th is fact, a paramagnetic v ersion of the GBP on 2D EA mo del was stu died recentl y in [26]. Th e connection of C VM with the replica tric k and a Generalized Surv ey Propagation ha ve b een pr esented recentl y [27]. Ho we ver the imp lemen tation of the latter algorithm on fi nite dim en sional lattices is computationally v ery demanding, and should b e preceded by the stud y of th e original CVM approximat ion and GBP algorithm. In this p ap er we s tudy the con ve rgence pr op erties of GBP message passing algorithm and the p erforman ce of the CVM approximat ion on the 2D EA mo del. After the in tro duction of the region-based f r ee energy in Sec. I I and the message passing algorithm in terms of ca vit y fi elds, w e compute the critical (in verse) temp erature T CVM ≃ 0 . 82 ( β CVM ≃ 1 . 22) of the plaquette-CVM appro ximation in Sec. I I I, impro ving Bethe estimate T Bethe = 1 . 51 ( β Bethe ≃ 0 . 66) by r oughly a factor 2. The CVM av erage case temp eratur e, ho wev er , do es not clearly corresp onds to the single instance b eha vior of the GBP message passing al gorithm, as is shown in Sec. IV. A t v ariance with Belief Propagation, GBP conv erges to spin glass s olutions (b elo w T SG ≃ 1 . 27, ab o ve β SG ≃ 0 . 79), and stops conv erging near T ≃ 1 . 0, b efore the a v erage case pr ed iction T CVM . In Sec.V we sho w th at this con v ergence problem dep end s on th e implemen tation details of the messag e passing algorithm, and can b e improv ed b y a sim ultaneous up date of message. In order to do so the ga u ge inv ariance of the message passing equ ations h as to b e fixed. In Sec. VI w e compare the solutions and the p erforman ce of GBP with 3 other algorithms for the minimization of the CVM free energy: Double Lo op [19], Tw o-W a ys Message Passing [19], and the Dual algo rith m [26]. In terms of the CVM free energy , the paramagnetic solution is in general th e one to b e c hosen, except for a small inte rv al in temp eratures wh ere the spin glass solution has a lo wer fr ee energy . Our results are summarized in Sec. VI I. I I. GENERALIZED BE LIEF PROP A GA TION ON EA 2D Giv en that a d etailed d eriv ation of plaquette-GBP message passing equations for the 2D Ed- w ards Anderson mo d el w ere presented in [26], here we only summarize such d eriv ation, skippin g unnecessary details. The idea of the r e gion-b ase d free energy appro ximation [17, 28] is to mimic the exact (Boltzmann- Gibbs) distrib ution P ( σ ), b y a redu ced set of its marginals. A hierarc hy of approximat ions is giv en b y the size of su c h marginals, starting with the set of all single sp ins marginals p i ( s i ) (mean field), then follo wing to all neigh b oring sites m arginals p ( s i , s j ) (Bethe appr o ximation), then to all squ are plaquettes marginals p ( s i , s j , s k , s l ), and so on. Since the only w ay of kno wing su c h marginals exactly is the un attainable compu tation of Z , the metho d pretends to appr o ximate th em by a set 4 of b eliefs b i ( s i ), b L ( s i , s j ), b P ( s i , s j , s k , s l ), etc. obtained from a minimization of a r egion based free energy . F ollo wing the deriv ation d one in [26], the p laquette level approximat ed free energy for the 2D EA mo d el is giv en as a contribution of all Plaquettes, Links and Spins in the 2D lattice : − β F = X P X σ P b P ( σ P ) log b P ( σ P ) exp( − β E P ( σ P )) Plaquettes − X L X σ L b L ( σ L ) log b L ( σ L ) exp( − β E L ( σ L )) Links (1) + X i X s i b i ( s i ) log b i ( s i ) exp( − β E i ( s i )) Spins where the sym b ol σ R = ( s 1 , . . . , s k ) stands for the set of s pins in regio n R , wh ile E R ( σ R ) = − P ∈ R J i,j s i s j stands for the energy contribution in that region. The energy term E i ( s i ) in the spins contribution is only relev ant w hen an extern al field acts o v er spins, and will b e neglected from n o w on. i j L i i j R U D L k l L P L R D U P FIG. 1. Schematic representation of belief equations (2). La grange multipliers ar e depicted as arr ows, going from parent regions to c hildren regions. Italics capital letters are used to denote Plaquettes, simple capital letters denote Link s, and lower case letters denote Spins. An unrestricted m in imization of the f ree energy (1) in terms of its b eliefs, pro duces incongruen t results. Beliefs are only meaningful as an appr oximati on to the correct marginals if they ob ey the m arginalizatio n constrains b i ( s i ) = P s j b L ( s i , s j ) and b L ( s i , s j ) = P s k ,s l b P ( s i , s j , s k , s l ). T his marginalizatio n is enforced b y the in tro duction of Lagrange m ultipliers (see [17] for a general in tro d uction, and [26] for this p articular case) in the free energy expr ession. Ther e is one Lagrange m ultiplier µ L → i ( s i ) for ev ery lin k L and s p in i ∈ L , and a Lagrange multiplier ν P → L ( s i , s j ) for eac h plaquette P and link L ∈ P . In terms of these Lagrange m ultipliers, the stationary condition of the ap p ro ximated free energy is ac hiev ed with b i ( s i ) = 1 Z i exp − β E i ( s i ) − 4 X L ⊃ i µ L → i ( s i ) ! , b L ( σ L ) = 1 Z L exp − β E L ( σ L ) − 2 X P ⊃ L ν P → L ( σ L ) − 2 X i ⊂ L 3 X L ′ ⊃ i L ′ 6 = L µ L ′ → i ( s i ) , (2) b P ( σ P ) = 1 Z P exp − β E P ( σ P ) − 4 X L ⊂P 1 X P ′ ⊃ L P ′ 6 = P ν P ′ → L ( σ L ) − 4 X i ⊂P 2 X L ⊃ i L 6⊂P µ L → i ( s i ) . 5 A graph ical represent ation of these equ ations is giv en in figure 1. Lagrange m ultipliers are sho wn as arrows going from parent regions, to children. T ak e, for one, the mid d le equation for the b elief in link regions b L ( σ L ) = b L ( s i , s j ). Th e su m of the t wo Lagrange multiplie r s ν P → L ( s i , s j ) corresp onds to the triple arro ws on b oth sides of the link in central fi gure 1, while th e t w o sums o v er three messages µ L ′ → i ( s i ) corresp onds to the three arrows acting o ver the top ( j ) and b ottom ( i ) s p ins, resp ectiv ely . In equatio n s (2), the Z R are normalizat ion constant s. The terms E P ( σ P ) = E P ( s i , s j , s k , s l ) = − ( J i,j s i s j + J j,k s j s k + J k ,l s k s l + J l,i s l s i ) and E L ( s i , s j ) = − J i,j s i s j are the corresp ondin g energies in plaquettes and links resp ectiv ely , and are represented in the diagram b y the lin es (int eractions) b et ween circles (spins). zero since n o field is acting up on spins. i L i i = = R U D k l j L j i L L j P P L P R U D A U U E B D G C F FIG. 2. Message passing eq uations (5) a nd (6), shown s chematically . Messa ges are depicted as ar r ows, going fr om parent reg ions to children r egions. On any link J i,j , represented as b old lines b etw een spins (circles), a Boltzmann fac to r e β J i,j s i s j exists. Dark circles repr esent spins to b e tr aced ov er. Messa ges from plaquettes to links ν P → L ( s i , s j ) are represented b y a triple arr ow, b ecause they ca n be written in terms of three para meters U , u i and u j , defining the co rrelatio n h s i s j i and ma gnetizations h s i i and h s j i , res p ec tively . The Lagrange m ultipliers can b e parametrized in terms of ca vit y fields u and ( U, u a , u b ) as − µ L → i ( s i ) = β u L → i s i (3) − ν P → L ( s i , s j ) = β ( U P → L s i s j + u P → i s i + u P → j s j ) (4) In particular, the fi eld u L → i corresp onds to the cavi ty field in th e Bethe approxi mation [17 ]. Th e c hoice of these parametrization is the reason for the use of single and triple arro ws in figur es 1 and 2. In particular, the messages going from plaquettes to lin ks, are c haracterized by th r ee fields ( U P → L , u P → i , u P → j ), and the capital U P → L acts as an effectiv e int eraction term. The Lagrange m ultipliers are related among them by the constrains they are su pp osed to imp ose (see [26]). In te r m s of the ca vit y fields and using th e n otation in figure 2, Link-to-Spin ca vit y fi elds shall b e related by u L → i = ˆ u ( u P → i + u L→ i , U P → L + U L→ L + J ij , u P → j + u L→ j + u A → j + u B → j + u U → j ) , (5 ) where ˆ u ( u, U, h ) ≡ u + 1 2 β log cosh β ( U + h ) cosh β ( U − h ) Note that the usual ca vity equation f or fields in the Bethe appro ximation [3] is reco vered if all con tributions from plaquettes P an d L are set to zero. Similarly , b y imp osing the marginalizatio n of the b eliefs at Plaquettes ont o their children L inks, w e find the self consisten t exp ression for the Plaquette-to-Link cavi ty fields: U P → L = ˆ U (#) = 1 4 β log K (1 , 1) K ( − 1 , − 1) K (1 , − 1) K ( − 1 , 1) u P → i = − u D → i + ˆ u i (#) = u D → i − u D → i + 1 4 β log K (1 , 1) K (1 , − 1) K ( − 1 , 1) K ( − 1 , − 1) (6) u P → j = − u U → j + ˆ u j (#) = u U → j − u U → j + 1 4 β log K (1 , 1) K ( − 1 , 1) K (1 , − 1) K ( − 1 , − 1) 6 where K ( s i , s j ) = X s k ,s l exp β ( U U → U + J j k ) s j s k + ( U R→ R + J k l ) s k s l + ( U D → D + J li ) s l s i + ( u U → k + u C → k + u E → k + u R→ k ) s k + ( u R→ l + u F → l + u G → l + u D → l ) s l and the sym b ol # stands f or all in coming fields in the righ t hand side of th e equations. The functions ˆ u ( u, U, h ) and [ ˆ U (#) , ˆ u i (#) , ˆ u j (#)] will b e used in next section for the a verag e case calculatio n. F or a give n sys tem of size N (n umb er of spins) there are 2 N Links and N square p laquettes, and therefore there are 4 N Plaquette-to-Link fields [ U P → L , u P → i , u P → j ], and 4 N Link-to-Spins fields u L → i . At the stationary p oin ts of the free energy their v alues are related by the set of 4 N + 4 N equations (5) and (6). The set of 4 N + 4 N self-consisten t equations are also called m essage-passing equations when they are u sed as up date ru les f or fields in the message passing algorithm, or ca vit y iteration equations in the con text of ca vit y calculations. Th e field notation is more comprehensible than the original Lagrange m ultipliers notation, and has a clear p hysical meaning: eac h plaquette is telling its c h ildren links that they should ad d an effec tive in teraction term U P → L to th e d ir ect interacti on J i,j , due to the fact that sp ins s i and s j are also in teracting throu gh th e other thr ee links in the plaquette. T erms u i act lik e magnetic fields up on spins, and the complete ν ( s i , s j ) − message is c haracterized by the triplet ( U i,j , u i , u j ). I I I. CRITICAL TEMPERA TURE O F PLA QUETTE-CVM APPR OXIMA TION In this section we revisit the metho d used in [27] to compu te the critical temp erature at whic h CVM appr o ximation develo p s a sp in glass p hase. By spin glass ph ase w e mean a phase charact erized b y non zero local magnetizations m i = tanh β P 4 L u L → i and n early zero total magnetization m = 1 N P i m i ≃ 0 (remem b er w e wo rk with no external field). The 2D EA mo del is paramagnetic do wn to zero temp erature, but spin glass like s olutions can app ear in the C VM approximat ion due to its mean fi eld charact er. W e correct one of the conclusions r eac hed in [27], where we fail to observ e the app earance of the spin glass p hase in the CVM app ro ximation to the 2D Ed w ard s Anderson mo del. W e follo w an a ve rage case approac h, whic h is similar in spirit but d ifferen t fr om the single instance stabilit y analysis done in [29] for the Bethe appr o ximation (Belief Propagation). The av erage case calculation is a mathematical tec hniqu e deve lop ed in [3] to study the typica l solutions of ca vit y equations in disorder ed systems, with a deep and fun damen tal conn ection to the replica tric k [9]. When applied to the plaquette-CVM appro ximation [27], we end up with t w o equations, in whic h fi elds (messages) are no w replaced b y fu nctions of fields q ( u ) and Q ( U, u 1 , u 2 ), and the in teractions are a verag ed out. As a consequence of the homogeneit y of the 2D lattice and the a ve r aging ov er lo cal disorder J i,j , all plaquettes, links, and sp ins in the graph are now equiv alent , and we only n eed to study one of them to c haracterize the w hole sys tem. More pr ecisely , the av erage case self consistent equations for the distribution q ( u ) is giv en by q ( u i ) = E J Z d q ( u A → j ) d q ( u B → j ) d q ( u U → j ) (7) d Q ( U P → L , u P → i , u P → j ) d Q ( U L→ L , u L→ i , u L→ j ) δ u i − ˆ u (#) with ˆ u (#) as defin ed in the right hand s ide of equation (5), an d d f ( x ) ≡ f ( x )d x 7 The corresp ond ing self-consistent equation for Q ( U, u 1 , u 2 ) is Z Z Q ( U, u a , u b ) q ( u i − u a ) q ( u j − u b )d u a d u b = (8) = E J Z d q ( u C → k ) d q ( u E → k ) d q ( u F → l ) d q ( u G → l ) d Q ( U U → U , u U → j , u U → k ) d Q ( U R→ R , u R→ k , u R→ l ) d Q ( U D → D , u D → l , u D → i ) δ U − ˆ U (#) δ u i − ˆ u i (#) δ u j − ˆ u j (#) where the notation corresp onds to equation (6). I n b oth equations (7) and (8) th e expression E J = R d J P ( J ) . . . stands for the a v erage ov er the qu enc hed randomness. A t high temp eratures w e exp ect fixed point equations (5) and (6) to yield a paramagnetic solution. Suc h a s olution is c haracterized b y Link to Site messages u = 0, and Plaquette to L in k messages ( U, u 1 , u 2 ) = ( U, 0 , 0). If we imp ose this ans atz to fields, w e reco ve r the paramagnetic or dual algorithm of [26] f or th e single in stance message passing, and the paramagnetic a ve rage case study of [27] for th e a verag e case. L et us rememb er that th e 2D EA m o del is exp ected to ha ve no thermo dyn amic transition at any finite temp eratur e, and hence remain paramagnetic all th e wa y do wn to T = 0. F ollo wing [27], in th e av erage case the paramagnetic solution h as the form q ( u ) = δ ( u ) Q ( U, u 1 , u 2 ) = Q ( U ) δ ( u 1 ) δ ( u 2 ) The equation (7) is alw a ys satisfied when q ( u ) = δ ( u ) f or whatev er Q ( U ). The equ ation (8) can b e solv ed self-consisten tly f or Q ( U ): Q ( U ) = E J Z d Q ( U U ) d Q ( U R ) d Q ( U D ) (9) δ U − 1 β arctanh h tanh β ( J U + U U ) tanh β ( J R + U R ) tanh β ( J D + U D ) i and the a v erage free energy and all other r elev an t fu n ctions can be deriv ed in terms of it (see [27]). On the other hand, a general (not paramagnetic) s olution of the a v erage case equatio ns (7) and (8) is v ery difficult, since it in vol ves the decon vo lution of distrib utions q ( u ) in the left hand side of eq. (8) in order to up date Q ( U, u 1 , u 2 ) by an iterativ e metho d. A critical temp eratur e can b e found, h o we ver, b y an expansion in small u around the paramagnetic solution. W e can fo cus on the s econd moments of the distribu tions a = Z q ( u ) u 2 d u a i j ( U ) = Z Z Q ( U, u 1 , u 2 ) u i u j d u 1 d u 2 where i, j ∈ { 1 , 2 } and c hec k wh ether the p aramagnetic solution ( a = 0 and a ij ( U ) = 0) is lo cally stable. T o do this w e expand equations (7) and (8) to second order, and w e obtain the f ollo wing linearized equations: a = K a,a a + Z d U ′ K a,a 11 ( U ′ ) a 11 ( U ′ ) + Z d U ′ K a,a 12 ( U ′ ) a 12 ( U ′ ) a Q ( U ) + a 11 ( U ) = K a 11 ,a ( U ) a + Z d U ′ K a 11 ,a 11 ( U, U ′ ) a 11 ( U ′ ) + Z d U ′ K a 11 ,a 12 ( U, U ′ ) a 12 ( U ′ ) a 12 ( U ) = K a 12 ,a ( U ) a + Z d U ′ K a 12 ,a 11 ( U, U ′ ) a 11 ( U ′ ) + Z d U ′ K a 12 ,a 12 ( U, U ′ ) a 12 ( U ′ ) The actual v alues of the K a x ,a y come from the expansion in small u of the original equations (see equation 90 in [27 ] for an example). 8 W e can not solv e these equations analytically b ecause we d o not ha ve an analytical expression of Q ( U ) for the p aramagnetic solution at all temp eratur es. By discretizing the v alues of U uniformly in ( − U max , U max ), i.e. U = i ∆ U w ith i ∈ [ − I max , I max ], we can transform th e con tinuous set of equations to a system of the form ~ a = K ( β ) · ~ a (10) where the v ector of the second moments ~ a = ( a, a 11 ( U ) , a 12 ( U )) ha ve the form ~ a = a, a 11 ( − U max ) , a 11 ( − U max + ∆ U ) , . . . , a 11 ( U max − ∆ U ) , a 11 ( U max ) , a 12 ( − U max ) , a 12 ( − U max + ∆ U ) , . . . , a 12 ( U max − ∆ U ) , a 12 ( U max ) K ( β ) is a (2 I max + 1) × (2 I max + 1) matrix, that s tand for the discrete r epresen tation of the in tegrals in the right hand side of the linearized equations, and dep en ds on the in verse temp erature via th e solution Q ( U ) of eq. (9 ). The p aramagnetic solution ~ a = 0 alw ays satisfy th e homogeneous eq. (10). The stabilit y criterion for the paramagnetic solutio n is the singularity of the Jacobian det( I − K ( β )) = 0. When suc h condition is satisfied, a non paramagnetic solution cont inuously arises from the paramagnetic one, sin ce a fl at d irection app ears in the free energy . 0.5 1.0 1.5 2.0 2.5 3.0 Β - 0.4 - 0.3 - 0.2 - 0.1 0.1 Det @ J D FIG. 3 . Determina nt o f the Ja cobian J = I − K ( β ) as a function of inv erse temp er ature β . The critica l inv erse temp era ture is β CVM ≃ 1 . 22 . Numerically , w e work ed with a d iscretizatio n of 2 I max + 1 = 41 p oin ts b etw een ( − U max = − 3 . 5 , U max = 3 . 5). The p aramagnetic solution Q ( U ) is foun d solving eq. (9) by an iterativ e metho d at ev ery temp erature, and then used to compu te the elements of the K ( β ) m atrix. In figure 3 we sho w the determinant of the Jacobian matrix J = I − K ( β ). The critical inv erse temp erature deriv ed from this analysis is β CVM ≃ 1 . 22 for the app earance of a flat direction in the free energy . In [27] β CVM w as though t to b e in fi nite (zero temp erature) b ecause an incomplete range of the v alues of β was examined. The critical temp erature found here is b elo w th e Bethe critical temp erature β Bethe ≃ 0 . 66, and th erefore impr o v es the Bethe approxima tion by roughly a factor 2, since the 2D EA mo d el is lik ely to remain paramagnetic at all finite temp eratures. A t v ariance with the Bethe appro ximation, the sin gle instance b eh a vior of the message passing is not so clearly related to the a v erage case critical temp erature, as w e sho w in the next section. IV. PERFORMANCE OF GBP ON 2D EA MODEL Before studying GBP message passing f or the plaquette-CVM appr o ximation, let us c heck what happ en s to the simpler Bethe approximati on and th e corresp onding message passing algorithm kno wn as Belief P ropagation (BP) in the 2D EA mo del. When ru nning BP at high temp eratures 9 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 2.5 Prob convergence β Probability of convergence GBP β Bethe = 0.66 GBP 16 x 16 GBP 32 x 32 GBP 64 x 64 GBP 128 x 128 GBP 256 x 256 BP 32 x 32 BP 128 x 128 FIG. 4. P robability of con vergence of BP a nd GBP on a 2D EA mo de l, with random bimo dal interactions, as a function of inv ers e temp er ature β = 1 / T . The Bethe spin glas s transition is exp ected to o ccur a t β Bethe ≃ 0 . 66 on a r andom g raph with the s ame connec tivit y . The BP message pa ssing algo rithm on 2D EA mo del s tops conv erg ing very clos e to that p oint. Ab ov e that temp erature, BP equations conv erg e to the pa ramagne tic solution, i.e. all messages are trivia l u = 0. Below the Bethe temp eratur e (near ly ) the Bethe ins tability takes messages awa y from the para magnetic solution, and the pr esence o f short lo ops is thought to b e res po nsible of the lack of c onv ergence. On the o ther hand, the GB P equations converge at low er temp era tures, but event ually stops con verging as well. (ab o ve T Bethe = 1 /β Bethe ≃ 1 . 51) in a typical in stance of the mo del with bimo dal in teractions, we find the paramagnetic solution (giv en b y all fields u = 0), and therefore, the sy s tem is equ iv alen t to a set of indep enden t in teracting pairs of spins, wh ic h is only correct at infi nite temp erature. The Bethe temp eratur e T Bethe (computed in a ve rage case and exact on acyclic graphs [30]), seems to mark precisely the p oin t where BP stops con v erging (see Fig. 4). Indeed messages flo w a wa y from zero b elo w T Bethe , and con v ergence of the BP message p assing algo rithm is not ac hiev ed an ymore. So, the Bethe appro ximation is d isapp ointing wh en applied to single instances of th e Edwards Anderson mo del: either it con verge to a paramagnetic solution at high temp eratur es, or it do es not conv er ge at all b elo w T Bethe . The n atural question arises, as to what extent GBP message passing algorithm for the plaquette- CVM ap p ro ximation is also non conv ergent b elo w its critical temp eratur e, and whether this tem- p erature coincides with the av erage case one. T o c h ec k th is w e used GBP m essage passing equ ations (5) and (6), with a dampin g factor 0 . 5 in the L ink-to-Site fields u : u new L → i = 0 . 5 u old L → i + 0 . 5 ˆ u (#) W e will mak e the distinction b etw een t wo t yp es of solutions f or the GBP algorithm. Th e high temp erature or paramagnetic solution is characte rized by zero lo cal magnetizat ion of sp in s m i = P s i s i b i ( s i ) = tanh β P 4 L u L → i = 0. At lo w temp eratures, follo win g the av erage case anal- ysis, a non paramagnetic or sp in-glass solution should app ear, c haracterized by non zero lo cal magnetizati ons, b ut rough ly null global m agnetization. The temp erature at w h ic h non zero lo cal magnetizati ons app ear will b e called T SG = 1 /β SG . Figure 4 sho ws that GBP is able to conv erge b elo w the Bethe critical temp eratur e, bu t stops 10 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 2.5 SG Fraction β Fraction of SG solutions in GBP β SG = 0.79 β CVM = 1.22 GBP L= 16 GBP L= 32 GBP L= 64 GBP L= 128 GBP L= 256 0 0.2 0.4 0.6 0.8 1 L 0.9 ( β - β SG ) FIG. 5. Data p oints cor resp ond to the fra ction of SG solutions in a p o pulation of 100 sys tems of sizes 16 2 , 32 2 , 64 2 , 128 2 , 256 2 resp ectively . At high temp eratures (low β ) GBP message passing con verge always to the paramag netic solution. The average ca se critica l in verse tempe r ature β CVM ≃ 1 . 22 does not cor resp onds to the single instance behavior, as the spin glass solutions in GBP app ear aro und β SG ≃ 0 . 79 . The inset shows that all data co lla psed if plotted as a function of the scaling v aria ble L 0 . 9 ( β − 0 . 7 9), where the exp onent 0 . 9 and the critica l inv erse tempera ture β SG ≃ 0 . 79 are obtaine d from best data collapse. con v erging b efore the CVM av erage case critical temp eratur e β CVM ≃ 1 . 22. F urtherm ore, figure 5 sho ws that even b efore stop con v erging, GBP find s a sp in glass solution in most instances. The inn er plot of figure 5 sho ws a collapse of the d ata p oin ts for differen t system sizes using the scaling v ariable L 0 . 9 ( β − 0 . 79), wh ic h gives an estimate β SG ≃ 0 . 79 (the exp onent 0 . 9 is obtained from the b est d ata collapse). Since β SG ≃ 0 . 79 is well b elo w the a ve r age ca se inv erse critical temp erature β CVM ≃ 1 . 22, the relev ance of the latter on the b eha vior of GBP on single samp les is questionable. By a similar data collapse pr o cedure, w e estimate the non-con v ergence temp erature for th e GBP message p assin g algorithm to b e β con v ≃ 0 . 96 (see Fig. 9 ), which is again far a w ay from the a v erage case prediction β SG . So, b ey ond the simp le Bethe appro ximation, w e found thr ee different temp eratures in the CVM appro ximation: β SG ≃ 0 . 79 < β con v ≃ 0 . 96 < β CVM ≃ 1 . 22 corresp onding resp ectiv ely to the app earance of spin glass solutions, to the lac k of con v ergence on single instances, and to the a v erage case prediction for the critical temp erature. W e can summ arize three m ain differences b et ween the p rop erties of BP and GBP . At high temp eratures (b elo w β SG ≃ 0 . 79) GBP giv es a quite goo d appro ximate of the marginals [26], namely the paramagnetic solution w ith non trivial correlations fi elds U 6 = 0, while BP treats the system as a set of ind ep endent pairs of linke d sp ins. F urthermore, this naiv e appr oac h is all that BP can d o for us , sin ce ab o v e β Bethe ≃ 0 . 66, it no longer con ve r ges. GBP , on th e other h an d , is not only able to con v erge b eyond β Bethe , but it is also able to find spin glass solutions ab ov e β SG . The third difference b etw een b oth algorithms is that the n on conv ergence of BP seems to o ccur exactly at the s ame temp erature wh er e a spin glass phase should app ear (and arguably b ecause of it), wh ile the GBP con ve r gence problems app ear deep into the spin glass phase. The lac k of con v ergence of GBP , how ever, seems to dep end strongly on implementa tion details as we sho w next. 11 V. GA UGE INV ARIANCE O F GBP EQ UA TIONS The con v ergence prop erties of the GBP message passing is sensitiv e to implemen tation details, e.g. the dampin g v alue in the up date equations, and th is is not an inherent prop erty of the CVM (or region-graph) approximati on. W e m igh t try , f or instance, to up date sim ultaneously all smal l- u fi elds p oin ting to w ards a giv en spin, hopin g to gain some more stabilit y in message passing algorithm. When trying to do this we find out that ther e is a freedom in th e c hoice of these fields that has no effect ov er the fixed p oin t solutions. Th is freedom (similar to the one noticed in [31]) is the result of h a ving introd uced u nnecessary L agrange m ultipliers to en f orce marginalization constrain ts that w ere already indirectly enforced. FIG. 6. Null mo des of the plaquette CVM free energ y in terms of fields. The small- u fields that act ov er a given spin i inside a plaquette can b e shifted by an ar bitrary amount δ a s in equation (11) without changing the self co nsistent (message passing) eq ua tions. Consider, for instance, the messages shown in figure 6. I f th e b elief on a plaquette b P ( s i , s j , s k , s l ) correctly marginalizes to the b eliefs of tw o of its c hildren links b L ( s i , s j ) and b D ( s l , s i ), and one of those b eliefs marginaliz es to the common spin b i ( s i ) = P s j b L ( s i , s j ), it is inevitable that the second link D also marginalizes to the same b elief on s i , since b i ( s i ) = P s j b L ( s i , s j ) = P s j ,s l ,s k b P ( s i , s j , s k , s l ) = P s l b D ( s l , s i ). Th erefore the L agrange multiplier that w as in tro duced to force this last marginalizatio n is n ot needed. This redun dancy is a general feature of GBP equations wh en there are more than t wo lev el of regions (Plaquette, Links, and Spins, in our ca se). The consequence of ha ving in tro duced u nnecessary multipliers leads to a gauge inv ariance on the fields (messages) v alues. S uc h in v ariance can b e b etter und ersto o d by lo oking at the GBP equations at infin ite temp eratur e: for β = 0 the n on linear parts of the message passing equations (5) and (6) disapp ear, b ut ther e is still a set of linear equations to b e satisfied for the small- u messages with infi n ite many non trivial solutions. These solutions corresp ond, how ever, to the same physical paramagnetic solution, since the total field h i = P 4 L u L → i and the m agnetizations m i = tanh( β h i ) are alw a ys zero. It is easy to c hec k that once we h a v e a solution of the m essage passing equ ations (5) and (6) at any temp erature, w e can c hange by an arbitrary amount δ any group of 4 u -messages in s ide a plaquette (fi gu r e 6) p ointing to the same s pin as u L → i → u L → i + δ , u P L → i → u P L → i + δ , (11) u D → i → u D → i − δ , u P D → i → u P D → i − δ , and still all self-consisten t equations are satisfied. This lo cal null mo d e of the s tandard GBP equ ations can b e a v oided by arbitrary setting to ze ro one of the four small- u fi elds en tering equation (11). W e c ho ose to fi x the gauge by removing the righ t small- u field in ev ery Plaquette-to-Link field ( U, u left , u right ), as s h o wn in figure 7. Once the gauge is fix ed , the fields are uniquely determined, and we can try to implement the simulta n eous up d ating of all smal l- u fields around a giv en spin, hop efu lly impro ving conv ergence. 12 FIG. 7. In the left diagra m, all 8 s m al l- u messag es p ointing to the ce ntral s pin are highlighted with b old face. They are 4 Link-Site u -messag es, and 4 Plaquette-Link u left -messages . They hav e linear dep endence among them. The right dia gram shows four plaquettes around a spin, and the messages that contribute in a non linear wa y to the aforementioned 8 messa ges. The idea o f GBP+GF is to compute the non linea r contributions to the message passing equations , and then assign the v alues of the u -mess ages in order to satisfy their line a r relations. In the left diagram of fi gu r e 7 all messages in volving th e central s pin are rep resen ted, and in b old face those th at act precisely up on that spin . These messages ent er lin early in the message passing equations of eac h other (see equations (5) and (6)). T herefore, th e self consistent equations they should satisfy at the fixed p oints, can b e written as (usin g the notation of figur e 7) u 1 = u a + N L 1 u 2 = u b + N L 2 u 3 = u c + N L 3 u 4 = u d + N L 4 u a = u b − u 2 + N L a u b = u c − u 3 + N L b u c = u d − u 4 + N L c u d = u a − u 1 + N L d (12) where th e N L stand f or the non linear cont r ib utions to the corresp onding equation. As a con- sequence, the v alues of the 8 u -messages p oin ting to the cen tral spin can b e assigned pr ecisely b y a linear transformation for an y give n v alues of the n on lin ear con tributions. This gauge fixed up d ating metho d, that w e will call GBP+GF, up dates all u -messages around a spin sim ultaneously and in a w ay that th ey are consisten t with eac h other via the message passing equations. The right diagram in fi gure 7 shows the messages entering the non linear parts. T aking the 8 u -messages as zero, the non linear con tributions are th e right hand sides of the message passing equations inv olve d . Wit h the non linear parts computed, the system of equations (12) is solv ed for th e u -v ariables multiplying the non linearities vect or by the corresp ond ing matrix. The 8 u - messages are then u p dated, usually with a damp in g facto r. The u p d ate of the U corr elation fields is done as in the original GBP m etho d, via the equation (6), since it do es not dep end on the u -messages th at are b eing u p d ated. Figure 8 shows the probabilit y of conv ergence v ersu s in v erse temp erature for GBP and GBP+GF, and also the fraction of the solutions f ound that corresp ond to a spin glass p hase. Let us emph asize here that GBP and GBP+GF are not d ifferen t app ro ximations, b ut different metho ds to fin d the same fi xed p oint solution b y message passing. Th ey are exp ected to find the same solutions, and in f act they d o. At high temp eratures b oth metho ds con ve rge to the paramagnetic solutio n , with all null lo cal magnetizations m i = tanh β P 4 L u L → i = 0. The standard message passin g up date of GBP equations hard ly con ve rges ab o ve β con v ≃ 0 . 96, w hile the GBP+GF metho d reac hes low er temp eratures, β con v-GF ≃ 1 . 2, as can b e seen in Fig. 9. F ur- thermore, the GBP+GF allo w s u s to work in a range of temp eratures where most solutions are 13 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 Convergence prob. SG fraction β 2D EA model L = 64 GBP conv. GBP, SG frac. GBP-GF conv. GBP-GF, SG frac. FIG. 8. Co nv ergence pr obability of GBP a nd GBP +GF as a function of β . The solution found by either iteration metho d is a lwa ys the same (when b oth conv er ge), but GBP+ GF re a ches lo wer temp eratures while conv erging . The fraction of s pin gla ss solutio ns found by either algorithm show that GBP+ GF sees the same s pin g lass trans ition temper ature. The fraction of spin g lass solutions is a lwa ys given resp ect to the amount of conv ergent so lutions. 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1/512 1/256 1/128 1/64 β conv 1/L GBP GBP-GF FIG. 9. E stimate of the no n co nv ergence temp era ture for differe nt sy s tem siz e s using the s tandard GB P (squares) and the Gauge Fixed GB P (circles). As shown, with the ga uge fixed pr o cedure the non conv ergence extrap olated temp er a ture is quite close to the av era g e case pr ediction β CVM ≃ 1 . 22 . E ach data p o int corres p o nds to the av era g e of the no n convergence temperature ov er many realiza tions of the disorder : 10 realizations for the 512 × 512 systems, 20 for the 256 × 256 and 100 for the others. 14 spin glass lik e. This prov es th at the non con v erging temp erature fou n d for GBP , β con v ≃ 0 . 96, is not a feature of the CVM appro ximation, but a charact eristic of the message passing metho d used, and can b e outp erf ormed by other message passing sc h emes, like GBP+GF. Kin dly note in figure 9 that the non conv ergence in verse temp erature of GBP+GF β con v-GF ≃ 1 . 2 is quite close to the av erage case pr ediction for the critical temp erature β CVM ≃ 1 . 22. Whether th is is acciden tal or not is still unclear. Sin ce the a ve r age case instabilit y sh ou ld describ e the breakdo wn of the paramagnetic p hase, and the lac k of con vergence in single instances o ccurs while already in a non paramagnetic p hase, it seems far fetc h ed assuming th at b oth critical b eha viors are r elated. A. Gauge fixed av erage case stability The disagreemen t b et w een the a v erage case critical temp erature β CVM and th e one observ ed in the single instance β SG , can b e due to a num b er of reasons. First, th e a ve rage case calculation assumes that cavit y fields are uncorr elated. But, in our case, messages participating in the ca vity iteration are ve ry close to eac h other in the lattice, and thus correlated. F urthermore, GBP d o es not hav e the equiv alen t of a Bethe lattice for BP , i.e. a mo del in wh ic h the correlation b et w een ca vit y messages is close to zero by construction. The second reason for a failure of th e av erage case pr ediction is that the trans ition we observe in single in stances m igh t b e d ue to the almost inevitable app earance of ferromagnetic d omains in large systems (Griffith instabilit y). The third , and the m ost ob vious reason, is that the gauge inv ariance wa s not accounte d in the a v erage case calculatio n. Repro du cing the metho d of Sec. I I I to obtain an a ve r age case prediction of the critical temp er- ature for the Gauge Fixed GBP is not straigh tforwa rd . The reason is that Link-to-Spins messages u , should fulfill tw o different equations: their own original equation (5), and the implicit equation deriv ed from the fact that the gauge is fixed and one of the fields in the Plaquette-to-Link message ( U, u, u ) is set to zero. FIG. 10 . Lef t: The s et o f four messa ges that we compute jo int ly by a p opulation dynamic. Righ t: the po pulation dynamic step consists in taking four q ua druplets at r andom from the popula tion (those in bla ck), and co mputing a new quadr uplet (the one in g ray ins ide the plaquette) using ra ndomly selected int er actions J ij on the plaquette. Ho w eve r, a differen t av erage case calculation is p ossible. W e can represen t th e messages flo wing in the lattice by a p opulation of quadr u plets ( u L l → l , u P → l , U P → lr , u L r → r ), wh ere one of the original messages is abs ent b ecause the gauge has b een fi xed (see left panel in Fig. 10). Give n any f our of these quadrup lets of messages around a plaquette, w e can compute, using the message passin g equations, the new messages insid e the plaquette (see righ t p anel in Fig. 10). T he new p opu lation dynamics consists in p ic king four of these quadru plets out of the p opulation at r andom, then computing th e new quadru plet (using also random interac tions in the plaquette) and finally put 15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.6 0.8 1 1.2 1.4 1.6 1.8 2 q EA β Pop Dynamics FIG. 11. E dwards Anderson or der parameter, see eq.(13), obta ined using a p opulation of N = 10 3 messages, and running the popula tion dynamic step 10 3 × N times. In agreement with the s ingle instance behavior, the transition b etw een paramag ne tic ( q E A = 0) a nd no n para magnetic (spin glass) phases is found a t β ≃ 0 . 78 . it bac k in the p opulation. After sev eral steps, th e p opulation stabilizes either to a paramagnetic solution (wh ere all u = 0 and only U 6 = 0), either to a n on p aramagnetic one (wh ere also u 6 = 0). In Fig. 11 we sho w the Edwards And erson ord er parameter q E A = P i m 2 i / N obtained at differen t temp eratures using th is p opulation dynamics a v erage case metho d. W e find that q E A b ecomes larger than zero at β CVM-GF ≃ 0 . 78 , whic h is qu ite close to the inv ers e temp er atur e β SG ≃ 0 . 79 where single instances dev elop non-zero lo cal m agnetizati ons and a spin glass ph ase. The corresp ondence b et w een this a verag e case result and the sin gle instance b eha viour is v ery enligh tening: indeed the a verag e case computation do es not tak e into accoun t correlations among quadrup lets of messages and it is n ot sensible to Griffith’s singularities. S o, the most simple explanation for the GBP-GF b eh a viour on single samples of the 2D EA mo del is that quadruplets of messages arriving on any giv en plaquette are mostly un correlated and that at β SG a true spin glass instabilit y tak es p lace (whic h is an artifact of the m ean-field lik e appro ximation). Please consider that und er the Bethe appro ximation the S G in s tabilit y happ ens at β Bethe ≃ 0 . 66, while the CVM ap p ro ximation impro ves the estimate of the SG critical b oun d ary to β SG ≃ 0 . 79 (on single ins tances) and to β CVM-GF ≃ 0 . 78 (on the a v erage case). VI. SAME APPRO XIMA TION, FOUR ALGORITHM S It can b e prov ed [17] that stable fixed p oints of the m essage passin g equations corresp ond to stationary p oints of the region graph appro ximated f ree energy (or CVM free energy). T h e con v erse is not necessarily true, and some of the stationary p oints of the free energy , might n ot b e stable under the message passing heuristic. As w e h a v e seen, the m essage passing might not ev en con v erge at all. F or a given free energy approxima tion (eq. (1) in our case), there are other algorithms to searc h for stationary p oin ts, includin g other t yp es of message passing and p r o v ably con v ergent algorithms. In this section we study t w o of these algorithms and show that they do fin d the s ame spin glass like transition at β m , b ut ha ve a different b ehavio r at lo w er temp eratures. The one pr esen ted so f ar is the so called P arent-to -Child (PTC) message passing algorithm, in which Lagrange m ultipliers are in tro duced to force m arginalizatio n of bigger (paren t) regions on to their c hildren. Oth er choic es of Lagrange multipliers are p ossible [17], leading to the so 16 -0.002 0 0.002 0.004 0.006 f-f dual Double Loop HAK GBP 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 2 2.5 3 3.5 q EA β 0.5 1 1.5 2 2.5 3 3.5 β FIG. 12 . F ree ener gy of the solutions found b y Double Lo op alg orithm, HAK and the GBP PTC alg orithm relative to the free ener g y of the par amagnetic solutio n (Dual approximation), in a t ypica l system in which GBP PTC finds a spin gla ss so lution. At hig h temp er a tures b o th algor ithm find the same par amagnetic solution. Interestingly , there is a sma ll ra nge of temper atures whe r e the spin glass solution fo und b y GBP is actually the one that minimizes the free energy . But at even low er temperatures the para magnetic solution bec omes ag ain the co rrect one. While Double Lo o p and HAK switch back to the para magnetic solution (even if a t a wrong T ), the GBP PTC g et stuck in the spin glass solution (and for this rea son, it even tually stops con verging). called Child-to-P arent and Two-W a ys algorithms. Next we test the f ollo wing four algorithms for minimizing the plaquette-CVM free energy in t ypical instances of 2D E A: • Double-Lo op algorithm of Hesk es et. al. [19]. I s a prov ably conv ergent algorithm that guaran tees a step by step m inimization of the free energy fun ctional. It consist of tw o loops , the inner of whic h is a Tw o-W a ys message passing algorithm that w e will call HAK. W e use the im p lemen tation in LibDai p ublic library [32]. • HAK message passing algorithm. Is a T w o-W a ys message passing algorithm [19]. When it con v erges, it is usu ally faster th an Double-Lo op. • GBP Pa rent-to -Child is the message passing algorithm we h a v e pr esen ted so far in this pap er , and for whic h th e sim ultaneous up dating of ca vit y fields was introd uced to help con verge nce. Nev ertheless the follo wing results were obtained usin g standard GBP PTC. • Dual algorithm of [26 ]. I s the same GBP PTC setting all small fields u = 0, and doing only message passing in terms of correlation fields U (first equation in eq. (6)). F or the last three algorithms w e use ou r o wn implementa tion in terms of ca vit y fields u and ( U, u a , u b ). Th e d ual algorithm forces the solution of GBP to remain paramagnetic since all u = 0. This paramagnetic ans atz is sp ecia lly suited for the 2D EA mo del since it is exp ected to b e paramagnetic at any finite temp erature (in the ther m o dynamical limit). As sho wn in the p revious s ection, the GBP PTC message passing equations fin ds a paramagnetic solution in the 2D EA mo d el at h igh temp eratures, w hile b elo w T SG = 1 /β SG ≃ 1 . 27 it finds a 17 0.01 0.1 1 10 100 1000 10000 0.5 1 1.5 2 2.5 3 3.5 t conv (seconds) β Double Loop HAK GBP Dual 0.5 1 1.5 2 2.5 3 3.5 β FIG. 13. Co nv ergence time in seco nds fo r the Double Lo op a lgorithm (full p o ints) and standa rd message passing alg orithms (empty p oints) for the plaq ue tte- GB P appr oximation in tw o differe nt realiz ations of a 16 2 Edwards Anderson system. Message passing a lgorithms ar e t ypica lly faster , but not alw ays conv ergent. The first cusp is rela ted to the app ear ance of the s pin glas s solution, while the seco nd cusp in the Double Lo op algorithm is related to the switching bac k to the paramag netic solution (see Fig. 12). spin glass lik e solution. By spin glass like we mean that the total field h i = P 4 L u L → i and th e magnetizati on m i = tanh( β h i ) are non zero and c hange fr om spin to sp in . The order parameter q EA = 1 N X i m 2 i (13) is u sed to lo cate this phase. T he critical temp erature T SG , where q EA b ecomes larger than zero, seems to b e in dep end en t of message p assing details, like damping or the u s e of gauge fixing for sim ultaneous up dates of fi elds. In fi gu r e 12 w e s ho w the free energy and the q EA parameter of the solutions foun d by Double Lo op, HAK and GBP PTC for t wo t ypical realizatio n s of an N = 16 × 16 EA system with bimo dal in teractions. The fr ee energy of the dual appr oximati on is sub tracted to highligh t the differences with resp ect to th e paramagnetic solution. The figure shows that HAK and Double Lo op d o find the same spin glass solution that GBP PT C finds when going d o wn in temp erature. T his solution is actually lo w er in fr ee en er gy w hen it app ears, but at ev en lo we r temp eratur es b ecomes sub dominant compared to the paramagnetic one. The GBP PTC kee p s fi n ding the sp in glass solutions wh ile Double Lo op and HAK switc h b ack to th e paramagnetic on e. This is an int eresting feature of Double Loop and in particular of HAK whic h is a fast message passing algorithm. By returning to the d u al (paramagnetic) solution, HAK is also ensu ring its con ve rgence at lo w temp er atur e [26], while GBP PTC get lost in the irrelev an t (and physicall y wr ong) spin glass solution, and even tu ally stops conv erging. Ho w eve r note that DL and HAK may stop findin g the S G solution w hen this solution is still the one with low er f ree energy . Moreo ver the lac k of co nv ergence of GBP can b e u sed as a warning that something wrong is happ ening with the CVM appro ximation, something that is imp ossible to understand b y lo oking at the b eha vior of prov ably con verge nt algorithms. In figure 13 we compare the run ning times of Double Lo op (Lib Dai [32]), HAK and GBP PTC 18 (our imp lemen tation) for th e t wo systems of fi gure 12. As exp ected, Double Lo op is m u ch m ore slo wly th an the message passing heuristics of HAK and GBP (please notice the log scale in the time axis). The p eaks in run ning times corresp ond to the transition p oin ts from paramagnetic to spin glass solution. Double L o op and HAK ha ve tw o p eaks, the second corresp onding to th e transition b ac k to p aramagnetic solution, while the GBP PTC has only the first p eak. VI I. SUMMAR Y AND CONCLUSIONS W e studied the prop erties of th e Generalized Belie f Propagation algorithm deriv ed from a Cluster V ariational Metho d appr oximati on to the free energy of the Edwards Anderson mo del in 2D at the lev el of p laquettes. W e compared th e results obtained b y Paren t-to-Child GBP w ith the ones obtained b y the Dual (p aramagnetic) algorithm [26] and by HAK Tw o-W ays algorithm [19] and Double-Lo op p ro v ably con verge nt algorithm [19 ]. W e found that th e plaquette-CVM appro ximation (using Paren t-to-Child GBP) is far ric her than the Bethe (BP) approximat ion in 2D EA mo del. BP conv er ges only at high temp eratures (ab o ve T Bethe = 1 /β Bethe = 1 . 51), and in su c h case it treats the system as a set of indep enden t p airs of lin ked sp in s. GBP on the other hand , make s a b etter prediction on the paramagnetic b eha vior of the mo d el at high T, since it imp lemen ts a message passing of correlations fields fl o wing fr om plaquettes to links in the graph. F u rthermore with GBP the p aramagnetic phase is extended to temp eratures b elo w T Bethe = 1 . 51 u nt il T SG = 1 /β SG ≃ 1 . 27 w here sp in glass solutions app ear in th e s in gle in s tance imp lemen tation of the message passing algorithm. In con trast to Bethe appro ximation, GBP is able to find spin glass solutions, and th e standard message p assing s tops con v erging n ear T con v ≃ 1. The a verag e case calculation of the stabilit y of the paramagnetic solution in the CVM appro xi- mation predicted that non paramagnetic (spin glass) solutions should app ear at lo w er temp eratures T CVM = 1 /β CVM ≃ 0 . 82. Th is av erage case result do es not coincide with the single ins tance b e- ha vior of the standard GBP , since it fails to mark b oth the p oin t w here GBP start find ing spin glass solutions T SG and the p oint where GBP stops con vergi n g T con v . Ho w eve r, the non con v ergence of GBP is not a feature of the CVM approxima tion, and is susceptible of changes fr om one implemen tation of the message passing to another. W e show ed that b y fi x in g a hidden gauge in v ariance in the message p assin g equ ation, a simultaneo u s up date of all ca vit y fields p oin ting to a single spin in the lattice improv es the con verge nce of the algorithm, without changing dr astically its sp eed. Usin g the gauge fixed GBP , the non conv ergence inv erse temp erature is mo v ed to T con v-GF ≃ 1 . 2, quite close to the av erage case prediction T CVM (whether this is only a coincidence is still not clear). Most imp ortan tly the a verage case computation (p opulation dynamics) with the gauge fixed id en tifies the same S G critical temp eratur e T CVM-GF ≃ 1 . 28 measur ed on single samp les (where T SG ≃ 1 . 27). Finally we compared the fixed p oin t solutions f ou n d by the GBP message passing with those found b y the prov ably con vergen t Double-Lo op algorithm and the message passing heuristic of the Tw o-W ays algorithm of [19]. All the algorithms find the same p aramagnetic s olutions at high T, while b elo w T SG they fin d a spin glass solution, in th e sense that lo cal magnetizations are non zero, w h ile the global magnetization is n u ll. Decreasing the temp erature Double-Loop and HAK switc h bac k from the spin glass to the paramagnetic solution, at the cost of a facto r 10 2 − 10 3 and 10 − 1 0 2 resp ectiv ely in runnin g time, compared to GBP . F urthermore, th e paramagnetic solution can alw a ys b e foun d fast b y the Du al algorithm of [26], making these tw o algorithms (Double-Loop and HAK) unnecessarily slo w. Although the thermo d ynamics of the 2D EA mo d el is paramagnetic, at low temp eratures, the correlation length gro ws until even tually sur passing L/ 2 and therefore b eing effec tivel y infi nite for an y finite size 2D system. I n such a situation the non p aramagnetic solutions obtained by GBP can acc ount for long range correlations, and presumably giv es b etter estimat es for the co rr elations among sp ins than the p aramagnetic solution obtained b y HAK and Doub le Lo op. Establishing the previous claim requires a detailed stu dy of the qualit y of C VM appr oximati on at lo w temp eratures (in the non paramagnetic range) and its co n nections to the statics and d ynamics 19 of 2D Edwards Anderson mo del, wh ic h is already under stud y . Applicatio n of CVM and GBP message p assing to Edwards Anderson mo del in 3D is also app ealing, since this mo del do es ha ve a sp in glass b eha vior at lo w temp erature. [1] H. A. Bethe, Pr o c. R. So c. A. 150 , 55 2 (19 35). [2] R. Kikuchi, Phys. Rev. 81 , 9 88 (1 9 51). [3] M. M ´ ezard and G. Parisi, Eur. Phys. J. B 20 , 21 7 (20 01). [4] J. Pearl, Pr ob abilistic R e asoning in Intel ligent S ystems (2nd e d.) (Morg an Kaufmann, San F rancisco, CA, 1988). [5] F. R. Ksc hischang, B. J. F rey , and H. Lo elig er, IE E E T ra ns. Info rm. Theory 47 , 49 8 (19 98). [6] M. M ´ ezard and R. Zec china, P hys. Rev. E 66 , 056 126 (20 02). [7] M. M ´ ezard, G. P aris i, and R. Z ecchina, Science 297 , 8 1 2 (2002). [8] M. M ´ ezard and G. Parisi, J. Stat. Phys. 111 , 1 (2 0 03). [9] G. P ar isi, M. M ´ e zard, and M. A. Vir a soro , Spin Glass The ory and Beyond (W orld Scientific, Singap o re, 1987). [10] R. Mulet, A. Pagnani, M. W eigt, and R. Zecchina, Phys. Rev. Lett. 89 , 2 6870 1 (2 002). [11] A. Braunstein, R. Mulet, A. Pagnani, M. W eigt, and R. Zecchina, Phys. Rev. E 68 , 036702 (2003 ). [12] D. Ac hlio pta s, A. Naor , and Y. P er e s , Nature 435 , 759 (200 5 ). [13] F. Krzak ala, A. Mon tanari, F. Ricci-T ersenghi, G. Semerjian, and L. Zdeb orov a, Pro c. Nat. Acad. Sci. 104 , 10318 (200 7). [14] A. Montanari, F. Ricci-T ersenghi, and G. Semerjian, J. Stat. Mech. , P04004 (2 0 08). [15] A. Montanari, F. Ricci-T ersenghi, a nd G. Semer jian, in Pr o c e e dings of the 45th Annual Al lerton Confer enc e on Communic ation, Contr ol, and Computing (2007 ) pp. 352–3 59. [16] F. Ricci-T ersenghi and G. Semerjian, J. Stat. Mech. , P0900 1 (2009 ). [17] J. Y edidia, W. T. F reema n, a nd Y. W eiss, IEEE T ra ns. Inform. Theory 51 , 2282 (2005). [18] Y. S.-K. Ey e and A. L. Y uille, Neural Comput. 14 , 20 02 (2 001). [19] T. Heskes, C. A. Alb ers , a nd H. J. Kapp en, UAI-03 , 313 (2003). [20] K. T anak a, J. Inoue, and D. M. Titterington, in XIII Pr o c e e dings of t he 2003 IEEE Signal Pr o c essing So ciety Workshop (17-19 Septemb er, 2003 (IE EE Computer So ciet y Press, 2 003) pp. 32 9–33 8. [21] C. A. Albers, T. Hesk es, and H. J. Kappen, Genetics 177 , 1 101 (2007). [22] H. J. Kappen, in Mo deling Bio-me dic al signals (W o rld Scien tific, 2002) pp. 3–1 6. [23] K. T a nak a a nd T . Mor ita , P hys. Lett. A 203 , 122 (1995). [24] T. Jorg , J. Lukic, E. Mar inari, a nd O . Ma rtin, Phys. Rev . Lett. 9 6 , 2 3720 5 (2006 ). [25] C. K. Thomas a nd A. A. Middleton, Ph ys. Rev. E 80 , 04670 8 (2009). [26] A. Lage-Ca stellanos, R. Mulet, F. Ricci- T ersenghi, a nd T. Rizzo, Phys. Rev. E , to appe a r (20 11). [27] T. Rizzo, A. La ge-Castella nos, R. Mulet, and F. Ricci-T ersenghi, J . Sta t. Phys. 13 9 , 375 (2010 ). [28] A. Pelizzola, J. Ph ys . A 38 , R30 9 (200 5). [29] J. M. Mooij and H. J. Kapp en, J. Sta t. Mec h. , P 1101 2 (2005). [30] The Bethe tempe r ature T B ethe is the o ne at whic h a non trivial spin glass solution app ea r s for a random regular Bethe lattice with connectivity K = 4 . The B ethe lattice looks loca lly as a tree. [31] M. W eigt, R. A. Whitea, H. Szur mantc, J. A. Ho chc, and T. Hw a, P ro c. Nat. Aca d. Sci. 36 , 110 96 (2003). [32] J. M. Mooij, J. Mac h. Learn. Res. 11 , 2169 (2010).
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment