Approximation algorithms for two-state anti-ferromagnetic spin systems on bounded degree graphs
In a seminal paper (Weitz, 2006), Weitz gave a deterministic fully polynomial approximation scheme for count- ing exponentially weighted independent sets (equivalently, approximating the partition function of the hard-core model from statistical phys…
Authors: Alistair Sinclair, Piyush Srivastava, Marc Thurley
Spin systems are a general framework for modeling nearest-neighbor interactions on graphs, and are widely studied in both statistical physics and applied probability. A spin system consists of a large collection of nodes, each of which may be in one of a fixed number of states called spins. A neighborhood structure is specified by edges between the nodes. Interactions between neighboring nodes are determined by edge potentials, which assign an energy value to each edge based on the spin values of its endpoints. In addition, there are vertex potentials which assign an energy value to each node based on the value of its spin. For any configuration σ of spins on the nodes, the energy H(σ) is just the sum of its edge and vertex energies. Based on the Gibbs formalism from statistical physics, the probability of finding the system in configuration σ is then proportional to the weight w(σ) = exp(-H(σ)).
In this paper, we concentrate on two-state spin systems, where each vertex can be in one of two states, referred to as "+" and "-". Such a system can be defined by specifying a (+, +) edge activity β, a (-, -) edge activity γ, and a vertex activity λ, where β, γ and λ are non-negative parameters. For a graph G = (V, E), a configuration σ : V → {+ , -} is an assignment of + and -spins to the vertices of G. The weight w(σ) of the configuration σ is given by w
where m(σ) denotes the number of vertices assigned spin -, and n + (σ) (respectively, n -(σ)) denotes the number of edges for which both endpoints are assigned spin + (respectively, -). The partition function of the model is defined as
The partition function, in addition to being a natural weighted generalization of the notion of counting, is a fundamental quantity in statistical physics. For example, it is the normalizing factor in the Gibbs distribution: the probability of occurrence of configuration σ is given by µ(σ) = w(σ)/Z. In addition, many other properties of the model can be deduced by studying the partition function [3]. As a simple concrete example of a two-state spin system, consider the setting β = 1 and γ = 0 (so that configurations with adjacent "-" spins are assigned weight zero, and thus prohibited), known as the hardcore model. The associated Gibbs distribution is a weighted measure on independent sets in the graph G, in which any independent set U has weight λ |U | . Another important class of examples, known as the Ising model 1 , is obtained by setting β = γ > 0. There is a significant qualitative difference between the Ising model with β = γ > 1 (the ferromagnetic case) and with β = γ < 1 (the anti-ferromagnetic case). The latter is an example of a "repulsive" model, which means that the edge potentials assign higher weights to edges with different spins at their endpoints, while the ferromagnetic case is "attractive" (higher weights are assigned to edges with the same spin at their endpoints). The parameter λ can be identified with an "external field", i.e., a bias associated with each spin. The case λ = 1 corresponds to zero field, while λ < 1 and λ > 1 correspond to positive and negative fields respectively. More generally, we will refer to any two-state system satisfying βγ > 1 as ferromagnetic, and any satisfying βγ < 1 as anti-ferromagnetic. Also, a model satisfying βγ > 0 is said to have soft constraints (in the sense that no combination of spin values at adjacent vertices is prohibited). In a sense to be made precise later (see Appendix A), Ising models capture arbitrary two-state spin systems with soft constraints; in particular, the two descriptions are equivalent on regular graphs. For this reason we will henceforth focus mainly on Ising models.
The theory of spin systems derives in large part from considering the limiting behavior of the Gibbs distribution as the size of the underlying graph goes to infinity. Based on the above formalism for finite graphs, one may define a Gibbs measure µ on an infinite graph G by requiring that the marginal distribution on any finite subgraph H, conditional on the configuration on G\H, is given by equation (1). (Here the spins in G\H act as a fixed boundary condition in (1).) It is a well known result in the statistical physics literature (see, for example, [3]) that at least one such measure µ can always be defined. However, for certain values of the parameters of the spin system there may be multiple solutions for µ, in which case the Gibbs measure is said to be non-unique.
We will now look at the phenomenon of non-uniqueness more closely in the special case when the infinite graph G is a d-ary tree. 2 As noted above, the anti-ferromagnetic Ising model captures all two-state antiferromagnetic spin systems with soft constraints on regular graphs, and hence it is sufficient to consider the Ising case. Consider an anti-ferromagnetic Ising model on the d-ary tree with edge activity β(= γ) and vertex activity λ. It turns out that if β ≥ d-1 d+1 then the Gibbs measure is unique for all values of λ. In particular, in the zero-field case λ = 1, the Gibbs measure is unique if and only if β ≥ d-1 d+1 . However, when β < d-1 d+1 , the Gibbs measure is no longer unique for all values of the vertex activity λ. In this case, there exists a critical activity λ c (β, d) ≥ 1 such that the Gibbs measure is unique if and only if |log λ| ≥ log λ c (β, d). We sketch the curves of log λ c (β, d) in Figure 1, for d = 5 and d = 13. The area below the curves is the non-uniqueness region. We note here that the non-uniqueness region is monotonically increasing with degree, so the curve for d = 5 lies strictly below that for d = 13. Also, note that the curves intersect the β-axis at β = d-1 d+1 .
1 The description of the Ising model given here differs slightly from the more popular description in terms of edge and vertex potentials outlined in the first paragraph. However, translating between the two descriptions is easy; see Appendix A. 2 We remark here that the infinite (d + 1)-regular tree and the infinite d-ary tree show exactly the same behavior with respect to the uniqueness of the Gibbs measure. This follows immediately from the fact that the (d + 1)-regular tree can be viewed as a root attached to the roots of d + 1 infinite d-ary trees. We shall thus move freely between these two objects for ease of exposition throughout the paper. The phenomenon of non-uniqueness of the Gibbs measure can also be described in terms of the more algorithmic notion of decay of correlations. We stick to our example of the infinite d-ary tree. Fix a vertex v in the tree, and let S l be the set of vertices in the tree at distance at least l from v. Let q v (l, σ) be the probability of having spin + at v conditional on the configuration on S l being σ. It turns out that uniqueness of the Gibbs measure is equivalent to the condition that the inequality
holds for any two configurations σ and τ on S l . 3 The above condition is referred to in the literature as weak spatial mixing.
It has been believed for a long time (and proved in various manifestations) that there is an intimate relationship between weak spatial mixing and the running time of algorithms for approximating the associated partition function: roughly speaking, in the uniqueness region (where there is decay of correlations), the system should be amenable to local algorithms and thus be computationally tractable. A spectacular result in this direction was Weitz's fully polynomial deterministic approximation scheme (FPTAS) for the partition function of the hard-core model, which works on all graphs of degree at most d + 1 for all activities λ less than the critical activity λ c (d) for the uniqueness of the Gibbs measure on the infinite d-ary tree [12]. This is even more interesting in light of a recent breakthrough due to Sly [10] (see also [2]), who showed that the existence of an FPRAS for the partition function of the hard-core model on graphs of degree at most d + 1 for activities larger than λ c (d) would imply that NP = RP. Thus the range of validity of Weitz's algorithm is optimal.
Weitz [12] gave a general two-step framework for designing deterministic algorithms for approximating partition functions of two-state spin systems. To describe this framework, we begin with the standard observation that in order to get an FPTAS for the partition function, it is sufficient to give an FPTAS for the probability of having spin + at any given vertex v. The first component of the framework is a combinatorial reduction, which shows that the problem of approximating this probability for a general twostate spin system on a graph G of maximum degree d + 1 can be reduced to the problem of approximating the same probability on a related finite subtree of the infinite (d + 1)-regular tree rooted at v, in which the spins of some of the vertices are fixed to certain values (this is the so-called self-avoiding walk tree of the graph G). We emphasize that this is a model-independent reduction, and depends only upon the fact that the number of spin values is two. The associated self-avoiding walk tree, however, may be exponential in the size of the original graph G, and thus one needs to show that it is sufficient to truncate the tree at a depth logarithmic in the size of G in order to obtain a good approximation. However, since some of the fixed vertices in the tree might be very close to the root v, it is not possible to argue using weak spatial mixing that a logarithmic depth of recursion suffices for approximating the partition function (because the parameter l in equation (3) must be taken to be the minimum distance of a fixed vertex from the root).
Accordingly, the second component of Weitz's framework is to establish that, for the spin system in question, weak spatial mixing on the infinite d-ary tree is in fact equivalent to strong spatial mixing, which roughly states that the exponential decay of point-to-set correlations (3) guaranteed by weak spatial mixing holds also when the spins at an arbitrary set of vertices are fixed to arbitrarily chosen values (see Section 2 for a precise definition). Weitz [12] established this fact for the hard core model, using a step-by-step comparison of ratios of occupation probabilities on the standard d-ary tree and on the modified tree with fixed vertices. It was claimed in [12] that such a result holds also for the anti-ferromagnetic Ising model, but to the best of our knowledge no proof of this fact (except in the special zero-field case where λ = 1; see [9,13]) has so far been published.
In this paper, we give a proof of the fact that for the anti-ferromagnetic Ising model with any field, weak spatial mixing implies strong spatial mixing on the d-ary tree. Formally, we have the following theorem. Notice that it is easy to see that this holds also for the infinite (d + 1)-regular tree, since the (d + 1)-regular tree and the d-ary tree differ only in the degree of the root. We also note that by the translation described in Appendix A, Theorem 1.1 holds also for arbitrary anti-ferromagnetic two-state spin systems with soft constraints.
Given Weitz's general reduction described above, we obtain as an almost immediate consequence of Theorem 1.1 an FPTAS for the partition function of the anti-ferromagnetic Ising model on graphs of maximum degree at most d + 1 throughout the uniqueness region of the Gibbs measure on the d-ary tree.
Corollary 1.2 Let d ≥ 2. Consider an anti-ferromagnetic Ising model with parameters β and λ. For β and λ in the interior of the uniqueness region of the d-ary tree, there is a deterministic polynomial time approximation scheme for the partition function of the associated spin system on graphs of degree at most d + 1.
By the translation described in Appendix A, we can extend this result to general two-state anti-ferromagnetic spin systems. The difference is that the critical activity may now differ for vertices of different degrees. Let λ c (β, d) be the critical activity for the anti-ferromagnetic Ising model described above (and defined formally in Section 2.2.1). Then, we have the following corollary.
Consider an anti-ferromagnetic two-state spin system with parameters β, γ and λ. Let β be the edge potential for the equivalent anti-ferromagnetic Ising model. Let G be the class of graphs with maximum degree d + 1 in which every vertex v satisfies the condition
Then there is a deterministic polynomial time approximation scheme for the partition function of the associated spin system on graphs in the class G. In particular, the class G includes all (d + 1)-regular graphs when β, γ and λ are in the interior of the uniqueness region of the d-ary tree.
We briefly sketch the approach we use to prove our main technical result, namely that weak spatial mixing implies strong spatial mixing (Theorem 1.1). Inspired by recent work of Restrepo et al [9], we design a "message" (i.e., an invertible function of the probability of a vertex having spin + ) such that "disagreements" in the message decay by a constant factor at each vertex of the tree. The challenge is to ensure that such a message can be designed for all points in the uniqueness region of the d-ary tree. For the special zero-field case of the anti-ferromagnetic Ising model (when λ = 1), such a message is well known [13]. However, this message does not work up to the threshold for general vertex potentials λ. Restrepo et al [9] recently derived a message which works up to the tree uniqueness threshold for the hard-core model. For the general anti-ferromagnetic Ising model, such a message turns out to be more complex than those known for the zero-field case and for the hard-core model. Our message is defined at the beginning of Section 3, and the requisite decay property is established in Section 4.
We conjecture that our proof of strong spatial mixing based on stepwise decay of messages may lead to further consequences. For example, as shown by Restrepo et al [9], the message decay property can be used to extend Weitz's algorithm by exploiting the structure of special classes of graphs to obtain approximation algorithms beyond the tree threshold for those graphs. In addition, our proof demonstrates the versatility of the message approach.
Finally, we point out a byproduct of our approach that may be of independent interest. The exact value of the critical field λ c (β, d) as a function of (β, d) is apparently widely accepted folklore, but the only derivation we could find for it in the literature [3, p. 255] does not provide a formal proof, appealing instead to numerical evidence. We point out that our proof of strong spatial mixing does not assume knowledge of λ c (β, d), and in fact derives its value as a byproduct (see the Remark following Theorem 2.5 for more details). Thus our approach gives a proof of the location of the uniqueness threshold λ c (β, d) of two-state anti-ferromagnetic spin systems.
Remark: After obtaining our message-decay proof, we received a sketch of Weitz's original unpublished proof [11]. It is interesting to note that that proof is quite different from ours, and employs a delicate two-step analysis of the tree recursion described in Section 2. For reasons mentioned above, we believe that our message-decay proof, in addition to being the first published version of this result, is potentially more robust and flexible than Weitz's approach; for example, it is not clear how to adapt Weitz's analysis to obtain stronger results for special classes of graphs such as lattices, as is done in [9].
Our work is mainly motivated by the deterministic counting algorithm of Weitz [12], which was the first to show an interesting connection between the running time of an algorithm not related to Markov chain Monte Carlo and the phase transition phenomenon for spin systems. On the complexity side, using a randomized gadget first proposed by Dyer, Frieze and Jerrum [1] and analysed further by Mossel, Weitz and Wormald [8], Sly [10] proved that if there is an FPRAS for the partition function of the hard-core model on graphs of degree at most d in the non-uniqueness region of the d-regular tree, then NP = RP, thus showing that the range of validity of Weitz's algorithm is optimal. Technically Sly's result holds only sufficiently close to the boundary of the uniqueness region; this restriction was mostly removed in a recent paper of Galanis et al [2]. For the case of unbounded degree graphs, Goldberg, Jerrum and Paterson [5] showed that approximating the partition function for the zero-field case (λ = 1) is NP-hard in the interior of the square 0 ≤ β, γ ≤ 1.
A related problem is to get exponential lower bounds on the mixing time of any local Markov chain (Glauber dynamics) that samples from the hard-core and anti-ferromagnetic Ising models. Mossel, Weitz and Wormald [8] and Gerschenfeld and Montanari [4] showed that beyond the uniqueness threshold for d-regular trees, Glauber dynamics for these models can take exponential time to mix on d-regular graphs. Gerschenfeld and Montanari also show that for these models on random regular graphs, this threshold for slow mixing is also the threshold beyond which the reconstruction problem is solvable, and pointed out that these results therefore establish the existence of d-regular graphs on which the reconstruction problem is solvable beyond the uniqueness threshold for the d-regular tree [4].
On the algorithmic side, an analysis of Weitz's algorithm for the zero-field case of the anti-ferromagnetic Ising model appears in [13]. There has been some subsequent progress on the hard-core model on special classes of graphs too: recently, Restrepo et al [9] used a message-decay proof to get improved strong spatial mixing thresholds on the 2D integer lattice for the hard core model. They achieved this by exploiting the special structure of self-avoiding walk trees obtained when Weitz's reduction is applied to the lattice. The message-decay proof turns out to be crucial in tightening the analysis to obtain strong spatial mixing over a wider range of parameters for these special trees. Much more is known about algorithms for the ferromagnetic case: Jerrum and Sinclair [6] gave an FPRAS for the Ising model with arbitrary field on graphs of arbitrary degree, while Goldberg, Jerrum and Paterson [5] showed how to extend this to the whole of the ferromagnetic region βγ > 1 with λ = 1. The latter paper [5] also gave an FPRAS for the partition function on graphs of arbitrary degree for parts of the anti-ferromagnetic region βγ < 1. However, the results of [5] when restricted to bounded degree graphs do not hold throughout the uniqueness region and hence are incomparable to ours. In a recent paper, Li, Lu and Yin [7] improve upon the work of [5]. They consider two-state antiferromagnetic spin systems with zero field (λ = 1) on general graphs, and derive a condition under which an FPTAS exists for approximating the partition function. Their condition requires that (β, γ) lies in the intersection of the uniqueness regions for all possible degrees d. (Note that in the (β, γ) parameterization, this is a non-trivial region of the βγ-plane.) However, for any fixed degree d, this region is smaller than the uniqueness region for d, which is the range of validity obtained in our present paper. We also point out an important qualitative difference between the parameterization we use (edge potential β and vertex field λ) and the parameterization via two edge potentials (β and γ) used by Li et al : while the (β, γ) parametrization is not monotonic, in the sense that uniqueness on a d-regular tree does not imply uniqueness on a (d -1)regular tree, the (β, λ) parametrization is monotonic. In fact, as our results show, uniqueness on the d-regular tree implies strong spatial mixing, and hence uniqueness, on all d -regular trees for d ≤ d in the (β, λ) parameterization.
We will mostly follow the notational conventions of [5]. Given a graph G = (V, E), a two-state spin configuration is defined as an assignment σ : V → {+ , -} of spins to the vertices. Weights for different configurations are computed in terms of the (+, +)-edge activity β, the (-, -) edge activity γ and a vertex activity λ, and are given by
where given the configuration σ, m(σ) denotes the number of vertices assigned spin -, and n + (σ) (respectively, n -(σ)) denotes the number of edges for which both endpoints are assigned spin + (respectively, -).
The partition function is defined as
We remark that this representation can be easily translated to the usual description in terms of edge potentials and vertex field: for completeness we give the translation in Appendix A.
Definition 2.1 (Occupation probability). Given a vertex v in the graph G, the occupation probability p v is the probability that v is assigned spin + in a random configuration σ sampled according to the weights defined in equation (4).
The Ising model corresponds to the case β = γ. The model is ferromagnetic when β > 1 and antiferromagnetic when β < 1 (the case β = 1 is trivial). The zero-field case corresponds to λ = 1, the positive field case to λ < 1 and the negative field case to λ > 1. As shown in Appendix A, on d-regular graphs the Ising model is equivalent to general two-state spin systems. Thus, in the rest of this paper, we will concentrate mostly on the Ising case. On non-regular graphs the equivalence still holds; however, the vertex activity λ in the Ising model may then be different on different vertices. The adaptation of our results to this setting is described in Corollary 1.3.
The anti-ferromagnetic Ising model exhibits a uniqueness phase transition on the d-ary tree for d ≥ 2. In particular, one can define a critical activity λ c (β, d) as follows. A consequence of uniqueness4 of the Gibbs measure is weak spatial mixing, which captures a weak notion of decay of point to set correlations. Let p ρ (σ, S) be the probability of occupation of the root ρ of an infinite d-ary tree when the spins of a set S of nodes are fixed according to the configuration σ. Let δ(ρ, S) denote the distance of ρ from the set S. Definition 2.3 (Weak spatial mixing). Given any two-state spin system, weak spatial mixing is said to hold if for any set S whose distance δ(ρ, S) from the root ρ of the tree is finite, and any two configurations σ 1 and σ 2 , we have
Notice that weak spatial mixing does not guarantee exponential decay of correlations when the set S contains vertices which are very close to the root ρ, even when σ 1 and σ 2 differ only on vertices which are very far away from ρ. A related but, as the name suggests, stronger notion is that of strong spatial mixing, which captures the idea that fixing vertices near the root to the same spin should not affect the exponential decay of point-to-set correlations. We note that strong spatial mixing is not in general implied by weak spatial mixing for arbitrary spin systems; see Appendix B for a counterexample involving the ferromagnetic Ising model with appropriate parameters.
Definition 2.4 (Strong spatial mixing). Given any two-state spin system, strong spatial mixing is said to hold if for any set S whose distance δ(ρ, S) from the root ρ of the tree is finite, and any two configurations σ 1 and σ 2 which differ only on a set T ⊆ S of vertices, we have
It is well known (see, for example, [3]) that the uniqueness condition for two-state spin systems on d-ary trees can be written in terms of the number of fixed points of the recursion for occupation probabilities. Consider a subtree rooted at a vertex v in the d-ary tree, and let v i , i = 1, 2, ...d be its children. Let p v be the occupation probability at vertex v and define R v = 1-pv pv . One can then write the following recurrence for R v :
This can easily be converted to a recurrence for occupation probabilities. Define
We can then write the recurrence as
We will find it useful in what follows to consider the tree recurrence with the special boundary condition in which all vertices at some distance l from the root are fixed to the same spin. In this case, by symmetry, the tree recurrence outputs the same value p v at all vertices v which are at the same distance from the root. Thus, the recurrence can be simplified to a one-parameter recurrence as follows:
Note that in the anti-ferromagnetic case, h is an increasing function, and hence F and f are decreasing in each of their arguments. We also note that since f is strictly decreasing in [0, 1], it has a unique fixed point.
In terms of the recurrence function f , the condition for uniqueness can be stated as follows.
Theorem 2.5 ( [3]) For given values of β and λ, the infinite d-ary tree has a unique Gibbs measure if and only if the two-step recurrence function f • f has a unique fixed point. In particular, if the Gibbs measure is unique, and (β, λ) are not on the boundary of the uniqueness region, then the unique fixed point x of f satisfies
Remark: In [3], it is claimed (implicitly) on the basis of numerical simulations that the condition ( 6) is also sufficient for uniqueness. To be precise, the expression for the critical activity λ c (β, d) given in [3, p. 255] is exactly the same as that obtained by assuming that ( 6) is also a sufficient condition for uniqueness. While we believe this fact to be folklore, we have not been able to find a rigorous proof of it in the literature. With a slight abuse of terminology, we will henceforth refer to the set of (β, λ) for which the fixed point x satisfies f (x ) > -1 as the "uniqueness region". We will justify this terminology later (see the Remark following the proof of Theorem 1.1 in Section 4) by proving that condition (6) does indeed imply uniqueness. Thus we will obtain a rigorous proof of the expression for the critical activity appearing in [3].
Definition 2.6 (Message). A message is a continuously differentiable function φ : [0, 1] → R with positive derivative.
Note that a message is strictly increasing and hence invertible on its range. Moreover, the inverse function φ -1 is also a continuously differentiable function with positive derivative. Given a recurrence function f : [0, 1] → R + , and a message φ, we denote by f φ the function φ • f • φ -1 . The function f φ , which will play a crucial role in this paper, describes the evolution of the message φ under the recurrence, in the sense that f φ (φ(x)) = φ(f (x)). We will also need the following fact. Proof. Notice that since φ is strictly increasing, and f has a unique fixed point x , f φ also has a unique fixed point p = φ(x ). Now, we notice that f φ (p ) = f (x ), because
where in the second line we used the facts that φ -1 (p ) = x and φ -1 (y) = 1 φ (φ -1 (y)) , and in the third line the fact that f (x ) = x . Thus, (β, λ) are in the uniqueness region (as defined in the Remark following Theorem 2.5) if and only if f φ (p ) = f (x ) > -1.
As indicated in the introduction, Weitz [12] proved the following combinatorial reduction.
Theorem 2.8 For any two-state spin system, strong spatial mixing on the d-ary tree implies that there exists a deterministic fully polynomial approximation scheme for the partition function of the spin system on graphs of degree at most d + 1.
In this section, we will prove the main technical ingredient of our result, which is expressed in the following theorem.
Theorem 3.1 Given d, β and λ, there exists a message φ and a constant c < 1, such that the tree recurrence g f φ for the quantity φ(p v ) satisfies g ∞ ≤ c < 1 , whenever (β, λ) is in the uniqueness region for the d-ary tree.
The above theorem says that in the uniqueness region, the single-parameter recurrence g = f φ for the message φ(p v ) contracts at every step. (Without the message, the function f itself is not contractive.) This stepwise contraction is easily seen by standard arguments to imply weak spatial mixing; for completeness, we give a proof in Appendix C.1. To extend the argument to strong spatial mixing, as required for Theorem 1.1, we need to consider a multi-parameter (vectorized) version of the message recurrence g, since under arbitrary boundary conditions the occupation probabilities need not be uniform. We will show in Section 4 that for our message φ in Theorem 3.1, the analysis of the vectorized version can in fact be reduced to an application of Theorem 3.1. Remark: For ease of notation, in the rest of the paper we will prove our results in terms of the uniqueness threshold of the d-ary tree, relating it to algorithms on graphs of degree at most d + 1. As already noted, the uniqueness thresholds on the (d + 1)-regular tree and the d-ary tree coincide, and hence our results apply equally to the infinite (d + 1)-regular tree.
We begin by setting up some notation for the proof of Theorem 3.1. Notice that in the light of Fact 2.7 the main technical challenge is to come up with a message φ such that the quantity f φ is maximized at the unique fixed point of f φ . Let us fix constants
Notice that D > 0, so φ is a continuously differentiable function with positive derivative on the interval [0, 1].
Using this message we are able to prove the following.
Lemma 3.2 Consider the anti-ferromagnetic Ising model on a d-ary tree with edge activity β and vertex activity λ. Then, defining g = f φ , ψ = φ -1 , α = ψ(x) and η = f (α), we have
Proof of Corollary 1.2. As observed earlier, in order to obtain an FPTAS for the partition function of the associated spin system, it is sufficient to give an FPTAS for approximating the occupation probability p ρ of a vertex ρ, under an arbitrary fixing of spin values for an arbitrary subset of vertices. Given a vertex ρ in a graph G of maximum degree (d + 1), we start by constructing Weitz's self-avoiding walk (SAW) tree rooted at ρ. For non-leaf vertices (apart from ρ) in this tree which do not have d children, we can create dummy children (so as to make the arity of the vertex d) all of which independently have occupation probabilities of 1/2. It is easy to see that this does not change the output of the tree recurrence (equation ( 5)) at any vertex of the tree. As we saw in the proof of Theorem 1.1, we have strong spatial mixing on this SAW tree whenever (β, λ) are in the uniqueness region of the d-ary tree. The corollary now follows using Weitz's reduction (Theorem 2.8).
Finally, we will see how to use Lemmas 3.2 and 4.2 to prove Corollary 1.3, which extends the FPTAS to general two-state anti-ferromagnetic spin systems.
Proof of Corollary 1.3. Given a two-state spin system with parameters β, γ and λ on a graph G of degree at most d + 1, we can use the translation given in Appendix A to come up with an equivalent Ising model with edge potential β = √ βγ and vertex-dependent potentials λ v = λ( γ/β) dv . Now, as before, in order to estimate the occupation probability p ρ for a given vertex ρ, we construct Weitz's self-avoiding walk (SAW) tree rooted at ρ, and complete the degree of any non-leaf vertex (apart from ρ) in the tree which does not have d children by attaching dummy children which are fixed to have occupation probability 1 2 . We now use the message φ constructed above for d-ary trees for the parameter β . By the hypotheses of the corollary, the parameters (β , λ u ) at each vertex u of the SAW tree are in the uniqueness region of the d-ary tree. Since the message φ does not depend upon λ u , Theorem 3.1 and Lemma 4.2 apply at each vertex u of the tree. Thus, as in the proof of Theorem 1.1, we get contractive spatial mixing and, hence, strong spatial mixing on the SAW tree. Employing Weitz's reduction (Theorem 2.8), we have the first part of the corollary.
The claim that the class G in the corollary includes (d + 1)-regular graphs when β, γ and λ are in the uniqueness region of d-ary tree follows by noticing that in this case the parameters λ = λ v obtained by the translation are the same at each vertex v, and that β and λ are in the uniqueness region of the d-ary tree by the hypotheses of the corollary. Thus, we can complete the proof for this case in the same manner as in the proof of Corollary 1.2. represented in terms of the Ising model (this translation can also be found, e.g., in [5]). Consider a general two-state spin system with parameters β, γ > 0 and λ. Then the equivalent Ising model has edge activity β = βγ, and a degree-dependent vertex activity given by
where d v denotes the degree of vertex v. Now, denote the weight of a configuration σ in the Ising model just defined by w (σ) and its partition function by Z . Then one calculates straightforwardly that
|E| and hence
Thus we have translated the original spin system with parameters (β, γ, λ) into an Ising model with locally changing field. Note that on regular graphs the resulting field is in fact constant at all vertices. Furthermore, the Ising model is anti-ferromagnetic if and only if βγ < 1. This justifies our use of the term "anti-ferromagnetic" for general spin systems based on the value of βγ. We also observe that in the special case of d-regular trees, this implies that weak (strong) spatial mixing in the original spin system (β, γ, λ) is equivalent to weak (strong) spatial mixing in the Ising model given by the translation. A little thought shows that since all vertices except the root have the same degree in the d-ary tree, the last observation holds also for d-ary trees.
We construct a counterexample as follows: given a degree d ≥ 3, consider the infinite rooted d-ary tree, with the fixed boundary condition where each vertex in the tree has one of its children fixed to +. Notice that if the original parameters are β ≥ 1 and λ, the effect of this fixed boundary condition can be simulated by changing the vertex field to λ β . Therefore, strong spatial mixing on this subgraph of the d-ary tree with parameters (β, λ) holds only if weak spatial mixing holds on the (d -1)-ary tree with parameters (β, λ β ). It is therefore sufficient to choose β and λ satisfying both the conditions log λ > log λ c (β, d), and
in order to construct a counterexample. To see that such a choice of parameters is possible, we consider the exact form of λ c (β, d). We note that P (d) is an increasing function of d. Thus, the required conditions ( 11) and ( 12) become log λ > log β,
We show in this section that a stepwise contraction in the recurrence for φ(p v ) implies weak spatial mixing.
As before, we denote f φ by g, and assume that for any x, y in the range of φ, it holds that
for some c < 1. To show that this implies weak spatial mixing, we consider boundary conditions σ 1 and σ 2 on a set S whose distance from the root ρ is l. Using the monotonicity of the tree recurrence F (defined in equation ( 5)) in all its arguments, it can be verified that |φ(p ρ (σ 1 , S)) -φ(p ρ (σ 2 , S))| is maximized when S is the set of all leaves at distance l from ρ and σ 1 assigns all vertices in S to + and σ 2 assigns all vertices in S to -. With this definition of σ 1 and σ 2 , we notice that the tree recurrence for φ(p v ) outputs the same value for all vertices v at the same distance from the root. For a vertex at distance l -i from the root ρ, we denote by q i,j the quantity φ(p v (σ j , S)), for j ∈ {1, 2}. Now, using condition (16), we have
Since both φ and φ -1 are continuously differentiable functions defined over compact sets, they are Lipschitz continuous, say with parameters L 1 and L 2 respectively. We therefore have weak spatial mixing, since
In this section, we show that contractive spatial mixing, as defined in Definition 4.1 implies strong spatial mixing. We again consider boundary conditions σ 1 and σ 2 on a set S which differ only on a subset T which is at distance l from the root ρ. Again, since both φ and φ -1 are continuously differentiable functions defined over compact sets, they are Lipschitz continuous, say with parameters L 1 and L 2 respectively. We define the quantity q i as q i max v:δ(ρ,v)=l-i |φ(p v (σ 1 , S)) -φ(p v (σ 2 , S))| .
Notice that q 0 ≤ |φ(1) -φ(0)| ≤ L 1 . Also, since G is the tree recurrence for φ (p v ), contractive spatial mixing for G implies q i+1 ≤ cq i for c < 1. Thus, we get strong spatial mixing since |p ρ (σ 1 , S) -p ρ (σ 2 , S)| ≤ L 2 q l ≤ L 2 c l q 0 ≤ L 1 L 2 c l .
In this section, we prove Lemma 3.2. The proof involves a few somewhat lengthy derivative computations, which we isolate in the following lemma.
Lemma D.1 With the notation used in Lemma 3.2 above, we have
Proof (sketch). Most of these identities are easily verified by direct computation. In proving equation ( 17), one needs to keep in mind the definition of the constant D.
Proof of Lemma 3.2. To ease notation, we will suppress the dependence of the quantities η and α on x.
Using the chain rule, we have
Here, we used the fact that since ψ = φ -1 , ψ (x) = 1 φ (ψ(x)) . After taking the logarithm, and noticing that the right hand side is more easily expressed as a function of α rather than of x, one can write the second derivative of g as 1 ψ (x)
We now consider each of the terms involved above. Recalling that η = f (α), and using equations (20) and (21) to expand the first and last terms in equation ( 22) above, we get
where T 1 and T 2 are defined as
Notice that all terms containing η are isolated in T 2 . We now consider each of the terms separately. For T 1 , we have
.
Here, we used equations ( 19) and (18) in the first line. Now using equation (17), we have
.
Here, we use A = d(1 -β 2 ) + (1 -β) 2 , followed by equation (18) in the last line.
We now consider T 2 . Again using equation ( 17), we have
Notice that modulo the h (α) h(α) factor, T 1 and T 2 have the same functional form as functions of α and η respectively. In fact, the message φ is designed so as to make this possible. We can now substitute these values into equation (23) to get ,
where in the last step we used equation (18).
To be precise, this condition does not hold on the boundary of the uniqueness region, that is, for |log λ| = log λc(β, d); at this critical value, the l.h.s of equation (3) still decays to 0 with l, but not at an exponential rate. We will focus on the interior of this region, and by a slight abuse of terminology refer to it as the "uniqueness region".
As stated in the introduction, we exclude the boundary of the uniqueness region here.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment