Sequential Monte Carlo samplers: error bounds and insensitivity to initial conditions
This paper addresses finite sample stability properties of sequential Monte Carlo methods for approximating sequences of probability distributions. The results presented herein are applicable in the scenario where the start and end distributions in t…
Authors: Nick Whiteley
Sequen tial Mon te Carlo samplers: error b ounds and insensitivit y to initial conditions Nic k Whiteley ∗ Abstract This pap er addresses finite sample stabilit y prop erties of sequen tial Mon te Carlo metho ds for appro x- imating sequences of probability distributions. The results presented herein are applicable in the scenario where the start and end distributions in the sequence are fixed and the n umber of intermediate steps is a parameter of the algorithm. Under assumptions whic h hold on non-compact spaces, it is shown that the effect of the initial distribution decays exp onen tially fast in the n umber of intermediate steps and the corresponding sto c hastic error is stable in L p norm. Keyw ords: non-compact spaces; unbounded functions; sequential Monte Carlo AMS Sub ject Classification: 82C80;60J05; 1 In tro duction Sequen tial Mon te Carlo (SMC) metho ds are a class of sto c hastic algorithms for approximating s equences of probabilit y measures using a p opulation of N particles. They hav e b een adopted in a v ariet y of application domains, including rare even t analysis [8], statistical ph ysics [27], optimal filtering [18, 2] and computational statistics [3, 11]. V arious theoretical prop erties of SMC metho ds hav e b een studied, and in v arious contexts, see amongst others [4, 23, 13, 28], and the seminal work of Del Moral [6]. Existing stability results for SMC metho ds rely on very strong assumptions which typically do not hold on non-compact spaces. This article is concerned with establishing stability prop erties for a class of SMC algorithms primarily motiv ated b y those of Del Moral et al. [11], under assumptions which do hold on non-compact spaces. The following example indicates a scenario of interest. ∗ School of Mathematics, Universit y of Bristol, Universit y W alk, Bristol, BS8 1TW, UK 1 1.1 A motiv ating example Let π be a distribution on some space X , admitting a strictly p ositiv e and b ounded density with resp ect to some dominating measure. F or ease of presentation, let π also denote this density and let ¯ π denote the corre- sp onding unnormalised density , i.e. for some Z > 0 , π ( x ) = ¯ π ( x ) / Z . F or some γ ∈ (0 , 1) , let γ : [0 , 1] → [ γ , 1] b e a non-decreasing Lipschitz function with γ (0) = γ and γ (1) = 1 . F or n ∈ N let { G n,k ; 0 ≤ k < n } b e the collection of p oten tial functions on X defined by G n,k ( x ) = ¯ π γ (( k +1) /n ) ( x ) / ¯ π γ ( k/n ) ( x ) . Let { M n,k ; 1 ≤ k ≤ n } b e a collection of ergo dic Mark ov kernels also on X , where M n,k admits as its in v ariant distribution the prob- abilit y with density prop ortional to ¯ π γ ( k/n ) ( x ) (denoted by π γ ( k/n ) ). F or some initial distribution µ , consider the sequence of probabilit y distributions { η n,k ; 1 ≤ k ≤ n } defined, for a test function f , by η n,k ( f ) := E µ h Q n − 1 k =0 G n,k ( X n,k ) f ( X n,n ) i E µ h Q n − 1 k =0 G n,k ( X n,k ) i , (1) where E µ denotes exp ectation with resp ect to the law of the inhomogeneous Marko v chain { X n,k ; 0 ≤ k ≤ n } , suc h that X 0 ∼ µ and X n,k +1 | X n,k = x n,k ∼ M n,k +1 ( x n,k , · ) . The sp ecial interest in (1) is that when µ = π γ , then η n,n = π , for any n ≥ 1 . F urthermore, in applications Z is unknown, the distributions of the form (1) cannot b e computed exactly , and one aims to obtain an approximation of π b y fixing n then appro ximating each of the { η n,k ; 0 ≤ k ≤ n } in turn, as follows. Let N ∈ N and let { ζ n,k ; 0 ≤ k ≤ n } b e an inhomogeneous Marko v chain, with each ζ n,k = n ξ i n,k ; 1 ≤ i ≤ N o an N -tuplet and with eac h ξ i n,k v alued in X , with ξ i n, 0 ; 1 ≤ i ≤ N indep enden t and of common distribution µ . Giv en ζ n,k , the n ξ i n,k +1 ; 1 ≤ i ≤ N o are indep enden t, with ξ i n,k +1 dra wn from P N j =1 G n,k ξ j n,k M n,k +1 ξ j n,k , · P N j =1 G n,k ξ j n,k . The particle appro ximation measure at time step k is η N n,k := 1 N P N i =1 δ ξ i n,k , and one tak es η N n,n ( f ) := 1 N P N i =1 f ( ξ i n,n ) as an approximation of π ( f ) . The exp ectation terms in (1) are in the shap e of F eynman-Kac form ulae, and adopting the terminology of Del Moral [6] throughout the follo wing, a general collection of Mark ov kernels, p oten tial functions, initial distribution and asso ciated { η n,k } are referred to as constituting a F eynman-Kac (FK) mo del. The FK mo del describ ed ab o ve has a notable structural characteristic: due to fact that γ and π are fixed and γ ( · ) is contin uous, the p otential functions eac h b ecome flat as n → ∞ (note that this is not an essential 2 feature of the general FK mo dels considered by Del Moral [6], nor is it the regime usually considered in the filtering scenario). One might then conjecture due to this “flattening” prop erty that the sequence of measures { η n,k ; 0 ≤ k ≤ n } inherit ergo dicit y prop erties from the Mark ov kernels when µ 6 = π ¯ γ , and η n,n ( f ) − π ( f ) goes to zero at some rate as n → ∞ . Perhaps more adven turously , one migh t further conjecture that this stability prop ert y is inherited b y the corresp onding particle system so that ¯ E µ η N n,n ( f ) − π ( f ) p 1 /p is con trolled uniformly in n , and diminishes at the usual √ N rate, where ¯ E µ denotes exp ectation w.r.t. the law of the particle pro cess initialised using µ . The results presented in this paper allow it to b e shown that this is indeed the case under assumptions whic h are realistic in the context of applications such as [11] on non-compact spaces. The contributions are to establish deterministic stability results for a broad class of FK mo dels, of which the example ab o v e is one instance, and to pro vide L p b ounds for the corresp onding particle errors. 1.2 Summary of results The presen t work is built up on generic assumptions ab out the FK mo del structure, whic h can b e loosely summarized as follows: • the Marko v kernels { M n,k } are geometrically ergo dic, with common F oster-Lyapuno v drift function V and asso ciated constants, and with a common minorization condition • the potential functions { G n,k } are uniformly b ounded ab o ve, and are of the form G n,k ( x ) ∝ exp − 1 n U n,k ( x ) , for U n,k p ositiv e and b ounded uniformly o ver n and k in V -norm . • f is b ounded in V -norm Stabilit y prop erties of general FK semigroups are established in Theorem 1 (section 3). Note here that, in con trast to the ab o v e example, no specific form is assumed for { U n,k } or their relations to the in v ariant distributions of { M n,k } . Theorem 2 (section 4) provides L p error b ounds for the corresp onding particle systems, under additional assumptions on the drift prop erties of the particle pro cess. F or the reader’s con venience Theorem 3 is now summarized (for the precise statement see section 5), whic h is an application to the example sketc hed ab o ve. Let X = R d . Then when π has a sub-exp onen tial densit y w.r.t. Leb esgue measure with asymptotically regular con tours, and when each M n,k is a random w alk Metrop olis kernel of inv ariant distribution π γ ( k/n ) , under a suitable trade-off b et ween p ≥ 1 , α ∈ [0 , 1) and γ , there exists ρ < 1 and constan ts C 1 and C 2 suc h that for an y f with k f k V α := sup x | f ( x ) | V α ( x ) < + ∞ , 3 an y N ≥ 1 and n ≥ 1 , ¯ E µ h π N n ( f ) − π ( f ) p i 1 /p ≤ k f k V α C 1 √ N 1 − ρ n 1 − ρ + ρ n C 2 I µ 6 = π γ , (2) where π N n := η N n,n , is the particle appro ximation as in section 1. The first term on the r.h.s. of (2) corresp onds to the sto chastic errors from the p opulation of size N interacting and mutating ov er n times steps. The second term on the r.h.s. is a deterministic bias which arises if the initial distribution is mis-sp ecified, and this bias is deca ys exp onen tially quickly in n . 1.3 Existing work Despite the fact that the fo cus here is on mo dels with the “flattening” prop ert y , existing SMC stability results cannot b e transferred directly to the present scenario under realistic assumptions. They are reviewed b elo w for completeness. Del Moral [6, Theorem 7.4.4] prov ed time-uniform L p error b ounds, under assumptions which in the presen t scenario w ould take the form: • there exist M > 0 , m ≥ 0 and G > 0 such for all n , k and x, y ∈ X M n,k . . . M n,k + m ( x, · ) ≥ M M n,k . . . M n,k + m ( y , · ) (3) G n,k ( x ) ≥ G G n,k ( y ) , (4) • f is b ounded. Note that these results hold for fully general FK mo dels, not necessarily having the “flattening” prop ert y (see also [10, 9, 7, 24, 23, 5, 12] for v arious results under the same type of assumption as one or b oth of (3)-(4)). Ho wev er, these assumptions are very strong. Equation (3) is stronger than uniform ergo dicit y of the m-step k ernels, and typically is not satisfied for the kernels of interest in [11], such as Metrop olis-Hastings kernels on R d . F or a toy example which highlights the issue, consider the case that X = R and for some probability measure ν on X dominated by Leb esgue measure, take M n,k ( x, · ) = aδ x ( · ) + (1 − a ) ν ( · ) . It is easy to chec k that (3) is violated. Similarly (4) is typically not satisfied in the applications of interest on non-compact spaces and the assumption that f is b ounded is then also rather restrictive. 4 Oudjane and Rub enthaler [25] and Heine and Crisan [19] used truncation approaches to obtain stability results for particle filters in exp ectation ov er the observ ation pro cess, without mixing assumptions, but they resp ectiv ely introduce a rejection step into the particle algorithm (making its computational cost random) and use restrictive assumptions ab out the state-spaces inv olv ed, the hidden Marko v mo del (HMM) and the particle mutation kernels which are not realistic in the scenario of interest. v an Handel [28] prov ed uniform time-a verage consistency of particle filters under tigh tness assumptions for a class of bounded functions. V ery recen tly , Del Moral et al. [12] ha ve studied SMC metho ds in which the resampling step is applied adaptiv ely o ver time. The stability of SMC metho ds has also b een studied in the asymptotic regime N → ∞ . Chopin [4] established a CL T for a broad class of SMC algorithms, and show ed that under the same type of strong mixing assumptions as in (3)-(4), the asymptotic (in N ) v ariance asso ciated with the rescaled sto c hastic error can b e b ounded uniformly in n . Jasra and Doucet [21] fo cused on the asymptotic v ariance corresp onding to the algorithms prop osed by Del Moral et al. [11] for unbounded functions. They obtained a b ound on the asymptotic v ariance under realistic geometric ergo dicit y assumptions which are the same as those considered in the present work. They do not consider the “flattening” regime in their assumptions, and their b ounds on the asymptotic v ariance are not time uniform. F urther de tails of the relationship b et ween the approac h of Jasra and Doucet [21] and some of the ideas in this pap er are p ostponed until section 3. It is relev ant also to mention the recen t results of Kleptsyna and V eretenniko v [22], Douc et al. [15] on forgetting of initial conditions in HMM’s (i.e. without particle appro ximation) assuming a lo cal Do eblin condition on the Marko v kernels: this also do es not hold in the Metrop olis-on- R d scenario of in terest and so these results cannot b e transferred directly (the results of Douc et al. [17] do not require ergo dicit y of the kernel, but their assumptions do in volv e the same minorization/ma jorization structure). Douc et al. [16] prop osed a generic approach to optimal fi lter stability without particle approximation using ideas of coupling inhomogeneous Mark ov chains and a similar approac h is exploited here, further comment is delay ed un til the later sections. The remainder of the pap er is structured in the following manner. Section 2 sp ecifies the general form of the FK mo dels in question, asso ciated semigroups and particle systems. Section 3 deals with the deterministic stabilit y of the sequences of measures arising from the FK mo dels. L p error b ounds for the sto chastic errors of the particle approximations are derived in Section 4. Section 5 applies the results to the case where X = R d and when the Marko v k ernels are of the random w alk Metrop olis v ariet y . The app endix contains a 5 discussion of drift prop erties of the particle system. 2 Definitions Consider a state sp ac e X and an asso ciated countably generated σ -algebra B ( X ) , Let P ( X ) b e the collection of probabilit y measures on ( X , B ( X )) . F or a measure µ on ( X , B ( X )) , an integral kernel M : X × B ( X ) → [0 , ∞ ) and a function f : X → R , define µf := ´ X f ( x ) µ ( dx ) , M f ( x ) := ´ X M ( x, dy ) f ( y ) and µM ( · ) = ´ X µ ( dx ) P ( x, · ) . The function which assigns 1 to ev ery p oin t in X is also denoted by 1 , and the indicator function on a set A is denoted by I A . Let W : X → [1 , ∞ ) and f : X → R b e tw o measurable functions, then define the norm k f k W = sup x ∈ X | f ( x ) | W ( x ) . and let L W = { f : k f k W < ∞} . Let µ b e a signed measure on ( X , B ( X )) , then define the norm k µ k W = sup | φ |≤ W | µ ( φ ) | . 2.1 F eynman-Kac mo dels and asso ciated semigroups Let µ ∈ P ( X ) and for each n ∈ N let { M n,k ; 1 ≤ k ≤ n } b e a collection of Marko v kernels, each k ernel acting X × B ( X ) → [0 , 1] . Let { G n,k ; 0 ≤ k ≤ n − 1 } b e a collection of B ( X ) -measurable, real-v alued, strictly p ositiv e and b ounded functions on X . The notation employ ed b elo w is directly inspired by that of Del Moral [6], with some imp ortan t mo difications, primarily to the indexing, reflecting the scenario of interest in which there is a different FK mo del for each n . Next, for each n ∈ N , let { Q n,k ; 1 ≤ k ≤ n } b e the collection of integral k ernels defined b y Q n,k ( x, dy ) = G n,k − 1 ( x ) M n,k ( x, dy ) . F or eac h n and 0 ≤ k ≤ ` ≤ n , let M n,k : ` and Q n,k : ` the semigroups associated with the Marko v k ernels 6 { M n,k } and the k ernels { Q n,k } . These semigroups are defined by M n,k : ` = M n,k +1 M n,k +2 . . . M n,` , k < ` ≤ n, Q n,k : ` = Q n,k +1 Q n,k +2 . . . Q n,` , k < ` ≤ n, (5) and M n,k : k = Q n,k : k = I d, 0 ≤ k ≤ n . Next define the collection of probability measures { η n,k ; 0 ≤ k ≤ n } b y η n,k ( A ) = µQ n, 0: k ( A ) µQ n, 0: k (1) , A ∈ B ( X ) . Let { Ψ n,k ; 0 ≤ k < n } and { Φ n,k ; 1 ≤ k ≤ n } b e the collections of mappings, each mapping acting from P ( X ) in to P ( X ) , defined for any η ∈ P ( X ) by Ψ n,k ( η )( dx ) = G n,k ( x ) η ( G n,k ) η ( dx ) , Φ n,k ( η ) = Ψ n,k − 1 ( η ) M n,k , and for 0 ≤ k ≤ ` ≤ n denote by Φ n,k : ` the semigroup asso ciated with the mappings { Φ n,k } , defined by Φ n,k : ` = Φ n,` ◦ Φ n,` − 1 ◦ . . . ◦ Φ n,k +1 , k < ` ≤ n, and with the conv en tion Φ n,k : k = I d . It is straightforw ard to chec k that under these definitions, for any 0 ≤ k ≤ ` ≤ n , η ∈ P ( X ) and A ∈ B ( X ) , Φ n,k : ` ( η )( A ) = η Q n,k : ` ( A ) η Q n,k : ` (1) (6) and in particular, Φ n,k : ` ( η n,k ) = η n,` . Lastly , let { S n,k ; 1 ≤ k ≤ n } b e the collection of Marko v kernels, each kernel acting X × B ( X ) → [0 , 1] , defined b y S n,k ( A ) = M n,k ( Q n,k : n (1) I A ) M n,k ( Q n,k : n (1)) . Under these definitions it is straightforw ard to chec k that, in line with (6), we ha ve the alternativ e description of the mapping Φ n,k : n in te rms of the Mark ov kernels { S n,k } : for any 0 ≤ k ≤ n and an y 7 η ∈ P ( X ) Φ n,k : n ( η )( A ) = η ( Q n,k : n (1) S n,k +1 . . . S n,n ( A )) η ( Q n,k : n (1)) . (7) Consider the following assumption. ( A1 ) There exists a finite constant G X suc h that for all n ∈ N and 0 ≤ k ≤ n − 1 , G n,k ( x ) ≤ G X , ∀ x ∈ X . When assumption (A1) holds, for 0 ≤ k ≤ n − 1 define e G n,k : X → (0 , 1] by e G n,k ( x ) = G n,k ( x ) G X , and corresp ondingly , e Q n,k ( x, dy ) = e G n,k − 1 ( x ) M n,k ( x, dy ) . Also let e Q n,k : ` denote the semigroup asso ciated with the kernels e Q n,k in the same manner as (5), with the same conv en tion e Q n,k : k = I d . F urthermore under (A1), let U n,k : X → [0 , ∞ ) b e defined b y U n,k ( x ) = − n log e G n,k ( x ) . As stated in section 1, the work presented here is primarily motiv ated by the mo dels and algorithms considered in [11]. How ever, the “backw ard” kernel structure whic h they consider is not introduced here as it is not essential for our purp oses. A sp ecific example is given in section 5 and at that p oint comment on ho w this fits with the framework of Del Moral et al. [11] is provided. 2.2 P article systems An explicit construction of the probability space for the particle systems is not pro vided here, but this can b e carried out by canonical metho ds, see for example [6, Chapter 3] and should b e clear from the follo wing symbolic description. Fix N ∈ N , n ∈ N , and for 0 ≤ k ≤ n let ζ ( N ) n,k := n ξ ( N ,i ) n,k ; 1 ≤ i ≤ N o , where eac h ξ ( N ,i ) n,k is v alued in X . Denote η N n,k := 1 N P N i =1 δ ξ ( N,i ) n,k . F or 1 ≤ i ≤ N and 1 ≤ k ≤ n let F ( N ,i ) n,k := σ ( ζ ( N ) n, 0 , . . . , ζ ( N ) n,k − 1 , ξ ( N , 1) n,k , ..., ξ ( N ,i ) n,k ) and F ( N ,i ) n, 0 := σ ( ξ ( N , 1) n, 0 , ..., ξ ( N ,i ) n, 0 ) . The generations of the particle system 8 n ζ ( N ) n,k ; 0 ≤ k ≤ n o form a non-homogeneous Marko v chain: for µ ∈ P ( X ) , the la w of this c hain is denoted b y P µ and has transitions giv en in integral form by: P µ ζ ( N ) n, 0 ∈ dx = N Y i =1 µ ( dx i ) , P µ ζ ( N ) n,k ∈ dx ζ ( N ) n,k − 1 = N Y i =1 Φ n,k η N n,k − 1 ( dx i ) , 1 ≤ k ≤ n, where dx = d x 1 , . . . x N . Denote by ¯ E µ the exp ectation corresp onding to P µ . It is easy to chec k that for an y 0 ≤ k ≤ n and 1 ≤ i ≤ N , ¯ E µ h I A ξ ( N ,i ) n,k i = ¯ E µ η N n,k ( A ) , A ∈ B ( X ) . (8) 3 Stabilit y of the deterministic measures This section is concerned with stability prop erties of the sequences of Marko v kernels { S n,k } and op erators { Φ n,k } . The approach is to identify non-homogeneous F oster-Lyapuno v drift functions and minorization conditions whic h arise quite naturally from the structure of the FK mo del and then to employ the quan titative b ounds of Douc et al. [14]. Compared to [21], in the presen t work the general structure of FK mo dels is exploited more directly and in a wa y which is fruitful when the mo dels satisfy assumption (A4) b elo w. Douc et al. [16, Section 5] identified drift functions and coupling sets for related op erators in some sp ecific HMM’s; the present work is concerned with a general FK model structure. The first main idea of this section is illustrated by the following assumption and lemma. ( A2 ) There exists λ ∈ [0 , 1) , a function V : X → [1 , ∞ ) , ε ∈ (0 , 1] , b ∈ (0 , ∞ ) , C ∈ B ( X ) and a probability measure ν ∈ P ( X ) , such that inf n ≥ 1 inf 1 ≤ k ≤ n M n,k ( x, A ) ≥ ε · ν ( A ) , ∀ A ∈ B ( X ) , ∀ x ∈ C , (9) sup n ≥ 1 sup 1 ≤ k ≤ n M n,k V ( x ) ≤ λV ( x ) + b I C ( x ) , ∀ x ∈ X . (10) 9 Lemma 1. Assume (A1)-(A2). Then for e ach n ∈ N and 1 ≤ k ≤ n , S n,k ( x, · ) ≥ n,k ν n,k ( · ) , ∀ x ∈ C , (11) S n,k V n,k ( x ) ≤ λV n,k − 1 ( x ) + b n,k − 1 I C ( x ) , ∀ x ∈ X , (12) wher e n,k := ε · ν e Q n,k : n (1) , b n,k := b ε · ν e Q n,k : n (1) , (13) and ν n,k ∈ P ( X ) and V n,k : X → [1 , ∞ ) ar e define d by ν n,k ( A ) := ν e Q n,k : n (1) I A ν e Q n,k : n (1) , A ∈ B ( X ) , (14) V n,k ( x ) := V ( x ) M n,k +1 e Q n,k +1: n (1) ( x ) , x ∈ X , 0 ≤ k < n, (15) V n,n := V , with λ , V , b , ν , ε and C as in (A2). Pr o of. Noting that 0 < e Q n,k : n (1)( x ) ≤ 1 for all x ∈ X , we hav e S n,k ( x, A ) = M n,k e Q n,k : n (1) I A ( x ) M n,k e Q n,k : n (1) ( x ) ≥ M n,k e Q n,k : n (1) I A ( x ) ≥ ε · ν e Q n,k : n (1) ν e Q n,k : n (1) I A ν e Q n,k : n (1) , x ∈ C . and S n,k V n,k ( x ) = M n,k e G n,k V ( x ) M n,k e Q n,k : n (1) ( x ) ≤ λ V ( x ) M n,k e Q n,k : n (1) ( x ) + b I C ( x ) M n,k e Q n,k : n (1) ( x ) ≤ λV n,k − 1 ( x ) + b I C ( x ) ε · ν e Q n,k : n (1) , ∀ x ∈ X , (16) recalling the conv en tion e Q n,n : n (1) = 1 . 10 The minorization and drift conditions (11)-(12) pav e the w ay to establishing the stability prop erties of the op erators { Φ n,k } . In order to mov e further we in tro duce assumption (A3) below, which is a stricter v ersion of (A2). The extra structure of (A3) allows the construction in Prop osition 1 of bi-v ariate drift and minorization conditions. Consideration of the case in whic h the small set arises as a sub-level set of the drift function V is a standard and generic approach to constructing bi-v ariate drift conditions from their uni-v ariate coun ter-parts, see for example [1, 14]. F urthermore, assumption (A3) allows the lev el in question to b e chosen in a very flexible wa y , and this prop erty is exploited in Prop osition 1 when dealing with the sp ecific structure arising from the kernels { S n,k } . V erification of (A3) in a particular application is pro vided in section 5. Assumption (A4) allo ws the bounding of non-homogeneous minorization and drift constants. A generic approach to verifying (A4) is presented at the end of this section, where the connection with flattening prop ert y mentioned in the introduction is made more explicit. ( A3 ) There exists d 0 ≥ 1 , λ ∈ [0 , 1) , a function V : X → [1 , ∞ ) , and for all d ∈ [ d 0 , + ∞ ) , there exists ε d ∈ (0 , 1] , b d ∈ (0 , ∞ ) and ν d ∈ P ( X ) such that ν d ( C d ) > 0 , ν d ( V ) < + ∞ , inf n ≥ 1 inf 1 ≤ k ≤ n M n,k ( x, A ) ≥ ε d · ν d ( A ) , ∀ A ∈ B ( X ) , ∀ x ∈ C d , (17) sup n ≥ 1 sup 1 ≤ k ≤ n M n,k V ( x ) ≤ λV ( x ) + b d I C d ( x ) , ∀ x ∈ X , (18) where C d := { x : V ( x ) ≤ d } . ( A4 ) Whenev er (A1) and (A3) hold, for any µ ∈ P ( X ) with µ ( V ) < + ∞ , and d ∈ [ d 0 , + ∞ ) there exists a p ositiv e and finite constan t K ( µ, λ, V , b d ) such that inf n ≥ 1 inf 0 ≤ k ≤ n µ e Q n,k : n (1) ≥ K ( µ, λ, V , b d ) , where λ , V and d 0 are as in (A3). The main result of this section is now presented. Theorem 1. Assume (A1), (A3) and (A4). Then for α ∈ (0 , 1] , ther e exists ρ ∈ ( λ, 1) and a finite c onstant M such that for any µ, µ 0 ∈ P ( X ) , any n ∈ N and any 1 ≤ k ≤ n , k µS n,k . . . S n,n − µ 0 S n,k . . . S n,n k V α ≤ M ρ n − k +1 µ V α n,k − 1 + µ 0 V α n,k − 1 , (19) 11 wher e V n,k − 1 is as given in e quation (15), and c onse quently, for 0 ≤ k ≤ n, k Φ n,k : n ( µ ) − Φ n,k : n ( µ 0 ) k V α ≤ M ρ n − k µ e G n,k V α µ e Q n,k : n (1) + µ 0 e G n,k V α µ 0 e Q n,k : n (1) . (20) The pro of of Theorem 1 is p ostponed. It inv olv es the bi-v ariate drift functions identified in the following prop osition. Prop osition 1. Assume (A1) and (A3) and let V , d 0 and λ b e as in (A3). Then for al l d ≥ d 0 , and al l ¯ λ ∈ ( λ, 1) ther e exists ¯ d ≥ d , and for e ach n ∈ N , ther e exist 1) a c ol le ction of functions ¯ V n,k ; 0 ≤ k ≤ n , with e ach ¯ V n,k : X × X → [1 , ∞ ) and ¯ V n,n ( x, x 0 ) = 1 2 [ V ( x ) + V ( x 0 )] ; 2) c ol le ctions { ¯ n,k ; 1 ≤ k ≤ n } , and b n,k ; 0 ≤ k ≤ n − 1 dep ending on d and ¯ d , with e ach ¯ n,k ∈ (0 , 1] and e ach ¯ b n,k ∈ (0 , ∞ ) ; 3) a c ol le ction of pr ob ability me asur es { ¯ ν n,k ; 1 ≤ p ≤ n } , dep ending on ¯ d , with e ach ¯ ν n,k ∈ P ( X ) ; such that for 1 ≤ k ≤ n , S n,k ( x, · ) ∧ S n,k ( x 0 , · ) ≥ ¯ n,k ¯ ν n,k ( · ) , ∀ ( x, x 0 ) ∈ ¯ C ¯ d , (21) S ∗ n,k ¯ V n,k ( x, x 0 ) ≤ ¯ λ ¯ V n,k − 1 ( x, x 0 ) + b n,k − 1 I ¯ C ¯ d ( x, x 0 ) , ∀ ( x, x 0 ) ∈ X × X , (22) wher e ¯ C ¯ d := ( x, x 0 ) : V ( x ) ≤ ¯ d, V ( x 0 ) ≤ ¯ d , S ∗ n,k : X × X × B ( X × X ) → [0 , 1] is defined by S ∗ n,k (( x, x 0 ) , d ( y, y 0 )) = S n,k ( x, dy ) S n,k ( x 0 , dy 0 ) , ( x, x 0 ) / ∈ ¯ C ¯ d , ¯ R n,k (( x, x ) , d ( y , y 0 )) , ( x, x 0 ) ∈ ¯ C ¯ d and ¯ R n,k : X × X × B ( X × X ) → [0 , 1] is define d by 12 ¯ R n,k (( x, x ) , d ( y , y 0 )) = 1 (1 − ¯ n,k ) 2 ( S n,k ( x, dy ) − ¯ n,k ¯ ν n,k ( dy )) ( S n,k ( x 0 , dy 0 ) − ¯ n,k ¯ ν n,k ( dy 0 )) . F urthermor e, inf n ≥ 1 inf 1 ≤ k ≤ n ¯ n,k > 0 , and sup n ≥ 1 sup 0 ≤ k ≤ n − 1 ¯ b n,k < + ∞ . (23) Pr o of. Throughout the pro of, expressions featuring indices n and k hold for all n ∈ N and 1 ≤ k ≤ n , unless stated otherwise. Fix d ≥ d 0 and ¯ λ ∈ ( λ, 1) . Then let ε d , b d and ν d b e the corresp onding constants and minorizing measure from (A3). Let K ( ν d , λ, V , b d ) b e the constant of assumption (A4) corresp onding to ν d and b d , and set ¯ d := " b d ε d ¯ λ − λ K ( ν d , λ, V , b d ) − 1 # ∨ d. Then for equation (21), under assumption (A3), there exists ε ¯ d and ν ¯ d suc h that S n,k ( x, A ) ≥ ε ¯ d · ν ¯ d e Q n,k : n (1) I A = ε ¯ d · ν ¯ d e Q n,k : n (1) ν ¯ d e Q n,k : n (1) I A ν ¯ d e Q n,k : n (1) , for all x ∈ C ¯ d . Then setting ¯ n,k := ε ¯ d · ν ¯ d e Q n,k : n (1) , ¯ ν n,k ( A ) := ν ¯ d e Q n,k : n (1) I A ν ¯ d e Q n,k : n (1) , A ∈ B ( X ) , (24) establishes (21). The first part of (23) is an immediate consequence of (A4) and (24): ε ¯ d · ν ¯ d e Q n,k : n (1) ≥ ε ¯ d K ( ν ¯ d , λ, V , b ¯ d ) . (25) Let { V n,k } b e as defined in (15). Consider the collection of bi-v ariate drift functions ¯ V n,k ; 0 ≤ k ≤ n , with eac h ¯ V n,k : X × X → [1 , ∞ ) defined by ¯ V n,k ( x, x 0 ) := 1 2 [ V n,k ( x ) + V n,k ( x 0 )] . (26) 13 W e no w pro ceed to establish the bi-v ariate drift condition of equation (22). First, following the same argumen ts as in the pro of of Lemma 1, under (A3), S n,k V n,k ( x ) ≤ λV n,k − 1 ( x ) + b d I C d ( x ) ε d · ν d e Q n,k : n (1) , ∀ x ∈ X . (27) F rom (27), for ( x, x 0 ) / ∈ ¯ C ¯ d w e hav e S ∗ n,k ¯ V n,k ( x, x 0 ) ≤ λ 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] + b d ε d K ( ν d , λ, V , b d ) 1 2 [ I C d ( x ) + I C d ( x 0 )] ≤ λ 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] + ¯ λ − λ ¯ d + 1 1 2 I C ¯ d ( x ) + I C ¯ d ( x 0 ) ≤ λ 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] + ¯ λ − λ 1 2 [ V ( x ) + V ( x 0 )] I C ¯ d ( x ) + I C ¯ d ( x 0 ) ≤ λ 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] + ¯ λ − λ 1 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] I C ¯ d ( x ) + I C ¯ d ( x 0 ) ≤ ¯ λ ¯ V n,k − 1 ( x, x 0 ) , where for the first inequality (A4) has been applied, the second inequality is due to the definition of ¯ d and the p en ultimate inequality is due to the definition of ¯ V n,k − 1 . F or all ( x, x 0 ) ∈ ¯ C ¯ d , S ∗ n,k ¯ V n,k ( x, x 0 ) = ¯ R n,k ¯ V n,k ( x, x 0 ) = 1 2(1 − ¯ n,k ) [ S n,k V n,k ( x ) + S n,k V n,k ( x 0 ) − 2 n,k ¯ ν n,k ( V n,k )] ≤ λ 2(1 − ¯ n,k ) [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] + b d 2(1 − ¯ n,k ) ε d ν d e Q n,k : n (1) [ I C d ( x ) + I C d ( x 0 )] ≤ λ ¯ d (1 − ¯ n,k ) 1 inf x : V ( x ) ≤ ¯ d M n,k e Q n,k : n (1) ( x ) + b d (1 − ¯ n,k ) ε d ν d e Q n,k : n (1) =: ¯ b n,k − 1 , (28) where equation (27) has b een used. This concludes the proof of equation (22). Applying (A4) to the denominator terms in (28) and using (24)-(25) establishes the remaining part of equation (23). Pr o of. (The or em 1). Let d 0 and λ be as in (A3). Set d ≥ d 0 , ¯ λ ∈ ( λ, 1) and let ¯ d , { ¯ n,k } , ¯ b n,k , ¯ R n,k , and ¯ V n,k b e as in Prop osition 1. The latter v erifies conditions (NS1) and (NS2) of Douc et al. [14]. Consequently [14, Theorem 8] may b e applied. F or α = 1 , the uniform b ounds in equation (23) of 14 Prop osition 1, combined with standard manipulations of the b ounds of Douc et al. [14, Theorem 8] (details omitted for brevity) sho w that there exists a finite constant M and ρ < 1 such that k µS n,k . . . S n,n − µS n,k . . . S n,n k V ≤ M ρ n − k +1 [ µ ( V n,k − 1 ) + µ 0 ( V n,k − 1 )] , (29) Noting that from equation (7), Φ n,k : n ( µ )( A ) = µ e G n,k M n,k +1 e Q n,k +1: n (1) S n,k +1 . . . S n,n ( A ) µ e Q n,k : n (1) , equation (20) holds due to (29) and the definition of V n,k − 1 giv en in equation (15). F or the case α ∈ (0 , 1) , due to Jensen’s inequality and the fact that for any t wo non-negativ e reals a, b and α ∈ [0 , 1] , ( a + b ) α ≤ a α + b α , w e hav e that whenever equation (22) of Prop osition 1 holds, S ∗ n,k ¯ V α n,k ( x, x 0 ) ≤ ¯ λ ¯ V n,k − 1 ( x, x 0 ) + ¯ b n,k − 1 I ¯ C ( x, x 0 ) α ≤ ¯ λ α ¯ V α n,k − 1 ( x, x 0 ) + ¯ b α n,k − 1 I ¯ C ( x, x 0 ) , ∀ ( x, x 0 ) ∈ X × X . and for ¯ V n,k − 1 giv en in equation (26), ¯ V α n,k − 1 ( x, x 0 ) ≤ 1 2 α V α n,k ( x ) + V α n,k ( x 0 ) . The arguments as for the case α = 1 are then rep eated essentially replacing ¯ V n,k , ¯ λ , b n,k − 1 b y ¯ V α n,k , ¯ λ α and b α n,k − 1 resp ectiv ely , in order to establish equation (19) and thus (20). The details are omitted for brevity . 3.1 V erifying assumption (A4) The follo wing lemma illustrates that (A4) can b e verified under a generic condition on the deca y in x of the p oten tial functions { G n,k } sp ecified via { U n,k } , relative to the drift function V of assumption (A3). Lemma 2. Assume (A1) and (A3). L et V , d 0 and λ b e as in (A3) and assume sup n ≥ 1 sup 0 ≤ k ≤ n − 1 sup x ∈ X U n,k ( x ) V ( x ) < + ∞ . 15 F or any d ≥ d 0 let b d b e the c orr esp onding c onstant of (A3). Then ther e exists a p ositive, finite c onstant C dep ending only on λ and b d such that for any µ ∈ P ( X ) with µ ( V ) < + ∞ , inf n ≥ 1 inf 0 ≤ k ≤ n µ e Q n,k : n (1) ≥ exp [ − C µ ( V )] . (30) Pr o of. Firstly , µ e Q n,n : n (1) = µ (1) and by Jensen’s inequality , µ e Q n,n − 1: n (1) = µ e G n,n − 1 ≥ exp − 1 n µ ( U n,n − 1 ) ≥ exp − µ ( V ) k U n,n − 1 k V . F or 1 ≤ k < n − 1 , by Jensen’s inequality , µ e Q n,k : n (1) = ˆ X n − k +1 exp − 1 n n − 1 X ` = k U n,` ( x ` ) ! µ ( dx k ) n Y ` = k +1 M n,` ( x ` − 1 , dx ` ) ≥ exp " − 1 n ˆ X n − k +1 n − 1 X ` = k U n,` ( x ` ) ! µ ( dx k ) n Y ` = k +1 M n,` ( x ` − 1 , dx ` ) # = exp " − 1 n µ ( U n,k ) − 1 n n − 1 X ` = k +1 ˆ X U n,` ( x ` ) µM n,k : ` ( dx ` ) # . (31) Iteration of the drift inequality in (A3) shows that for any 1 ≤ k < ` < n , ˆ X V ( x ` ) M n,k : ` ( x k , dx ` ) ≤ λ ` − k V ( x ) + b d ` − k − 1 X j =0 λ j . (32) It follows from (32) that ˆ X U n,` ( x ` ) µM n,k − 1: ` ( dx ` ) ≤ k U n,` k V ˆ X V ( x ` ) µM n,k − 1: ` ( dx ` ) ≤ µ ( V ) + b d 1 1 − λ whic h combined with (31) implies the desired result. 4 L p error b ounds for the particle measures Making use of the results of section 3, the following theorem presents an L p b ound on the error η N n,n ( f ) − η n,n ( f ) , for some p ossibly unbounded f . This theorem rests on assumptions ab out the moments of the mean particle drift, η N n,k ( V ) , and a related normalization quan tity , which are used in the pro of to b ound the momen ts of Martingale increments asso ciated with the particle approximation. Discussion of the (34) is 16 giv en in the app endix and b oth assumptions are verified in the application of section 5. Theorem 2. Assume (A1), (A3) and (A4). L et V b e as in (A3) and for s > 0 an indep endent p ar ameter let t := 1+ s s . L et p ≥ 1 and α ∈ [0 , 1] , b e such that α tp ≤ 1 and (1 + s ) p ≤ 1 , and for µ ∈ P ( X ) assume sup N ≥ 1 sup n ≥ 1 sup 1 ≤ k ≤ n ¯ E µ η N n,k e Q n,k : n (1) − (1+ s ) p < + ∞ , (33) sup N ≥ 1 sup n ≥ 1 sup 1 ≤ k ≤ n ¯ E µ η N n,k V αtp < + ∞ . (34) Then ther e exists ρ < 1 and a finite c onstant C dep ending on α , µ , V , and the c onstants in (A1), (A3), and (A4) such that for any f ∈ L V α , n ∈ N and N ∈ N , ¯ E µ h η N n,n − η n,n ( f ) p i 1 /p ≤ C k f k V α 1 − ρ n 1 − ρ 1 √ N . (35) Pr o of. Throughout the pro of C denotes a constan t whose v alue may change on each app earance. Consider the telescoping decomp osition η N n,n − η n,n ( f ) = n X k =0 Φ n,k : n η N n,k − Φ n,k : n Φ n,k η N n,k − 1 ( f ) , (36) with the conv en tion Φ n, 0 η N n, − 1 := µ . F or any of the terms in the summation of equation (36), following the approach of Del Moral [6, page 245], we ha ve Φ n,k : n η N n,k − Φ n,k : n Φ n,k η N n,k − 1 ( f ) = η N n,k e Q n,k : n η N n,k e Q n,k : n (1) − Φ n,k η N n,k − 1 e Q n,k : n Φ n,k η N n,k − 1 e Q n,k : n (1) ( f ) = 1 η N n,k e Q n,k : n (1) η N n,k e Q n,k : n − η N n,k e Q n,k : n (1) Φ n,k η N n,k − 1 e Q n,k : n Φ n,k η N n,k − 1 e Q n,k : n (1) ( f ) = 1 η N n,k e Q n,k : n (1) η N n,k − Φ n,k η N n,k − 1 e Q N n,k : n ( f ) , (37) where e Q N n,k : n ( f )( x ) := e Q n,k : n ( f )( x ) − e Q n,k : n (1)( x ) Φ n,k η N n,k − 1 e Q n,k : n ( f ) Φ n,k η N n,k − 1 e Q n,k : n (1) . 17 F rom equations (36) and (37), for p ≥ 1 , ¯ E µ h η N n,n − η n,n ( f ) p i 1 /p ≤ n X k =0 ¯ E µ " 1 η N n,k e Q n,k : n (1) η N n,k − Φ n,k η N n,k − 1 e Q N n,k : n ( f ) p # 1 /p ≤ n X k =0 ¯ E µ 1 η N n,k e Q n,k : n (1) (1+ s ) p 1 / [(1+ s ) p ] ¯ E µ η N n,k − Φ n,k η N n,k − 1 e Q N n,k : n ( f ) pt 1 / ( tp ) , (38) where Minko wski’s and Hölder’s inequalities hav e b een applied. W e next pro ceed to b ound eac h of the factors in the summands of equation (38). Denoting η N n,k − Φ n,k η N n,k − 1 e Q N n,k : n ( f ) = 1 N N X i =1 T ( i ) n,k , where T ( i ) n,k := e Q N n,k : n ( f )( ξ ( N ,i ) n,k ) − Φ n,k η N n,k − 1 e Q N n,k : n ( f ) , (with the dep endence of T ( i ) n,k on N suppressed) we hav e that for any n ∈ N , 1 ≤ k ≤ n and 1 ≤ i ≤ N , ¯ E µ h T ( i ) n,k F ( N ,i − 1) n,k i = 0 , 18 with the conv en tion that F ( N , 0) n,k = F ( N ,N ) n,k − 1 . Next, there exists a constant C such that for any p ≥ 1 , ¯ E µ h T ( i ) n,k p i 1 /p = ¯ E µ " e Q n,k : n (1) ξ ( N ,i ) n,k " S n,k +1 . . . S n,n ( f ) ξ ( N ,i ) n,k − Φ n,k η N n,k − 1 e Q n,k : n (1) S n,k +1 . . . S n,n ( f ) Φ n,k η N n,k − 1 e Q n,k : n (1) p 1 /p ≤ ρ n − k M k f k V α ¯ E µ e Q n,k : n (1) ξ ( N ,i ) n,k V α n,k ξ ( N ,i ) n,k + Φ n,k η N n,k − 1 e Q n,k : n (1) V α n,k Φ n,k η N n,k − 1 e Q n,k : n (1) p 1 /p ≤ ρ n − k M k f k V α ¯ E µ h V αp ξ ( N ,i ) n,k i 1 /p + ρ n − k M k f k V α ¯ E µ e Q n,k : n (1) ξ ( N ,i ) n,k Φ n,k η N n,k − 1 e Q n,k : n (1) V α n,k Φ n,k η N n,k − 1 e Q n,k : n (1) p 1 /p ≤ ρ n − k M k f k V α ¯ E µ h V αp ξ ( N ,i ) n,k i 1 /p + ρ n − k M k f k V α ¯ E µ Φ n,k η N n,k − 1 e Q n,k : n (1) V αp n,k Φ n,k η N n,k − 1 e Q n,k : n (1) ¯ E µ h e Q n,k : n (1) ξ ( N ,i ) n,k F ( N ,i − 1) n,k i 1 /p ≤ ρ n − k M k f k V α ¯ E µ η N n,k ( V αp ) 1 /p + ¯ E µ Φ n,k η N n,k − 1 ( V αp ) 1 /p ≤ ρ n − k C k f k V α , (39) where Theorem 1, follow ed by Minko wski’s inequality , Jensen’s inequality , the exchangeabilit y prop erty of equation (8), Jensen’s inequality again and the assumption of equation (34) ha ve b een applied. Th us for fixed N , n P i j =1 T ( j ) n,k , F ( N ,i ) n,k ; 1 ≤ i ≤ N o is a Martingale sequence with increments b ounded in L p . It follo ws that when tp ≥ 2 , by the Burkholder-Davis inequality and Minko wski’s inequality , there exists a constan t C such that ¯ E µ η N n,k − Φ n,k η N n,k − 1 e Q N n,k : n ( f ) tp 1 / ( tp ) ≤ C N − 1 ¯ E µ N X i =1 T ( i ) n,k 2 tp/ 2 1 / ( tp ) ≤ C N − 1 N X i =1 ¯ E µ T ( i ) n,k tp 2 / ( tp ) ! 1 / 2 , 19 and when 1 < tp < 2 , using the fact that for any a, b ≥ 0 and 0 ≤ r ≤ 1 , ( a + b ) r ≤ a r + b r , ¯ E µ η N n,k − Φ n,k η N n,k − 1 e Q N n,k : n ( f ) tp 1 / ( tp ) ≤ C N − 1 ¯ E µ N X i =1 T ( i ) n,k 2 tp/ 2 1 / ( tp ) ≤ C N − 1 N X i =1 ¯ E µ T ( i ) n,k tp ! 1 / ( tp ) . Com bining with (39), we conclude that there exists a constant C such that ¯ E µ η N n,k − Φ n,k η N n,k − 1 e Q N n,k : n ( f ) tp 1 / ( tp ) ≤ ρ n − k C k f k V α N −{ tp/ 2 ∧ ( tp − 1) } / ( tp ) , for all n ≥ 1 and 1 ≤ k ≤ n . The remaining terms in (38) are treated directly by the assumption of equation (33), and therefore up on returning to (38) w e conclude that there exists a constant C such that ¯ E µ h η N n,n − η n,n ( f ) p i 1 /p ≤ C k f k V α 1 √ N n X k =0 ρ n − k < C k f k V α 1 − ρ n 1 − ρ 1 √ N , and the result holds. 5 Application In this section is concerned with the case in which X = R d , B ( R d ) is the corresp onding Borel σ -algebra and throughout consider the follo wing structural definitions and assumptions. • Let π ∈ P ( X ) b e a target distribution admitting a density with resp ect to Leb esgue measure. Also denote by π its densit y . In applications of interest, this density will only b e kno wn up to a multiplicativ e constan t, Z , and denote by ¯ π the unnormalised densit y , i.e. π ( x ) = ¯ π ( x ) / Z , x ∈ R d . • F or γ ∈ (0 , 1] a constant, let γ : [0 , 1] → [ γ , 1] b e a non-decreasing, Lipschitz function. • Let π γ ; γ ∈ [ γ , 1] b e the family of probabilit y measures defined by π γ ( A ) := ´ A ¯ π γ ( x )d x ´ X ¯ π γ ( x )d x , A ∈ B ( X ) . 20 • Let q ∈ P ( X ) b e an increment distribution admitting a density with resp ect to Leb esgue measure, also denoted by q . F or each n ≥ 1 and 1 ≤ k ≤ n , let M n,k b e a random walk Metrop olis (R WM) kernel of in v ariant distribution π γ ( k/n ) and prop osal kernel q , i.e. M n,k ( x, A ) = ˆ A − x 1 ∧ ¯ π γ ( k/n ) ( x + y ) ¯ π γ ( k/n ) ( x ) q ( y ) λ Leb ( dy ) + δ x ( A ) ˆ X − x 1 − 1 ∧ ¯ π γ ( k/n ) ( x + y ) ¯ π γ ( k/n ) ( x ) q ( y ) λ Leb ( dy ) , A ∈ B ( X ) . where for any set C , C − x := { z ∈ X ; z + x ∈ C } (note that in applications it ma y b e of interest to allo w q to dep end on n and k , for example via γ ( k /n ) , but for simplicity this issue is not pursued further here). • Let { G n,k ; n ≥ 1 , 0 ≤ k ≤ n − 1 } b e a collection of p oten tial functions defined by G n,k ( x ) = exp 1 n log ¯ π ( x ) γ (( k + 1) /n ) − γ ( k /n ) 1 /n . Consider the following assumptions on the target density π and increment densit y q . • The density π is strictly p ositiv e, b ounded and has contin uous first deriv ativ es such that lim r →∞ sup | x |≥ r n ( x ) · ∇ log π ( x ) = −∞ , lim r →∞ sup | x |≥ r n ( x ) · ∇ π ( x ) |∇ π ( x ) | < 0 , (40) • F or all r > 0 there exists r > 0 such that | x | ≤ r ⇒ q ( x ) ≥ r . (41) The assumptions of equations (40)-(41) are standard types of assumptions ensuring geometric ergo dicit y of R WM k ernels [26, 20]. The assumption of equation (41) is stronger than the standard one in [20], but is flexible enough to verify (A3) whic h in volv es a family of minorization measures/constants, inde xed o ver a range of levels of V . The interest in the sp ecific FK mo dels of this section arises from the choice of the initial distribution µ addressed in the following lemma. This FK mo del corresp onds to a particular choice of the “backw ards” k ernels in [11], and in the corresp onding SMC algorithm the order of the weigh ting and resampling steps is 21 rev ersed. Lemma 3. Consider the op er ators { Φ n,k } asso ciate d with { G n,k } and { M n,k } of se ction 5. Then for al l n ≥ 1 and 0 ≤ k ≤ n , Φ n, 0: k π γ = π γ ( k/n ) . Pr o of. Fix n arbitrarily and supp ose the result holds at rank 0 < k < n . Then Φ n, 0: k +1 π γ ( A ) = Φ n,k +1 ◦ Φ n, 0: k π γ ( A ) = ´ ¯ π γ (( k +1) /n ) ( x ) ¯ π γ ( k/n ) ( x ) M n,k +1 ( A )( x ) π γ ( k/n ) ( dx ) ´ ¯ π γ (( k +1) /n ) ( x ) ¯ π γ ( k/n ) ( x ) π γ ( k/n ) ( dx ) = π γ (( k +1) /n ) ( A ) , A ∈ B ( X ) , due to the prop erty that M n,k is in v ariant for π γ ( k/n ) ( dx ) . The pro of is complete up on noting that for all n , Φ n, 0:0 = I d by conv en tion. W e hav e the following result. Theorem 3. Consider the c ol le ction of FK mo dels sp e cifie d in se ction 5. L et s > 0 b e an indep endent p ar ameter and set t = 1+ s s . L et α ∈ [0 , 1] , and p ≥ 1 b e such that αpt ≤ 1 and (1 + s ) p (1 − γ ) /γ < 1 . Then ther e exist finite c onstants C 1 ( p, µ ) , and C 2 µ, π γ (dep ending implicitly on π and γ ( · ) ), and c onstants β ∈ (0 , 1) and ρ ∈ [0 , 1) , such that for any f ∈ L V α , n ≥ 1 and N ≥ 1 , ¯ E µ h π N n − π ( f ) p i 1 /p ≤ k f k V α C 1 ( p, µ ) √ N + ρ n C 2 µ, π γ I h µ 6 = π γ i , wher e V ( x ) ∝ π − β γ ( x ) and for e ach n , π N n := η N n,n . The pro of of Theorem 3 is p ostponed until after the following prop osition regarding the verification of assumptions. Prop osition 2. Consider the setting of se ction 5. Then (A1), (A3) and (A4) hold. Pr o of. As the density π is b ounded and γ ( · ) is Lipsc hitz, (A1) holds by the mean v alue theorem. W e now turn to the verification of (A3). V arious arguments are adopted from Andrieu et al. [1] and the manipulations are fairly standard, but are included here for completeness. The main difference is that we need to explicitly 22 v erify the drift and minorization conditions of (A3) which hold ov er a range of sub-levels for V and the pro of b elo w inv olves verification of some assumptions taken as given in [1, Lemma 4]. Firstly , due to the definition of π γ w e hav e that ∇ log π γ ( x ) = γ ∇ log ¯ π ( x ) = γ ∇ log π ( x ) and ∇ π ( x ) |∇ π ( x ) | = ¯ π ( x ) ∇ log ¯ π ( x ) | ¯ π ( x ) ∇ log ¯ π ( x ) | γ γ ¯ π γ − 1 ( x ) ¯ π γ − 1 ( x ) = ∇ π γ ( x ) ∇ π γ ( x ) and so lim r →∞ sup | x |≥ r n ( x ) · ∇ log π γ ( x ) = −∞ , lim r →∞ sup | x |≥ r n ( x ) · ∇ π γ ( x ) ∇ π γ ( x ) < 0 (42) In order to verify (A3) we first verify a drift condition for M 0 ( x, dy ) , defined to b e the R WM kernel rev ersible w.r.t. π γ ( x ) with increment densit y q as ab ov e, i.e. M 0 ( x, A ) = ˆ A − x 1 ∧ ¯ π γ ( x + y ) ¯ π γ ( x ) q ( y ) λ Leb ( dy ) + δ x ( A ) ˆ X − x 1 − 1 ∧ ¯ π γ ( x + y ) ¯ π γ ( x ) q ( y ) λ Leb ( dy ) , A ∈ B ( X ) . Let β ∈ (0 , 1) and define V : X → [1 , + ∞ ) by V ( x ) := π − γ β ( x ) inf x π − γ β ( x ) . (43) The results of Jarner and Hansen [20] sho w that when (42) holds, then for M 0 with increment density q satisfying equation (41), it holds that lim r →∞ sup | x |≥ r M 0 V ( x ) V ( x ) < 1 . Th us there exist λ < 1 and ρ λ < + ∞ suc h that | x | ≥ ρ λ ⇒ M 0 V ( x ) V ( x ) ≤ λ. (44) Due to (40), there exists > 0 and ρ > 0 such that | x | ≥ ρ ⇒ n ( x ) · ∇ log π γ ( x ) ≤ − . No w set r 0 = ρ λ ∨ ρ and d 0 := sup | x |≤ r 0 V ( x ) . Note that d 0 < + ∞ due to the definition of V and as the densit y π is contin uous and p ositive. W e now pro ceed to verify the drift part of (A3). 23 F or any d ≥ d 0 let C d := { x : V ( x ) ≤ d } . W e then hav e sup x ∈ C d M 0 V ( x ) ≤ d sup x ∈ C d M 0 V ( x ) V ( x ) = d sup x ∈ C d ( ˆ A ( x ) ¯ π − γ β ( x + y ) ¯ π − γ β ( x ) q ( y ) λ Leb ( dy ) + ˆ R ( x ) " 1 − ¯ π γ ( x + y ) ¯ π γ ( x ) + ¯ π γ (1 − β ) ( x + y ) ¯ π γ (1 − β ) ( x ) # q ( y ) λ Leb ( dy ) ) (45) where A ( x ) := { y ∈ X : π ( x + y ) ≥ π ( x ) } , R ( x ) := { y ∈ X : π ( x + y ) < π ( x ) } . As each of the ratios in (45) is less than or equal to 1 , then we conclude that there exists a constant C b < + ∞ such that sup x ∈ C d M 0 V ( x ) ≤ dC b =: b d . (46) Noting the definition of d 0 , and combining (44) and (46) w e obtain M 0 V ( x ) = M 0 V ( x ) I [ | x | > r 0 ] + M 0 V ( x ) I [ | x | ≤ r 0 ] ≤ λV ( x ) + M 0 V ( x ) I [ V ( x ) ≤ d ] ≤ λV ( x ) + b d I C d ( x ) , (47) for any d ≥ d 0 and x ∈ X . The arguments of Andrieu et al. [1, Lemma 5] then give M n,k V ( x ) ≤ M 0 V ( x ) and from this, (47) and (46), we obtain for any d ≥ d 0 , sup n ≥ 1 sup 1 ≤ k ≤ n M n,k V ( x ) ≤ λV ( x ) + b d I C d ( x ) , whic h establishes the drift part of (A3). It remains to show the minorization part. T o this end we first sho w that for an y d ≥ d 0 , C d is b ounded. Recalling the definition of r 0 , we hav e that for any x suc h that | x | − r 0 ≥ 0 , V ( x ) V ( n ( x ) r 0 ) = π ( x ) π ( n ( x ) r 0 ) − γ β = exp − γ β ( | x | − r 0 ) ˆ 1 0 n ( x ) · ∇ log π ( tx + (1 − t ) n ( x ) r 0 ) dt ≥ exp γ β ( | x | − r 0 ) from whic h we see that lim r →∞ inf | x |≥ r V ( x ) = + ∞ , which in turn implies that for all d ≥ d 0 there exists r d ≥ 0 such that V ( x ) ≤ d ⇒ | x | ≤ r d . Then for an y n ≥ 1 , 0 ≤ k ≤ n and r d ≥ 0 , whenever x ∈ C d 24 M n,k ( x, A ) ≥ ˆ A − x 1 ∧ π ( x + y ) π ( x ) q ( y ) λ Leb ( dy ) ≥ ˆ ( A ∩ B (0 ,r d ) − x ) 1 ∧ π ( x + y ) π ( x ) q ( y ) λ Leb ( dy ) ≥ 2 r d inf y ∈ B (0 , 3 r d ) π ( y ) sup y ∈ B (0 , 3 r d ) π ( y ) ˆ ( A ∩ B (0 ,r d ) − x ) λ Leb ( dy ) = 2 r d inf y ∈ B (0 , 3 r d ) π ( y ) sup y ∈ B (0 , 3 r d ) π ( y ) ˆ B (0 ,r d ) λ Leb ( dy ) ´ A ∩ B (0 ,r d ) λ Leb ( dy ) ´ B (0 ,r d ) λ Leb ( dy ) =: ε d · ν d ( A ) , where the third inequalit y holds due to the prop erties of q in (41) and b ecause A ∩ B (0 , r d ) ∩ B ( x, 2 r d ) = A ∩ B (0 , r d ) whenever x ∈ B (0 , r d ) . As π is strictly p ositiv e and contin uous, V is b ounded on compact sets and therefore ν d ( V ) < + ∞ . Also, C d ⊇ C d 0 and then due to the definition of d 0 , ν d ( C d ) ≥ ν d ( C d 0 ) ≥ ν d ( B (0 , r 0 )) > 0 . This concludes the verification of (A3). F or (A4), from the definition of G n,k , we observe that U n,k is defined by U n,k ( x ) = log ¯ π ( x ) γ (( k + 1) /n ) − γ ( k /n ) 1 /n − C γ sup y log ¯ π ( y ) where C γ is the Lipschitz constant for γ ( · ) and we observe that sup n ≥ 1 sup 0 ≤ k ≤ n − 1 sup x ∈ X U n,k ( x ) V ( x ) < + ∞ . Assumption (A4) is then satisfied up on application of Lemma 2. Pr o of. (The or em 3) Throughout the pro of, we denote by C a constan t whose v alue ma y c hange upon eac h appearance. Consider the error decomp osition ¯ E µ h π N n − π ( f ) p i 1 /p = ¯ E µ h η N n,n − η n,n ( f ) p i 1 /p + | ( η n,n − π ) ( f ) | . (48) Cho ose β such that (1 + s ) p (1 − γ ) / γ β ≤ 1 and take V to b e defined as in equation (43). By prop osition 2, the FK mo del of section 5 satisfies assumptions (A1), (A3) and (A4). The second term on the r.h.s. of (48) is treated by application of Theorem 1. Noting that by definition, η n,n = Φ n, 0: n ( µ ) and by Lemma 3, 25 π = Φ n, 0: n π γ w e obtain from Theorem 1 that there exist constants ρ , M and C 2 suc h that k η n,n − π k V α ≤ M ρ n µ e G n,k V α µ e Q n,k : n (1) + π γ e G n,k V α π γ e Q n,k : n (1) I h µ 6 = π γ i ≤ ρ n C 2 µ, π γ I h µ 6 = π γ i . where the constant C 2 arises from assumption (A4) applied to the denominator terms and implicitly dep ends on π γ , V , and the constants in (A3). In order to apply Theorem 2 to the first term on the r.h.s. of (48) it remains to verify the assumptions of equations (33)-(34). W e start by addressing the latter. F rom the definition of V and due to the assumption that γ ( · ) is non-decreasing w e observe that for all x, x 0 ∈ X , [ G n,k ( x ) − G n,k ( x 0 )] [ V ( x ) − V ( x 0 )] ≤ 0 and therefore b y Lemma 4 for all (p ossibly random) η ∈ P ( X ) , η ( G n,k V ) ≤ η ( G n,k ) η ( V ) . Then for any n ≥ 1 and 1 ≤ k ≤ n . ¯ E µ h η N n,k ( V ) F ( N ,N ) n,k − 1 i ≤ λ η N n,k − 1 ( G n,k − 1 V ) η N n,k − 1 ( G n,k − 1 ) + b d 0 ≤ λη N n,k − 1 ( V ) + b d 0 , (49) where (A3) has b een applied with d 0 is defined b elo w equation (44) in the pro of of prop osition 2. Stan- dard iteration of the particle drift inequality (49) (details omitted for brevit y) combined with the fact that n ξ ( N ,i ) n, 0 ; i = 1 , ..., N o are are indep enden t and each distributed according to µ sho ws that sup N ≥ 1 sup n ≥ 1 sup 1 ≤ k ≤ n ¯ E µ η N n,k ( V ) < + ∞ , (50) and noting that αpt ≤ 1 , equation (34) then holds by tw o applications of Jensen’s inequality . W e now turn to the verification of equation (33). F rom previous considerations we notice that for some finite constant C , U n,k ( x ) = 1 β γ γ (( k + 1) /n ) − γ ( k /n ) 1 /n log V ( x ) + C and therefore for an y 26 η N n,k e Q n,k : n (1) − 1 = ˆ exp − 1 n n − 1 X j = k U n,j ( x j ) η N n,k ( dx k ) n Y j = k +1 M n,j ( x j − 1 , dx j ) − 1 ≤ exp 1 n n − 1 X j = k ˆ U n,j ( x j ) η N n,k M n,k : j ( dx j ) = exp C + 1 n 1 β γ n − 1 X j = k γ (( j + 1) /n ) − γ ( j /n ) 1 /n ˆ log V ( x j ) η N n,k M n,k : j ( dx j ) ≤ exp C + 1 n 1 β γ n − 1 X j = k γ (( j + 1) /n ) − γ ( j /n ) 1 /n log ˆ V ( x j ) η N n,k M n,k : j ( dx j ) ≤ exp C + 1 − γ ( k /n ) β γ log η N n,k ( V ) ≤ exp( C ) η N n,k ( V ) 1 − γ β γ where the p en ultimate inequalit y hold due to standard iteration of the drift inequality in (A3). Therefore ¯ E µ η N n,k e Q n,k : n (1) − (1+ s ) p ≤ C ¯ E µ η N n,k ( V ) q due to Jensen’s inequality and where q := ( 1 − γ ) β γ (1 + s ) p ≤ 1 by assumption of the theorem. Equation (33) then follows up on combining this with equation (50). This completes the pro of. A c kno wledgmen ts The author thanks Christophe Andrieu for discussions which lead to the consideration of this work. 6 App endix When (A3) or even more simply (A2) holds, it is natural to ask under what further conditions, if any , do es the non-homogeneous Marko v chain n ζ ( N ) n,k ; 0 ≤ k ≤ n o also satisfy a geometric drift condition, as this would b e one natural route to verifying (34). Defining V N : X N → [1 , ∞ ) by 27 V N ( ζ ) := 1 N N X i =1 V ξ i , where V is as in (A2) and ζ = ξ 1 , ..., ξ N , we see that ¯ E µ h V N ( ζ N n,k ) F ( N ,N ) n,k − 1 i ≤ λ η N n,k − 1 e G n,k − 1 V η N n,k − 1 e G n,k − 1 + b η N n,k − 1 e G n,k − 1 I C η N n,k − 1 e G n,k − 1 ≤ λ η N n,k − 1 e G n,k − 1 V η N n,k − 1 e G n,k − 1 + b I C ( N ) ( ζ ( N ) n,k ) , (51) where C ( N ) := ζ = ξ 1 , ..., ξ N ∈ X N : ∃ j ∈ { 1 , ..., N } ; ξ j ∈ C . W e are then faced with the issue of whether the re-weigh ting of η N n,k − 1 b y the p oten tial function destroys the geometric drift of M n,k : for example one may ask when is it true that for some fixed δ ≥ 0 , η N n,k − 1 e G n,k − 1 V η N n,k − 1 e G n,k − 1 ≤ (1 + δ ) η N n,k − 1 ( V ) (52) The remainder of this section examines this question and go es on to lo ok at some particular issues when X = R d . First, we hav e the following Lemma, whic h addresses a general scenario. Lemma 4. F or f : X → (0 , ∞ ) , g : X → (0 , ∞ ) , two me asur able functions and δ ∈ [0 , ∞ ) , η ( f g ) ≤ (1 + δ ) η ( f ) η ( g ) for any η ∈ P ( X ) such that | η ( f g ) | < + ∞ , | η ( f ) | < + ∞ , | η ( g ) | < + ∞ , if [ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )] ≤ δ 2 + δ , ∀ ( x, x 0 ) ∈ X 2 , (53) 28 and only if [ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )] ≤ 3 δ 2 + δ , ∀ ( x, x 0 ) ∈ X 2 . Pr o of. F or any η , f , g as sp ecified in the statement and ∈ (0 , 1) , consider the iden tity: (1 − ) η ( f g ) − (1 + ) η ( f ) η ( g ) = 1 2 ˆ X ˆ X ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] − [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) η ( dx ) η ( dx 0 ) = 1 2 ˆ X 2 ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] − [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) η ( dx ) ⊗ η ( dx 0 ) , where the final inequality is due to F ubini’s theorem, which is applicable under the hypotheses of the lemma. The sufficiency part then follows directly up on setting = δ 2 + δ ⇔ (1 + δ ) = 1 + 1 − . F or the necessity part, supp ose on the con trary that there exists ( y , y 0 ) ∈ X 2 suc h that [ f ( y ) − f ( y 0 )] [ g ( y ) − g ( y 0 )] > 3 δ 2 + δ [ f ( y ) + f ( y 0 )] [ g ( y ) + g ( y 0 )] , then setting η = 1 2 [ δ y + δ y 0 ] and = δ 2+ δ ≥ 0 , we obtain ˆ X ˆ X ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] − [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) η ( dx ) η ( dx 0 ) = ˆ X 2 ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] − [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) 1 4 [ δ y ( dx ) ⊗ δ y ( dx 0 ) + δ y 0 ( dx ) ⊗ δ y 0 ( dx 0 )] + ˆ X 2 ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] − [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) 1 4 [ δ y ( dx ) ⊗ δ y 0 ( dx 0 ) + δ y 0 ( dx ) ⊗ δ y ( dx 0 )] = − [ f ( y ) g ( y ) + f ( y 0 ) g ( y 0 )] + 1 2 ([ f ( y ) − f ( y 0 )] [ g ( y ) − g ( y 0 )] − [ f ( y ) + f ( y 0 )] [ g ( y ) + g ( y 0 )]) ≥ 1 2 ([ f ( y ) − f ( y 0 )] [ g ( y ) − g ( y 0 )] − 3 [ f ( y ) + f ( y 0 )] [ g ( y ) + g ( y 0 )]) > 0 , whic h completes the pro of. 29 Note that the sufficient condition is alwa ys met, for example, when g = ϕ ◦ f for some p ositive, strictly decreasing and inv ertible function ϕ . The factor of 3 in the necessit y part of Lemma 4 arises from considering a suitable η with N = 2 supp ort p oin ts and is illustrativ e in the case of interest where each of the measures η N n,k is atomic. The following section further explores the necessity part of Lemma 4 in the case of X = R d , whic h is of interest in the context of [11]. The purp ose of this next section is to show that in order for (52) to hold for an y η N n,k − 1 , it is necessary that there is a very sp ecific relationship holds b et ween G and V . 6.1 When X = R d Let X = R d and let B ( X ) b e the corresp onding Borel σ -algebra. W e denote by S d − 1 the unit sphere in R d and for z in the image of V (resp. G ), define the contour manifolds V z := x ∈ R d : V ( x ) = z and G z := x ∈ R d : G ( x ) = z . W e denote by B ( c, r ) the closed ball of radius r and centered at c . F or x, x 0 ∈ R d , let n ( x ) := x | x | and let ` x 0 x b e the line segment with end p oin ts x and x 0 . Consider the following collection of assumptions. ( A5 ) – G : R d → (0 , + ∞ ) and V : R d → [1 , + ∞ ) hav e contin uous first deriv ativ es – There exist r 0 > 0 and strictly p ositiv e and non-decreasing functions φ V : (0 , ∞ ) → (0 , ∞ ] and φ G : (0 , ∞ ) → (0 , ∞ ] such that whenever | x | ≥ r 0 , n ( x ) · ∇ log V ( x ) ≥ φ V ( | x | ) and n ( x ) · ∇ log G ( x ) ≤ − φ G ( | x | ) (54) The conditions of equation (54) imply that for all δ > 0 and whenever | x | ≥ r 0 , V ( x + δ n ( x )) V ( x ) ≥ exp [ φ V ( | x | ) δ ] , and G ( x + δ n ( x )) G ( x ) ≤ exp [ − φ G ( | x | ) δ ] . (55) The assumptions of (54) are largely inspired by assumptions on probability densities used to verify geometric ergo dicit y of certain Metrop olis-Hastings kernels [26, 20] and are realistic in the context of [11]. It follows immediately from the arguments of Rob erts and T weedie [26, pro of of theorem 2.1] transferred to G and V , 30 that for | x | large enough, the contour manifolds of G and V which contain x are parameterizable by the unit sphere in the sense that: • F or each z > sup | y |≤ r 0 V ( y ) there exists a bijection betw een S d − 1 and V z , and for each z 0 < inf | y |≤ r 0 G ( y ) there exists a bijection b et ween S d − 1 and G z 0 , such that, V z = w z ( ζ ) ζ : ζ ∈ S d − 1 , and G z 0 = h z 0 ( ζ ) ζ : ζ ∈ S d − 1 , , (56) where w z ( · ) and h z ( · ) are p ositiv e and contin uous functions on S d − 1 . F urthermore, V z ∩ B (0 , r 0 ) = G z 0 ∩ B (0 , r 0 ) = ∅ . In order to describ e the relationship b et ween contour manifolds of V and G which intersect at some p oin t, w e introduce the function ψ : R d → [0 , ∞ ] defined by ψ ( x ) := sup ζ ∈ S d − 1 h G ( x ) ( ζ ) − w V ( x ) ( ζ ) , (57) whic h implicitly dep ends on G and V . W e hav e the following prop osition. Prop osition 3. Assume G : X → (0 , ∞ ) and V : X → [1 , ∞ ) satisfy (A1) and (A5) with lim r →∞ inf s ≥ r φ V ( s ) ∧ inf s ≥ r φ G ( s ) inf | x |≥ r ψ ( x ) = + ∞ . (58) Then for any 0 ≤ δ < 1 , ther e exists an atomic η ∈ P ( X ) such that η ( GV ) η ( G ) > (1 + δ ) η ( V ) . Pr o of. In outline, the pro of inv olves showing that if the condition in (58) holds, then for all ∈ [0 , 1 ), there exists ( y , y 0 ) ∈ X 2 suc h that G ( y ) − G ( y 0 ) G ( y ) + G ( y 0 ) V ( y ) − V ( y 0 ) V ( y ) + V ( y 0 ) > , and then employing Lemma 4. Firstly , supp ose (58) fails to hold. Then fix arbitrarily ∈ [0 , 1) and let r ≥ r 0 b e large enough that inf s ≥ r φ V ( s ) ∧ inf s ≥ r φ G ( s ) inf | x |≥ r ψ ( x ) ≥ 2 log 1 + √ 1 − √ , 31 and then let y be an y p oin t such that | y | > r , V ( y ) > sup | u |≤ r V ( u ) and ψ ( y ) > 0 (such r and y exist due to the h yp otheses of equation (54) and the assumed failure of (58)). Then recalling the definition of ψ in (57), and as, for fixed y , h G ( y ) ( · ) and w V ( y ) ( · ) are contin uous functions, there exists ζ ∈ S d − 1 suc h that h G ( y ) ( ζ ) − w V ( y ) ( ζ ) = ψ ( y ) > 0 . With a sligh t abuse, let x := h G ( y ) ( ζ ) ζ , x 0 := w V ( y ) ( ζ ) ζ , and let y 0 := 1 2 ( x + x 0 ) . It follows that the line segment ` x 0 x lies on a ray and | x | − | y 0 | = | y 0 | − | x 0 | = ψ ( y ) / 2 > 0 . Observ e that under the implications of (54) stated before the prop osition, by construction V ( x ) > V ( y 0 ) > V ( x 0 ) = V ( y ) and G ( x 0 ) > G ( y 0 ) > G ( x ) = G ( y ) . It must also b e the case that | x 0 | > r , as otherwise V ( y ) = V ( x 0 ) ≤ sup | u |≤ r V ( u ) , contradicting the definition of y . The situation is illustrated in Figure 1. Due to the h yp othesis of equation (55) and the definition of r , we then hav e V ( y ) V ( y 0 ) = V ( x 0 ) V ( y 0 ) ≤ exp − φ V ( | x 0 | ) ψ ( y ) 2 ≤ exp − 1 2 inf s ≥ r φ V ( s ) inf | u |≥ r ψ ( u ) ≤ 1 − √ 1 + √ , (59) and similarly , G ( y ) G ( y 0 ) = G ( x ) G ( y 0 ) ≤ exp − φ G ( | y 0 | ) ψ ( y ) 2 ≤ exp − 1 2 inf s ≥ r φ G ( s ) inf | u |≥ r ψ ( u ) ≤ 1 − √ 1 + √ . (60) It follows from (59)-(60) that 1 − V ( y ) V ( y 0 ) 1 + V ( y ) V ( y 0 ) ≥ √ and 1 − G ( y ) G ( y 0 ) 1 + G ( y ) G ( y 0 ) ≥ √ . The pro of is complete up on noting that was chosen arbitrarily in [0 , 1) and applying the necessity part of Lemma 4. W e may then formulate the following example. Corollary 1. L et X = R 2 . F or any > 0 , let G and V b e define d by 32 y y ’ x x ’ ψ(y) Figure 1: Solid line: G G ( y ) . Dashed line: V V ( y ) . Radial distance of ψ ( y ) outw ards from G G ( y ) is indicated. V ( x ) = exp ( x 1 − ) 2 + x 2 2 and G ( x ) = exp − h ( x 1 + ) 2 + x 2 2 i wher e x = ( x 1 , x 2 ) . Then for any 0 ≤ δ < 1 , ther e exists an atomic η ∈ P ( X ) such that η ( GV ) η ( G ) > (1 + δ ) η ( V ) . (61) Pr o of. Elemen tary manipulations show that (A5) holds with r 0 = 2 and taking φ G ( | x | ) = φ V ( | x | ) = 2( | x |− ) for | x | ≥ r 0 . F or an y r ≥ r 0 , consider the con tour manifolds V z and G z 0 for z = V 0 , √ r 2 − 2 and z 0 = G 0 , √ r 2 − 2 . It is straightforw ard to c heck that ψ 0 , √ r 2 − 2 = 2 and the result follo ws from Prop osition 3. This example serves to highlight that a v ery sp ecific relationship b et ween the p otential functions and the drift function is required if inequalities of the form (52) are to hold for (1 + δ ) λ < 1 and for all probability measures. The application of section 5 is one situation where suc h a relationship holds. W e note that it is p ossible to b ound ¯ E µ h η N n,k ( V ) i without confirming (52), b y app ealing to conv exit y and the flattening prop ert y of the p oten tial functions combined with the geometric drift; but the application of section 5 will not need suc h an approach and so we do not rep ort these details here. On the other hand, if inequalities of the form (52) do hold with (1 + δ ) λ < 1 , one might then pursue accompanying minorization conditions 33 for the c hain n ζ ( N ) n,k ; 0 ≤ k ≤ n o , but it seems difficult to achiev e this in such a wa y that the minorizing constan ts do not degrade as N increases. References [1] C. Andrieu, L. A. Brey er, and A. Doucet. Conv ergence of sim ulated annealing using F oster-Ly apunov criteria. Journal of Applie d Pr ob ability , 38(4), 2001. [2] O. Capp é, E. Moulines, and T. Ryden. Infer enc e in hidden Markov mo dels . Springer Series in Statistics. Springer, New Y ork, 2005. [3] N. Chopin. A sequential particle filter metho d for static mo dels. Biometrika , 89(3), 2002. [4] N. Chopin. Cen tral limit theorem for sequential Monte Carlo methods and its application to Ba yesian inference. Annals of Statistics , 32(6), 2004. [5] N. Chopin, P . Del Moral, and S. Rub en thaler. Stability of F eynman Kac formulae with path-dep endent p oten tials. Sto chastic Pr o c esses and their Applic ations , 121(1):38–60, 2011. [6] P . Del Moral. F eynman-Kac F ormulae. Gene alo gic al and inter acting p article systems with applic ations . Probabilit y and its Applications. Springer V erlag, New Y ork, 2004. [7] P . Del Moral and A. Doucet. P article motions in absorbing medium with hard and soft obstacles. Sto chastic Analysis and Applic ations , 22:1175–1207, 2004. [8] P . Del Moral and J. Garnier. Genealogical particle analysis of rare ev ents. The Annals of Applie d Pr ob ability , 15(4), 2005. [9] P . Del Moral and A. Guionnet. On the stability of interacting pro cesses with applications to filtering and genetic algorithms. Annales de l’Institut Henri Poinc ar é (B) Pr ob ability and Statistics , 37(2), 2001. [10] P . Del Moral and L. Miclo. Branching and interacting particle systems approximation of F eynman-Kac form ulae with applications to non-linear filtering. Séminair e de Pr ob abilitiés XXXIV. L e ctur e Notes in Mathematics , 1729:1–145, 2000. [11] P . Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the R oyal Statistic al So ciety, Series B , 68(3):411–436, 2006. 34 [12] P . Del Moral, A. Doucet, and A. Jasra. On adaptiv e resampling strategies for sequential Monte Carlo metho ds. Bernoul li , 2011. T o app ear. [13] R. Douc and E. Moulines. Limit theorems for weigh ted samples with applications to sequential Monte Carlo metho ds. Annals of Statistics , 36(5), 2008. [14] R. Douc, E. Moulines, and J.S. Rosen thal. Quantitativ e b ounds on con vergence of time-inhomogeneous Mark ov chains. The Annals of Applie d Pr ob ability , 14(4):1643–1665, 2004. [15] R. Douc, G. F ort, E. Moulines, and P . Priouret. F orgetting the initial distribution for hidden Marko v mo dels. Sto chastic Pr o c esses and their Applic ations , 119:1235–1256, 2009. [16] R. Douc, E. Moulines, and Y. Rito v. F orgetting of the initial condition in general state-space hidden Mark ov chain: a coupling approach. Ele ctr onic Journal of Pr ob ability , 14:27–49, 2009. [17] R. Douc, E. Gassiat, B. Landelle, and E. Moulines. F orgetting of the initial distribution for nonergo dic hidden Marko v mo dels. The Annals of Applie d Pr ob ability , 20(5), 2010. [18] A. Doucet, N. De F reitas, and N. Gordon, editors. Se quential Monte Carlo metho ds in pr actic e . Springer, New Y ork, 2001. [19] K. Heine and D. Crisan. Uniform appro ximations of discrete-time filters. A dvanc es in Applie d Pr ob ability , 40(4):979–1001, 2008. [20] S.F. Jarner and E. Hansen. Geometric ergo dicit y of Metrop olis algorithms. Sto chastic Pr o c esses and their Applic ations , 85(2):341–361, F ebruary 2000. [21] A. Jasra and A. Doucet. Stability of sequential Mon te Carlo samplers via the F oster Lyapuno v condition. Statistics and Pr ob ability L etters , 78(17), 2008. [22] M. L. Kleptsyna and A. Y. V eretennik ov. On discrete time ergo dic filters with wrong initial data. Pr ob ability The ory and R elate d Fields , 141(3-4), 2008. [23] H.R. Künsch. Recursive Monte Carlo filters: algorithms and theoretical analysis. A nnals of Statistics , 33(5), 2005. [24] F. Le Gland and N. Oudjane. Stability and uniform appro ximation of nonlinear filters using the Hilb ert metric and application to particle filter. The Annals of Applie d Pr ob ability , 14(1):144–187, 2004. 35 [25] N. Oudjane and S. Rub en thaler. Stability and uniform particle approximation of nonlinear filters in case of non ergo dic signals. Sto chastic analysis and applic ations , 23(3):421–448, 2005. [26] G.O. Rob erts and R. L. T weedie. Geometric conv ergence and central limit theorems for multidimensional Hastings and Metrop olis algorithms. Biometrika , 83(1):95–120, March 1996. [27] M. Rousset and G. Stoltz. Nonequilibrium sampling from equilibrium dynamics. Journal of Statistic al Physics , 123(6):1251–1272, 2006. [28] R. v an Handel. Uniform time av erage consistency of M on te Carlo particle filters. Sto chastic Pr o c esses and their Applic ations , 119(11), 2009. 36
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment