Sequential Monte Carlo samplers: error bounds and insensitivity to initial conditions

Sequen tial Mon te Carlo samplers: error b ounds and insensitivit y to initial conditions Nic k Whiteley ∗ Abstract This pap er addresses ﬁnite sample stabilit y prop erties of sequen tial Mon te Carlo metho ds for appro x- imating sequences of probability distributions. The results presented herein are applicable in the scenario where the start and end distributions in the sequence are ﬁxed and the n umber of intermediate steps is a parameter of the algorithm. Under assumptions whic h hold on non-compact spaces, it is shown that the eﬀect of the initial distribution decays exp onen tially fast in the n umber of intermediate steps and the corresponding sto c hastic error is stable in L p norm. Keyw ords: non-compact spaces; unbounded functions; sequential Monte Carlo AMS Sub ject Classiﬁcation: 82C80;60J05; 1 In tro duction Sequen tial Mon te Carlo (SMC) metho ds are a class of sto c hastic algorithms for approximating s equences of probabilit y measures using a p opulation of N particles. They hav e b een adopted in a v ariet y of application domains, including rare even t analysis [8], statistical ph ysics [27], optimal ﬁltering [18, 2] and computational statistics [3, 11]. V arious theoretical prop erties of SMC metho ds hav e b een studied, and in v arious contexts, see amongst others [4, 23, 13, 28], and the seminal work of Del Moral [6]. Existing stability results for SMC metho ds rely on very strong assumptions which typically do not hold on non-compact spaces. This article is concerned with establishing stability prop erties for a class of SMC algorithms primarily motiv ated b y those of Del Moral et al. [11], under assumptions which do hold on non-compact spaces. The following example indicates a scenario of interest. ∗ School of Mathematics, Universit y of Bristol, Universit y W alk, Bristol, BS8 1TW, UK 1 1.1 A motiv ating example Let π be a distribution on some space X , admitting a strictly p ositiv e and b ounded density with resp ect to some dominating measure. F or ease of presentation, let π also denote this density and let ¯ π denote the corre- sp onding unnormalised density , i.e. for some Z > 0 , π ( x ) = ¯ π ( x ) / Z . F or some γ ∈ (0 , 1) , let γ : [0 , 1] → [ γ , 1] b e a non-decreasing Lipschitz function with γ (0) = γ and γ (1) = 1 . F or n ∈ N let { G n,k ; 0 ≤ k < n } b e the collection of p oten tial functions on X deﬁned by G n,k ( x ) = ¯ π γ (( k +1) /n ) ( x ) / ¯ π γ ( k/n ) ( x ) . Let { M n,k ; 1 ≤ k ≤ n } b e a collection of ergo dic Mark ov kernels also on X , where M n,k admits as its in v ariant distribution the prob- abilit y with density prop ortional to ¯ π γ ( k/n ) ( x ) (denoted by π γ ( k/n ) ). F or some initial distribution µ , consider the sequence of probabilit y distributions { η n,k ; 1 ≤ k ≤ n } deﬁned, for a test function f , by η n,k ( f ) := E µ h Q n − 1 k =0 G n,k ( X n,k ) f ( X n,n ) i E µ h Q n − 1 k =0 G n,k ( X n,k ) i , (1) where E µ denotes exp ectation with resp ect to the law of the inhomogeneous Marko v chain { X n,k ; 0 ≤ k ≤ n } , suc h that X 0 ∼ µ and X n,k +1 | X n,k = x n,k ∼ M n,k +1 ( x n,k , · ) . The sp ecial interest in (1) is that when µ = π γ , then η n,n = π , for any n ≥ 1 . F urthermore, in applications Z is unknown, the distributions of the form (1) cannot b e computed exactly , and one aims to obtain an approximation of π b y ﬁxing n then appro ximating each of the { η n,k ; 0 ≤ k ≤ n } in turn, as follows. Let N ∈ N and let { ζ n,k ; 0 ≤ k ≤ n } b e an inhomogeneous Marko v chain, with each ζ n,k = n ξ i n,k ; 1 ≤ i ≤ N o an N -tuplet and with eac h ξ i n,k v alued in X , with  ξ i n, 0 ; 1 ≤ i ≤ N  indep enden t and of common distribution µ . Giv en ζ n,k , the n ξ i n,k +1 ; 1 ≤ i ≤ N o are indep enden t, with ξ i n,k +1 dra wn from P N j =1 G n,k  ξ j n,k  M n,k +1  ξ j n,k , ·  P N j =1 G n,k  ξ j n,k  . The particle appro ximation measure at time step k is η N n,k := 1 N P N i =1 δ ξ i n,k , and one tak es η N n,n ( f ) := 1 N P N i =1 f ( ξ i n,n ) as an approximation of π ( f ) . The exp ectation terms in (1) are in the shap e of F eynman-Kac form ulae, and adopting the terminology of Del Moral [6] throughout the follo wing, a general collection of Mark ov kernels, p oten tial functions, initial distribution and asso ciated { η n,k } are referred to as constituting a F eynman-Kac (FK) mo del. The FK mo del describ ed ab o ve has a notable structural characteristic: due to fact that γ and π are ﬁxed and γ ( · ) is contin uous, the p otential functions eac h b ecome ﬂat as n → ∞ (note that this is not an essential 2 feature of the general FK mo dels considered by Del Moral [6], nor is it the regime usually considered in the ﬁltering scenario). One might then conjecture due to this “ﬂattening” prop erty that the sequence of measures { η n,k ; 0 ≤ k ≤ n } inherit ergo dicit y prop erties from the Mark ov kernels when µ 6 = π ¯ γ , and η n,n ( f ) − π ( f ) goes to zero at some rate as n → ∞ . Perhaps more adven turously , one migh t further conjecture that this stability prop ert y is inherited b y the corresp onding particle system so that ¯ E µ    η N n,n ( f ) − π ( f )   p  1 /p is con trolled uniformly in n , and diminishes at the usual √ N rate, where ¯ E µ denotes exp ectation w.r.t. the law of the particle pro cess initialised using µ . The results presented in this paper allow it to b e shown that this is indeed the case under assumptions whic h are realistic in the context of applications such as [11] on non-compact spaces. The contributions are to establish deterministic stability results for a broad class of FK mo dels, of which the example ab o v e is one instance, and to pro vide L p b ounds for the corresp onding particle errors. 1.2 Summary of results The presen t work is built up on generic assumptions ab out the FK mo del structure, whic h can b e loosely summarized as follows: • the Marko v kernels { M n,k } are geometrically ergo dic, with common F oster-Lyapuno v drift function V and asso ciated constants, and with a common minorization condition • the potential functions { G n,k } are uniformly b ounded ab o ve, and are of the form G n,k ( x ) ∝ exp  − 1 n U n,k ( x )  , for U n,k p ositiv e and b ounded uniformly o ver n and k in V -norm . • f is b ounded in V -norm Stabilit y prop erties of general FK semigroups are established in Theorem 1 (section 3). Note here that, in con trast to the ab o v e example, no speciﬁc form is assumed for { U n,k } or their relations to the in v ariant distributions of { M n,k } . Theorem 2 (section 4) provides L p error b ounds for the corresp onding particle systems, under additional assumptions on the drift prop erties of the particle pro cess. F or the reader’s con venience Theorem 3 is now summarized (for the precise statement see section 5), whic h is an application to the example sketc hed ab o ve. Let X = R d . Then when π has a sub-exp onen tial densit y w.r.t. Leb esgue measure with asymptotically regular con tours, and when each M n,k is a random w alk Metrop olis kernel of inv ariant distribution π γ ( k/n ) , under a suitable trade-oﬀ b et ween p ≥ 1 , α ∈ [0 , 1) and γ , there exists ρ < 1 and constan ts C 1 and C 2 suc h that for an y f with k f k V α := sup x | f ( x ) | V α ( x ) < + ∞ , 3 an y N ≥ 1 and n ≥ 1 , ¯ E µ h   π N n ( f ) − π ( f )   p i 1 /p ≤ k f k V α  C 1 √ N  1 − ρ n 1 − ρ  + ρ n C 2 I  µ 6 = π γ   , (2) where π N n := η N n,n , is the particle appro ximation as in section 1. The ﬁrst term on the r.h.s. of (2) corresp onds to the sto chastic errors from the p opulation of size N interacting and mutating ov er n times steps. The second term on the r.h.s. is a deterministic bias which arises if the initial distribution is mis-sp eciﬁed, and this bias is deca ys exp onen tially quickly in n . 1.3 Existing work Despite the fact that the fo cus here is on mo dels with the “ﬂattening” prop ert y , existing SMC stability results cannot b e transferred directly to the present scenario under realistic assumptions. They are reviewed b elo w for completeness. Del Moral [6, Theorem 7.4.4] prov ed time-uniform L p error b ounds, under assumptions which in the presen t scenario w ould take the form: • there exist  M > 0 , m ≥ 0 and  G > 0 such for all n , k and x, y ∈ X M n,k . . . M n,k + m ( x, · ) ≥  M M n,k . . . M n,k + m ( y , · ) (3) G n,k ( x ) ≥  G G n,k ( y ) , (4) • f is b ounded. Note that these results hold for fully general FK mo dels, not necessarily having the “ﬂattening” prop ert y (see also [10, 9, 7, 24, 23, 5, 12] for v arious results under the same type of assumption as one or b oth of (3)-(4)). Ho wev er, these assumptions are very strong. Equation (3) is stronger than uniform ergo dicit y of the m-step k ernels, and typically is not satisﬁed for the kernels of interest in [11], such as Metrop olis-Hastings kernels on R d . F or a toy example which highlights the issue, consider the case that X = R and for some probability measure ν on X dominated by Leb esgue measure, take M n,k ( x, · ) = aδ x ( · ) + (1 − a ) ν ( · ) . It is easy to chec k that (3) is violated. Similarly (4) is typically not satisﬁed in the applications of interest on non-compact spaces and the assumption that f is b ounded is then also rather restrictive. 4 Oudjane and Rub enthaler [25] and Heine and Crisan [19] used truncation approaches to obtain stability results for particle ﬁlters in exp ectation ov er the observ ation pro cess, without mixing assumptions, but they resp ectiv ely introduce a rejection step into the particle algorithm (making its computational cost random) and use restrictive assumptions ab out the state-spaces inv olv ed, the hidden Marko v mo del (HMM) and the particle mutation kernels which are not realistic in the scenario of interest. v an Handel [28] prov ed uniform time-a verage consistency of particle ﬁlters under tigh tness assumptions for a class of bounded functions. V ery recen tly , Del Moral et al. [12] ha ve studied SMC metho ds in which the resampling step is applied adaptiv ely o ver time. The stability of SMC metho ds has also b een studied in the asymptotic regime N → ∞ . Chopin [4] established a CL T for a broad class of SMC algorithms, and show ed that under the same type of strong mixing assumptions as in (3)-(4), the asymptotic (in N ) v ariance asso ciated with the rescaled sto c hastic error can b e b ounded uniformly in n . Jasra and Doucet [21] fo cused on the asymptotic v ariance corresp onding to the algorithms prop osed by Del Moral et al. [11] for unbounded functions. They obtained a b ound on the asymptotic v ariance under realistic geometric ergo dicit y assumptions which are the same as those considered in the present work. They do not consider the “ﬂattening” regime in their assumptions, and their b ounds on the asymptotic v ariance are not time uniform. F urther de tails of the relationship b et ween the approac h of Jasra and Doucet [21] and some of the ideas in this pap er are p ostponed until section 3. It is relev ant also to mention the recen t results of Kleptsyna and V eretenniko v [22], Douc et al. [15] on forgetting of initial conditions in HMM’s (i.e. without particle appro ximation) assuming a lo cal Do eblin condition on the Marko v kernels: this also do es not hold in the Metrop olis-on- R d scenario of in terest and so these results cannot b e transferred directly (the results of Douc et al. [17] do not require ergo dicit y of the kernel, but their assumptions do in volv e the same minorization/ma jorization structure). Douc et al. [16] prop osed a generic approach to optimal ﬁ lter stability without particle approximation using ideas of coupling inhomogeneous Mark ov chains and a similar approac h is exploited here, further comment is delay ed un til the later sections. The remainder of the pap er is structured in the following manner. Section 2 sp eciﬁes the general form of the FK mo dels in question, asso ciated semigroups and particle systems. Section 3 deals with the deterministic stabilit y of the sequences of measures arising from the FK mo dels. L p error b ounds for the sto chastic errors of the particle approximations are derived in Section 4. Section 5 applies the results to the case where X = R d and when the Marko v k ernels are of the random w alk Metrop olis v ariet y . The app endix contains a 5 discussion of drift prop erties of the particle system. 2 Deﬁnitions Consider a state sp ac e X and an asso ciated countably generated σ -algebra B ( X ) , Let P ( X ) b e the collection of probabilit y measures on ( X , B ( X )) . F or a measure µ on ( X , B ( X )) , an integral kernel M : X × B ( X ) → [0 , ∞ ) and a function f : X → R , deﬁne µf := ´ X f ( x ) µ ( dx ) , M f ( x ) := ´ X M ( x, dy ) f ( y ) and µM ( · ) = ´ X µ ( dx ) P ( x, · ) . The function which assigns 1 to ev ery p oin t in X is also denoted by 1 , and the indicator function on a set A is denoted by I A . Let W : X → [1 , ∞ ) and f : X → R b e tw o measurable functions, then deﬁne the norm k f k W = sup x ∈ X | f ( x ) | W ( x ) . and let L W = { f : k f k W < ∞} . Let µ b e a signed measure on ( X , B ( X )) , then deﬁne the norm k µ k W = sup | φ |≤ W | µ ( φ ) | . 2.1 F eynman-Kac mo dels and asso ciated semigroups Let µ ∈ P ( X ) and for each n ∈ N let { M n,k ; 1 ≤ k ≤ n } b e a collection of Marko v kernels, each k ernel acting X × B ( X ) → [0 , 1] . Let { G n,k ; 0 ≤ k ≤ n − 1 } b e a collection of B ( X ) -measurable, real-v alued, strictly p ositiv e and b ounded functions on X . The notation employ ed b elo w is directly inspired by that of Del Moral [6], with some imp ortan t mo diﬁcations, primarily to the indexing, reﬂecting the scenario of interest in which there is a diﬀerent FK mo del for each n . Next, for each n ∈ N , let { Q n,k ; 1 ≤ k ≤ n } b e the collection of integral k ernels deﬁned b y Q n,k ( x, dy ) = G n,k − 1 ( x ) M n,k ( x, dy ) . F or eac h n and 0 ≤ k ≤ ` ≤ n , let M n,k : ` and Q n,k : ` the semigroups associated with the Marko v k ernels 6 { M n,k } and the k ernels { Q n,k } . These semigroups are deﬁned by M n,k : ` = M n,k +1 M n,k +2 . . . M n,` , k < ` ≤ n, Q n,k : ` = Q n,k +1 Q n,k +2 . . . Q n,` , k < ` ≤ n, (5) and M n,k : k = Q n,k : k = I d, 0 ≤ k ≤ n . Next deﬁne the collection of probability measures { η n,k ; 0 ≤ k ≤ n } b y η n,k ( A ) = µQ n, 0: k ( A ) µQ n, 0: k (1) , A ∈ B ( X ) . Let { Ψ n,k ; 0 ≤ k < n } and { Φ n,k ; 1 ≤ k ≤ n } b e the collections of mappings, each mapping acting from P ( X ) in to P ( X ) , deﬁned for any η ∈ P ( X ) by Ψ n,k ( η )( dx ) = G n,k ( x ) η ( G n,k ) η ( dx ) , Φ n,k ( η ) = Ψ n,k − 1 ( η ) M n,k , and for 0 ≤ k ≤ ` ≤ n denote by Φ n,k : ` the semigroup asso ciated with the mappings { Φ n,k } , deﬁned by Φ n,k : ` = Φ n,` ◦ Φ n,` − 1 ◦ . . . ◦ Φ n,k +1 , k < ` ≤ n, and with the conv en tion Φ n,k : k = I d . It is straightforw ard to chec k that under these deﬁnitions, for any 0 ≤ k ≤ ` ≤ n , η ∈ P ( X ) and A ∈ B ( X ) , Φ n,k : ` ( η )( A ) = η Q n,k : ` ( A ) η Q n,k : ` (1) (6) and in particular, Φ n,k : ` ( η n,k ) = η n,` . Lastly , let { S n,k ; 1 ≤ k ≤ n } b e the collection of Marko v kernels, each kernel acting X × B ( X ) → [0 , 1] , deﬁned b y S n,k ( A ) = M n,k ( Q n,k : n (1) I A ) M n,k ( Q n,k : n (1)) . Under these deﬁnitions it is straightforw ard to chec k that, in line with (6), we ha ve the alternativ e description of the mapping Φ n,k : n in te rms of the Mark ov kernels { S n,k } : for any 0 ≤ k ≤ n and an y 7 η ∈ P ( X ) Φ n,k : n ( η )( A ) = η ( Q n,k : n (1) S n,k +1 . . . S n,n ( A )) η ( Q n,k : n (1)) . (7) Consider the following assumption. ( A1 ) There exists a ﬁnite constant G X suc h that for all n ∈ N and 0 ≤ k ≤ n − 1 , G n,k ( x ) ≤ G X , ∀ x ∈ X . When assumption (A1) holds, for 0 ≤ k ≤ n − 1 deﬁne e G n,k : X → (0 , 1] by e G n,k ( x ) = G n,k ( x ) G X , and corresp ondingly , e Q n,k ( x, dy ) = e G n,k − 1 ( x ) M n,k ( x, dy ) . Also let e Q n,k : ` denote the semigroup asso ciated with the kernels e Q n,k in the same manner as (5), with the same conv en tion e Q n,k : k = I d . F urthermore under (A1), let U n,k : X → [0 , ∞ ) b e deﬁned b y U n,k ( x ) = − n log e G n,k ( x ) . As stated in section 1, the work presented here is primarily motiv ated by the mo dels and algorithms considered in [11]. How ever, the “backw ard” kernel structure whic h they consider is not introduced here as it is not essential for our purp oses. A sp eciﬁc example is given in section 5 and at that p oint comment on ho w this ﬁts with the framework of Del Moral et al. [11] is provided. 2.2 P article systems An explicit construction of the probability space for the particle systems is not pro vided here, but this can b e carried out by canonical metho ds, see for example [6, Chapter 3] and should b e clear from the follo wing symbolic description. Fix N ∈ N , n ∈ N , and for 0 ≤ k ≤ n let ζ ( N ) n,k := n ξ ( N ,i ) n,k ; 1 ≤ i ≤ N o , where eac h ξ ( N ,i ) n,k is v alued in X . Denote η N n,k := 1 N P N i =1 δ ξ ( N,i ) n,k . F or 1 ≤ i ≤ N and 1 ≤ k ≤ n let F ( N ,i ) n,k := σ ( ζ ( N ) n, 0 , . . . , ζ ( N ) n,k − 1 , ξ ( N , 1) n,k , ..., ξ ( N ,i ) n,k ) and F ( N ,i ) n, 0 := σ ( ξ ( N , 1) n, 0 , ..., ξ ( N ,i ) n, 0 ) . The generations of the particle system 8 n ζ ( N ) n,k ; 0 ≤ k ≤ n o form a non-homogeneous Marko v chain: for µ ∈ P ( X ) , the la w of this c hain is denoted b y P µ and has transitions giv en in integral form by: P µ  ζ ( N ) n, 0 ∈ dx  = N Y i =1 µ ( dx i ) , P µ  ζ ( N ) n,k ∈ dx    ζ ( N ) n,k − 1  = N Y i =1 Φ n,k  η N n,k − 1  ( dx i ) , 1 ≤ k ≤ n, where dx = d  x 1 , . . . x N  . Denote by ¯ E µ the exp ectation corresp onding to P µ . It is easy to chec k that for an y 0 ≤ k ≤ n and 1 ≤ i ≤ N , ¯ E µ h I A  ξ ( N ,i ) n,k i = ¯ E µ  η N n,k ( A )  , A ∈ B ( X ) . (8) 3 Stabilit y of the deterministic measures This section is concerned with stability prop erties of the sequences of Marko v kernels { S n,k } and op erators { Φ n,k } . The approach is to identify non-homogeneous F oster-Lyapuno v drift functions and minorization conditions whic h arise quite naturally from the structure of the FK mo del and then to employ the quan titative b ounds of Douc et al. [14]. Compared to [21], in the presen t work the general structure of FK mo dels is exploited more directly and in a wa y which is fruitful when the mo dels satisfy assumption (A4) b elo w. Douc et al. [16, Section 5] identiﬁed drift functions and coupling sets for related op erators in some sp eciﬁc HMM’s; the present work is concerned with a general FK model structure. The ﬁrst main idea of this section is illustrated by the following assumption and lemma. ( A2 ) There exists λ ∈ [0 , 1) , a function V : X → [1 , ∞ ) , ε ∈ (0 , 1] , b ∈ (0 , ∞ ) , C ∈ B ( X ) and a probability measure ν ∈ P ( X ) , such that inf n ≥ 1 inf 1 ≤ k ≤ n M n,k ( x, A ) ≥ ε · ν ( A ) , ∀ A ∈ B ( X ) , ∀ x ∈ C , (9) sup n ≥ 1 sup 1 ≤ k ≤ n M n,k V ( x ) ≤ λV ( x ) + b I C ( x ) , ∀ x ∈ X . (10) 9 Lemma 1. Assume (A1)-(A2). Then for e ach n ∈ N and 1 ≤ k ≤ n , S n,k ( x, · ) ≥  n,k ν n,k ( · ) , ∀ x ∈ C , (11) S n,k V n,k ( x ) ≤ λV n,k − 1 ( x ) + b n,k − 1 I C ( x ) , ∀ x ∈ X , (12) wher e  n,k := ε · ν  e Q n,k : n (1)  , b n,k := b ε · ν  e Q n,k : n (1)  , (13) and ν n,k ∈ P ( X ) and V n,k : X → [1 , ∞ ) ar e deﬁne d by ν n,k ( A ) := ν  e Q n,k : n (1) I A  ν  e Q n,k : n (1)  , A ∈ B ( X ) , (14) V n,k ( x ) := V ( x ) M n,k +1  e Q n,k +1: n (1)  ( x ) , x ∈ X , 0 ≤ k < n, (15) V n,n := V , with λ , V , b , ν , ε and C as in (A2). Pr o of. Noting that 0 < e Q n,k : n (1)( x ) ≤ 1 for all x ∈ X , we hav e S n,k ( x, A ) = M n,k  e Q n,k : n (1) I A  ( x ) M n,k  e Q n,k : n (1)  ( x ) ≥ M n,k  e Q n,k : n (1) I A  ( x ) ≥ ε · ν  e Q n,k : n (1)  ν  e Q n,k : n (1) I A  ν  e Q n,k : n (1)  , x ∈ C . and S n,k V n,k ( x ) = M n,k  e G n,k V  ( x ) M n,k  e Q n,k : n (1)  ( x ) ≤ λ V ( x ) M n,k  e Q n,k : n (1)  ( x ) + b I C ( x ) M n,k  e Q n,k : n (1)  ( x ) ≤ λV n,k − 1 ( x ) + b I C ( x ) ε · ν  e Q n,k : n (1)  , ∀ x ∈ X , (16) recalling the conv en tion e Q n,n : n (1) = 1 . 10 The minorization and drift conditions (11)-(12) pav e the w ay to establishing the stability prop erties of the op erators { Φ n,k } . In order to mov e further we in tro duce assumption (A3) below, which is a stricter v ersion of (A2). The extra structure of (A3) allows the construction in Prop osition 1 of bi-v ariate drift and minorization conditions. Consideration of the case in whic h the small set arises as a sub-level set of the drift function V is a standard and generic approach to constructing bi-v ariate drift conditions from their uni-v ariate coun ter-parts, see for example [1, 14]. F urthermore, assumption (A3) allows the lev el in question to b e chosen in a very ﬂexible wa y , and this prop erty is exploited in Prop osition 1 when dealing with the sp eciﬁc structure arising from the kernels { S n,k } . V eriﬁcation of (A3) in a particular application is pro vided in section 5. Assumption (A4) allo ws the bounding of non-homogeneous minorization and drift constants. A generic approach to verifying (A4) is presented at the end of this section, where the connection with ﬂattening prop ert y mentioned in the introduction is made more explicit. ( A3 ) There exists d 0 ≥ 1 , λ ∈ [0 , 1) , a function V : X → [1 , ∞ ) , and for all d ∈ [ d 0 , + ∞ ) , there exists ε d ∈ (0 , 1] , b d ∈ (0 , ∞ ) and ν d ∈ P ( X ) such that ν d ( C d ) > 0 , ν d ( V ) < + ∞ , inf n ≥ 1 inf 1 ≤ k ≤ n M n,k ( x, A ) ≥ ε d · ν d ( A ) , ∀ A ∈ B ( X ) , ∀ x ∈ C d , (17) sup n ≥ 1 sup 1 ≤ k ≤ n M n,k V ( x ) ≤ λV ( x ) + b d I C d ( x ) , ∀ x ∈ X , (18) where C d := { x : V ( x ) ≤ d } . ( A4 ) Whenev er (A1) and (A3) hold, for any µ ∈ P ( X ) with µ ( V ) < + ∞ , and d ∈ [ d 0 , + ∞ ) there exists a p ositiv e and ﬁnite constan t K ( µ, λ, V , b d ) such that inf n ≥ 1 inf 0 ≤ k ≤ n µ  e Q n,k : n (1)  ≥ K ( µ, λ, V , b d ) , where λ , V and d 0 are as in (A3). The main result of this section is now presented. Theorem 1. Assume (A1), (A3) and (A4). Then for α ∈ (0 , 1] , ther e exists ρ ∈ ( λ, 1) and a ﬁnite c onstant M such that for any µ, µ 0 ∈ P ( X ) , any n ∈ N and any 1 ≤ k ≤ n , k µS n,k . . . S n,n − µ 0 S n,k . . . S n,n k V α ≤ M ρ n − k +1  µ  V α n,k − 1  + µ 0  V α n,k − 1  , (19) 11 wher e V n,k − 1 is as given in e quation (15), and c onse quently, for 0 ≤ k ≤ n, k Φ n,k : n ( µ ) − Φ n,k : n ( µ 0 ) k V α ≤ M ρ n − k   µ  e G n,k V α  µ  e Q n,k : n (1)  + µ 0  e G n,k V α  µ 0  e Q n,k : n (1)    . (20) The pro of of Theorem 1 is p ostponed. It inv olv es the bi-v ariate drift functions identiﬁed in the following prop osition. Prop osition 1. Assume (A1) and (A3) and let V , d 0 and λ b e as in (A3). Then for al l d ≥ d 0 , and al l ¯ λ ∈ ( λ, 1) ther e exists ¯ d ≥ d , and for e ach n ∈ N , ther e exist 1) a c ol le ction of functions  ¯ V n,k ; 0 ≤ k ≤ n  , with e ach ¯ V n,k : X × X → [1 , ∞ ) and ¯ V n,n ( x, x 0 ) = 1 2 [ V ( x ) + V ( x 0 )] ; 2) c ol le ctions { ¯  n,k ; 1 ≤ k ≤ n } , and  b n,k ; 0 ≤ k ≤ n − 1  dep ending on d and ¯ d , with e ach ¯  n,k ∈ (0 , 1] and e ach ¯ b n,k ∈ (0 , ∞ ) ; 3) a c ol le ction of pr ob ability me asur es { ¯ ν n,k ; 1 ≤ p ≤ n } , dep ending on ¯ d , with e ach ¯ ν n,k ∈ P ( X ) ; such that for 1 ≤ k ≤ n , S n,k ( x, · ) ∧ S n,k ( x 0 , · ) ≥ ¯  n,k ¯ ν n,k ( · ) , ∀ ( x, x 0 ) ∈ ¯ C ¯ d , (21) S ∗ n,k ¯ V n,k ( x, x 0 ) ≤ ¯ λ ¯ V n,k − 1 ( x, x 0 ) + b n,k − 1 I ¯ C ¯ d ( x, x 0 ) , ∀ ( x, x 0 ) ∈ X × X , (22) wher e  ¯ C ¯ d := ( x, x 0 ) : V ( x ) ≤ ¯ d, V ( x 0 ) ≤ ¯ d  , S ∗ n,k : X × X × B ( X × X ) → [0 , 1] is deﬁned by S ∗ n,k (( x, x 0 ) , d ( y, y 0 )) =        S n,k ( x, dy ) S n,k ( x 0 , dy 0 ) , ( x, x 0 ) / ∈ ¯ C ¯ d , ¯ R n,k (( x, x ) , d ( y , y 0 )) , ( x, x 0 ) ∈ ¯ C ¯ d and ¯ R n,k : X × X × B ( X × X ) → [0 , 1] is deﬁne d by 12 ¯ R n,k (( x, x ) , d ( y , y 0 )) = 1 (1 − ¯  n,k ) 2 ( S n,k ( x, dy ) − ¯  n,k ¯ ν n,k ( dy )) ( S n,k ( x 0 , dy 0 ) − ¯  n,k ¯ ν n,k ( dy 0 )) . F urthermor e, inf n ≥ 1 inf 1 ≤ k ≤ n ¯  n,k > 0 , and sup n ≥ 1 sup 0 ≤ k ≤ n − 1 ¯ b n,k < + ∞ . (23) Pr o of. Throughout the pro of, expressions featuring indices n and k hold for all n ∈ N and 1 ≤ k ≤ n , unless stated otherwise. Fix d ≥ d 0 and ¯ λ ∈ ( λ, 1) . Then let ε d , b d and ν d b e the corresp onding constants and minorizing measure from (A3). Let K ( ν d , λ, V , b d ) b e the constant of assumption (A4) corresp onding to ν d and b d , and set ¯ d := " b d ε d  ¯ λ − λ  K ( ν d , λ, V , b d ) − 1 # ∨ d. Then for equation (21), under assumption (A3), there exists ε ¯ d and ν ¯ d suc h that S n,k ( x, A ) ≥ ε ¯ d · ν ¯ d  e Q n,k : n (1) I A  = ε ¯ d · ν ¯ d  e Q n,k : n (1)  ν ¯ d  e Q n,k : n (1) I A  ν ¯ d  e Q n,k : n (1)  , for all x ∈ C ¯ d . Then setting ¯  n,k := ε ¯ d · ν ¯ d  e Q n,k : n (1)  , ¯ ν n,k ( A ) := ν ¯ d  e Q n,k : n (1) I A  ν ¯ d  e Q n,k : n (1)  , A ∈ B ( X ) , (24) establishes (21). The ﬁrst part of (23) is an immediate consequence of (A4) and (24): ε ¯ d · ν ¯ d  e Q n,k : n (1)  ≥ ε ¯ d K ( ν ¯ d , λ, V , b ¯ d ) . (25) Let { V n,k } b e as deﬁned in (15). Consider the collection of bi-v ariate drift functions  ¯ V n,k ; 0 ≤ k ≤ n  , with eac h ¯ V n,k : X × X → [1 , ∞ ) deﬁned by ¯ V n,k ( x, x 0 ) := 1 2 [ V n,k ( x ) + V n,k ( x 0 )] . (26) 13 W e no w pro ceed to establish the bi-v ariate drift condition of equation (22). First, following the same argumen ts as in the pro of of Lemma 1, under (A3), S n,k V n,k ( x ) ≤ λV n,k − 1 ( x ) + b d I C d ( x ) ε d · ν d  e Q n,k : n (1)  , ∀ x ∈ X . (27) F rom (27), for ( x, x 0 ) / ∈ ¯ C ¯ d w e hav e S ∗ n,k ¯ V n,k ( x, x 0 ) ≤ λ 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] + b d ε d K ( ν d , λ, V , b d ) 1 2 [ I C d ( x ) + I C d ( x 0 )] ≤ λ 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] +  ¯ λ − λ   ¯ d + 1  1 2  I C ¯ d ( x ) + I C ¯ d ( x 0 )  ≤ λ 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] +  ¯ λ − λ  1 2 [ V ( x ) + V ( x 0 )]  I C ¯ d ( x ) + I C ¯ d ( x 0 )  ≤ λ 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] +  ¯ λ − λ  1 2 [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )]  I C ¯ d ( x ) + I C ¯ d ( x 0 )  ≤ ¯ λ ¯ V n,k − 1 ( x, x 0 ) , where for the ﬁrst inequality (A4) has been applied, the second inequality is due to the deﬁnition of ¯ d and the p en ultimate inequality is due to the deﬁnition of ¯ V n,k − 1 . F or all ( x, x 0 ) ∈ ¯ C ¯ d , S ∗ n,k ¯ V n,k ( x, x 0 ) = ¯ R n,k ¯ V n,k ( x, x 0 ) = 1 2(1 − ¯  n,k ) [ S n,k V n,k ( x ) + S n,k V n,k ( x 0 ) − 2  n,k ¯ ν n,k ( V n,k )] ≤ λ 2(1 − ¯  n,k ) [ V n,k − 1 ( x ) + V n,k − 1 ( x 0 )] + b d 2(1 − ¯  n,k ) ε d ν d  e Q n,k : n (1)  [ I C d ( x ) + I C d ( x 0 )] ≤ λ ¯ d (1 − ¯  n,k ) 1 inf x : V ( x ) ≤ ¯ d M n,k  e Q n,k : n (1)  ( x ) + b d (1 − ¯  n,k ) ε d ν d  e Q n,k : n (1)  =: ¯ b n,k − 1 , (28) where equation (27) has b een used. This concludes the proof of equation (22). Applying (A4) to the denominator terms in (28) and using (24)-(25) establishes the remaining part of equation (23). Pr o of. (The or em 1). Let d 0 and λ be as in (A3). Set d ≥ d 0 , ¯ λ ∈ ( λ, 1) and let ¯ d , { ¯  n,k } ,  ¯ b n,k  ,  ¯ R n,k  , and  ¯ V n,k  b e as in Prop osition 1. The latter v eriﬁes conditions (NS1) and (NS2) of Douc et al. [14]. Consequently [14, Theorem 8] may b e applied. F or α = 1 , the uniform b ounds in equation (23) of 14 Prop osition 1, combined with standard manipulations of the b ounds of Douc et al. [14, Theorem 8] (details omitted for brevity) sho w that there exists a ﬁnite constant M and ρ < 1 such that k µS n,k . . . S n,n − µS n,k . . . S n,n k V ≤ M ρ n − k +1 [ µ ( V n,k − 1 ) + µ 0 ( V n,k − 1 )] , (29) Noting that from equation (7), Φ n,k : n ( µ )( A ) = µ  e G n,k M n,k +1  e Q n,k +1: n (1)  S n,k +1 . . . S n,n ( A )  µ  e Q n,k : n (1)  , equation (20) holds due to (29) and the deﬁnition of V n,k − 1 giv en in equation (15). F or the case α ∈ (0 , 1) , due to Jensen’s inequality and the fact that for any t wo non-negativ e reals a, b and α ∈ [0 , 1] , ( a + b ) α ≤ a α + b α , w e hav e that whenever equation (22) of Prop osition 1 holds, S ∗ n,k  ¯ V α n,k  ( x, x 0 ) ≤  ¯ λ ¯ V n,k − 1 ( x, x 0 ) + ¯ b n,k − 1 I ¯ C ( x, x 0 )  α ≤ ¯ λ α ¯ V α n,k − 1 ( x, x 0 ) + ¯ b α n,k − 1 I ¯ C ( x, x 0 ) , ∀ ( x, x 0 ) ∈ X × X . and for ¯ V n,k − 1 giv en in equation (26), ¯ V α n,k − 1 ( x, x 0 ) ≤ 1 2 α  V α n,k ( x ) + V α n,k ( x 0 )  . The arguments as for the case α = 1 are then rep eated essentially replacing ¯ V n,k , ¯ λ , b n,k − 1 b y ¯ V α n,k , ¯ λ α and b α n,k − 1 resp ectiv ely , in order to establish equation (19) and thus (20). The details are omitted for brevity . 3.1 V erifying assumption (A4) The follo wing lemma illustrates that (A4) can b e veriﬁed under a generic condition on the deca y in x of the p oten tial functions { G n,k } sp eciﬁed via { U n,k } , relative to the drift function V of assumption (A3). Lemma 2. Assume (A1) and (A3). L et V , d 0 and λ b e as in (A3) and assume sup n ≥ 1 sup 0 ≤ k ≤ n − 1 sup x ∈ X U n,k ( x ) V ( x ) < + ∞ . 15 F or any d ≥ d 0 let b d b e the c orr esp onding c onstant of (A3). Then ther e exists a p ositive, ﬁnite c onstant C dep ending only on λ and b d such that for any µ ∈ P ( X ) with µ ( V ) < + ∞ , inf n ≥ 1 inf 0 ≤ k ≤ n µ  e Q n,k : n (1)  ≥ exp [ − C µ ( V )] . (30) Pr o of. Firstly , µ  e Q n,n : n (1)  = µ (1) and by Jensen’s inequality , µ  e Q n,n − 1: n (1)  = µ  e G n,n − 1  ≥ exp  − 1 n µ ( U n,n − 1 )  ≥ exp  − µ ( V ) k U n,n − 1 k V  . F or 1 ≤ k < n − 1 , by Jensen’s inequality , µ  e Q n,k : n (1)  = ˆ X n − k +1 exp − 1 n n − 1 X ` = k U n,` ( x ` ) ! µ ( dx k ) n Y ` = k +1 M n,` ( x ` − 1 , dx ` ) ≥ exp " − 1 n ˆ X n − k +1 n − 1 X ` = k U n,` ( x ` ) ! µ ( dx k ) n Y ` = k +1 M n,` ( x ` − 1 , dx ` ) # = exp " − 1 n µ ( U n,k ) − 1 n n − 1 X ` = k +1 ˆ X U n,` ( x ` ) µM n,k : ` ( dx ` ) # . (31) Iteration of the drift inequality in (A3) shows that for any 1 ≤ k < ` < n , ˆ X V ( x ` ) M n,k : ` ( x k , dx ` ) ≤ λ ` − k V ( x ) + b d ` − k − 1 X j =0 λ j . (32) It follows from (32) that ˆ X U n,` ( x ` ) µM n,k − 1: ` ( dx ` ) ≤ k U n,` k V ˆ X V ( x ` ) µM n,k − 1: ` ( dx ` ) ≤ µ ( V ) + b d 1 1 − λ whic h combined with (31) implies the desired result. 4 L p error b ounds for the particle measures Making use of the results of section 3, the following theorem presents an L p b ound on the error η N n,n ( f ) − η n,n ( f ) , for some p ossibly unbounded f . This theorem rests on assumptions ab out the moments of the mean particle drift, η N n,k ( V ) , and a related normalization quan tity , which are used in the pro of to b ound the momen ts of Martingale increments asso ciated with the particle approximation. Discussion of the (34) is 16 giv en in the app endix and b oth assumptions are veriﬁed in the application of section 5. Theorem 2. Assume (A1), (A3) and (A4). L et V b e as in (A3) and for s > 0 an indep endent p ar ameter let t := 1+ s s . L et p ≥ 1 and α ∈ [0 , 1] , b e such that α tp ≤ 1 and (1 + s ) p ≤ 1 , and for µ ∈ P ( X ) assume sup N ≥ 1 sup n ≥ 1 sup 1 ≤ k ≤ n ¯ E µ  η N n,k  e Q n,k : n (1)  − (1+ s ) p  < + ∞ , (33) sup N ≥ 1 sup n ≥ 1 sup 1 ≤ k ≤ n ¯ E µ  η N n,k  V αtp  < + ∞ . (34) Then ther e exists ρ < 1 and a ﬁnite c onstant C dep ending on α , µ , V , and the c onstants in (A1), (A3), and (A4) such that for any f ∈ L V α , n ∈ N and N ∈ N , ¯ E µ h    η N n,n − η n,n  ( f )   p i 1 /p ≤ C k f k V α  1 − ρ n 1 − ρ  1 √ N . (35) Pr o of. Throughout the pro of C denotes a constan t whose v alue may change on each app earance. Consider the telescoping decomp osition  η N n,n − η n,n  ( f ) = n X k =0  Φ n,k : n  η N n,k  − Φ n,k : n  Φ n,k  η N n,k − 1  ( f ) , (36) with the conv en tion Φ n, 0  η N n, − 1  := µ . F or any of the terms in the summation of equation (36), following the approach of Del Moral [6, page 245], we ha ve  Φ n,k : n  η N n,k  − Φ n,k : n  Φ n,k  η N n,k − 1  ( f ) =   η N n,k e Q n,k : n η N n,k e Q n,k : n (1) − Φ n,k  η N n,k − 1  e Q n,k : n Φ n,k  η N n,k − 1  e Q n,k : n (1)   ( f ) = 1 η N n,k e Q n,k : n (1)   η N n,k e Q n,k : n − η N n,k e Q n,k : n (1) Φ n,k  η N n,k − 1  e Q n,k : n Φ n,k  η N n,k − 1  e Q n,k : n (1)   ( f ) = 1 η N n,k e Q n,k : n (1)  η N n,k − Φ n,k  η N n,k − 1  e Q N n,k : n ( f ) , (37) where e Q N n,k : n ( f )( x ) := e Q n,k : n ( f )( x ) − e Q n,k : n (1)( x ) Φ n,k  η N n,k − 1  e Q n,k : n ( f ) Φ n,k  η N n,k − 1  e Q n,k : n (1) . 17 F rom equations (36) and (37), for p ≥ 1 , ¯ E µ h    η N n,n − η n,n  ( f )   p i 1 /p ≤ n X k =0 ¯ E µ "      1 η N n,k e Q n,k : n (1)  η N n,k − Φ n,k  η N n,k − 1  e Q N n,k : n ( f )      p # 1 /p ≤ n X k =0 ¯ E µ        1 η N n,k e Q n,k : n (1)      (1+ s ) p   1 / [(1+ s ) p ] ¯ E µ      η N n,k − Φ n,k  η N n,k − 1  e Q N n,k : n ( f )    pt  1 / ( tp ) , (38) where Minko wski’s and Hölder’s inequalities hav e b een applied. W e next pro ceed to b ound eac h of the factors in the summands of equation (38). Denoting  η N n,k − Φ n,k  η N n,k − 1  e Q N n,k : n ( f ) = 1 N N X i =1 T ( i ) n,k , where T ( i ) n,k := e Q N n,k : n ( f )( ξ ( N ,i ) n,k ) − Φ n,k  η N n,k − 1  e Q N n,k : n ( f ) , (with the dep endence of T ( i ) n,k on N suppressed) we hav e that for any n ∈ N , 1 ≤ k ≤ n and 1 ≤ i ≤ N , ¯ E µ h T ( i ) n,k    F ( N ,i − 1) n,k i = 0 , 18 with the conv en tion that F ( N , 0) n,k = F ( N ,N ) n,k − 1 . Next, there exists a constant C such that for any p ≥ 1 , ¯ E µ h    T ( i ) n,k    p i 1 /p = ¯ E µ "      e Q n,k : n (1)  ξ ( N ,i ) n,k  " S n,k +1 . . . S n,n ( f )  ξ ( N ,i ) n,k  − Φ n,k  η N n,k − 1   e Q n,k : n (1) S n,k +1 . . . S n,n ( f )  Φ n,k  η N n,k − 1   e Q n,k : n (1)          p   1 /p ≤ ρ n − k M k f k V α ¯ E µ         e Q n,k : n (1)  ξ ( N ,i ) n,k    V α n,k  ξ ( N ,i ) n,k  + Φ n,k  η N n,k − 1   e Q n,k : n (1) V α n,k  Φ n,k  η N n,k − 1   e Q n,k : n (1)          p   1 /p ≤ ρ n − k M k f k V α ¯ E µ h V αp  ξ ( N ,i ) n,k i 1 /p + ρ n − k M k f k V α ¯ E µ         e Q n,k : n (1)  ξ ( N ,i ) n,k  Φ n,k  η N n,k − 1   e Q n,k : n (1) V α n,k  Φ n,k  η N n,k − 1   e Q n,k : n (1)        p   1 /p ≤ ρ n − k M k f k V α ¯ E µ h V αp  ξ ( N ,i ) n,k i 1 /p + ρ n − k M k f k V α ¯ E µ   Φ n,k  η N n,k − 1   e Q n,k : n (1) V αp n,k  Φ n,k  η N n,k − 1   e Q n,k : n (1)  ¯ E µ h e Q n,k : n (1)  ξ ( N ,i ) n,k     F ( N ,i − 1) n,k i   1 /p ≤ ρ n − k M k f k V α  ¯ E µ  η N n,k ( V αp )  1 /p + ¯ E µ  Φ n,k  η N n,k − 1  ( V αp )  1 /p  ≤ ρ n − k C k f k V α , (39) where Theorem 1, follow ed by Minko wski’s inequality , Jensen’s inequality , the exchangeabilit y prop erty of equation (8), Jensen’s inequality again and the assumption of equation (34) ha ve b een applied. Th us for ﬁxed N , n P i j =1 T ( j ) n,k , F ( N ,i ) n,k  ; 1 ≤ i ≤ N o is a Martingale sequence with increments b ounded in L p . It follo ws that when tp ≥ 2 , by the Burkholder-Davis inequality and Minko wski’s inequality , there exists a constan t C such that ¯ E µ      η N n,k − Φ n,k  η N n,k − 1  e Q N n,k : n ( f )    tp  1 / ( tp ) ≤ C N − 1 ¯ E µ        N X i =1  T ( i ) n,k  2      tp/ 2   1 / ( tp ) ≤ C N − 1 N X i =1 ¯ E µ     T ( i ) n,k    tp  2 / ( tp ) ! 1 / 2 , 19 and when 1 < tp < 2 , using the fact that for any a, b ≥ 0 and 0 ≤ r ≤ 1 , ( a + b ) r ≤ a r + b r , ¯ E µ      η N n,k − Φ n,k  η N n,k − 1  e Q N n,k : n ( f )    tp  1 / ( tp ) ≤ C N − 1 ¯ E µ        N X i =1  T ( i ) n,k  2      tp/ 2   1 / ( tp ) ≤ C N − 1 N X i =1 ¯ E µ   T ( i ) n,k  tp  ! 1 / ( tp ) . Com bining with (39), we conclude that there exists a constant C such that ¯ E µ      η N n,k − Φ n,k  η N n,k − 1  e Q N n,k : n ( f )    tp  1 / ( tp ) ≤ ρ n − k C k f k V α N −{ tp/ 2 ∧ ( tp − 1) } / ( tp ) , for all n ≥ 1 and 1 ≤ k ≤ n . The remaining terms in (38) are treated directly by the assumption of equation (33), and therefore up on returning to (38) w e conclude that there exists a constant C such that ¯ E µ h    η N n,n − η n,n  ( f )   p i 1 /p ≤ C k f k V α 1 √ N n X k =0 ρ n − k < C k f k V α  1 − ρ n 1 − ρ  1 √ N , and the result holds. 5 Application In this section is concerned with the case in which X = R d , B ( R d ) is the corresp onding Borel σ -algebra and throughout consider the follo wing structural deﬁnitions and assumptions. • Let π ∈ P ( X ) b e a target distribution admitting a density with resp ect to Leb esgue measure. Also denote by π its densit y . In applications of interest, this density will only b e kno wn up to a multiplicativ e constan t, Z , and denote by ¯ π the unnormalised densit y , i.e. π ( x ) = ¯ π ( x ) / Z , x ∈ R d . • F or γ ∈ (0 , 1] a constant, let γ : [0 , 1] → [ γ , 1] b e a non-decreasing, Lipschitz function. • Let  π γ ; γ ∈ [ γ , 1]  b e the family of probabilit y measures deﬁned by π γ ( A ) := ´ A ¯ π γ ( x )d x ´ X ¯ π γ ( x )d x , A ∈ B ( X ) . 20 • Let q ∈ P ( X ) b e an increment distribution admitting a density with resp ect to Leb esgue measure, also denoted by q . F or each n ≥ 1 and 1 ≤ k ≤ n , let M n,k b e a random walk Metrop olis (R WM) kernel of in v ariant distribution π γ ( k/n ) and prop osal kernel q , i.e. M n,k ( x, A ) = ˆ A − x  1 ∧ ¯ π γ ( k/n ) ( x + y ) ¯ π γ ( k/n ) ( x )  q ( y ) λ Leb ( dy ) + δ x ( A ) ˆ X − x  1 −  1 ∧ ¯ π γ ( k/n ) ( x + y ) ¯ π γ ( k/n ) ( x )  q ( y ) λ Leb ( dy ) , A ∈ B ( X ) . where for any set C , C − x := { z ∈ X ; z + x ∈ C } (note that in applications it ma y b e of interest to allo w q to dep end on n and k , for example via γ ( k /n ) , but for simplicity this issue is not pursued further here). • Let { G n,k ; n ≥ 1 , 0 ≤ k ≤ n − 1 } b e a collection of p oten tial functions deﬁned by G n,k ( x ) = exp  1 n log ¯ π ( x )  γ (( k + 1) /n ) − γ ( k /n ) 1 /n  . Consider the following assumptions on the target density π and increment densit y q . • The density π is strictly p ositiv e, b ounded and has contin uous ﬁrst deriv ativ es such that lim r →∞ sup | x |≥ r n ( x ) · ∇ log π ( x ) = −∞ , lim r →∞ sup | x |≥ r n ( x ) · ∇ π ( x ) |∇ π ( x ) | < 0 , (40) • F or all r > 0 there exists  r > 0 such that | x | ≤ r ⇒ q ( x ) ≥  r . (41) The assumptions of equations (40)-(41) are standard types of assumptions ensuring geometric ergo dicit y of R WM k ernels [26, 20]. The assumption of equation (41) is stronger than the standard one in [20], but is ﬂexible enough to verify (A3) whic h in volv es a family of minorization measures/constants, inde xed o ver a range of levels of V . The interest in the sp eciﬁc FK mo dels of this section arises from the choice of the initial distribution µ addressed in the following lemma. This FK mo del corresp onds to a particular choice of the “backw ards” k ernels in [11], and in the corresp onding SMC algorithm the order of the weigh ting and resampling steps is 21 rev ersed. Lemma 3. Consider the op er ators { Φ n,k } asso ciate d with { G n,k } and { M n,k } of se ction 5. Then for al l n ≥ 1 and 0 ≤ k ≤ n , Φ n, 0: k  π γ  = π γ ( k/n ) . Pr o of. Fix n arbitrarily and supp ose the result holds at rank 0 < k < n . Then Φ n, 0: k +1  π γ  ( A ) = Φ n,k +1 ◦ Φ n, 0: k  π γ  ( A ) = ´ ¯ π γ (( k +1) /n ) ( x ) ¯ π γ ( k/n ) ( x ) M n,k +1 ( A )( x ) π γ ( k/n ) ( dx ) ´ ¯ π γ (( k +1) /n ) ( x ) ¯ π γ ( k/n ) ( x ) π γ ( k/n ) ( dx ) = π γ (( k +1) /n ) ( A ) , A ∈ B ( X ) , due to the prop erty that M n,k is in v ariant for π γ ( k/n ) ( dx ) . The pro of is complete up on noting that for all n , Φ n, 0:0 = I d by conv en tion. W e hav e the following result. Theorem 3. Consider the c ol le ction of FK mo dels sp e ciﬁe d in se ction 5. L et s > 0 b e an indep endent p ar ameter and set t = 1+ s s . L et α ∈ [0 , 1] , and p ≥ 1 b e such that αpt ≤ 1 and (1 + s ) p (1 − γ ) /γ < 1 . Then ther e exist ﬁnite c onstants C 1 ( p, µ ) , and C 2  µ, π γ  (dep ending implicitly on π and γ ( · ) ), and c onstants β ∈ (0 , 1) and ρ ∈ [0 , 1) , such that for any f ∈ L V α , n ≥ 1 and N ≥ 1 , ¯ E µ h    π N n − π  ( f )   p i 1 /p ≤ k f k V α  C 1 ( p, µ ) √ N + ρ n C 2  µ, π γ  I h µ 6 = π γ i  , wher e V ( x ) ∝ π − β γ ( x ) and for e ach n , π N n := η N n,n . The pro of of Theorem 3 is p ostponed until after the following prop osition regarding the veriﬁcation of assumptions. Prop osition 2. Consider the setting of se ction 5. Then (A1), (A3) and (A4) hold. Pr o of. As the density π is b ounded and γ ( · ) is Lipsc hitz, (A1) holds by the mean v alue theorem. W e now turn to the veriﬁcation of (A3). V arious arguments are adopted from Andrieu et al. [1] and the manipulations are fairly standard, but are included here for completeness. The main diﬀerence is that we need to explicitly 22 v erify the drift and minorization conditions of (A3) which hold ov er a range of sub-levels for V and the pro of b elo w inv olves veriﬁcation of some assumptions taken as given in [1, Lemma 4]. Firstly , due to the deﬁnition of π γ w e hav e that ∇ log π γ ( x ) = γ ∇ log ¯ π ( x ) = γ ∇ log π ( x ) and ∇ π ( x ) |∇ π ( x ) | = ¯ π ( x ) ∇ log ¯ π ( x ) | ¯ π ( x ) ∇ log ¯ π ( x ) | γ γ ¯ π γ − 1 ( x ) ¯ π γ − 1 ( x ) = ∇ π γ ( x )    ∇ π γ ( x )    and so lim r →∞ sup | x |≥ r n ( x ) · ∇ log π γ ( x ) = −∞ , lim r →∞ sup | x |≥ r n ( x ) · ∇ π γ ( x )    ∇ π γ ( x )    < 0 (42) In order to verify (A3) we ﬁrst verify a drift condition for M 0 ( x, dy ) , deﬁned to b e the R WM kernel rev ersible w.r.t. π γ ( x ) with increment densit y q as ab ov e, i.e. M 0 ( x, A ) = ˆ A − x  1 ∧ ¯ π γ ( x + y ) ¯ π γ ( x )  q ( y ) λ Leb ( dy ) + δ x ( A ) ˆ X − x  1 −  1 ∧ ¯ π γ ( x + y ) ¯ π γ ( x )  q ( y ) λ Leb ( dy ) , A ∈ B ( X ) . Let β ∈ (0 , 1) and deﬁne V : X → [1 , + ∞ ) by V ( x ) := π − γ β ( x ) inf x π − γ β ( x ) . (43) The results of Jarner and Hansen [20] sho w that when (42) holds, then for M 0 with increment density q satisfying equation (41), it holds that lim r →∞ sup | x |≥ r M 0 V ( x ) V ( x ) < 1 . Th us there exist λ < 1 and ρ λ < + ∞ suc h that | x | ≥ ρ λ ⇒ M 0 V ( x ) V ( x ) ≤ λ. (44) Due to (40), there exists  > 0 and ρ  > 0 such that | x | ≥ ρ  ⇒ n ( x ) · ∇ log π γ ( x ) ≤ − . No w set r 0 = ρ λ ∨ ρ  and d 0 := sup | x |≤ r 0 V ( x ) . Note that d 0 < + ∞ due to the deﬁnition of V and as the densit y π is contin uous and p ositive. W e now pro ceed to verify the drift part of (A3). 23 F or any d ≥ d 0 let C d := { x : V ( x ) ≤ d } . W e then hav e sup x ∈ C d M 0 V ( x ) ≤ d sup x ∈ C d M 0 V ( x ) V ( x ) = d sup x ∈ C d ( ˆ A ( x ) ¯ π − γ β ( x + y ) ¯ π − γ β ( x ) q ( y ) λ Leb ( dy ) + ˆ R ( x ) " 1 − ¯ π γ ( x + y ) ¯ π γ ( x ) + ¯ π γ (1 − β ) ( x + y ) ¯ π γ (1 − β ) ( x ) # q ( y ) λ Leb ( dy ) ) (45) where A ( x ) := { y ∈ X : π ( x + y ) ≥ π ( x ) } , R ( x ) := { y ∈ X : π ( x + y ) < π ( x ) } . As each of the ratios in (45) is less than or equal to 1 , then we conclude that there exists a constant C b < + ∞ such that sup x ∈ C d M 0 V ( x ) ≤ dC b =: b d . (46) Noting the deﬁnition of d 0 , and combining (44) and (46) w e obtain M 0 V ( x ) = M 0 V ( x ) I [ | x | > r 0 ] + M 0 V ( x ) I [ | x | ≤ r 0 ] ≤ λV ( x ) + M 0 V ( x ) I [ V ( x ) ≤ d ] ≤ λV ( x ) + b d I C d ( x ) , (47) for any d ≥ d 0 and x ∈ X . The arguments of Andrieu et al. [1, Lemma 5] then give M n,k V ( x ) ≤ M 0 V ( x ) and from this, (47) and (46), we obtain for any d ≥ d 0 , sup n ≥ 1 sup 1 ≤ k ≤ n M n,k V ( x ) ≤ λV ( x ) + b d I C d ( x ) , whic h establishes the drift part of (A3). It remains to show the minorization part. T o this end we ﬁrst sho w that for an y d ≥ d 0 , C d is b ounded. Recalling the deﬁnition of r 0 , we hav e that for any x suc h that | x | − r 0 ≥ 0 , V ( x ) V ( n ( x ) r 0 ) =  π ( x ) π ( n ( x ) r 0 )  − γ β = exp  − γ β ( | x | − r 0 ) ˆ 1 0 n ( x ) · ∇ log π ( tx + (1 − t ) n ( x ) r 0 ) dt  ≥ exp  γ β ( | x | − r 0 )   from whic h we see that lim r →∞ inf | x |≥ r V ( x ) = + ∞ , which in turn implies that for all d ≥ d 0 there exists r d ≥ 0 such that V ( x ) ≤ d ⇒ | x | ≤ r d . Then for an y n ≥ 1 , 0 ≤ k ≤ n and r d ≥ 0 , whenever x ∈ C d 24 M n,k ( x, A ) ≥ ˆ A − x  1 ∧ π ( x + y ) π ( x )  q ( y ) λ Leb ( dy ) ≥ ˆ ( A ∩ B (0 ,r d ) − x )  1 ∧ π ( x + y ) π ( x )  q ( y ) λ Leb ( dy ) ≥  2 r d inf y ∈ B (0 , 3 r d ) π ( y ) sup y ∈ B (0 , 3 r d ) π ( y ) ˆ ( A ∩ B (0 ,r d ) − x ) λ Leb ( dy ) =  2 r d inf y ∈ B (0 , 3 r d ) π ( y ) sup y ∈ B (0 , 3 r d ) π ( y ) ˆ B (0 ,r d ) λ Leb ( dy ) ´ A ∩ B (0 ,r d ) λ Leb ( dy ) ´ B (0 ,r d ) λ Leb ( dy ) =: ε d · ν d ( A ) , where the third inequalit y holds due to the prop erties of q in (41) and b ecause A ∩ B (0 , r d ) ∩ B ( x, 2 r d ) = A ∩ B (0 , r d ) whenever x ∈ B (0 , r d ) . As π is strictly p ositiv e and contin uous, V is b ounded on compact sets and therefore ν d ( V ) < + ∞ . Also, C d ⊇ C d 0 and then due to the deﬁnition of d 0 , ν d ( C d ) ≥ ν d ( C d 0 ) ≥ ν d ( B (0 , r 0 )) > 0 . This concludes the veriﬁcation of (A3). F or (A4), from the deﬁnition of G n,k , we observe that U n,k is deﬁned by U n,k ( x ) = log ¯ π ( x )  γ (( k + 1) /n ) − γ ( k /n ) 1 /n  − C γ sup y log ¯ π ( y ) where C γ is the Lipschitz constant for γ ( · ) and we observe that sup n ≥ 1 sup 0 ≤ k ≤ n − 1 sup x ∈ X U n,k ( x ) V ( x ) < + ∞ . Assumption (A4) is then satisﬁed up on application of Lemma 2. Pr o of. (The or em 3) Throughout the pro of, we denote by C a constan t whose v alue ma y c hange upon eac h appearance. Consider the error decomp osition ¯ E µ h    π N n − π  ( f )   p i 1 /p = ¯ E µ h    η N n,n − η n,n  ( f )   p i 1 /p + | ( η n,n − π ) ( f ) | . (48) Cho ose β such that (1 + s ) p (1 − γ ) /  γ β  ≤ 1 and take V to b e deﬁned as in equation (43). By prop osition 2, the FK mo del of section 5 satisﬁes assumptions (A1), (A3) and (A4). The second term on the r.h.s. of (48) is treated by application of Theorem 1. Noting that by deﬁnition, η n,n = Φ n, 0: n ( µ ) and by Lemma 3, 25 π = Φ n, 0: n  π γ  w e obtain from Theorem 1 that there exist constants ρ , M and C 2 suc h that k η n,n − π k V α ≤ M ρ n   µ  e G n,k V α  µ  e Q n,k : n (1)  + π γ  e G n,k V α  π γ  e Q n,k : n (1)    I h µ 6 = π γ i ≤ ρ n C 2  µ, π γ  I h µ 6 = π γ i . where the constant C 2 arises from assumption (A4) applied to the denominator terms and implicitly dep ends on π γ , V , and the constants in (A3). In order to apply Theorem 2 to the ﬁrst term on the r.h.s. of (48) it remains to verify the assumptions of equations (33)-(34). W e start by addressing the latter. F rom the deﬁnition of V and due to the assumption that γ ( · ) is non-decreasing w e observe that for all x, x 0 ∈ X , [ G n,k ( x ) − G n,k ( x 0 )] [ V ( x ) − V ( x 0 )] ≤ 0 and therefore b y Lemma 4 for all (p ossibly random) η ∈ P ( X ) , η ( G n,k V ) ≤ η ( G n,k ) η ( V ) . Then for any n ≥ 1 and 1 ≤ k ≤ n . ¯ E µ h η N n,k ( V )    F ( N ,N ) n,k − 1 i ≤ λ η N n,k − 1 ( G n,k − 1 V ) η N n,k − 1 ( G n,k − 1 ) + b d 0 ≤ λη N n,k − 1 ( V ) + b d 0 , (49) where (A3) has b een applied with d 0 is deﬁned b elo w equation (44) in the pro of of prop osition 2. Stan- dard iteration of the particle drift inequality (49) (details omitted for brevit y) combined with the fact that n ξ ( N ,i ) n, 0 ; i = 1 , ..., N o are are indep enden t and each distributed according to µ sho ws that sup N ≥ 1 sup n ≥ 1 sup 1 ≤ k ≤ n ¯ E µ  η N n,k ( V )  < + ∞ , (50) and noting that αpt ≤ 1 , equation (34) then holds by tw o applications of Jensen’s inequality . W e now turn to the veriﬁcation of equation (33). F rom previous considerations we notice that for some ﬁnite constant C , U n,k ( x ) = 1 β γ  γ (( k + 1) /n ) − γ ( k /n ) 1 /n  log V ( x ) + C and therefore for an y 26 η N n,k  e Q n,k : n (1)  − 1 =   ˆ exp   − 1 n n − 1 X j = k U n,j ( x j )   η N n,k ( dx k ) n Y j = k +1 M n,j ( x j − 1 , dx j )   − 1 ≤ exp   1 n n − 1 X j = k ˆ U n,j ( x j ) η N n,k M n,k : j ( dx j )   = exp   C + 1 n 1 β γ n − 1 X j = k  γ (( j + 1) /n ) − γ ( j /n ) 1 /n  ˆ log V ( x j ) η N n,k M n,k : j ( dx j )   ≤ exp   C + 1 n 1 β γ n − 1 X j = k  γ (( j + 1) /n ) − γ ( j /n ) 1 /n  log  ˆ V ( x j ) η N n,k M n,k : j ( dx j )    ≤ exp  C + 1 − γ ( k /n ) β γ log  η N n,k ( V )   ≤ exp( C )  η N n,k ( V )  1 − γ β γ where the p en ultimate inequalit y hold due to standard iteration of the drift inequality in (A3). Therefore ¯ E µ  η N n,k  e Q n,k : n (1)  − (1+ s ) p  ≤ C ¯ E µ  η N n,k ( V )  q due to Jensen’s inequality and where q := ( 1 − γ ) β γ (1 + s ) p ≤ 1 by assumption of the theorem. Equation (33) then follows up on combining this with equation (50). This completes the pro of. A c kno wledgmen ts The author thanks Christophe Andrieu for discussions which lead to the consideration of this work. 6 App endix When (A3) or even more simply (A2) holds, it is natural to ask under what further conditions, if any , do es the non-homogeneous Marko v chain n ζ ( N ) n,k ; 0 ≤ k ≤ n o also satisfy a geometric drift condition, as this would b e one natural route to verifying (34). Deﬁning V N : X N → [1 , ∞ ) by 27 V N ( ζ ) := 1 N N X i =1 V  ξ i  , where V is as in (A2) and ζ =  ξ 1 , ..., ξ N  , we see that ¯ E µ h V N ( ζ N n,k )   F ( N ,N ) n,k − 1 i ≤ λ η N n,k − 1  e G n,k − 1 V  η N n,k − 1  e G n,k − 1  + b η N n,k − 1  e G n,k − 1 I C  η N n,k − 1  e G n,k − 1  ≤ λ η N n,k − 1  e G n,k − 1 V  η N n,k − 1  e G n,k − 1  + b I C ( N ) ( ζ ( N ) n,k ) , (51) where C ( N ) :=  ζ =  ξ 1 , ..., ξ N  ∈ X N : ∃ j ∈ { 1 , ..., N } ; ξ j ∈ C  . W e are then faced with the issue of whether the re-weigh ting of η N n,k − 1 b y the p oten tial function destroys the geometric drift of M n,k : for example one may ask when is it true that for some ﬁxed δ ≥ 0 , η N n,k − 1  e G n,k − 1 V  η N n,k − 1  e G n,k − 1  ≤ (1 + δ ) η N n,k − 1 ( V ) (52) The remainder of this section examines this question and go es on to lo ok at some particular issues when X = R d . First, we hav e the following Lemma, whic h addresses a general scenario. Lemma 4. F or f : X → (0 , ∞ ) , g : X → (0 , ∞ ) , two me asur able functions and δ ∈ [0 , ∞ ) , η ( f g ) ≤ (1 + δ ) η ( f ) η ( g ) for any η ∈ P ( X ) such that | η ( f g ) | < + ∞ , | η ( f ) | < + ∞ , | η ( g ) | < + ∞ , if [ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )] ≤ δ 2 + δ , ∀ ( x, x 0 ) ∈ X 2 , (53) 28 and only if [ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )] ≤ 3 δ 2 + δ , ∀ ( x, x 0 ) ∈ X 2 . Pr o of. F or any η , f , g as sp eciﬁed in the statement and  ∈ (0 , 1) , consider the iden tity: (1 −  ) η ( f g ) − (1 +  ) η ( f ) η ( g ) = 1 2 ˆ X ˆ X ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] −  [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) η ( dx ) η ( dx 0 ) = 1 2 ˆ X 2 ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] −  [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) η ( dx ) ⊗ η ( dx 0 ) , where the ﬁnal inequality is due to F ubini’s theorem, which is applicable under the hypotheses of the lemma. The suﬃciency part then follows directly up on setting  = δ 2 + δ ⇔ (1 + δ ) = 1 +  1 −  . F or the necessity part, supp ose on the con trary that there exists ( y , y 0 ) ∈ X 2 suc h that [ f ( y ) − f ( y 0 )] [ g ( y ) − g ( y 0 )] > 3 δ 2 + δ [ f ( y ) + f ( y 0 )] [ g ( y ) + g ( y 0 )] , then setting η = 1 2 [ δ y + δ y 0 ] and  = δ 2+ δ ≥ 0 , we obtain ˆ X ˆ X ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] −  [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) η ( dx ) η ( dx 0 ) = ˆ X 2 ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] −  [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) 1 4 [ δ y ( dx ) ⊗ δ y ( dx 0 ) + δ y 0 ( dx ) ⊗ δ y 0 ( dx 0 )] + ˆ X 2 ([ f ( x ) − f ( x 0 )] [ g ( x ) − g ( x 0 )] −  [ f ( x ) + f ( x 0 )] [ g ( x ) + g ( x 0 )]) 1 4 [ δ y ( dx ) ⊗ δ y 0 ( dx 0 ) + δ y 0 ( dx ) ⊗ δ y ( dx 0 )] = −  [ f ( y ) g ( y ) + f ( y 0 ) g ( y 0 )] + 1 2 ([ f ( y ) − f ( y 0 )] [ g ( y ) − g ( y 0 )] −  [ f ( y ) + f ( y 0 )] [ g ( y ) + g ( y 0 )]) ≥ 1 2 ([ f ( y ) − f ( y 0 )] [ g ( y ) − g ( y 0 )] − 3  [ f ( y ) + f ( y 0 )] [ g ( y ) + g ( y 0 )]) > 0 , whic h completes the pro of. 29 Note that the suﬃcient condition is alwa ys met, for example, when g = ϕ ◦ f for some p ositive, strictly decreasing and inv ertible function ϕ . The factor of 3 in the necessit y part of Lemma 4 arises from considering a suitable η with N = 2 supp ort p oin ts and is illustrativ e in the case of interest where each of the measures η N n,k is atomic. The following section further explores the necessity part of Lemma 4 in the case of X = R d , whic h is of interest in the context of [11]. The purp ose of this next section is to show that in order for (52) to hold for an y η N n,k − 1 , it is necessary that there is a very sp eciﬁc relationship holds b et ween G and V . 6.1 When X = R d Let X = R d and let B ( X ) b e the corresp onding Borel σ -algebra. W e denote by S d − 1 the unit sphere in R d and for z in the image of V (resp. G ), deﬁne the contour manifolds V z :=  x ∈ R d : V ( x ) = z  and G z :=  x ∈ R d : G ( x ) = z  . W e denote by B ( c, r ) the closed ball of radius r and centered at c . F or x, x 0 ∈ R d , let n ( x ) := x | x | and let ` x 0 x b e the line segment with end p oin ts x and x 0 . Consider the following collection of assumptions. ( A5 ) – G : R d → (0 , + ∞ ) and V : R d → [1 , + ∞ ) hav e contin uous ﬁrst deriv ativ es – There exist r 0 > 0 and strictly p ositiv e and non-decreasing functions φ V : (0 , ∞ ) → (0 , ∞ ] and φ G : (0 , ∞ ) → (0 , ∞ ] such that whenever | x | ≥ r 0 , n ( x ) · ∇ log V ( x ) ≥ φ V ( | x | ) and n ( x ) · ∇ log G ( x ) ≤ − φ G ( | x | ) (54) The conditions of equation (54) imply that for all δ > 0 and whenever | x | ≥ r 0 , V ( x + δ n ( x )) V ( x ) ≥ exp [ φ V ( | x | ) δ ] , and G ( x + δ n ( x )) G ( x ) ≤ exp [ − φ G ( | x | ) δ ] . (55) The assumptions of (54) are largely inspired by assumptions on probability densities used to verify geometric ergo dicit y of certain Metrop olis-Hastings kernels [26, 20] and are realistic in the context of [11]. It follows immediately from the arguments of Rob erts and T weedie [26, pro of of theorem 2.1] transferred to G and V , 30 that for | x | large enough, the contour manifolds of G and V which contain x are parameterizable by the unit sphere in the sense that: • F or each z > sup | y |≤ r 0 V ( y ) there exists a bijection betw een S d − 1 and V z , and for each z 0 < inf | y |≤ r 0 G ( y ) there exists a bijection b et ween S d − 1 and G z 0 , such that, V z =  w z ( ζ ) ζ : ζ ∈ S d − 1  , and G z 0 =  h z 0 ( ζ ) ζ : ζ ∈ S d − 1  , , (56) where w z ( · ) and h z ( · ) are p ositiv e and contin uous functions on S d − 1 . F urthermore, V z ∩ B (0 , r 0 ) = G z 0 ∩ B (0 , r 0 ) = ∅ . In order to describ e the relationship b et ween contour manifolds of V and G which intersect at some p oin t, w e introduce the function ψ : R d → [0 , ∞ ] deﬁned by ψ ( x ) := sup ζ ∈ S d − 1    h G ( x ) ( ζ )   −   w V ( x ) ( ζ )    , (57) whic h implicitly dep ends on G and V . W e hav e the following prop osition. Prop osition 3. Assume G : X → (0 , ∞ ) and V : X → [1 , ∞ ) satisfy (A1) and (A5) with lim r →∞  inf s ≥ r φ V ( s )  ∧  inf s ≥ r φ G ( s )  inf | x |≥ r ψ ( x ) = + ∞ . (58) Then for any 0 ≤ δ < 1 , ther e exists an atomic η ∈ P ( X ) such that η ( GV ) η ( G ) > (1 + δ ) η ( V ) . Pr o of. In outline, the pro of inv olves showing that if the condition in (58) holds, then for all  ∈ [0 , 1 ), there exists ( y , y 0 ) ∈ X 2 suc h that  G ( y ) − G ( y 0 ) G ( y ) + G ( y 0 )   V ( y ) − V ( y 0 ) V ( y ) + V ( y 0 )  > , and then employing Lemma 4. Firstly , supp ose (58) fails to hold. Then ﬁx arbitrarily  ∈ [0 , 1) and let r ≥ r 0 b e large enough that  inf s ≥ r φ V ( s )  ∧  inf s ≥ r φ G ( s )  inf | x |≥ r ψ ( x ) ≥ 2 log  1 + √  1 − √   , 31 and then let y be an y p oin t such that | y | > r , V ( y ) > sup | u |≤ r V ( u ) and ψ ( y ) > 0 (such r and y exist due to the h yp otheses of equation (54) and the assumed failure of (58)). Then recalling the deﬁnition of ψ in (57), and as, for ﬁxed y , h G ( y ) ( · ) and w V ( y ) ( · ) are contin uous functions, there exists ζ ∈ S d − 1 suc h that   h G ( y ) ( ζ )   −   w V ( y ) ( ζ )   = ψ ( y ) > 0 . With a sligh t abuse, let x := h G ( y ) ( ζ ) ζ , x 0 := w V ( y ) ( ζ ) ζ , and let y 0 := 1 2 ( x + x 0 ) . It follows that the line segment ` x 0 x lies on a ray and | x | − | y 0 | = | y 0 | − | x 0 | = ψ ( y ) / 2 > 0 . Observ e that under the implications of (54) stated before the prop osition, by construction V ( x ) > V ( y 0 ) > V ( x 0 ) = V ( y ) and G ( x 0 ) > G ( y 0 ) > G ( x ) = G ( y ) . It must also b e the case that | x 0 | > r , as otherwise V ( y ) = V ( x 0 ) ≤ sup | u |≤ r V ( u ) , contradicting the deﬁnition of y . The situation is illustrated in Figure 1. Due to the h yp othesis of equation (55) and the deﬁnition of r , we then hav e V ( y ) V ( y 0 ) = V ( x 0 ) V ( y 0 ) ≤ exp  − φ V ( | x 0 | ) ψ ( y ) 2  ≤ exp  − 1 2  inf s ≥ r φ V ( s ) inf | u |≥ r ψ ( u )  ≤  1 − √  1 + √   , (59) and similarly , G ( y ) G ( y 0 ) = G ( x ) G ( y 0 ) ≤ exp  − φ G ( | y 0 | ) ψ ( y ) 2  ≤ exp  − 1 2  inf s ≥ r φ G ( s ) inf | u |≥ r ψ ( u )  ≤  1 − √  1 + √   . (60) It follows from (59)-(60) that 1 − V ( y ) V ( y 0 ) 1 + V ( y ) V ( y 0 ) ≥ √  and 1 − G ( y ) G ( y 0 ) 1 + G ( y ) G ( y 0 ) ≥ √ . The pro of is complete up on noting that  was chosen arbitrarily in [0 , 1) and applying the necessity part of Lemma 4. W e may then formulate the following example. Corollary 1. L et X = R 2 . F or any  > 0 , let G and V b e deﬁne d by 32 y y ’ x x ’ ψ(y) Figure 1: Solid line: G G ( y ) . Dashed line: V V ( y ) . Radial distance of ψ ( y ) outw ards from G G ( y ) is indicated. V ( x ) = exp  ( x 1 −  ) 2 + x 2 2  and G ( x ) = exp  − h ( x 1 +  ) 2 + x 2 2 i wher e x = ( x 1 , x 2 ) . Then for any 0 ≤ δ < 1 , ther e exists an atomic η ∈ P ( X ) such that η ( GV ) η ( G ) > (1 + δ ) η ( V ) . (61) Pr o of. Elemen tary manipulations show that (A5) holds with r 0 = 2  and taking φ G ( | x | ) = φ V ( | x | ) = 2( | x |−  ) for | x | ≥ r 0 . F or an y r ≥ r 0 , consider the con tour manifolds V z and G z 0 for z = V  0 , √ r 2 −  2  and z 0 = G  0 , √ r 2 −  2  . It is straightforw ard to c heck that ψ  0 , √ r 2 −  2  = 2  and the result follo ws from Prop osition 3. This example serves to highlight that a v ery sp eciﬁc relationship b et ween the p otential functions and the drift function is required if inequalities of the form (52) are to hold for (1 + δ ) λ < 1 and for all probability measures. The application of section 5 is one situation where suc h a relationship holds. W e note that it is p ossible to b ound ¯ E µ h η N n,k ( V ) i without conﬁrming (52), b y app ealing to conv exit y and the ﬂattening prop ert y of the p oten tial functions combined with the geometric drift; but the application of section 5 will not need suc h an approach and so we do not rep ort these details here. On the other hand, if inequalities of the form (52) do hold with (1 + δ ) λ < 1 , one might then pursue accompanying minorization conditions 33 for the c hain n ζ ( N ) n,k ; 0 ≤ k ≤ n o , but it seems diﬃcult to achiev e this in such a wa y that the minorizing constan ts do not degrade as N increases. References [1] C. Andrieu, L. A. Brey er, and A. Doucet. Conv ergence of sim ulated annealing using F oster-Ly apunov criteria. Journal of Applie d Pr ob ability , 38(4), 2001. [2] O. Capp é, E. Moulines, and T. Ryden. Infer enc e in hidden Markov mo dels . Springer Series in Statistics. Springer, New Y ork, 2005. [3] N. Chopin. A sequential particle ﬁlter metho d for static mo dels. Biometrika , 89(3), 2002. [4] N. Chopin. Cen tral limit theorem for sequential Monte Carlo methods and its application to Ba yesian inference. Annals of Statistics , 32(6), 2004. [5] N. Chopin, P . Del Moral, and S. Rub en thaler. Stability of F eynman Kac formulae with path-dep endent p oten tials. Sto chastic Pr o c esses and their Applic ations , 121(1):38–60, 2011. [6] P . Del Moral. F eynman-Kac F ormulae. Gene alo gic al and inter acting p article systems with applic ations . Probabilit y and its Applications. Springer V erlag, New Y ork, 2004. [7] P . Del Moral and A. Doucet. P article motions in absorbing medium with hard and soft obstacles. Sto chastic Analysis and Applic ations , 22:1175–1207, 2004. [8] P . Del Moral and J. Garnier. Genealogical particle analysis of rare ev ents. The Annals of Applie d Pr ob ability , 15(4), 2005. [9] P . Del Moral and A. Guionnet. On the stability of interacting pro cesses with applications to ﬁltering and genetic algorithms. Annales de l’Institut Henri Poinc ar é (B) Pr ob ability and Statistics , 37(2), 2001. [10] P . Del Moral and L. Miclo. Branching and interacting particle systems approximation of F eynman-Kac form ulae with applications to non-linear ﬁltering. Séminair e de Pr ob abilitiés XXXIV. L e ctur e Notes in Mathematics , 1729:1–145, 2000. [11] P . Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the R oyal Statistic al So ciety, Series B , 68(3):411–436, 2006. 34 [12] P . Del Moral, A. Doucet, and A. Jasra. On adaptiv e resampling strategies for sequential Monte Carlo metho ds. Bernoul li , 2011. T o app ear. [13] R. Douc and E. Moulines. Limit theorems for weigh ted samples with applications to sequential Monte Carlo metho ds. Annals of Statistics , 36(5), 2008. [14] R. Douc, E. Moulines, and J.S. Rosen thal. Quantitativ e b ounds on con vergence of time-inhomogeneous Mark ov chains. The Annals of Applie d Pr ob ability , 14(4):1643–1665, 2004. [15] R. Douc, G. F ort, E. Moulines, and P . Priouret. F orgetting the initial distribution for hidden Marko v mo dels. Sto chastic Pr o c esses and their Applic ations , 119:1235–1256, 2009. [16] R. Douc, E. Moulines, and Y. Rito v. F orgetting of the initial condition in general state-space hidden Mark ov chain: a coupling approach. Ele ctr onic Journal of Pr ob ability , 14:27–49, 2009. [17] R. Douc, E. Gassiat, B. Landelle, and E. Moulines. F orgetting of the initial distribution for nonergo dic hidden Marko v mo dels. The Annals of Applie d Pr ob ability , 20(5), 2010. [18] A. Doucet, N. De F reitas, and N. Gordon, editors. Se quential Monte Carlo metho ds in pr actic e . Springer, New Y ork, 2001. [19] K. Heine and D. Crisan. Uniform appro ximations of discrete-time ﬁlters. A dvanc es in Applie d Pr ob ability , 40(4):979–1001, 2008. [20] S.F. Jarner and E. Hansen. Geometric ergo dicit y of Metrop olis algorithms. Sto chastic Pr o c esses and their Applic ations , 85(2):341–361, F ebruary 2000. [21] A. Jasra and A. Doucet. Stability of sequential Mon te Carlo samplers via the F oster Lyapuno v condition. Statistics and Pr ob ability L etters , 78(17), 2008. [22] M. L. Kleptsyna and A. Y. V eretennik ov. On discrete time ergo dic ﬁlters with wrong initial data. Pr ob ability The ory and R elate d Fields , 141(3-4), 2008. [23] H.R. Künsch. Recursive Monte Carlo ﬁlters: algorithms and theoretical analysis. A nnals of Statistics , 33(5), 2005. [24] F. Le Gland and N. Oudjane. Stability and uniform appro ximation of nonlinear ﬁlters using the Hilb ert metric and application to particle ﬁlter. The Annals of Applie d Pr ob ability , 14(1):144–187, 2004. 35 [25] N. Oudjane and S. Rub en thaler. Stability and uniform particle approximation of nonlinear ﬁlters in case of non ergo dic signals. Sto chastic analysis and applic ations , 23(3):421–448, 2005. [26] G.O. Rob erts and R. L. T weedie. Geometric conv ergence and central limit theorems for multidimensional Hastings and Metrop olis algorithms. Biometrika , 83(1):95–120, March 1996. [27] M. Rousset and G. Stoltz. Nonequilibrium sampling from equilibrium dynamics. Journal of Statistic al Physics , 123(6):1251–1272, 2006. [28] R. v an Handel. Uniform time av erage consistency of M on te Carlo particle ﬁlters. Sto chastic Pr o c esses and their Applic ations , 119(11), 2009. 36

Sequential Monte Carlo samplers: error bounds and insensitivity to initial conditions

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment