CLTs and asymptotic variance of time-sampled Markov chains

Noname man uscript No. (will be inserted b y the editor) CL Ts and asymptotic v ariance of time-sampled Mark o v c hains Krzysztof Latuszy´ nski · Gareth O. Rob erts Receiv ed: date / Accepted: date Abstract F or a Ma rk ov transition kernel P and a probabilit y distribution µ on nonnegative int egers , a time-sampled Marko v chain evolves according to the transition k ernel P µ = P k µ ( k ) P k . In this note we obtain CL T conditions for time-sampled Markov chains and derive a sp ectral formula for the asymptotic v aria nc e . Using these r esults we compar e eﬃciency of Ba r k er’s and Metr opolis algorithms in terms of asymptotic v aria nce. Keyw ords time-sampled Markov c hains · Barker’s alg o rithm · Metrop olis algorithm · Cen tral Limit T he o rem · asymptotic v ariance · v a riance bo unding Marko v chains · MCMC es timation 1 In tro duction Let P be an ergo dic transition k er nel of a Markov chain ( X n ) n ≥ 0 with lim- iting distribution π on ( X , B ( X )) and let f : X → R b e in L 2 ( π ) . A typ- ical MC MC pro cedure for estimating I = π f := R X f ( x ) π ( dx ) would use ˆ I n := 1 n P n − 1 i =0 f ( X i ) . Under appr opriate assumptions on P and f a CL T ho lds for ˆ I n , i.e. √ n ( ˆ I n − I ) → N (0 , σ 2 f ,P ) , (1) Supported by EPSRC grants EP/G02652 1/1 and EP/D002060/1 and by CRiSM. K. Latuszy´ nski Departmen t of Statistics Unive rsity of W arwick CV4 7AL, Cov en try , UK E-mail: latuch@g mail.com G. O. Rob erts Departmen t of Statistics Unive rsity of W arwick CV4 7AL, Cov en try , UK 2 where the constant σ 2 f ,P < ∞ is called asymptotic v a riance and dep ends only on f and P. The following theo rem fro m [ 15 ] is a fundamen tal res ult o n conditions that guarantee ( 1 ) for reversible Markov chains. Theorem 1 ([ 15 ]) F or a rev ersible and er go dic Markov chain, and a function f ∈ L 2 ( π ) , if V ar ( f , P ) := lim n →∞ n V ar π ( ˆ I n ) < ∞ , (2) then ( 1 ) ho lds with σ 2 f ,P = V ar ( f , P ) = Z [ − 1 , 1] 1 + x 1 − x E f ,P ( dx ) , (3) wher e E f ,P is t he sp e ctr al me asur e asso ciate d with f and P . W e refer to ( 2 ) as the Kipnis-V aradhan co ndition. Assuming that ( 2 ) holds and P is re versible, in Se c tion 2 we obta in conditions for the CL T and der iv e a sp ectral for m ula for the asy mpto tic v a riance σ 2 f ,P µ of a time-sampled Marko v chain of the form P µ := ∞ X k =0 µ ( k ) P k , (4) where µ is a pro babilit y distribution on the nonnegative in tegers. Time-sampled Marko v c hains are of theoretical interest in the co n text of petite sets (cf. Chap- ter 5 of [ 20 ]), and also in the context of computational algo rithms [ 27 , 28 ]. Next we pro ceed to ana lyze eﬃciency of Barker’s algorithm [ 2 ]. Ba rk er’s algorithm, similarly as Metrop olis, uses an irr educible trans ition kernel Q to draw pro posals . A mov e form X n = x to a pro posal Y n +1 = y is then acc epted with probability α (B) ( x, y ) = π ( y ) q ( y , x ) π ( y ) q ( y , x ) + π ( x ) q ( x , y ) , (5) where q ( x, · ) is the transition density of Q ( x, · ) . It is well known that w ith the same propo sal kernel Q , the Metrop olis acceptance r a tio res ults in a smaller asymptotic v aria nce then Barker’s. In Section 3 w e show that the asymptotic v aria nc e of Barker’s algo rithm is not bigg er then, roughly sp eaking, tw o times that of Metrop olis. W e also motiv ate our cons iderations by recent adv a nces in exact MCMC for diﬀusion mo de ls . The theore tica l results are illustra ted by a simulation study in Section 4 . 3 2 Time-s ampled Marko v c hains In this section we work under assumptions of Theor em 1 which imply that the asymptotic v arianc e σ 2 f ,P equals V ar ( f , P ) deﬁned in ( 2 ) a nd satisﬁes ( 3 ). F or other Marko v chain CL T conditions we refer to [ 13 , 25 , 20 , 4 , 26 ]. Theorem 2 L et P b e a r eversible and er go dic tr ansition kernel with stationary me asur e π , and let f ∈ L 2 ( π ) . As sume that the K ip nis-V ar adhan c ondition ( 2 ) holds for f and P . F or a pr ob ability distribution µ on nonne gative inte gers, let the time-sample d kernel P µ b e d eﬁne d by ( 4 ) . Then, if any of the fol lowing c onditions hold (i) µ o dd := µ ( { 1 , 3 , 5 , . . . } ) > 0 , (ii) µ (0) < 1 and P is ge ometric al ly er go dic, the CL T ho lds for f and P µ , mo r e over σ 2 f ,P µ = Z [ − 1 , 1] 1 + G µ ( x ) 1 − G µ ( x ) E f ,P ( dx ) < ∞ , (6) wher e G µ is the pr ob ability gener ating function of µ, i.e. G µ ( z ) := E µ z K , | z | ≤ 1 , K ∼ µ, and E f ,P is t he sp e ctr al me asur e asso ciate d with f and P . R emark 1 The condition µ od d > 0 in t he above result is necessary , which we show b elo w b y means of a counterexample. Pr o of The pro of is based on the functiona l analytic appro ac h (see e.g . [ 15 , 24 ]). Without lo s s of generality assume that π f = 0 . A reversible transition kernel P with inv ariant dis tribution π is a s elf-adjoin t opera tor o n L 2 0 ( π ) := { f ∈ L 2 ( π ) : π f = 0 } with sp ectral ra dius b ounded by 1. B y the spectra l decomp osition theor em for s elf adjoint op erators, for each f ∈ L 2 0 ( π ) there exists a ﬁnite p ositiv e measure E f ,P on [ − 1 , 1] , such that h f , P n f i = Z [ − 1 , 1] x n E f ,P ( dx ) , for all in tegers n ≥ 0 . Thus in pa rticular σ 2 f = π f 2 = Z [ − 1 , 1] 1 E f ,P ( dx ) < ∞ , (7) and b y [ 15 ] (c.f. also T he o rem 4 of [ 11 ]) one obtains σ 2 f ,P = Z [ − 1 , 1] 1 + x 1 − x E f ,P ( dx ) < ∞ . (8) Since P n µ = P k µ ( k ) P k , by the sp ectral mapping theorem [ 9 ], w e hav e h f , P n µ f i = Z [ − 1 , 1] x n E f ,P µ ( dx ) = Z [ − 1 , 1]  X k µ ( k ) x k  n E f ,P ( dx ) = Z [ − 1 , 1]  G µ ( x )  n E f ,P ( dx ) , 4 and consequently , applying the same argument as [ 15 , 11 ], we obtain σ 2 f ,P µ = Z [ − 1 , 1] 1 + x 1 − x E f ,P µ ( dx ) = Z [ − 1 , 1] 1 + G µ ( x ) 1 − G µ ( x ) E f ,P ( dx ) =: ♣ . (9) Now ( 9 ) giv es the claimed form ula but we need to prov e ( 9 ) is ﬁnit e: b y [ 15 ] ﬁniteness of the in tegra l in ( 9 ) implies a CL T fo r f and P µ . Observe that | G ( x ) | ≤ 1 for all x ∈ [ − 1 , 1] , G ( x ) ≤ µ (0) + x (1 − µ (0)) for x ≥ 0 . Moreov er, if (i) holds, then G ( x ) ≤ X k even µ ( k ) x k ≤ 1 − µ od d for x ≤ 0 , hence we can write ♣ = Z [ − 1 , 0) 1 + G µ ( x ) 1 − G µ ( x ) E f ,P ( dx ) + Z [0 , 1] 1 + G µ ( x ) 1 − G µ ( x ) E f ,P ( dx ) ≤ 1 µ od d Z [ − 1 , 0) 2 E f ,P ( dx ) + 1 1 − µ (0 ) Z [0 , 1] 2 1 − x E f ,P ( dx ) . (10) The ﬁrst in tegra l in ( 10 ) is ﬁnite by ( 7 ) and the second by ( 8 ) and we ar e done with (i). Next ass ume that (ii) ho lds. By S ( P ) denote the spe ctrum of P and let s P := sup {| λ | : λ ∈ S ( P ) } b e the spe c tr al radius. F rom [ 24 ] we know that since P is reversible and geometr ically ergo dic, it has a s pectral g ap, i.e. s P < 1 . Hence for x ∈ [ − s P , 0] , we can write G µ ≤ µ (0) + X k even µ ( k ) x k ≤ µ (0) + s P (1 − µ (0)) . Consequently ♣ = Z [ − s P , 0) 1 + G µ ( x ) 1 − G µ ( x ) E f ,P ( dx ) + Z [0 ,s P ] 1 + G µ ( x ) 1 − G µ ( x ) E f ,P ( dx ) ≤ 1 1 − µ (0 ) Z [ − s P , 0) 2 1 − s P E f ,P ( dx ) + 1 1 − µ (0 ) Z [0 ,s P ] 2 1 − x E f ,P ( dx ) . (11) The ﬁrst in tegra l in ( 11 ) is ﬁnite by ( 7 ) a nd the s econd by ( 8 ). The mos t impo rtan t sp ecial case of Theorem 2 is underline d and co mputed explicitly in the next corolla r y . 5 Corollary 1 L et P b e a r eversible and er go dic t ra nsition kernel with st atio n- ary me asur e π , and assume that for f and P the CL T ( 1 ) holds. F or ε ∈ (0 , 1) let the lazy version of P b e deﬁne d as P ε := ε Id + (1 − ε ) P. Then the CL T holds for f and P ε and σ 2 f ,P ε = 1 1 − ε σ 2 f ,P + ε 1 − ε σ 2 f . (12) Pr o of W e use Theorem 2 w ith µ (0 ) = ε, µ (1) = 1 − ε. Hence G µ = ε + (1 − ε ) x, and consequently σ 2 f ,P ε = Z [ − 1 , 1] 1 + ε + (1 − ε ) x 1 − ε − (1 − ε ) x E f ,P ( dx ) = Z [ − 1 , 1] 1 1 − ε  1 + x 1 − x + ε  E f ,P ( dx ) = 1 1 − ε Z [ − 1 , 1] 1 + x 1 − x E f ,P ( dx ) + ε 1 − ε Z [ − 1 , 1] 1 E f ,P ( dx ) = 1 1 − ε σ 2 f ,P + ε 1 − ε σ 2 f . Eﬃciency of time sa mpled Marko v chains can b e compared using the fol- lowing coro llary from Theorem 2 . Corollary 2 L et P and f b e as in The or em 2 . If P is p ositive as an op er- ator on L 2 ( π ) and µ 1 dominates sto chastic al ly µ 2 (i.e. µ 1 ≥ st µ 2 ), then P µ 1 dominates P µ 2 in the eﬃciency or dering, i.e. σ 2 f ,P µ 1 ≤ σ 2 f ,P µ 2 . Pr o of If P is p ositiv e self-adjoint then supp E f ,P ⊆ [0 , 1 ] . Mo reo ver µ 1 ≥ st µ 2 ⇒ G µ 1 ( x ) ≤ G µ 2 ( x ) for x ∈ [ − 1 , 1] . The conclusion follows from ( 6 ). In another direction of studying CL Ts , the varianc e b ounding pro perty of Markov chains has be en in tro duced in [ 26 ] and is deﬁned as follows. P is v aria nc e b ounding if there exists K < ∞ such that V ar ( f , P ) ≤ K V a r π ( f ) for all f . Here V ar ( f , P ) is deﬁned in ( 2 ) and V ar π ( f ) = π f 2 − ( π f ) 2 . W e prov e that fo r time-sampled Mar k ov c hains the v ar iance b ounding pro perty propaga tes the same wa y the CL T do es. Theorem 3 Assume P is r eversible and varianc e b ounding. Then P µ is vari- anc e b ounding i f any of the fol lowing c onditions h old (i) µ o dd := µ ( { 1 , 3 , 5 , . . . } ) > 0 , (ii) µ (0) < 1 and P is ge ometric al ly er go dic. Pr o of F or any f such tha t V a r π f < ∞ , the Kipnis-V aradhan condition holds due to v ariance bounding prop erty o f P a nd th us the assumptions of Theorem 2 are met. Hence for every f ∈ L 2 ( π ) there is a CL T for f a nd P µ . Therefore P µ is v ar ia nce bo unding by Theo rem 7 of [ 26 ]. 6 The next example sho ws that in case of Marko v c hains that are not geo- metrically ergo dic, the condition µ od d > 0 is necessa ry . Example 1 W e set f ( x ) = x a nd give an example of an ergo dic and reversible transition kernel P on X = [ − 1 , 1] , a nd such that there is a CL T for P and f but not fo r P 2 and f . W e shall r ely on Theo rem 4.1 of [ 4 ] that provides if and only if co nditio ns for Markov chains CL Ts in ter ms of reg enerations. It will b e apparent that the condition µ od d > 0 in Theorem 2 is nece ssary . Set s ( x ) := p 1 − | x | , let U ( · ) be the unifor m distribution on [ − 1 , 1] , and let the kernel P be of the for m P ( x , · ) = (1 − s ( x )) δ − x ( · ) + s ( x ) U ( · ) , hence (13) P 2 ( x, · ) = (1 − s ( x )) 2 δ x ( · ) + (2 s ( x ) − s ( x ) 2 ) U ( · ) . (14) T o ﬁnd the s tationary distribution of P (and a lso P 2 ), we verify r ev ersibility with π ( x ) ∝ 1 /s ( x ) . π (d x ) P ( x, d y ) ∝ 1 s ( x ) δ − x ( y ) + δ − x ( y ) + U (d y ) = 1 s ( y ) δ − y ( x ) + δ − y ( x ) + U (d x ) ∝ π (d y ) P ( y , d x ) . Hence π ( x ) is a reﬂe c ted B eta(1 , 1 2 ) . Clearly π ( f 2 ) < ∞ . Recall now the split chain construction [ 22 , 1 ] of the biv ariate Ma rk ov chain { X n , Γ n } o n { 0 , 1 } × X = { 0 , 1 } × [0 , 1] . If ( X n ) n ≥ 0 evolv es accor ding to P deﬁned in ( 14 ), we have the following transition rule fro m { X n − 1 , Γ n − 1 } to { X n , Γ n } for the split c hain. ˇ P ( X n ∈ ·| Γ n − 1 = 1 , X n − 1 = x ) = U ( · ) , ˇ P ( X n ∈ ·| Γ n − 1 = 0 , X n − 1 = x ) = δ − x ( · ) , ˇ P ( Γ n = 1 | Γ n − 1 , X n = x ) = s ( x ) , ˇ P ( Γ n = 0 | Γ n − 1 , X n = x ) = 1 − s ( x ) . The notation ˇ P abov e indicates that we consider the ex tended pr obabilit y space f or ( X n , Γ n ) , no t the original one of X n . The appropriate mo diﬁcation of the ab o ve ho lds if the dynamics of X n is P 2 , namely ˇ P ( X n ∈ ·| Γ n − 1 = 1 , X n − 1 = x ) = U ( · ) , ˇ P ( X n ∈ ·| Γ n − 1 = 0 , X n − 1 = x ) = δ x ( · ) , ˇ P ( Γ n = 1 | Γ n − 1 , X n = x ) = 2 s ( x ) − s 2 ( x ) , ˇ P ( Γ n = 0 | Γ n − 1 , X n = x ) = (1 − s ( x )) 2 . W e re fer to to the original pa pers for more deta ils on the split c hain c o nstruc- tion and to [ 4 , 25 ] for central limit theor e ms in this co n text. Denote τ := min { k ≥ 0 : Γ k = 1 } . (15) 7 By Theo rem 4.1 of [ 4 ], the CL T for P and f holds if and only if the following expression for the asymptotic v aria nce is ﬁnite. σ 2 f ,P = Z [ − 1 , 1] s ( x ) π ( x )d x ˇ E U  τ X k =0 f ( X n )  2 , (16) where ( X n , Γ n ) follow the dynamics of P . Resp ectiv ely , the CL T for P 2 and f holds in our setting, if and only i f σ 2 f ,P 2 = Z [ − 1 , 1] (2 s ( x ) − s 2 ( x )) π ( x )d x ˇ E U  τ X k =0 f ( X n )  2 (17) is ﬁnite, where ( X n , Γ n ) follow the dynamics of P 2 . Now obs erv e that if ( X n ) n ≥ 0 evolv es accor ding to P , th en ( P τ k =0 f ( X n )) 2 equals 0 if τ is o dd, or ( P τ k =0 f ( X n )) 2 = X 2 0 , if τ is even. Conse quen tly ( 16 ) is ﬁnite. How ever, if ( X n ) n ≥ 0 evolv es accor ding to P 2 , then ( P τ k =0 f ( X n )) 2 = ( τ + 1 ) 2 X 2 0 and the distribution of τ is geometr ic with para meter 2 s ( X 0 ) − s 2 ( X 0 ) = 1 − (1 − s ( x )) 2 . Therefore we compute σ 2 f ,P 2 in ( 17 ) as σ 2 f ,P 2 = Z [ − 1 , 1] (2 s ( x ) − s 2 ( x )) π ( x )d x Z [ − 1 , 1] 2 −  1 −  1 − s ( x )  2  2  1 − (1 − s ( x )) 2  2 x 2 d x = C Z [ − 1 , 1]  1 + (1 − s ( x )) 2  x 2 2  1 − | x | − 2 p 1 − | x |  2 d x ≥ C Z [ − 1 , 1] x 2 8(1 − | x | ) d x = ∞ . 3 Bark er’s alg orithm When assessing e ﬃcie nc y of Markov c hain Monte Car lo algorithms, the asy mp- totic v a riance cr iterion is one o f na tural choices. Peskun o rdering [ 23 ] (see a lso [ 29 , 21 ]) provides a too l to co mpare t wo reversible transition kernels P 1 , P 2 with the s ame limiting distribution π and is deﬁned as follows. P 1 ≻ P 2 ⇐ ⇒ for π − almost every x ∈ X and a ll A ∈ B ( X ) holds P 1 ( x, A − { x } ) ≥ P 2 ( x, A − { x } ) . If P 1 ≻ P 2 then σ 2 f ,P 1 ≤ σ 2 f ,P 2 for every f ∈ L 2 ( π ) . Consider now a class of a lgorithms wher e the transition kernel P is deﬁned by applying an irr e ducible prop osal kernel Q and an ac c e ptance rule α, i.e. given X n = x, the v a lue o f X n +1 is a result of perfor ming t he following t wo steps. 1. Draw a prop osal y ∼ Q ( x, · ) , 2. Set X n +1 := y with pro babilit y α ( x, y ) and X n +1 = x o therwise, 8 where α ( x, y ) is suc h that the re sulting kernel P is re v ersible with stationar y distribution π . It follows [ 23 , 29 ] that fo r a given pr o posal kernel Q the standard Metrop olis-Hastings [ 19 , 12 ] acceptance rule α (MH) ( x, y ) = min  1 , π ( y ) q ( y , x ) π ( x ) q ( x, y )  (18) yields a tra ns ition k ernel P (MH) that is max ima l with resp ect to Peskun or- dering and th us minimal with r espect to asymptotic v ariance. In particular, the Barker’s algor ithm [ 2 ] that uses acceptance rule α (B) ( x, y ) = π ( y ) q ( y , x ) π ( y ) q ( y , x ) + π ( x ) q ( x, y ) (19) is inferior to Metro polis-Ha s tings when the a symptotic v a riance is co ns idered. In the a bov e nota tion we assume that all the inv o lv ed distributions hav e c o m- mon denominating measure and q ( x, · ) are tr ansition densities of Q. See [ 29 ] for a more general statement and discussion. Exact Algor ithms introduced in [ 7 , 8 , 5 , 6 ] allow for inference in diﬀusion mo dels without Euler discre tiza tion error. In r e c en t adv a nces in Exact MCMC inference fo r complex diﬀusion models a par ticular setting is r eoccurr ing, where the Metrop olis-Hastings acc e ptance step requires a speciﬁc Bernoulli F actory and is not po ssible to execute. Ho wev er, in this diﬀusion context the Barker’s algorithm ( 19 ) is feasible, as well as the ’lazy’ version of the Metropolis- Hastings kernel P (MH) ε := εI d + (1 − ε ) P (MH) . (20) W e refer to [ 10 , 18 , 1 6 ] for the bac kgro und on exact MCMC inference for dif- fusions a nd the Ber noulli F actory problem. T his mo tiv ates us to investigate per formance of these a lternativ es in comparison to the standar d Metrop olis- Hastings. Theorem 4 L et P (B) denote the tr ansition kernel of the Barker’s algorithm and let P (MH) and P (MH) ε b e as deﬁne d in ( 20 ). If t he CL T ( 1 ) holds for f and P (MH) , then it holds also for (i) f and P (MH) ε with σ 2 f ,P (MH) ε = 1 1 − ε σ 2 f ,P (MH) + ε 1 − ε σ 2 f . (21) (ii) f and P (B) with σ 2 f ,P (MH) ≤ σ 2 f ,P (B) ≤ σ 2 f ,P (MH) 1 / 2 = 2 σ 2 f ,P (MH) + σ 2 f . (22) 9 Pr o of The ﬁrst claim ( i ) is a res tatemen t of Corollar y 1 for Metropo lis-Hastings chains. T o obta in the second claim ( ii ) , note tha t P (MH) 1 / 2 can be v ie w ed as an algorithm that uses prop osals from Q and accepta nc e rule α ( x, y ) = min  1 2 , π ( y ) q ( y , x ) 2 π ( x ) q ( x, y )  . Now since min  1 , π ( y ) q ( y , x ) π ( x ) q ( x, y )  ≥ π ( y ) q ( y , x ) π ( y ) q ( y , x ) + π ( x ) q ( x, y ) ≥ min  1 2 , π ( y ) q ( y , x ) 2 π ( x ) q ( x, y )  , the result follows from Peskun or dering and Corolla ry 1 . 4 Numerical Examples T o illustra te the theoretical ﬁndings, we c onsider t wo numerical examples. The ﬁrst fo cuses on time sampling, the sec o nd on eﬃciency of the Barker’s algorithm. 4.1 Time sampled contracting normals Consider the contracting no rmals example, i.e. a Mar k ov c hain with transitio n probabilities P ( x , · ) = N ( θx, 1 − θ 2 ) (23) for so me θ ∈ ( − 1 , 1) . It is easy to chec k that the sta tio nary dis tr ibution is π ( · ) = N (0 , 1) . Moreover the transition kernel is geometrica lly ergo dic and reversible for all θ ∈ ( − 1 , 1) a nd als o po sitiv e for θ ∈ [0 , 1) , [ 3 , 17 ]. F or the target function we take f ( x ) = x and estimate the asymptotic v aria nce using the ba tc h mea ns estimator of [ 14 ] based o n a tra jectories of leng th 10 7 . W e set θ to 0 . 9 and − 0 . 9 in the following settings: – CN: Co n tracting normals; – LCN: Lazy cont racting normals with ε = 0 . 5 ; – TSCN1: Time sampled contracting normals for sampling distribution µ = 1 + P ois (1); – TSCN2: Time sampled contracting normals for sampling distribution µ = 1 + P ois (5) . CN LCN TSCN1 TSCN2 θ = 0 . 9 19.1 38.5 9.28 3.43 θ = − 0 . 9 0.053 1.14 0.80 0.96 T able 1. Estimated asymptotic v ari ance of the contract ing normals Marko v chain for diﬀeren t sampling scenarios. 10 The ﬁr st tw o columns of T able 1 rep ort how lazines s inc r eases asy mpto tic v aria nc e and illustrate Co rollary 1 . Note that the s tationary v ar ia nce σ 2 f = 1 is substantial compa red to the asymptotic v ar ia nce of contracting normals for θ = − 0 . 9 and thus the lazy version LCN b ecomes severely ineﬃcient compared to CN. The sto chastic ordering of the sampling distributions in the ab o ve scenarios is LCN < st CN < st TSCN1 < st TSCN2 ther efore the simulation shows how the asymptotic v ariance decr eases for sto chastically bigger sampling distributions (Corollar y 2 ) in case o f p ositiv e o perator s ( θ = 0 . 9 ) and how this prop ert y fails if the op erator is no t p ositive, i.e for θ = − 0 . 9 . 4.2 Eﬃciency of the Barker’s algo rithm W e co mpa re the estimated asymptotic v arianc e of the ra ndom walk Metr opo- lis algorithm, the Barker’s algor ithm and la zy version o f the random w alk Metrop olis with ε = 0 . 5 to illustrate the bo unds of Theorem 4 . F or the sta- tionary distribution we take N (0 , 1) and the increment pro posal is U ([ − 2 , 2]) . The results based on a sim ulation length 10 7 are rep orted in T able 2. Metrop olis Barker’s lazy Metrop olis asymptotic v aria nce 3.69 5.67 8.32 T able 1. Estimated asymptotic v ari ance of the Metrop olis, Barker’s and lazy Metrop olis algorithms. 5 Ac kno wledgeme n ts W e thank J e ﬀr ey S. Rosenthal for a helpful discussio n. References 1. Ath reya, K ., Ney , P .: A new approac h to the li mit theory of r ecu rrent M ark ov cha ins. T ransactions of the Am erican M at hematical So ciet y 245 (Nov ), 493–501 (1978) 2. Barker, A.: M on te Carlo calculations of the radial distri but ion functions f or a proton- electron plasma. Australian Journal of Physics 1 8 , 119 (1965) 3. Baxendale, P .: Renewal theory and computable conv ergence rates f or geometrically er- godic Marko v cha ins. The Annals of Applied Probability 15 (1B) , 700–738 (2005) 4. Bednorz, W., Latuszy´ nski, K ., Lata la, R.: A regeneration pr oof of the central lim it theorem for uniformly ergo dic marko v c hains. Electronic Comm uni ca tions i n Probabili t y 13 , 85–98 (2008) 5. Besk os, A., Papaspiliopoulos, O., Roberts, G.: Retrospective exact simulation of diﬀu- sion sample paths wi th applications. Bernoulli 12 (6), 1077 (2006) 6. Besk os, A., Papaspiliopoulos, O., Rob erts, G.: A factorisation of diﬀusion measure and ﬁnite sample path constructions. Methodology and Computing in Applied Pr oba bility 10 (1), 85–104 (2008) 7. Besk os, A. , P apaspiliop oulos, O ., Rob erts, G., F earnhead, P .: Exact and computationally eﬃcien t likelihoo d-based estimation for discr et ely observed diﬀusion pro ce sses (with discussion). Journal of the Roy al Statistical So ciet y: Series B (Statistical M et ho dolog y) 68 (3), 333–382 (2006) 11 8. Besk os, A., Rob erts, G.: Exact simulation of diﬀusions. Annals of Appli ed Pr oba bility 15 (4), 2422–2444 (2005) 9. Con wa y , J.: A course in functional analysis. Springer (1990) 10. Gon¸ calv es, F., Rob erts, G. , Lat uszy ´ nski, K.: Exact mcmc inference for jump diﬀusion models with sto c hastic jump rate (2011) 11. H¨ aggstr¨ om, O., Rosen thal, J.: On v ariance conditions for Marko v c hain CL Ts. Elect. Comm. in Probab 12 , 454–464 (2007) 12. Hastings, W.: Mon te Carl o sampling metho ds using Marko v chains and their applica- tions. Biometrik a 5 7 (1) , 97 (1970) 13. Jones, G.: On the Marko v cha in cent ral limi t theorem. Probability surveys 1 , 299–320 (2004) 14. Jones, G., Haran, M., Caﬀo, B., Neath, R.: Fixed-width output analysis for Mark o v c hain Monte Carlo. Journal of the American Statistical Association 101 (47 6), 1537– 1547 (2006) 15. Kipnis, C., V aradhan, S.: Cen tral limit theorem for additive functionals of r ev ersi ble Marko v processes and applicat ions to simple exclusions. Communicat ions in Mathe- matical Physics 10 4 (1) , 1–19 (1986) 16. Latuszy´ nski, K., Kosmidis, I., Papaspiliopoulos, O ., Roberts, G.: Simulating eve nts of unkno wn pr oba bilities via reverse time martingales. Random Structures & Algori thms (2011) 17. Latuszy´ nski, K., Ni emiro, W.: Rigorous conﬁdence bounds f or M CMC under a geomet ric drift condition. Journal of Compl exit y 27 (1), 23–38 (2011) 18. Latuszy´ nski, K., Palczewski, J., Rob erts, G .: Exact inference for a mark ov swi tc hing diﬀusion mo del wi th discr et ely observed data (2011) 19. Metropoli s, N., Rosenbluth, A., Rosen bluth, M., T eller, A., T eller, E.: Equations of state calculations by fast computat ional mac hine. Journal of Chemical Ph ysics 2 1 (6) , 1087–109 1 (1953) 20. Meyn, S., Tweedie, R.: Marko v chains and sto c hastic stability. Springer London et al . (1993) 21. Mira, A., Gey er, C. : Orderi ng Monte Carlo M ark o v ch ains. In: Sc hool of Statistics, Unive rsity of Minnesota. technical rep ort (1999) 22. Nummelin, E.: A spli tt ing technique for Harri s recurrent Marko v chains. Probabilit y Theory and Related Fields 43 (4), 309–318 (1978) 23. Pe skun, P .: Optimum monte-carlo sampling using marko v c hains. Bi omet rik a 60 (3) , 607 (1973) 24. Roberts, G., Rosentha l, J.: Geometric ergo dicit y and hy brid Marko v c hains. Electron. Comm. Probab 2 (2 ), 13–25 (1997) 25. Roberts, G., Rosenthal, J.: General state space Mar k ov chains and MCMC algorithms. Probability Surveys 1 , 20–71 (2004) 26. Roberts, G., Rosenthal, J.: V ariance b ounding Mar k ov ch ains. Annals of applied prob- ability 18 (3) , 1201 (2008) 27. Rosen thal, J.: Asymptotic v ari anc e and con v ergence rates of nearly-p erio dic Mar k ov c hain Mon te Carlo algorithms. Journal of the American Statistical Association 98 (461), 169–177 (2003) 28. Rosen thal, J.: Geometric con ve rgence rates for time-sampled mar k ov cha ins. Journal of Theoretical Probability 16 (3), 671–688 (2003) 29. Tierney , L. : A note on Metrop olis-Hastings kernels f or general state spaces. Annals of Applied Probability 8 (1), 1–9 (1998)

CLTs and asymptotic variance of time-sampled Markov chains

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment