Optimal sequential procedures with Bayes decision rules

K Y B E R N E T I K A — V O L U M E 4 5 ( 2 0 0 9 ) , N U M B E R X , P A G E S x x x – x x x OPTIMAL SEQUENTI AL PROCEDURES WITH BA YES DE- CISION R ULES Andrey Novik ov In this article, a general problem of sequentia l statistical inference for general discrete-time stochastic processes is considered. The problem is to min imize an av erage sample num ber giv en that Bay esian risk due t o incorrect decision do es not exceed some given b oun d. W e chara cterize the form of op t imal sequential stopp ing ru les in th is problem. In particular, w e hav e a characterization of the form of optimal sequential decision p ro cedu res when the Ba yesi an risk includes b oth t he loss du e to incorrect d ecision an d the cost of observ ations. Keywo rds: sequenti al analysis, d iscrete-time sto chastic pro cess, dep en dent observ ations, statistical decision problem, Ba yes decision, randomized stopping time, optimal stopping rule, existence and uniqueness of optimal sequential decision p rocedu re AMS Su b ject Classiﬁcation: 62L10, 62L15, 62C10, 60G40 1 INTR ODUCTION Let X 1 , X 2 , . . . , X n , . . . b e a discrete- time sto chastic pro c e ss, whose distr ibutio n depe nds on an unknown par ameter θ , θ ∈ Θ. In this a rticle, we cons ider a general problem of s equential statistical decision ma king based on the obs e rv ations of this pro cess. Let us suppose that for any n = 1 , 2 , . . . , the vector ( X 1 , X 2 , . . . , X n ) ha s a probability “density” function f n θ = f n θ ( x 1 , x 2 , . . . , x n ) (1) (Radon-Nikodym deriv ative of its distribution) with resp ect to a pro duct-meas ure µ n = µ ⊗ µ ⊗ · · · ⊗ µ, with s ome σ -ﬁnite measure µ o n the resp ective space. As usual in the Bayesian co n- text, we supp ose that f n θ ( x 1 , x 2 , . . . , x n ) is meas urable with resp ect to ( θ , x 1 , . . . , x n ), for any n = 1 , 2 , . . . . Let us deﬁne a se qu ential statistic al pr o c e dure as a pair ( ψ, δ ), b eing ψ a (ran- domized) stopping ru le , ψ = ( ψ 1 , ψ 2 , . . . , ψ n , . . . ) , and δ a de cision r u le δ = ( δ 1 , δ 2 , . . . , δ n , . . . ) , 2 ANDREY NOVIK OV suppo sing that ψ n = ψ n ( x 1 , x 2 , . . . , x n ) and δ n = δ n ( x 1 , x 2 , . . . , x n ) are measurable functions, ψ n ( x 1 , . . . , x n ) ∈ [0 , 1], δ n ( x 1 , . . . , x n ) ∈ D (a decis ion space), for any observ a tion vector ( x 1 , . . . , x n ), for any n = 1 , 2 , . . . (see, for exa mple, [19], [7], [8], [1], [9]). The in terpretatio n o f these elements is a s follows. The v alue of ψ n ( x 1 , . . . , x n ) is in terpreted as the conditio na l probability to stop and pr o c e e d t o de cision making , given that that we ca me to sta g e n of the exp e riment and that the obs e rv ations up to stage n were ( x 1 , x 2 , . . . , x n ) . I f there is no stop, the exp eriments contin ues to the next stage and an additional obser v ation x n +1 is ta ken. Then the rule ψ n +1 is applied to x 1 , . . . , x n , x n +1 in the same wa y as as ab ove, etc., un til the ex p er iment even tually stops. When the exper iments stops at stag e n , be ing ( x 1 , . . . , x n ) the data vector ob- served, the decision sp eciﬁed by δ n ( x 1 , . . . , x n ) is taken, and the se quential statistical exp eriment sto ps . The stopping rule ψ genera tes, by the a b ove pro ce s s, a r andom v ariable τ ψ (ran- domized stopping time ), whic h may be deﬁned as follows. Let U 1 , U 2 . . . , U n , . . . be a sequence of indep endent and identically distr ibuted (i.i.d.) random v aria bles uniformly distributed on [0 , 1] (randomization v ar iables), such that the pro cess ( U 1 , U 2 , . . . ) is indep endent o f the pr o cess of observ ations ( X 1 , X 2 , . . . ). Then let us say that τ ψ = n if, and only if, U 1 > ψ 1 ( X 1 ) , . . . , U n − 1 > ψ n − 1 ( X 1 , . . . , X n − 1 ) , and U n ≤ ψ n ( X 1 , . . . X n ) , n = 1 , 2 , . . . . It is easy to s ee that the distr ibution of τ ψ is given by P θ ( τ ψ = n ) = E θ (1 − ψ 1 )(1 − ψ 2 ) . . . (1 − ψ n − 1 ) ψ n , n = 1 , 2 , . . . . (2) In (2), ψ n stands for ψ n ( X 1 , . . . , X n ) , unlike its previous deﬁnition as ψ n = ψ n ( x 1 , . . . , x n ). W e use this “dualit y” throughout the pap er , applying, for any F n = F n ( x 1 , . . . , x n ) or F n = F n ( X 1 , . . . X n ) the following g eneral r ule : when F n is under the probability or expecta tion sign, it is F n ( X 1 , . . . , X n ), otherwise it is F n ( x 1 , . . . , x n ). Let w ( θ , d ) b e a non-nega tive loss function (measur able with resp ect to ( θ , d ), θ ∈ Θ , d ∈ D ) a nd π 1 any pr o bability mea sure on Θ. W e deﬁne the av erage los s of the sequential sta tistical pro cedure ( ψ , δ ) a s W ( ψ , δ ) = ∞ X n =1 Z [ E θ (1 − ψ 1 ) . . . (1 − ψ n − 1 ) ψ n w ( θ , δ n )] dπ 1 ( θ ) . (3) and its aver age sample numb er , g iven θ , as N ( θ ; ψ ) = E θ τ ψ (4) Optimal Sequent ial Pro cedures with Bay es Decision Rules 3 (w e supp os e tha t N ( θ ; ψ ) = ∞ if P ∞ n =1 P θ ( τ ψ = n ) < 1 in (2)). Let us also deﬁne its ”weigh ted” v a lue N ( ψ ) = Z N ( θ ; ψ ) dπ 2 ( θ ) , (5) where π 2 is s ome probability mea sure o n Θ, giv ing “weights” to the particular v alues of θ . Our ma in go al is minimizing N ( ψ ) over all sequential decis ion pro cedures ( ψ , δ ) sub ject to W ( ψ , δ ) ≤ w , (6) where w is some positive constant, supposing that π 1 in (3 ) and π 2 in (5) a re, generally sp ea king, t wo diﬀer ent proba bility measur es. W e only consider the cases when there exist pro cedures ( ψ , δ ) s atisfying (6). Sometimes it is necessary to put the r isk under control in a more detailed wa y . Let Θ 1 , . . . , Θ k be some subsets of the parametric space such that Θ i T Θ j = ∅ if i 6 = j , i, j = 1 , . . . , k . Then, instead of (6), we may want to guara ntee that W i ( ψ , δ ) = ∞ X n =1 Z Θ i E θ (1 − ψ i − 1 ) . . . (1 − ψ n − 1 ) ψ n w ( θ , δ n ) dπ 1 ( θ ) ≤ w i , (7) with some w i > 0 , for a ny i = 1 , . . . , k , when minimizing N ( ψ ). T o advo cate r e s tricting the seq uential pro cedures by (7), let us see a particula r case of hypo thesis testing. Let H 1 : θ = θ 1 and H 2 : θ = θ 2 be tw o simple h yp otheses ab out the par ameter v alue, a nd le t w ( θ , d ) =      1 if θ = θ 1 and d = 2 , 1 if θ = θ 2 and d = 1 , 0 otherwise , and π 1 ( { θ 1 } ) = π , π 1 ( { θ 2 } ) = 1 − π , with so me 0 < π < 1. Then, letting Θ i = { θ i } , i = 1 , 2, in (7), we hav e that W 1 ( ψ , δ ) = π P θ 1 ( reject H 1 ) = π α ( ψ , δ ) and W 2 ( ψ , δ ) = (1 − π ) P θ 2 (accept H 1 ) = (1 − π ) β ( ψ , δ ) , where α ( ψ , δ ) a nd β ( ψ , δ ) are the type I and type II err or pr obabilities. Th us, taking in (7) w 1 = π α , w 2 = (1 − π ) β , with so me α, β ∈ (0 , 1), we see that (7) is eq uiv alent to α ( ψ , δ ) ≤ α, and β ( ψ , δ ) ≤ β . (8) Let now π 2 ( { θ 0 } ) = 1 and suppo se that the obs e rv ations ar e i.i.d.. Then our problem of minimizing N ( ψ ) = N ( θ 0 ; ψ ) under restrictions (8) is the classica l W ald a nd W olfowitz problem o f minimizing the e x p e cted sample size (see [20]). It is well known that its solution is g iven by the sequential probability ratio test (SPR T ), and 4 ANDREY NOVIK OV that it minimizes the ex pe cted s ample size under the alter native hypothesis a s well (see [20], [1 2]). On the other hand, if π 2 ( { θ } ) = 1 with θ 6 = θ 0 and θ 6 = θ 1 , we hav e the problem known as the mo diﬁed Kiefer-W e is s pro blem, the problem of minimizing the exp ected sample size, under θ , among all sequential tes ts sub ject to (8) (see [21], [10]). The general structur e of the o ptima l sequential test in this pro blem is given by Lor den [12] for i.i.d. obs e r v ations. So, we see that cons idering natura l particular cases of sequential pr o cedures s ub- ject to (7) a nd using diﬀerent choices o f π 1 in (3) and π 2 in (5) we extend known problems for i.i.d. observ a tions to the ca s e of g e ne r al discr ete-time sto chastic pro- cesses. The metho d w e use in this article was or iginally develop ed for testing of tw o hypotheses [17], then ex tended for m ultiple hypothesis testing pro blems [15]. An extension of the sa me metho d for hypothesis testing pro blems when co ntrol v ar iables are present ca n be found in [14]. A more general, than used in this a rticle, setting for Bayes-t yp e decision problems, where b o th the cos t of observ a tions a nd the lo ss functions dep end o n the true v alue of the pa r ameter and on the observ a tions, is co nsidered in [1 6]. F rom this time on, o ur aim will b e minimizing N ( ψ ), deﬁned by (5), in the class of sequential s tatistical pro cedur es sub ject to (7). In Sec tio n 2, we re duce the problem to an optimal stopping pro blem. In Section 3, we give a s olution to the o ptima l stopping problems in the class o f trunca ted stopping rules, and in Section 4 in some na tural clas s of non-tr uncated sto pping rules. In par ticular, in Se c tion 4 w e giv e a so lution to the pro blem o f minimizing N ( ψ ) in the class of all statistical pr o cedures satisfying W i ( ψ , δ ) ≤ w i , i = 1 , . . . , k (see Remar k 8). 2 REDUCTION TO AN OPTIMAL STOPPING PROBLEM In this s e c tion, the problem of minimizing the av er age sample num b er (5) ov er all sequential pro cedure s sub ject to (7) will b e reduced to an o ptimal stopping problem. This is a usual treatmen t of conditional problems in s equential hypothes is testing (see, for example, [2 ], [12], [3 ], [13]). W e will use the sa me ideas to tr eat the ge ne r al statistical dec is ion problem descr ibe d a b ove. Let us deﬁne the following Lagra nge-multiplier function: L ( ψ , δ ) = L ( ψ , δ ; λ 1 , . . . , λ k ) = N ( ψ ) + k X i =1 λ i W i ( ψ , δ ) (9) where λ i ≥ 0 , i = 1 , . . . , k are so me constant multipliers. Let ∆ b e a cla ss of seq uent ial statistical pr o cedures. The following Theore m is a dir ect application o f the metho d of Lagr a nge multi- pliers to the ab ov e optimization pr oblem. Optimal Sequent ial Pro cedures with Bay es Decision Rules 5 Theorem 1. L et t her e exist λ i > 0 , i = 1 , . . . , k , and a pr o c e dur e ( ψ ∗ , δ ∗ ) ∈ ∆ su ch that for any pr o c e dur e ( ψ , δ ) ∈ ∆ L ( ψ ∗ , δ ∗ ; λ 1 , . . . , λ k ) ≤ L ( ψ , δ ; λ 1 , . . . , λ k ) (10) holds and such t hat W i ( ψ ∗ , δ ∗ ) = w i , i = 1 , . . . k . (11) Then for any test ( ψ , δ ) ∈ ∆ sat isfying W i ( ψ , δ ) ≤ w i , i = 1 , 2 , . . . , k , (12) it holds N ( ψ ∗ ) ≤ N ( ψ ) . (13) The ine quality in (13) is st rict if at le ast one of the ine qualities (12) is strict. Pro of. Let ( ψ , δ ) ∈ ∆ b e any pr o cedure satis fying (12). Be c ause o f (10), L ( ψ ∗ , δ ∗ ; λ 1 , . . . , λ k ) = N ( ψ ∗ ) + k X i =1 λ i W i ( ψ ∗ , δ ∗ ) ≤ L ( ψ , δ ; λ 1 , . . . , λ k ) (14) = N ( ψ ) + k X i =1 λ i W i ( ψ , δ ) ≤ N ( ψ ) + k X i =1 λ i w i , (15) where to get the last inequality we used (12). T a king into account conditions (11) we get fro m this that N ( ψ ∗ ) ≤ N ( ψ ) . T o get the last statement of the theorem w e note that if N ( ψ ∗ ) = N ( ψ ) then there are equalities in (14) – (15) instead o f the ine q ualities, which is only p ossible if W i ( ψ , φ ) = w i for any i = 1 , . . . , k .  Remark 1. I t is easy to see that deﬁning a new loss function w ′ ( θ, d ) which is eq ual to λ i w ( θ , d ) whenever θ ∈ Θ i , i = 1 , . . . , k , we hav e that the weigh ted av erag e loss W ( ψ , δ ) deﬁned by (3) with w ( θ , d ) = w ′ ( θ, d ) coincides with the seco nd s ummand in (9). Because of this, we treat in what follows only the case of one summand ( k = 1) in (9), b eing the Lag range- mu ltiplier function deﬁned as L ( ψ , δ ; λ ) = N ( ψ ) + λW ( ψ , δ ) . (16) It is ob vious that the problem of minimization of (16) is equiv alent to that of minimization of R ( ψ , δ ; c ) = cN ( ψ ) + W ( ψ , δ ) , (17) where c > 0 is a ny consta nt , a nd, in the rest of the article, we will solve the problem of minimizing (1 7), instead o f (16). This is b e cause the pro blem of minimization of (17) is in tere s ting by itself, without its relation to the conditional problem ab ove. F or example, if π 2 = π 1 = π , it is easy to se e that it is equiv alent to the problem of Bay e sian sequential decisio n-making, with the prior distribution π and a ﬁxed cost c p er observ a tion. The latter set-up is fundamental in the sequential analys is (see [19], [8], [7], [22], [9], amo ng many others). 6 ANDREY NOVIK OV Because o f Theorem 1, from this time on, our main focus will b e on the unre- stricted minimization of R ( ψ , δ ; c ), ov er all seq uent ial decision pro cedures. Let us supp o s e, additionally to the ass umptions o f Intro duction, that fo r a ny n = 1 , 2 . . . there exis ts a decis io n function δ B n = δ B n ( x 1 , . . . , x n ) such that for any d ∈ D Z w ( θ , d ) f n θ ( x 1 , . . . , x n ) dπ 1 ( θ ) ≥ Z w ( θ , δ B n ( x 1 , . . . , x n )) f n θ ( x 1 , . . . , x n ) dπ 1 ( θ ) (18) for µ n -almost all ( x 1 , . . . , x n ). Then δ B n is called the Bay esian decisio n function based on n observ ations. W e do not disc uss in this a r ticle the questio ns of the existence of B ay e s ian decis io n functions, we just supp ose that they exist for any n = 1 , 2 , . . . referring, e.g ., to [19] for a n ex tensive underlying theor y . Let us denote by l n = l n ( x 1 , . . . , x n ) the right-hand side o f (1 8). It easily follows from (18) that Z l n dµ n = inf δ n Z E θ w ( θ , δ n ) dπ 1 ( θ ) , (19) th us Z l 1 dµ 1 ≥ Z l 2 dµ 2 ≥ . . . . Because of that, we supp o se that Z l 1 ( x ) dµ ( x ) < ∞ which makes all the Bay esian risk s (19) ﬁnite, for a ny n = 1 , 2 , . . . . Let δ B = ( δ B 1 , δ B 2 , . . . ). The following Theor e m shows that the only decision r ules worth o ur attention are the B ayesian ones . Its “if ” -part is, in ess ence, Theorem 5.2 .1 [9]. Let fo r any n = 1 , 2 , . . . and for any stopping rule ψ s ψ n = (1 − ψ 1 ) . . . (1 − ψ n − 1 ) ψ n , and let S ψ n = { ( x 1 , . . . , x n ) : s ψ n ( x 1 , . . . , x n ) > 0 } for all n = 1 , 2 , . . . . Theorem 2. F or any se quent ial pr o c e dur e ( ψ , δ ) W ( ψ , δ ) ≥ W ( ψ , δ B ) = ∞ X n =1 Z s ψ n l n dµ n . (20) Supp osing that the rig ht-hand side of (20) is ﬁnite, the e quality in (20) is only p ossible if Z w ( θ , δ n ) f n θ dπ 1 ( θ ) = Z w ( θ , δ B n ) f n θ dπ 1 ( θ ) µ n -almost everywher e on S ψ n for al l n = 1 , 2 , . . . . Optimal Sequent ial Pro cedures with Bay es Decision Rules 7 Pro of. It is eas y to see that W ( ψ , δ ) o n the left-hand side of (20) has the following equiv alent fo rm: W ( ψ , δ ) = ∞ X n =1 Z s ψ n Z w ( θ , δ n ) f n θ dπ 1 ( θ ) dµ n . (21) Applying (18) under the integral sign in each summand in (21) we immediately hav e: W ( ψ , δ ) ≥ ∞ X n =1 Z s ψ n Z w ( θ , δ B n ) f n θ dπ 1 ( θ ) dµ n = W ( ψ , δ B ) . (22) If W ( ψ , δ B ) < ∞ , then (22) is equiv alent to ∞ X n =1 Z s ψ n ∆ n dµ n ≥ 0 , where ∆ n = Z w ( θ , δ n ) f n θ dπ 1 ( θ ) − Z w ( θ , δ B n ) f n θ dπ 1 ( θ ) , which is, due to (1 8 ), non-neg ative µ n -almost everywhere for all n = 1 , 2 , . . . . Thus, there is an equality in (22 ) if and only if ∆ n = 0 µ n -almost everywhere on S ψ n = { s ψ n > 0 } for all n = 1 , 2 , . . . .  Because of (1 7), it follows from Theorem 2 that for a ny sequential decision pro- cedure ( ψ , δ ) R ( ψ , δ ; c ) ≥ R ( ψ , δ B ; c ) . (23) The following le mma gives the right-hand side of (23) a more conv enient form. F or any probability mea sure π on Θ let us denote P π ( τ ψ = n ) ≡ Z P θ ( τ ψ = n ) dπ ( θ ) = Z E θ s ψ n dπ ( θ ) , for n = 1 , 2 , . . . Resp ectively , P π ( τ ψ < ∞ ) = P ∞ n =1 P π ( τ ψ = n ), and E π τ ψ = Z E θ τ ψ dπ ( θ ) . Lemma 1. If P π 2 ( τ ψ < ∞ ) = 1 (24) then R ( ψ , δ B ; c ) = ∞ X n =1 Z s ψ n ( cnf n + l n ) dµ n , (25) wher e, by deﬁnition, f n = f n ( x 1 , . . . , x n ) = Z f n θ ( x 1 , . . . , x n ) dπ 2 ( θ ) . (26) 8 ANDREY NOVIK OV Pro of. By Theo rem 2, R ( ψ , δ B ; c ) = cN ( ψ ) + W ( ψ , δ B ) = cN ( ψ ) + ∞ X n =1 Z s ψ n l n dµ n . (27) If now (24) is fulﬁlled, then, by the F ubini theorem, N ( ψ ) = Z ∞ X n =1 nE θ s ψ n dπ 2 ( θ ) = ∞ X n =1 Z E θ ns ψ n dπ 2 ( θ ) = ∞ X n =1 Z s ψ n  n Z f n θ dπ 2 ( θ )  dµ n = ∞ X n =1 Z s ψ n nf n dµ n , so, combining this with (27), we get (25).  Let us denote R ( ψ ) = R ( ψ ; c ) = R ( ψ , δ B ; c ) . (28) By Lemma 1, R ( ψ ) =      ∞ X n =1 Z s ψ n ( cnf n + l n ) dµ n , if P π 2 ( τ ψ < ∞ ) = 1 , ∞ , other w is e . (29) The aim of what follows is to minimize R ( ψ ) over all stopping rules. In this wa y , our problem of minimization of R ( ψ , δ ) is re duce d to a n optimal stopping problem. 3 OPTIMAL TR UNCA TED STOPPING R ULES In this section, as a ﬁrst step, we characterize the struc tur e of optimal stopping rules in the cla s s F N , N ≥ 2, of all trunca ted stopping rules, i.e., such that ψ = ( ψ 1 , ψ 2 , . . . , ψ N − 1 , 1 , . . . ) (30) (if (1 − ψ 1 ) . . . (1 − ψ n ) = 0 µ n -almost everywhere for some n < N , we supp ose that ψ k ≡ 1 for a ny k > n , so F N ⊂ F N +1 , N = 1 , 2 , . . . ). Obviously , for any ψ ∈ F N R ( ψ ) = R N ( ψ ) = N − 1 X n =1 Z s ψ n ( cnf n + l n ) dµ n + Z t ψ N  cN f N + l N  dµ N , where for any n = 1 , 2 , . . . t ψ n = t ψ n ( x 1 , . . . , x n ) = (1 − ψ 1 ( x 1 ))(1 − ψ ( x 1 , x 2 )) . . . (1 − ψ n − 1 ( x 1 , . . . , x n − 1 )) (w e supp os e , by deﬁnition, tha t t ψ 1 ≡ 1 ). Optimal Sequent ial Pro cedures with Bay es Decision Rules 9 Let us introduce a sequence of functions V N n , n = 1 , . . . , N , which will deﬁne optimal stopping s rules. Let V N N ≡ l N , and recur sively for n = N − 1 , N − 2 , . . . 1 V N n = min { l n , Q N n } , (31) where Q N n = Q N n ( x 1 , . . . , x n ) = cf n ( x 1 , . . . , x n ) + Z V N n +1 ( x 1 , . . . , x n +1 ) dµ ( x n +1 ) , (32) n = 0 , 1 , . . . , N − 1 (we assume that f 0 ≡ 1). Please, remember that all V N n and Q N n implicitly dep end on the “unitar y observ a tion co st” c . The following theorem characterizes the structure of optimal stopping rules in F N . Theorem 3. F or al l ψ ∈ F N R N ( ψ ) ≥ Q N 0 . (33) The lower b ound in (33) is attaine d by a ψ ∈ F N if and only if I { l n 0 } , for al l n = 1 , 2 , . . . , N − 1 . The pro of of Theo rem 3 can be conducted following the lines of the pr o of of Theorem 3 .1 in [17] (in a less for mal wa y , the same routine is used to obtain T he o rem 4 in [15]). In fact, b oth of these theorems are particula r ca ses of Theore m 3 . Remark 2. Despite that ψ satisfying (34 ) is optimal among a ll truncated s topping rules in F N , it only ma kes practical sense if l 0 = inf d Z w ( θ , d ) dπ 1 ( θ ) ≥ Q N 0 . (35) Indeed, if (35) does not hold, we ca n, without taking an y obse rv ation, mak e any decision d 0 such that R w ( θ , d 0 ) dπ 1 ( θ ) < Q N 0 , and this gua rantees that this trivial pro c e dure (so mething like “( ψ 0 , d 0 )” with R ( ψ 0 , d 0 ) = R w ( θ , d 0 ) dπ 1 ( θ ) < Q N 0 ) p erfor ms b etter than the b est pro cedure with the optimal s to pping time in F N . Because o f this, V N 0 , deﬁned by (31) for n = 0, may b e co nsidered the “minimum v alue o f R ( ψ )”, when taking no o bserv a tions is a llow ed. 10 ANDREY NOVIK OV Remark 3. When π 2 in (5) c o incides with π 1 in (3) (Bay esian setting), an optimal truncated (non- r andomized) stopping rule for minimizing (17) is provided by Theo- rem 5.2.2 in [9]. Theorem 3 descr ibe s the clas s of al l r andomize d optimal stopping rules for the same problem in this pa rticular case. This may b e irrelev a nt if one is in terested in the purely Ba yesian pro blem, b eca us e an y of these stopping rules provides the same minimum v alue o f the risk. Nevertheless, this extension o f the class of o ptimal pro cedures may b e useful for complying with (11) in Theorem 1 when seeking for optimal sequential pr o cedures for the o riginal conditional problem (minimization of N ( ψ ) g iven that W i ( ψ , δ ) ≤ w i , i = 1 , . . . , k , see Introductio n and the discussion there in). This is very muc h like in non-sequential h yp othesis testing, where the r andomization is crucia l for ﬁnding the optimal level- α test in the Neyman-Pearson problem (see, for example, [11]). 4 OPTIMAL NON-TRUNCA T ED STOPPING R ULES In this section, we s olve the problem of minimization of R ( ψ ) in natura l class es of non-truncated stopping r ules ψ . Let ψ b e any stopping rule. Deﬁne R N ( ψ ) = R N ( ψ ; c ) = N − 1 X n =1 Z s ψ n ( cnf n + l n ) dµ n + Z t ψ N  cN f N + l N  dµ N . (36) This is the “risk” (17) for ψ truncated a t N , i.e . the r ule with the co mp o nents ψ N = ( ψ 1 , ψ 2 , . . . , ψ N − 1 , 1 , . . . ): R N ( ψ ) = R ( ψ N ). Because ψ N is truncated, the res ults of the pr eceding sectio n apply , in particular , the low er b ound of ( 33). V ery muc h like in [17] and in [15], our aim is to pass to the limit, as N → ∞ , in order to obtain a low er bo und for R ( ψ ), and conditions fo r attaining this b ound. It is easy to see that V N n ( x 1 , . . . , x n ) ≥ V N +1 n ( x 1 , . . . , x n ) for all N ≥ n , and for all ( x 1 , . . . , x n ), n ≥ 1 (see, for example, Lemma 3.3 in [1 7]). Thus, for any n ≥ 1 there exists V n = V n ( x 1 , . . . , x n ) = lim N →∞ V N n ( x 1 , . . . , x n ) , ( V n implicitly dep end o n c , as V N n do). It immediately follows from the monotone conv er gence theorem that for all n ≥ 1 lim N →∞ Q N n ( x 1 , . . . , x n ) = cf n ( x 1 , . . . , x n ) + Z V n +1 ( x 1 , . . . , x n +1 ) dµ ( x n +1 ) (37) (see (32)). Let Q n = Q n ( x 1 , . . . , x n ) = lim N →∞ Q N n ( x 1 , . . . , x n ). In a ddition, pass ing to the limit, as N → ∞ , in (31) we obtain V n = min { l n , Q n } , n = 1 , 2 , . . . . Let now F b e a ny cla s s of stopping rule s such that ψ ∈ F entails R N ( ψ ) → R ( ψ ), as N → ∞ . It is eas y to s ee that s uch classes exist, for e xample, any F N has this Optimal Sequent ial Pro cedures with Bay es Decision Rules 11 prop erty . Moreov er, we will ass ume that all tr unca ted sto pping r ules are included in F , i.e. that S N ≥ 1 F N ⊂ F . It follows from Theor em 3 now that for a ll ψ ∈ F R ( ψ ) ≥ Q 0 . (38) The following le mma states that, in fa ct, the low e r b ound in (38) is the inﬁmum of the risk R ( ψ ) over ψ ∈ F . Lemma 2. Q 0 = inf ψ ∈ F R ( ψ ) . The pro of of Lemma 2 is very close to that of Lemma 3.5 in [17] (see a lso Lemma 6 in [15]) and is omitted her e . Remark 4. Again (see Remark 3), if π 1 = π 2 , Lemma 2 is essent ially Theorem 5.2.3 in [9] (see also Section 7 .2 of [8]) . The following Theor em gives the structure o f optimal stopping r ules in F . Theorem 4. If ther e exists ψ ∈ F su ch that R ( ψ ) = inf ψ ′ ∈ F R ( ψ ′ ) , (39) then I { l n

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment