A Theory of Truncated Inverse Sampling

A Theory of T runcated In v erse Sampling ∗ Xinjia Chen No v em b er 2008 Abstract In this pap er, we ha ve establis hed a new framework of truncated in verse sampling for estimating mean v alues of non-negative random v ariables such as binomial, Poisson, hyper- geometrical, and b ounded v ariables. W e hav e derived ex plic it formulas a nd computationa l metho ds for designing sampling schemes to ensure prescribe d lev els of precision and conﬁdence for po int estimators. More ov er, we hav e developed interv al estimation metho ds. 1 In tro duction P arametric e stimation based on sampling is an im p ortan t b ranc h of mathematical statistics with ubiquitous app lications across many ﬁelds, from op eration researc h, biology and medical science, agriculture science, compu ter science, so cial science, telecomm unication engineering, control en- gineering, to name a few. A wide class estimat ion problems of b oth theoretical an d practica l signiﬁcance ca n b e put in to the se tting of estimating the mea n v alue of a random v ariable via sampling. F amilia r examples include the estimation of binomial p arameters, Poisson parameters, ﬁnite p opu lation prop ortion, the mean of a b oun ded v ariable, and s o on. A simple ye t frequently used s ampling scheme for estimating the mean v alue a rand om v ariable X is to dra w samples of X un til the sample sum is no less than a prescrib ed threshold and th en tak e the empirical mean as an estimate for the tru e mean v alue. This sampling s c heme, referred to as inverse sa mpling , w as ﬁrst studied by Haldane [8, 9] in the context of estimating a binomial parameter. Recen tly , in v erse sampling h as b een studied b y C hen [1, 2], Dagum et al. [5] and Chen g [3] f or estimation of the mean of a b oun ded v ariable. Mendo and Hern ando [10] h a v e r evisited inv erse sampling for estimating binomial p arameters. Theoretically , th ere is n o limit on the num b er of samples for inv erse s amp ling. How ever, the practical situation is quite contrary . Due to the limitation of resour ces, almost eve ry pr actitio ner w ould sp ecify a maxim um sample s ize on the sampling. This means that the frequent ly u sed ∗ The aut hor had b een p reviously working with Louisiana St ate Universit y at Baton Rouge, LA 70803, U SA, and is now with Department of Electrical Engineering, S outhern Universit y and A &M College, Baton Rouge, LA 70813, USA; Email: chenxinjia@gma il.com 1 metho d is actually the trunc ate d inverse sampling scheme in the sense that sampling is con tinued unt il the sample s um is no less than a prescrib ed thr esh old or the num b er of samp les reac h the maxim um sample size. While the id eal in v erse sampling has dr awn extensiv e researc h eﬀort, little atten tion has b een paid to the theoretical iss u es of the truly useful trun cated in v erse s ampling sc heme. In this p ap er, w e shall in v estigate the essen tial theory of trun cated in ve rse sampling with a prev ailing theme of error cont rol. W e h a v e answ ered tw o equally central problems regarding pre-exp erimenta l plan- ning and p ost-exp erimen tal analysis. The ﬁrst problem is on the determination of the thresh old v alue and the maxim um s ample size for guarant eeing pr escrib ed lev els of precision and conﬁdence of an estimator. T he second problem is on in terv al e stimation of the parameter based on the observ ed data when the trun cated inv erse sampling is completed. The remainder of the pap er is organized as follo ws . In Section 2, w e present our general results for truncated inv erse sampling. In Sectio n 3, w e consider the pr oblem of estimating binomial parameters. In S ection 4, we discuss th e estimation of the prop ortion of a ﬁnite p r op ortion. In Section 5, w e discuss the estimatio n of P oisson parameters. The estimation of the mean of a b ound ed v ariable is inv estigated in Section 6. Section 7 is the conclusion. Throughout this pap er, we shall use the follo wing notations. The exp ectat ion of a random v ariable is d enoted b y E [ . ]. The set of p ositive integ ers is denoted by N . The ceiling f unction and ﬂo or fun ction are den oted resp ectiv ely b y ⌈ . ⌉ an d ⌊ . ⌋ (i.e., ⌈ x ⌉ repr esen ts the smallest integ er no less than x ; ⌊ x ⌋ represent s the largest in teger n o greater than x ). The gamma fun ction is d enoted b y Γ( . ). F or any intege r m , the com binatoric function  m z  with resp ect to integ er z tak es v alue Γ( m +1) Γ( z +1)Γ( m − z +1) for z ≤ m and v alue 0 otherwise. The left limit as ǫ tend s to 0 is denoted as lim ǫ ↓ 0 . The notation “ ⇐ ⇒ ” means “if and only if ”. W e u se the notation Pr { . | θ } to ind icate that the asso ciated r andom samples X 1 , X 2 , · · · are parameterized by θ . T he p arameter θ in Pr { . | θ } may b e d ropp ed w henev er this can b e don e without introd ucing confusion. The other notations will b e made clear as w e pr o cee d. 2 General T heory In this section, we shall d ev elop some general results on the truncated inv erse sampling. Let X b e a n on-negativ e random v ariable deﬁned in a probability sp ace (Ω , F , Pr). Our problem is to estimate the mean, µ = E [ X ], of X based on i.i.d. r an d om samples X 1 , X 2 , · · · of X . T o th is end, w e shall adopt a truncated inv erse sampling scheme as follo ws: Con tin ue sampling until the sample sum is no less than a thr eshold v alue γ > 0 or the num b er of samples reac hes an inte ger n . Let n b e the total num b er of samples when the samp ling is stopp ed. By the d eﬁ nition of the truncated in ve rse sampling sc heme, n is a rand om v ariable s u c h that n ( ω ) = min ( n, min { ℓ ∈ N : ℓ X i =1 X i ( ω ) ≥ γ } ) 2 for an y ω ∈ Ω . Deﬁne k = P n i =1 X i . Then, we can tak e b µ = min { k ,γ } n as the estimator for µ = E [ X ]. With regard to the distribution of b µ , we hav e Theorem 1 F or any z > 0 , Pr { b µ ≤ z } =    Pr { P ⌈ γ /z ⌉− 1 i =1 X i < γ } for γ ≤ nz , Pr { P n i =1 X i ≤ nz } for γ > nz Pr { b µ ≥ z } =    Pr { P ⌊ γ /z ⌋ i =1 X i ≥ γ } for γ ≤ nz , Pr { P n i =1 X i ≥ nz } for γ > nz . With r egard to the av erage sample n umber E [ n ] of the truncated in v erse samplin g asso ciated with random v ariable X , w e hav e Theorem 2 F or any non-ne gative r andom variable X with p ositive me an and ﬁnite varianc e, E [ n ] < min { n, γ µ + 1 } . Sp e cial ly, if γ is a p ositive inte ger and X is a Bernoul li r andom variable such that Pr { X = 1 } = 1 − Pr { X = 0 } = p ∈ (0 , 1) , then E [ n ] < min { n, γ p } . 3 Estimation of Binomial P arameters In this sec tion, w e shall consider the estimation of a binomial parameter based on trun cated in v erse sampling. Let X be a Bernoulli random v ariable suc h that Pr { X = 1 } = 1 − Pr { X = 0 } = p ∈ (0 , 1). Our goal is to estimate p b ased on i.i.d. random samples X 1 , X 2 , · · · of X . Since X i assumes only tw o p ossible v alues 0 or 1, th e thresh old v alue γ shall b e restricted to an in teger. The estimator for p can b e tak en as b p = min { k ,γ } n = k n , where k and n h a v e b een deﬁn ed in Section 2. In ord er to estimate p via trun cated inv ers e s amp ling, a critical pr oblem is the d etermin ation of the threshold v alue γ and the maxim um s ample size n . By making u se of fun ctions M B ( z , µ ) = z ln  µ z  + (1 − z ) ln  1 − µ 1 − z  , M I ( z , µ ) = 1 z M B ( z , µ ) for 0 < z < 1 an d 0 < µ < 1, we h a v e d eriv ed the follo wing result. Theorem 3 L et 0 < δ < 1 . L et 0 < ε a < ε r < 1 b e r esp e ctively th e mar g ins o f absolute an d r elative err ors suc h that ε a ε r + ε a ≤ 1 2 . Then, P r n | b p − p | < ε a or    b p − p p    < ε r o > 1 − δ pr ovide d that n > ln( δ / 2) M B ( p ⋆ + ε a ,p ⋆ ) and γ > ln( δ/ 2) M I ( p ⋆ + ε a ,p ⋆ ) wher e p ⋆ = ε a ε r . Theorem 3 pr ovides explicit formulas for determining the thr eshold v alue γ and the m axim um sample size n . T o redu ce conserv atism, we can tak e a computational appr oac h to obtain smaller γ and n . In this d irection, the follo wing theorem wh ic h is of fun dament al imp ortance. Theorem 4 L et L ( . ) and U ( . ) b e monotone functions. L et the supp orts of L ( b p ) and U ( b p ) b e denote d by I L and I U r esp e ctively. Then, the maximum of Pr { p ≤ L ( b p ) | p } with r esp e ct to p ∈ [ a, b ] ⊆ [0 , 1] is achieve d at I L ∩ [ a, b ] ∪ { a, b } pr ovide d tha t I L has no c losur e p oint in [ a, b ] . Similarly, the maximum of Pr { p ≥ U ( b p ) | p } with r e sp e ct to p ∈ [ a, b ] ⊆ [0 , 1] is achieve d at I U ∩ [ a, b ] ∪ { a, b } pr ovide d that I U has no closur e p oint in [ a, b ] . 3 In Theorem 4, we hav e u sed the co ncept of supp ort. The sup p ort of a random v ariable is referred to the set of all p ossib le v alues that the rand om v ariable can assume. By virtue of Theorem 4, we hav e obtained the f ollo w ing r esults. Theorem 5 L et 0 < δ < 1 and ζ > 0 . L et 0 < ε a < ε r < 1 b e r esp e ctively the mar gins of absolute and r elative err ors. Deﬁne n = j ln( ζ δ ) M B ( p ⋆ + ε a ,p ⋆ ) k and γ = j ln( ζ δ ) M I ( p ⋆ + ε a ,p ⋆ ) k with p ⋆ = ε a ε r . Deﬁne Q − a as the supp ort of b p − ε a , Q + a as the supp ort of b p + ε a , Q + r as the supp ort of b p / (1 + ε r ) , Q − r as the supp ort of b p / (1 − ε r ) . Then, P r n | b p − p | < ε a or    b p − p p    < ε r o > 1 − δ pr ovide d that Pr { b p ≥ p + ε a | p } ≤ δ 2 , ∀ p ∈ Q − a ∪ { p ⋆ } ∩ (0 , p ⋆ ] (1) Pr { b p ≤ p − ε a | p } ≤ δ 2 , ∀ p ∈ Q + a ∪ { p ⋆ } ∩ (0 , p ⋆ ] (2) Pr { b p ≥ p (1 + ε r ) | p } ≤ δ 2 , ∀ p ∈ Q + r ∩ ( p ⋆ , 1) (3) Pr { b p ≤ p (1 − ε r ) | p } ≤ δ 2 , ∀ p ∈ Q − r ∩ ( p ⋆ , 1) (4) wher e these c onditions ar e satisﬁe d when ζ is smal ler than 1 2 . Clearly , the su pp ort of b p is { j n : j = 0 , 1 , · · · , γ } ∪ { γ m : m = γ , γ + 1 , · · · , n − 1 } . Th eorem 5 asserts that the prescrib ed lev els of precision and conﬁd ence can b e guarante ed if ζ is small enough. Hence, w e can determine an appropriate v alue of ζ b y a bisection searc h metho d. When the sampling is terminated, it is desirable to construct a conﬁdence in terv al for p . F or this purp ose, w e hav e Theorem 6 L et 0 < δ < 1 . Deﬁne lower c onﬁdenc e limit p ∈ [0 , 1) such that p = 0 for k = 0 and t hat P n i = k  n i  p i (1 − p ) n − i = δ 2 for k > 0 . Deﬁne u pp er c onﬁdenc e limit p ∈ (0 , 1] such that p = 1 for k = n and that P k i =0  n i  p i (1 − p ) n − i = δ 2 for k < n . Then, Pr { p < p < p } ≥ 1 − δ . It should b e noted the appr oac h of constructing a conﬁdence inte rv al for p can b e considered as a generalizatio n of Clopp er and Pe arson’s metho d [4] of in terv al estimation. 4 Estimation of Finite P opulation Prop ortion In the last s ection, w e ha v e in v estigated the estimation of a binomial parameter p , w hic h can b e considered as the prop ortion of an inﬁn ite p opulation. In man y situations, the p opulation size is ﬁnite and we s h all d ev ote this section to the estimation of the p rop ortion of a ﬁnite p opulation. Consider a p opu lation of N un its, among w hic h ther e are M units ha ving a certain attribute. It is a frequent problem to estimate the p opulation p rop ortion p = M N b y sampling without replacemen t. The pro cedu re of sampling without replacemen t can b e precisely d escrib ed as follo ws: 4 Eac h time a single unit is dra wn without replacemen t from the r emaining p opulation so that ev ery un it of the remaining p opulation has equ al c hance of b eing selected. Suc h a sampling pro cess can b e exactly c haracterized b y r andom v ariables X 1 , · · · , X N deﬁned in a pr obabilit y space (Ω , F , Pr ) such that X i denotes the c haracteristics of the i -th sample in the sense that X i = 1 if the i -th sample has the attribute and X i = 0 otherw ise. By the nature of the sampling pr o cedure, it can b e sh own that Pr { X i = x i , i = 1 , · · · , n } =  M P n i =1 x i  N − M n − P n i =1 x i   n P n i =1 x i  N n   (5) for an y n ∈ { 1 , · · · , N } and an y x i ∈ { 0 , 1 } , i = 1 , · · · , n . Moreo v er, if the pr op ortion p = M N is ﬁxed and the p opulation size N tends to inﬁn it y , the sequ ence X 1 , X 2 , · · · , X N tends to the i.i.d. random samples of a Bernou lli v ariable. T o estimate the p opulatio n prop ortion p , we can use a sampling sc heme deﬁned by p ositiv e in tegers γ and n as follo ws: Con tin ue s ampling without replacemen t until γ units fou n d to h a v e a certain attribute or the n umber of samples reac hes n . Despite the lac k of indep endence in the sequence X 1 , X 2 , · · · , X N with joint d istribution (5), suc h a sampling met ho d is also referred to as truncated in verse s ampling du e to the fact that, when the sampling is terminated, the num b er of units having a certain attribute, d enoted by k , is actually equal to P n i =1 X i , wh ere n is the sample size wh en the sampling is terminated. This implies that, by r elaxing the indep endency assump tion, we can p ut s u c h a sampling sc heme in the general framework of trun cated inv er s e samp ling describ ed in Section 2. It can b e seen that, as the sample size tend s to inﬁn it y while the p rop ortion p is b eing ﬁxed, suc h a samplin g sc heme reduces to th e truncated in v erse s amp ling for the estimation of a binomial parameter as d iscussed in Section 3. As in the case of estimating a binomial p arameter in Section 3, the estimato r f or th e prop ortion of a ﬁn ite p opulation can b e tak en as b p = min { k ,γ } n = k n . In order to determine n and γ to guarant ee prescrib ed lev els of pr ecision and conﬁd en ce, we h a v e the follo wing result. Theorem 7 L et 0 < δ < 1 . L et 0 < ε a < ε r < 1 b e r esp e ctively th e mar g ins o f absolute an d r elative err ors such that ε a ε r + ε a ≤ 1 2 . Then, P r {| b p − p | < ε a or | b p − p | < pε r } > 1 − δ pr ovide d th at n > ln( δ / 2) M B ( p ⋆ + ε a ,p ⋆ ) and γ > ln( δ/ 2) M I ( p ⋆ + ε a ,p ⋆ ) , wher e p ⋆ = ε a ε r . Theorem 7 pro vides explicit formulas f or determining threshold v alue γ and t he maximum sample size n . T o redu ce conserv atism, we can tak e a computational appr oac h to obtain smaller γ and n . In this d irection, the follo wing theorem is us efu l. Theorem 8 L et L ( . ) an d U ( . ) b e non-de cr e asing inte ge r- value d functions. L et the supp orts of L ( b p ) a nd U ( b p ) b e denote d by I L and I U r esp e ctively. Then, the maximum of Pr { M ≤ L ( b p ) | M } with r esp e ct to M ∈ [ a, b ] ⊆ [0 , N ] , wher e a and b a r e inte gers, is ach ieve d at 5 I L ∩ [ a, b ] ∪ { a, b } . Similarly, the maximum of Pr { M ≥ U ( b p ) | M } with r esp e ct to M ∈ [ a, b ] is achieve d at I U ∩ [ a, b ] ∪ { a, b } . By virtue of Theorem 8, we hav e obtained the f ollo wing results. Theorem 9 L et 0 < δ < 1 and ζ > 0 . L et 0 < ε a < ε r < 1 b e r esp e ctively the mar gins of absolute and r elative err ors. Deﬁne n = j ln( ζ δ ) M B ( p ⋆ + ε a ,p ⋆ ) k and γ = j ln( ζ δ ) M I ( p ⋆ + ε a ,p ⋆ ) k with p ⋆ = ε a ε r . Deﬁne Q − a as the supp ort of ⌊ N ( b p − ε a ) ⌋ , Q + a as the supp ort of ⌈ N ( b p + ε a ) ⌉ , Q + r as the supp ort of ⌊ N b p / (1 + ε r ) ⌋ , Q − r as the supp ort of ⌈ N b p / (1 − ε r ) ⌉ . Then, Pr {| b p − p | < ε a or | b p − p | < pε r } > 1 − δ pr ovide d that Pr { b p ≥ p + ε a | M } ≤ δ 2 , ∀ M ∈ Q − a ∪ {⌊ N p ⋆ ⌋} ∩ (0 , N p ⋆ ] (6) Pr { b p ≤ p − ε a | M } ≤ δ 2 , ∀ M ∈ Q + a ∪ {⌊ N p ⋆ ⌋} ∩ (0 , N p ⋆ ] (7) Pr { b p ≥ p (1 + ε r ) | M } ≤ δ 2 , ∀ M ∈ Q + r ∪ {⌊ N p ⋆ ⌋ + 1 } ∩ ( N p ⋆ , N ) (8) Pr { b p ≤ p (1 − ε r ) | M } ≤ δ 2 , ∀ M ∈ Q − r ∪ {⌊ N p ⋆ ⌋ + 1 } ∩ ( N p ⋆ , N ) (9) wher e these c onditions ar e satisﬁe d when ζ is smal ler than 1 2 . Clearly , the su pp ort of b p is { j n : j = 0 , 1 , · · · , γ } ∪ { γ m : m = γ , γ + 1 , · · · , n − 1 } . It is asserted b y Theorem 9 that the prescrib ed lev els of precision and conﬁdence can b e guaran teed if ζ is small enough. Th er efore, an appropriate v alue of ζ ca n b e determined b y a bisection s earc h metho d . In order to constr u ct a conﬁdence interv al for M , w e hav e Theorem 10 L et M l b e the sm al lest inte ger such that P n i = k  M l i  N − M l n − i  /  N n  > δ 2 . L et M u b e the lar gest inte ger such that P k i =0  M u i  N − M u n − i  /  N n  > δ 2 . Then, Pr { M l ≤ M ≤ M u } ≥ 1 − δ . With regard to the a v erage samp le num b er E [ n ], we ha v e Theorem 11 If the p opulation pr op ortion p is p ositive, then E [ n ] < min n n, γ p o . 5 Estimation of P oisson P arameters Let X b e a P oisson random v ariable with mean λ > 0. It is a frequen t problem to estimate λ based on i.i.d. random samples X 1 , X 2 , · · · of X . This can b e accomplished b y using th e truncated in v erse sampling sc heme describ ed in Section 2. Since X i is an integer-v alued random v ariable, w e shall restrict the threshold γ to b e a p ositive inte ger. W e take b λ = min { k ,γ } n as th e estimator for λ , wh ere k and n hav e b een deﬁned in Section 2. T o determine the thresh old γ a nd the maxim um sample size n , w e need to hav e a n up p er b ound for λ . W e do not p ursue results along this line. W e are more intereste d in the construction of a conﬁdence interv al when th e s ampling is completed. F or this p urp ose, we ha v e 6 Theorem 12 L et 0 < δ < 1 . Deﬁne lower c onﬁdenc e lim it λ such that λ = 0 for b λ = 0 , P ∞ i = γ 1 i ! ( n λ ) i exp( − n λ ) = δ 2 for b λ ≥ γ n , and P ∞ i = k 1 i ! ( n λ ) i exp( − n λ ) = δ 2 for 0 < b λ < γ n , wher e k = P n i =1 X i . Deﬁne upp er c onﬁdenc e limit λ such that λ = ∞ for n = 1 , P γ − 1 i =0 1 i ! [( n − 1) λ ] i exp( − ( n − 1) λ ) = δ 2 for γ n ≤ b λ < γ , a nd P k i =0 1 i ! ( n λ ) i exp( − n λ ) = δ 2 for b λ < γ n . Then, Pr { λ < λ < λ } ≥ 1 − δ . It should b e n oted that the in terv al estimation m etho d describ ed in Th eorem 12 is a general- ization of Garwoo d ’s in terv al estimation metho d [6 ]. 6 Estimation of Bounded-V ariable Means Let X b e a random v ariable b ounded in [0 , 1] with mean µ = E [ X ]. In man y situations, it is desirable to estimate µ b ased on i.i.d. random samples X 1 , X 2 , · · · of X (see, e.g., [5] and the references therein). T o fulﬁll this goal, we shall make use of the truncated inv erse sampling sc heme describ ed in Sect ion 2. In o rder to determine the thr eshold γ and maxim um sample size n to guaran tee pr escrib ed lev els of p recision and conﬁd ence, we ha v e Theorem 13 L et 0 < δ < 1 . L et 0 < ε a < ε r < 1 b e r esp e ctively the mar gins of absolute and r elative err ors such that p ⋆ + ε a ≤ 1 2 with p ⋆ = ε a ε r . Then, Pr n | b µ − µ | < ε a or    b µ − µ µ    < ε r o > 1 − δ pr ovide d that γ > 1 − ε r ε r , γ > ln δ 2 M I  γ ( p ⋆ − ε a ) γ − 1+ ε r , p ⋆  , γ > ln δ 2 M I ( p ⋆ + ε a , p ⋆ ) , n > ln δ 2 M B ( p ⋆ + ε a , p ⋆ ) . With regard to the in terv al estimation of µ , we h a v e Theorem 14 L et 0 < δ < 1 . Deﬁne lower c onﬁdenc e limit µ ∈ [0 , b µ ] such that µ = 0 for b µ = 0 , M B ( b µ , µ ) = ln δ 2 n for 0 < b µ < γ n , and M I ( b µ , µ ) = ln δ 2 γ for b µ ≥ γ n . Deﬁne upp er c onﬁdenc e limit µ ∈ [ b µ , 1] and th at µ = 1 for b µ ≥ γ γ +1 , M B ( b µ , µ ) = ln δ 2 n for b µ < γ n , M I  b µ γ γ − b µ , µ  = ln δ 2 γ and b µ γ γ − b µ < µ < 1 for γ γ +1 > b µ ≥ γ n . Then, Pr { µ < µ < µ } ≥ 1 − δ . 7 Conclusion In this pap er, we ha v e established a general theory of trun cated inv ers e sampling for estimating the mean v alue of a large class of random v ariables. W e ha ve applied su c h a theory t o the common imp ortan t v ariables suc h as binomial, P oisson, h yp er-geometrical, and b oun ded v ariables. Rigorous metho ds ha v e b een derived for determining the thresholds and maximum sample sizes to ensure statistical accuracy . In terv al estimation m etho ds hav e also b een dev elop ed. 7 A Pro of of Th eorem 1 The theorem can b e sh own by establishing Lemmas 1 to 4 as follo ws. Lemma 1 Supp ose z ≥ γ n . Then, Pr { b µ ≤ z } = Pr { P m i =1 X i < γ } wher e m =  γ z  − 1 . Pro of . By the assump tion th at z ≥ γ n and th e deﬁnition of the sampling sc heme, { b µ > z , k < γ } =  k n > z , k < γ  ⊆  k n > γ n , k < γ  =  k n > γ n , k < γ , n = n  = ∅ . Therefore, { b µ > z } = { b µ > z , k < γ } ∪ { b µ > z , k ≥ γ } = { b µ > z , k ≥ γ } = { γ n > z , k ≥ γ } . T o sho w the lemma, it remains to sho w that  γ n > z , k ≥ γ  = Pr { P m i =1 X i ≥ γ } . Since all X i are non-negativ e, we hav e { γ n > z , k ≥ γ } = { n ≤ m, k ≥ γ } ⊆ { n ≤ m, P m i =1 X i ≥ γ } ⊆ { P m i =1 X i ≥ γ } . On the other hand, by the assu mption th at z ≥ γ n , we hav e m =  γ z  − 1 ≤ n − 1. Hence, by th e deﬁnition of the sampling sc heme, we ha v e { P m i =1 X i ≥ γ } ⊆ { n ≤ m, k ≥ γ , P m i =1 X i ≥ γ } ⊆ { n ≤ m, k ≥ γ } . It follo ws that { P m i =1 X i ≥ γ } = { n ≤ m, k ≥ γ } = { γ n > z , k ≥ γ } = { b µ > z } . T his completes the pr o of of the lemma. ✷ Lemma 2 Supp ose z < γ n . Then, Pr { b µ ≤ z } = Pr n P n i =1 X i n ≤ z o . Pro of . By the assump tion th at z < γ n and th e deﬁnition of the sampling sc heme, { b µ ≤ z , k ≥ γ } = n γ n ≤ z , k ≥ γ o ⊆ n γ n < γ n , k ≥ γ o = { n > n, k ≥ γ } = ∅ . Therefore, { b µ ≤ z } = { b µ ≤ z , k < γ } ∪ { b µ ≤ z , k ≥ γ } = { b µ ≤ z , k < γ } =  k n ≤ z , k < γ  =  k n ≤ z , k < γ , n = n  = ( P n i =1 X i n ≤ z , n X i =1 X i < γ , n = n ) ⊆  P n i =1 X i n ≤ z  . On the other hand, b y the deﬁnition of the sampling sc heme and th e assumption that z < γ n , w e hav e n P n i =1 X i n ≤ z o ⊆ n P n i =1 X i n ≤ z , n = n o = n P n i =1 X i n ≤ z , P n i =1 X i < γ , n = n o = { b µ ≤ z } . It follo w s that { b µ ≤ z } = n P n i =1 X i n ≤ z o . This completes th e pro of of the lemma. ✷ Lemma 3 Supp ose z ≥ γ n . Then, Pr { b µ ≥ z } = Pr { P m i =1 X i ≥ γ } wher e m =  γ z  . 8 Pro of . By the assump tion th at z ≥ γ n and th e deﬁnition of the sampling sc heme, { b µ ≥ z , k < γ } =  k n ≥ z , k < γ  =  k n ≥ z , k < γ , n = n  = ∅ . Therefore, { b µ ≥ z } = { b µ ≥ z , k < γ } ∪ { b µ ≥ z , k ≥ γ } = { b µ ≥ z , k ≥ γ } = { γ n ≥ z , k ≥ γ } . T o sho w the lemma, it r emains to sho w that { γ n ≥ z , k ≥ γ } = Pr { P m i =1 X i ≥ γ } . Since all X i are non-negativ e, we hav e { γ n ≥ z , k ≥ γ } = { n ≤ m, k ≥ γ } ⊆ { n ≤ m, P m i =1 X i ≥ γ } ⊆ { P m i =1 X i ≥ γ } . On the other hand , by the assu m ption that z ≥ γ n , we ha ve m =  γ z  ≤ n . By the deﬁnition of the sampling sc heme and the fact that all X i are non-negativ e, { P m i =1 X i ≥ γ } ⊆ { n ≤ m, P m i =1 X i ≥ γ } ⊆ { n ≤ m, k ≥ γ } . Hence, { P m i =1 X i ≥ γ } = { n ≤ m, k ≥ γ } = { γ n ≥ z , k ≥ γ } = { b µ ≥ z } . This completes the pr o of of the lemma. ✷ Lemma 4 Supp ose z < γ n . Then, Pr { b µ ≥ z } = Pr n P n i =1 X i n ≥ z o . Pro of . By the assump tion th at z < γ n and th e deﬁnition of the sampling sc heme, { b µ < z , k ≥ γ } = n γ n < z , k ≥ γ o ⊆ n γ n < γ n , k ≥ γ o = { n > n, k ≥ γ } = ∅ . Therefore, { b µ < z } = { b µ < z , k < γ } ∪ { b µ < z , k ≥ γ } = { b µ < z , k < γ } =  k n < z , k < γ  =  k n < z , k < γ , n = n  = ( P n i =1 X i n < z , n X i =1 X i < γ , n = n ) ⊆  P n i =1 X i n < z  . On the other hand, by the deﬁn ition of the sampling scheme and the assumption that z < γ n ,  P n i =1 X i n < z  ⊆  P n i =1 X i n < z , n = n  = ( P n i =1 X i n < z , n X i =1 X i < γ , n = n ) = { b µ < z } . It fo llo ws that { b µ < z } = n P n i =1 X i n < z o , i.e., { b µ ≥ z } = n P n i =1 X i n ≥ z o . This completes the pro of of the lemma. ✷ B Pro of of Th eorem 2 By the deﬁn ition of the truncated inv erse sampling sc heme, E [ n ] = n P r ( n X i =1 X i < γ ) + n X m =1 m Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) < n Pr ( n X i =1 X i < γ ) + n X m =1 n Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) = n. 9 By the fact that X i is non-negativ e, ∞ [ m = n +1 ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ )! [ ( ∞ X i =1 X i < γ ) = ( n X i =1 X i < γ ) . Since E [ X ] = µ is p ositive and the corresp ondin g v ariance σ 2 is ﬁn ite, w e ha v e, b y Cheb yshev’s inequalit y , 0 ≤ P r ( ∞ X i =1 X i < γ ) = lim k →∞ Pr ( k X i =1 X i < γ ) = lim k →∞ Pr ( P k i =1 X i k − µ < γ k − µ ) ≤ lim k →∞ Pr (      P k i =1 X i k − µ      >    γ k − µ    ) ≤ lim k →∞ σ 2 k   γ k − µ   2 = 0 . Hence, E [ n ] = ∞ X m = n +1 n Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) + n X m =1 m Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) < ∞ X m =1 m Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) = E [ m ] , where m is the sa mple n um b er of the cla ssical inverse samp ling sc heme with the fol lo wing stopping rule: Sampling is co nt inued unti l the s amp le sum is no less than γ . By the d eﬁnition of the classical inv erse sampling, we ha v e P m − 1 i =1 X i < γ . Applying W ald’s equati on, we ha ve E [ P m − 1 i =1 X i ] = E [ m − 1] E [ X ] < γ , wh ic h imp lies th at E [ m ] < γ E [ X ] + 1 = γ µ + 1. Since E [ n ] is less than b oth n and E [ m ] as sh o wn ab o ve, we hav e E [ n ] < min { n, γ µ + 1 } . In the sp ecial case that γ is a p ositiv e integ er and that X is a Bernoulli r andom v ariable su c h that E [ X ] = p ∈ (0 , 1), we ha v e P m i =1 X i = γ and consequen tly , b y W ald’s equation, E [ P m i =1 X i ] = E [ m ] E [ X ] = γ , from w h ic h w e get E [ m ] = γ E [ X ] = γ p and it follo ws that E [ n ] < min { n, γ p } . This completes the pr o of the theorem. C Pro of of T heorem 3 W e need some preliminary results. Th e follo wing lemma is a s ligh t mo diﬁ cation of Ho eﬀding [7]. Lemma 5 L et X 1 , · · · , X n b e i.i.d. r andom v ariables b ounde d in [0 , 1] with c ommon me an value µ ∈ (0 , 1) . Then, Pr n P n i =1 X i n ≥ z o ≤ exp ( n M B ( z , µ )) for 1 ≥ z ≥ µ . Similar ly, Pr n P n i =1 X i n ≤ z o ≤ exp ( n M B ( z , µ )) for 0 ≤ z ≤ µ . Pro of . F or z = µ , w e hav e Pr n P n i =1 X i n ≥ z o ≤ exp ( n M B ( z , µ )) = 1 . F or µ < z < 1, it w as sh own b y Ho eﬀding in [7] that Pr n P n i =1 X i n ≥ z o ≤ exp ( n M B ( z , µ )) . F or z = 1, w e ha v e Pr n P n i =1 X i n ≥ z o = Q n i =1 Pr { X i = 1 } ≤ Q n i =1 E [ X i ] = µ n = exp ( n M B (1 , µ )) . F or z = 0, w e hav e Pr n P n i =1 X i n ≤ z o = Q n i =1 (1 − Pr { X i 6 = 0 } ) ≤ Q n i =1 (1 − E [ X i ]) = (1 − µ ) n = exp ( n M B (0 , µ )) . F or 0 < z < µ , it was sho w n by Ho eﬀding in [7] that Pr n P n i =1 X i n ≤ z o ≤ exp ( n M B ( z , µ )) . F or z = µ , w e hav e Pr n P n i =1 X i n ≤ z o ≤ exp ( n M B ( z , µ )) = 1 . ✷ 10 Lemma 6 L et 0 < ε < 1 . Then, M I ( µ + εµ, µ ) is monotonic al ly de c r e asing with r esp e ct to µ ∈  0 , 1 1+ ε  . Similarly, M I ( µ − εµ, µ ) is monotonic al ly de c r e asing with r esp e ct to µ ∈ (0 , 1) . Pro of . Note that ∂ M I ( µ + εµ,µ ) ∂ µ = − 1 µ 2 (1+ ε ) ln h 1 − µ 1 − µ (1+ ε ) i + ε µ (1 − µ )(1+ ε ) ≤ 0 if ln h 1 − µ 1 − µ (1+ ε ) i ≥ εµ 1 − µ , i.e., ln  1 − εµ 1 − µ  ≤ − εµ 1 − µ . (10) As a consequence of 0 < µ < 1 1+ ε , w e hav e 0 < εµ 1 − µ < 1. Sin ce ln(1 − x ) < − x for an y x ∈ (0 , 1), it follo ws that (10) holds and thus M I ( µ + εµ, µ ) is monotonically d ecreasing with resp ect to µ ∈  0 , 1 1+ ε  . Similarly , to sh o w that M I ( µ − εµ, µ ) is monotonically decreasing with r esp ect to µ , n ote that ∂ M I ( µ − εµ,µ ) ∂ µ = − 1 µ 2 (1 − ε ) ln h 1 − µ 1 − µ (1 − ε ) i − ε µ (1 − µ )(1 − ε ) ≤ 0 if ln h 1 − µ 1 − µ (1 − ε ) i ≥ − εµ 1 − µ , i.e., ln  1 + εµ 1 − µ  ≤ εµ 1 − µ . (11) Since εµ 1 − µ > 0 and ln (1 + x ) < x for any x ∈ (0 , ∞ ), w e hav e that (11) holds and thus M I ( µ − εµ, µ ) is monotonically d ecreasing with resp ect to µ ∈ (0 , 1). ✷ Lemma 7 M I ( µ + εµ, µ ) > M I ( µ − εµ, µ ) for µ ∈  0 , 1 2  and 0 < ε < 1 . Pro of . Direct computation sh o ws that ∂ M I ( µ + εµ, µ ) ∂ ε = − 1 (1 + ε ) 2 µ ln  1 − µ 1 − (1 + ε ) µ  , ∂ M I ( µ − εµ, µ ) ∂ ε = 1 (1 − ε ) 2 µ ln  1 − µ 1 − (1 − ε ) µ  . Since ln h 1 − µ 1 − (1+ ε ) µ i < εµ 1 − (1+ ε ) µ and ln h 1 − µ 1 − (1 − ε ) µ i < − εµ 1 − (1 − ε ) µ , w e hav e ∂ M I ( µ + εµ, µ ) ∂ ε − ∂ M I ( µ − εµ, µ ) ∂ ε > − 1 (1 + ε ) 2 µ εµ 1 − (1 + ε ) µ + 1 (1 − ε ) 2 µ εµ 1 − (1 − ε ) µ > 0 if (1 + ε ) 2 [1 − (1 + ε ) µ ] − (1 − ε ) 2 [1 − (1 − ε ) µ ] > 0, or equiv alent ly , 4 ε − 2 ε (3 + ε 2 ) µ > 0, whic h is true b ecause 4 ε − 2 ε (3 + ε 2 ) µ > 4 ε − 2 ε (3 + 1) × 1 2 = 0 as a resu lt of 0 < ε < 1 and 0 < µ < 1 2 . The lemma immediately follo ws from the fact that ∂ M I ( µ + εµ,µ ) ∂ ε is greater than ∂ M I ( µ − εµ,µ ) ∂ ε for 0 < ε < 1 , 0 < µ < 1 2 and M I ( µ + εµ, µ ) is equal to M I ( µ − εµ, µ ) for ε = 0. ✷ Lemma 8 L et 0 < ε < 1 2 . Then, M I ( µ + ε, µ ) is monotonic al ly incr e asing with r esp e ct to µ ∈  0 , 1 2 − ε  . Simila rly, M I ( µ − ε, µ ) is monoto nic al ly incr e asing with r esp e ct to µ ∈  ε, 1 2 + ε  . 11 Pro of . It can b e sho wn that M I ( µ + ε,µ ) ∂ µ = − 1 ( µ + ε ) 2 ln  1 − µ 1 − µ − ε  + ε µ ( µ + ε )(1 − µ ) > 0 if ln  1 − µ 1 − µ − ε  < ε ( µ + ε ) µ (1 − µ ) . Since ln  1 − µ 1 − µ − ε  < ε 1 − µ − ε , it suﬃ ces to ha v e ε 1 − µ − ε < ε ( µ + ε ) µ (1 − µ ) , or equiv alently , µ (1 − µ ) < (1 − µ − ε )( µ + ε ) = µ (1 − µ ) − εµ + (1 − µ ) ε − ε 2 , whic h can b e ensur ed by 0 < µ < 1 2 − ε . T his pro v es the ﬁrst statemen t of the lemma. Similarly , M I ( µ − ε,µ ) ∂ µ = − 1 ( µ − ε ) 2 ln  1 − µ 1 − µ + ε  − ε µ ( µ − ε )(1 − µ ) > 0 if ln  1 − µ 1 − µ + ε  < − ε ( µ − ε ) µ (1 − µ ) . Since ln  1 − µ 1 − µ + ε  < − ε 1 − µ + ε , to ensure M I ( µ − ε,µ ) ∂ µ > 0 , it su ﬃces to h av e − ε 1 − µ + ε < − ε ( µ − ε ) µ (1 − µ ) , or equiv alen tly , µ (1 − µ ) > (1 − µ + ε )( µ − ε ) = µ (1 − µ ) + εµ − ε (1 − µ ) − ε 2 , wh ic h can b e guaran teed by ε < µ < 1 2 + ε 2 . This prov es the second statemen t of the lemma. ✷ Lemma 9 M I ( µ + ε, µ ) > M I ( µ − ε, µ ) for 0 < ε < µ < 1 2 . Pro of . It can b e v eriﬁed that ∂ M I ( µ + ε,µ ) ∂ ε = − 1 ( µ + ε ) 2 ln  1 − µ 1 − µ − ε  and ∂ M I ( µ − ε,µ ) ∂ ε = 1 ( µ − ε ) 2 ln  1 − µ 1 − µ + ε  . Since ln  1 − µ 1 − µ − ε  < ε 1 − µ − ε and ln  1 − µ 1 − µ + ε  < − ε 1 − µ + ε , to ensure ∂ M I ( µ + ε,µ ) ∂ ε > ∂ M I ( µ − ε,µ ) ∂ ε , it suﬃces to hav e − 1 ( µ + ε ) 2 ε 1 − ( µ + ε ) > − 1 ( µ − ε ) 2 ε 1 − µ + ε , or equiv alen tly , 2 µ − 3 µ 2 > ε 2 , whic h is tr u e b eca use 2 µ − 3 µ 2 − ε 2 > 2 µ − 3 µ 2 − µ 2 = 2 µ (1 − 2 µ ) > 0 as a result of 0 < ε < µ < 1 2 . T herefore, the lemma is true since M I ( µ + ε, µ ) = M I ( µ − ε, µ ) for ε = 0 and ∂ M I ( µ + ε,µ ) ∂ ε > ∂ M I ( µ − ε,µ ) ∂ ε for 0 < ε < µ < 1 2 . T his completes the p ro of of the lemma. ✷ Lemma 10 L et 0 < ε < 1 2 . Then, M B ( µ + ε, µ ) is monotonic al ly incr e asing with r esp e ct to µ ∈  0 , 1 2 − ε  . Simila rly, M B ( µ − ε, µ ) is monotonic al ly incr e asing with r esp e ct to µ ∈  ε, 1 2  . Pro of . Our computation shows that ∂ M B ( µ + ε, µ ) ∂ µ = ln µ (1 − µ − ε ) ( µ + ε )(1 − µ ) + ε µ (1 − µ ) , ∂ 2 M B ( µ + ε, µ ) ∂ µ∂ ε = 1 µ (1 − µ ) − 1 ( µ + ε )(1 − µ − ε ) . Since ∂ M B ( µ + ε,µ ) ∂ µ = 0 for ε = 0 and ∂ 2 M B ( µ + ε,µ ) ∂ µ∂ ε > 0 for ε < 1 2 − µ , it must b e true that ∂ M B ( µ + ε,µ ) ∂ µ > 0 f or µ ∈  0 , 1 2 − ε  . This p ro v es the ﬁrst statemen t of the lemma. Similarly , w e can s h o w that ∂ M B ( µ − ε,µ ) ∂ µ = ln µ (1 − µ + ε ) ( µ − ε )(1 − µ ) − ε µ (1 − µ ) and ∂ 2 M B ( µ − ε,µ ) ∂ µ∂ ε = 1 ( µ − ε )(1 − µ + ε ) − 1 µ (1 − µ ) . Since ∂ M B ( µ − ε,µ ) ∂ µ = 0 for ε = 0 a nd ∂ 2 M B ( µ − ε,µ ) ∂ µ∂ ε > 0 for 0 < ε < µ < 1 2 , it m ust be t rue that ∂ M B ( µ − ε,µ ) ∂ µ > 0 for 0 < ε < µ < 1 2 . This p ro v es the second statemen t of the lemma. ✷ Lemma 11 L et 0 < ε < 1 2 . Then, M B ( µ + ε, µ ) > M B ( µ − ε, µ ) for µ ∈  ε, 1 2  . 12 Pro of . Straigh tforw ard computation sho ws that ∂ M B ( µ + ε, µ ) ∂ ε = ln  µ 1 − µ 1 − µ − ε µ + ε  , ∂ M B ( µ − ε, µ ) ∂ ε = − ln  µ 1 − µ 1 − µ + ε µ − ε  . Th us, ∂ M B ( µ + ε,µ ) ∂ ε − ∂ M B ( µ − ε,µ ) ∂ ε = ln µ 2 (1 − µ ) 2 (1 − µ ) 2 − ε 2 µ 2 − ε 2 > 0 if ε < µ < 1 2 . By virtue of su c h result and the f act that M B ( µ + ε, µ ) = M B ( µ − ε, µ ) for ε = 0, we ha v e M B ( µ + ε, µ ) > M B ( µ − ε, µ ) for ε < µ < 1 2 . This p ro v es the lemma. ✷ Lemma 12 L et 0 < ε < 1 . Then, M B ( µ + εµ, µ ) is monotonic al ly de cr e asing with r esp e ct to µ ∈  0 , 1 1+ ε  . Similarly, M B ( µ − εµ, µ ) is monotonic al ly de cr e asing with r esp e ct to µ ∈ (0 , 1) . Pro of . The ﬁrst statement of th e lemma is tru e b ecause ∂ M B ( µ + εµ, µ ) ∂ µ = (1 + ε ) ln  1 − ε (1 + ε )(1 − µ )  + ε 1 − µ < (1 + ε ) ×  − ε (1 + ε )(1 − µ )  + ε 1 − µ = 0 for 0 < µ < 1 1+ ε . Similarly , the second statemen t of the lemma is true b ecause ∂ M B ( µ − εµ, µ ) ∂ µ = (1 − ε ) ln  1 + ε (1 − ε )(1 − µ )  − ε 1 − µ < (1 − ε ) ×  ε (1 − ε )(1 − µ )  − ε 1 − µ = 0 for 0 < µ < 1. ✷ Lemma 13 L et 0 < ε < 1 . Then, M B ( µ + εµ, µ ) > M B ( µ − εµ, µ ) for µ ∈  0 , 1 2  . Pro of . It can b e shown by tedious computation that ∂ M B ( µ + εµ, µ ) ∂ ε = µ ln 1 − µ − εµ (1 + ε )(1 − µ ) , ∂ M B ( µ − εµ, µ ) ∂ ε = − µ ln 1 − µ + εµ (1 − ε )(1 − µ ) . Hence, ∂ M B ( µ + εµ, µ ) ∂ ε − ∂ M B ( µ − εµ, µ ) ∂ ε = µ ln  1 − µ − εµ (1 + ε )(1 − µ ) 1 − µ + εµ (1 − ε )(1 − µ )  . Since 1 − µ − εµ (1+ ε )(1 − µ ) 1 − µ + εµ (1 − ε )(1 − µ ) = (1 − µ ) 2 − ε 2 µ 2 (1 − µ ) 2 − ε 2 (1 − µ ) 2 > 1 for 0 < µ < 1 2 , we ha v e ∂ M B ( µ + εµ,µ ) ∂ ε − ∂ M B ( µ − εµ,µ ) ∂ ε > 0 for 0 < µ < 1 2 . Noting th at M B ( µ + εµ, µ ) − M B ( µ − εµ, µ ) = 0 f or ε = 0, w e ha v e M B ( µ + εµ, µ ) − M B ( µ − εµ, µ ) > 0 for 0 < µ < 1 2 . T his completes the p ro of of the lemma. ✷ Lemma 14 Pr { b p ≥ (1 + ε r ) p } < δ 2 for any p ∈ ( p ⋆ , 1) . 13 Pro of . T o p ro v e the lemma, w e sh all consider the follo wing th r ee cases: Case (i): (1 + ε r ) p > 1; Case (ii): γ n ≤ (1 + ε r ) p ≤ 1; Case (iii): (1 + ε r ) p < γ n . F or Case (i), it is ob vious that Pr { b p ≥ (1 + ε r ) p } = 0 < δ 2 . F or Case (ii), applying T h eorem 1 with z = (1 + ε r ) p ≥ γ n , w e hav e Pr { b p ≥ (1 + ε r ) p } = Pr    ⌊ γ /z ⌋ X i =1 X i ≥ γ    ≤ exp  ⌊ γ /z ⌋ M B  γ ⌊ γ /z ⌋ , p  (12) = exp  γ M I  γ ⌊ γ /z ⌋ , p  ≤ exp ( γ M I ( z , p ) ) (13) < exp ( γ M I ( p ⋆ + ε r p ⋆ , p ⋆ )) (14) < δ 2 (15) where (12) follo ws from Lemma 5, (13) is due to the fact that M I ( z , p ) is monotonically decreasing with resp ect to z ∈ ( p, 1), (14) follo ws from Lemma 6 , and (15) f ollo w s from the assum ption about γ . F or Case (iii), applying Th eorem 1 with z = (1 + ε r ) p < γ n , w e hav e Pr { b p ≥ (1 + ε r ) p } = Pr ( n X i =1 X i ≥ nz ) ≤ exp ( n M B ( p + ε r p, p )) (16) < exp ( n M B ( p ⋆ + ε r p ⋆ , p ⋆ )) (17) = exp ( n M B ( p ⋆ + ε a , p ⋆ )) < δ 2 (18) where (16) follo ws f r om Lemma 5, (17) follo ws from the ﬁr st statemen t of Lemma 12, and (18 ) follo w s fr om the assump tion ab out n . In summ ary , w e hav e shown Pr { b p ≥ (1 + ε r ) p } < δ 2 for all cases. This completes the p ro of of the lemma. ✷ Lemma 15 Pr { b p ≤ (1 − ε r ) p } < δ 2 for any p ∈ ( p ⋆ , 1) . 14 Pro of . T o p ro v e the lemma, w e sh all consider the follo wing tw o cases: Case (i): (1 − ε r ) p ≥ γ n ; Case (ii): (1 − ε r ) p < γ n . F or Case (i), applying T heorem 1 with z = (1 − ε r ) p ≥ γ n , we hav e Pr { b p ≤ (1 − ε r ) p } = Pr    ⌈ γ /z ⌉− 1 X i =1 X i < γ    = Pr    ⌈ γ /z ⌉− 1 X i =1 X i ≤ γ − 1    ≤ Pr    ⌈ γ /z ⌉ X i =1 X i ≤ γ    ≤ exp  ⌈ γ /z ⌉ M B  γ ⌈ γ /z ⌉ , p  (19) = exp  γ M I  γ ⌈ γ /z ⌉ , p  ≤ exp ( γ M I ( z , p )) (20) < exp ( γ M I ( p ⋆ − ε r p ⋆ , p ⋆ )) (21) < exp ( γ M I ( p ⋆ + ε r p ⋆ , p ⋆ )) (22) < δ 2 , where (19) follo ws from Lemma 5, (20) is du e to the fact that M I ( z , p ) is monotonically increasing with resp ect to z ∈ (0 , p ), (21) follo w s fr om L emm a 6, and (22) follo ws from Lemma 7 . F or Case (ii), applying T h eorem 1 with z = (1 − ε r ) p < γ n , w e hav e Pr { b p ≤ (1 − ε r ) p } ≤ exp ( n M B ( p − ε r p, p )) (23) < exp ( n M B ( p ⋆ − ε r p ⋆ , p ⋆ )) (24) < exp ( n M B ( p ⋆ + ε r p ⋆ , p ⋆ )) (25) = exp ( n M B ( p ⋆ + ε a , p ⋆ )) < δ 2 where (23) f ollo w s fr om Lemma 5, (24 ) follo ws f rom th e s econd statemen t of Lemma 12, and (25) follo w s fr om Lemma 13. In summary , we ha v e sho wn Pr { b p ≤ (1 − ε r ) p } < δ 2 for b oth cases. Th e lemma is th us p ro v ed. ✷ Lemma 16 Pr { b p ≥ p + ε a } < δ 2 for any p ∈ (0 , p ⋆ ] . Pro of . T o p ro v e the lemma, w e sh all consider the follo wing tw o cases: Case (i): p + ε a ≥ γ n ; Case (ii): p + ε a < γ n . 15 F or Case (i), applying T heorem 1 with z = p + ε a ≥ γ n , w e hav e Pr { b p ≥ p + ε a } = P r    ⌊ γ /z ⌋ X i =1 X i ≥ γ    ≤ exp  ⌊ γ /z ⌋ M B  γ ⌊ γ /z ⌋ , p  = exp  γ M I  γ ⌊ γ /z ⌋ , p  ≤ exp ( γ M I ( z , p )) < exp ( γ M I ( p ⋆ + ε r p ⋆ , p ⋆ )) (26) < δ 2 , where (26) follo ws from Lemma 8. F or Case (ii), applying T h eorem 1 with z = p + ε a < γ n , we hav e Pr { b p ≥ p + ε a } = Pr ( n X i =1 X i ≥ nz ) ≤ exp ( n M B ( p + ε a , p )) (27) ≤ exp ( n M B ( p ⋆ + ε a , p ⋆ )) (28) < δ 2 where (27) follo ws from Lemma 5, (28) follo ws from the ﬁrst statemen t of Lemma 10. In summ ary , w e hav e sho wn Pr { b p ≥ p + ε a } < δ 2 for b oth cases. The lemma is th us pr o v ed. ✷ Lemma 17 Pr { b p ≤ p − ε a } < δ 2 for any p ∈ (0 , p ⋆ ] . Pro of . T o p ro v e the lemma, w e sh all consider the follo wing th r ee cases: Case (i): p < ε a ; Case (ii): p − ε a ≥ γ n ; Case (iii): 0 ≤ p − ε a < γ n . F or Case (i), it is ob vious that Pr { b p ≤ p − ε a } = 0 < δ 2 . F or Case (ii), applying T h eorem 1 with z = p − ε a ≥ γ n , we hav e Pr { b p ≤ p − ε a } = Pr    ⌈ γ /z ⌉− 1 X i =1 X i < γ    ≤ Pr    ⌈ γ /z ⌉ X i =1 X i ≤ γ    ≤ exp  ⌈ γ /z ⌉ M B  γ ⌈ γ /z ⌉ , p  = exp  γ M I  γ ⌈ γ /z ⌉ , p  ≤ exp ( γ M I ( z , p ) ) = exp ( γ M I ( p − ε a , p )) ≤ exp ( γ M I ( p ⋆ − ε a , p ⋆ )) (29) ≤ exp ( γ M I ( p ⋆ + ε a , p ⋆ )) (30) < δ 2 16 where (29) follo ws from the second statement of Lemma 8, and (30) follo ws fr om Lemma 9. F or Case (iii), applying Th eorem 1 with z = p − ε a < γ n , we hav e Pr { b p ≤ p − ε a } = Pr ( n X i =1 X i ≤ nz ) ≤ exp ( n M B ( p − ε a , p )) (31) ≤ exp ( n M B ( p ⋆ − ε a , p ⋆ )) (32) ≤ exp ( n M B ( p ⋆ + ε a , p ⋆ )) (33) < δ 2 where (31) f ollo w s fr om Lemma 5, (32 ) follo ws f rom th e s econd statemen t of Lemma 10, and (33) follo w s fr om Lemma 11. In summ ary , w e hav e sho wn Pr { b p ≤ p − ε a } < δ 2 for all cases. The lemma is th us pr o v ed. ✷ No w we are in a p osition to pro v e Theorem 3. T o show Pr {| b p − p | < ε a or | b p − p | < ε r p } > 1 − δ , it su ﬃces to show Pr {| b p − p | ≥ ε a , | b p − p | ≥ ε r p } < δ for 0 < p < 1. F or p ∈ ( p ⋆ , 1), w e hav e Pr {| b p − p | ≥ ε a , | b p − p | ≥ ε r p } = Pr {| b p − p | ≥ ε r p } = Pr { b p ≥ (1 + ε r ) p } + Pr { b p ≤ (1 − ε r ) p } < δ 2 + δ 2 (34) = δ where (34) follo ws from Lemmas 14 and 15. Similarly , for p ∈ (0 , p ⋆ ], we ha v e Pr {| b p − p | ≥ ε a , | b p − p | ≥ ε r p } = Pr {| b p − p | ≥ ε a } = Pr { b p ≥ p + ε a } + Pr { b p ≤ p − ε a } < δ 2 + δ 2 (35) = δ where (35) follo ws from Lemmas 16 and 17. Th is completes th e pro of of Theorem 3. D Pro of of T heorem 4 Lemma 18 L et I denote the supp ort of b p . Supp ose the i nterse ction b etwe en op en interval ( p ′ , p ′′ ) and set I L is empty. Then, { ϑ ∈ I : p ≤ L ( ϑ ) } is ﬁxe d with r esp e ct to p ∈ ( p ′ , p ′′ ) . Pro of . Let p ∗ and p ⋄ b e t w o d istinct real num b ers includ ed in inte rv al ( p ′ , p ′′ ). T o sho w the lemma, it suﬃ ces to s ho w that { ϑ ∈ I : p ∗ ≤ L ( ϑ ) } = { ϑ ∈ I : p ⋄ ≤ L ( ϑ ) } . First, we shall show that { ϑ ∈ I : p ∗ ≤ L ( ϑ ) } ⊆ { ϑ ∈ I : p ⋄ ≤ L ( ϑ ) } . T o this end, w e let  ∈ { ϑ ∈ I : p ∗ ≤ L ( ϑ ) } and pro ceed to sh o w  ∈ { ϑ ∈ I : p ⋄ ≤ L ( ϑ ) } . Since  ∈ I and p ∗ ≤ L (  ), it must b e true 17 that  ∈ I and p ⋄ ≤ L (  ). If this is not the case, then we h a v e p ′′ > p ⋄ > L (  ) ≥ p ∗ > p ′ . Consequent ly , L (  ) is included by b oth the int erv al ( p ′ , p ′′ ) and the set I L . This con tradicts the assu mption of the lemma. Hence, we hav e s h o wn  ∈ { ϑ ∈ I : p ⋄ ≤ L ( ϑ ) } and accordingly { ϑ ∈ I : p ∗ ≤ L ( ϑ ) } ⊆ { ϑ ∈ I : p ⋄ ≤ L ( ϑ ) } . Second, by a simila r argument, w e can sho w { ϑ ∈ I : p ⋄ ≤ L ( ϑ ) } ⊆ { ϑ ∈ I : p ∗ ≤ L ( ϑ ) } . It follo ws that { ϑ ∈ I : p ∗ ≤ L ( ϑ ) } = { ϑ ∈ I : p ⋄ ≤ L ( ϑ ) } . Finally , the pro of of the lemma is completed b y n oting that th e ab o v e argumen t holds for arbitrary p ∗ and p ⋄ included in the op en interv al ( p ′ , p ′′ ). ✷ By virtue of Theorem 1, we can show th e f ollo w ing lemma. Lemma 19 Pr { b p ≤ z | p } is monotonic al ly de cr e asing with r esp e ct to p . Simila rly, Pr { b p ≥ z | p } is monotonic al ly incr e asing with r esp e ct to p . Lemma 20 L et p ′ < p ′′ b e two c onse cutive distinct elements of I L ∩ [ a, b ] ∪ { a, b } . Then, lim ǫ ↓ 0 Pr { p ′ + ǫ ≤ L ( b p ) | p ′ + ǫ } = Pr { p ′ < L ( b p ) | p ′ } , lim ǫ ↓ 0 Pr { p ′′ − ǫ ≤ L ( b p ) | p ′′ − ǫ } = Pr { p ′′ ≤ L ( b p ) | p ′′ } . Mor e over, Pr { p ≤ L ( b p ) | p } is monotone with r esp e ct to p ∈ ( p ′ , p ′′ ) . Pro of . First, we s hall sho w that lim ǫ ↓ 0 Pr { p ′ + ǫ ≤ L ( b p ) | p ′ + ǫ } = Pr { p ′ < L ( b p ) | p ′ } . Let m + ( ǫ ) b e the num b er of elemen ts of { ϑ ∈ I : p ′ < L ( ϑ ) < p ′ + ǫ } , where I denotes the sup p ort of b p as in L emm a 18. W e claim that lim ǫ ↓ 0 m + ( ǫ ) = 0. It suﬃces to consider tw o cases as follo w s. In the case of { ϑ ∈ I : p ′ < L ( ϑ ) } = ∅ , we hav e m + ( ǫ ) = 0 for an y ǫ > 0. In the case of { ϑ ∈ I : p ′ < L ( ϑ ) } 6 = ∅ , we ha ve m + ( ǫ ) = 0 for 0 < ǫ ≤ ǫ ∗ , where ǫ ∗ = min { L ( ϑ ) − p ′ : p ′ < L ( ϑ ) , ϑ ∈ I } is p ositiv e b ecause of the assumption that I L has no closure p oints in [ a, b ]. Hence, in b oth cases, lim ǫ ↓ 0 m + ( ǫ ) = 0. This establishes the claim. Noting that Pr { p ′ < L ( b p ) < p ′ + ǫ | p ′ + ǫ } ≤ m + ( ǫ ) as a consequence of Pr { b p = ϑ | p ′ + ǫ } ≤ 1 for any ϑ ∈ I , w e hav e that lim sup ǫ ↓ 0 Pr { p ′ < L ( b p ) < p ′ + ǫ | p ′ + ǫ } ≤ lim ǫ ↓ 0 m + ( ǫ ) = 0, whic h implies that lim ǫ ↓ 0 Pr { p ′ < L ( b p ) < p ′ + ǫ | p ′ + ǫ } = 0. Since { p ′ + ǫ ≤ L ( b p ) } ∩ { p ′ < L ( b p ) < p ′ + ǫ } = ∅ and { p ′ < L ( b p ) } = { p ′ + ǫ ≤ L ( b p ) } ∪ { p ′ < L ( b p ) < p ′ + ǫ } , w e h a v e Pr { p ′ < L ( b p ) | p ′ + ǫ } = Pr { p ′ + ǫ ≤ L ( b p ) | p ′ + ǫ } + Pr { p ′ < L ( b p ) < p ′ + ǫ | p ′ + ǫ } . Observing that Pr { p ′ < L ( b p ) | p ′ + ǫ } is con tin uous w ith r esp ect to ǫ ∈ (0 , 1 − p ′ ), w e hav e lim ǫ ↓ 0 Pr { p ′ < L ( b p ) | p ′ + ǫ } = Pr { p ′ < L ( b p ) | p ′ } . It follo w s that lim ǫ ↓ 0 Pr { p ′ + ǫ ≤ L ( b p ) | p ′ + ǫ } = lim ǫ ↓ 0 Pr { p ′ < L ( b p ) | p ′ + ǫ } − lim ǫ ↓ 0 Pr { p ′ < L ( b p ) < p ′ + ǫ | p ′ + ǫ } = lim ǫ ↓ 0 Pr { p ′ < L ( b p ) | p ′ + ǫ } = Pr { p ′ < L ( b p ) | p ′ } . 18 Next, we shall sh o w that lim ǫ ↓ 0 Pr { p ′′ − ǫ ≤ L ( b p ) | p ′′ − ǫ } = Pr { p ′′ ≤ L ( b p ) | p ′′ } . Let m − ( ǫ ) b e the num b er of elements of { ϑ ∈ I : p ′′ − ǫ ≤ L ( ϑ ) < p ′′ } . Th en, w e can sh o w lim ǫ ↓ 0 m − ( ǫ ) = 0 b y considering t w o cases as follo ws. In the case of { ϑ ∈ I : L ( ϑ ) < p ′′ } = ∅ , w e hav e m − ( ǫ ) = 0 for an y ǫ > 0. In the c ase o f { ϑ ∈ I : L ( ϑ ) < p ′′ } 6 = ∅ , we h a v e m − ( ǫ ) = 0 for 0 < ǫ < ǫ ⋆ , where ǫ ⋆ = min { p ′′ − L ( ϑ ) : ϑ ∈ I , L ( ϑ ) < p ′′ } is p ositiv e b eca use of the assumption that I U has no closur e p oin ts in [ a, b ]. Hence, in b oth cases, lim ǫ ↓ 0 m − ( ǫ ) = 0. It follo ws that lim sup ǫ ↓ 0 Pr { p ′′ − ǫ ≤ L ( b p ) < p ′′ | p ′′ − ǫ } ≤ lim ǫ ↓ 0 m − ( ǫ ) = 0 and consequen tly lim ǫ ↓ 0 Pr { p ′′ − ǫ ≤ L ( b p ) < p ′′ | p ′′ − ǫ } = 0. Since { p ′′ − ǫ ≤ L ( b p ) } = { p ′′ ≤ L ( b p ) } ∪ { p ′′ − ǫ ≤ L ( b p ) < p ′′ } and { p ′′ ≤ L ( b p ) } ∩ { p ′′ − ǫ ≤ L ( b p ) < p ′′ } = ∅ , we h av e Pr { p ′′ − ǫ ≤ L ( b p ) | p ′′ − ǫ } = Pr { p ′′ ≤ L ( b p ) | p ′′ − ǫ } + Pr { p ′′ − ǫ ≤ L ( b p ) < p ′′ | p ′′ − ǫ } . Observing that Pr { p ′′ ≤ L ( b p ) | p ′′ − ǫ } is con tinuous with respect to ǫ ∈ (0 , p ′′ ), we ha ve lim ǫ ↓ 0 Pr { p ′′ ≤ L ( b p ) | p ′′ − ǫ } = Pr { p ′′ ≤ L ( b p ) | p ′′ } . It follo ws that lim ǫ ↓ 0 Pr { p ′′ − ǫ ≤ L ( b p ) | p ′′ − ǫ } = lim ǫ ↓ 0 { p ′′ ≤ L ( b p ) | p ′′ } . No w we turn to sho w that Pr { p ≤ L ( b p ) | p } is monotone with resp ect to p ∈ ( p ′ , p ′′ ). Without loss of generalit y , w e assume that L ( . ) is monotonically increasing. S ince p ′ < p ′′ are t w o consecutiv e distinct elemen ts of I L ∩ [ a, b ] ∪ { a, b } , w e hav e that the in tersection b et we en op en interv al ( p ′ , p ′′ ) and s et I L is empty . As a result of Lemma 18, we can w rite Pr { p ≤ L ( b p ) | p } = Pr { b p ≥ ϑ | p } , where ϑ ∈ [0 , 1] is a constan t indep end en t of p ∈ ( p ′ , p ′′ ). By Lemma 19, we ha v e that Pr { b p ≥ ϑ | p } is monotonically increasing w ith resp ect to p ∈ ( p ′ , p ′′ ). This pro ve s the monotonicit y of Pr { p ≤ L ( b p ) | p } with resp ect to p ∈ ( p ′ , p ′′ ). The p r o of of the lemma is thus completed. ✷ By a similar metho d as that of Lemma 20, we can sho w the follo w ing lemma. Lemma 21 L et p ′ < p ′′ b e two c onse cutive distinct elements of I U ∩ [ a, b ] ∪ { a, b } . Then, lim ǫ ↓ 0 Pr { p ′ + ǫ ≥ U ( b p ) | p ′ + ǫ } = Pr { p ′ ≥ U ( b p ) | p ′ } , lim ǫ ↓ 0 Pr { p ′′ − ǫ ≥ U ( b p ) | p ′′ − ǫ } = Pr { p ′′ > U ( b p ) | p ′′ } . Mor e over, Pr { p ≥ U ( b p ) | p } is monotone with r e sp e ct to p ∈ ( p ′ , p ′′ ) . No w we are in a p osition to pro v e Th eorem 4. Let C ( p ) = Pr { p ≤ L ( b p ) | p } . By Lemm a 20, C ( p ) is a monotone function of p ∈ ( p ′ , p ′′ ), which implies that C ( p ) ≤ max { C ( p ′ + ǫ ) , C ( p ′′ − ǫ ) } for an y p ∈ ( p ′ , p ′′ ) an d an y p ositiv e ǫ less than m in { p − p ′ , p ′′ − p } . Consequen tly , C ( p ) ≤ lim ǫ ↓ 0 max { C ( p ′ + ǫ ) , C ( p ′′ − ǫ ) } = max { lim ǫ ↓ 0 C ( p ′ + ǫ ) , lim ǫ ↓ 0 C ( p ′′ − ǫ ) } ≤ max { C ( p ′ ) , C ( p ′′ ) } for an y p ∈ ( p ′ , p ′′ ). Since the argumen t h olds for arb itrary c onsecutiv e distinct e lemen ts of { L ( b p ) ∈ ( a, b ) | b p ∈ I } ∪ { a, b } , we h a v e established the state ment regarding the maxim um of 19 Pr { p ≤ L ( b p ) | p } with r esp ect to p ∈ ( a, b ). By a similar m etho d, we can pro v e the statement regarding the maxim um of Pr { p ≥ U ( b p ) | p } with r esp ect to p ∈ ( a, b ). This concludes the p ro of of Theorem 4. E Pro of of T heorem 6 The theorem can b e established b y showing the f ollo w in g lemmas. Lemma 22 Pr { p ≥ p } ≤ δ 2 . Pro of . By Theorem 1, Pr { b p ≤ z } =    Pr { P ⌈ γ /z ⌉− 1 i =1 X i < γ } for γ ≤ nz , Pr { P n i =1 X i ≤ nz } for γ > nz . Since X i m ust b e either 0 or 1 and γ is an in teger, w e h a v e Pr { P ⌈ γ /z ⌉− 1 i =1 X i < γ } ≤ Pr { P ⌈ γ /z ⌉ i =1 X i ≤ γ } . Hence, Pr { b p ≤ z } ≤    Pr { P ⌈ γ /z ⌉ i =1 X i ≤ γ } for γ ≤ nz , Pr { P n i =1 X i ≤ nz } for γ > nz . Since X 1 , X 2 , · · · are i.i.d. Bernoulli random v ariables, w e hav e Pr { b p ≤ z } ≤ G ( z , p ), wh ere G ( z , p ) =    P γ i =0  ⌈ γ /z ⌉ i  p i (1 − p ) ⌈ γ /z ⌉− i for γ n ≤ z ≤ 1 , P ⌊ nz ⌋ i =0  n i  p i (1 − p ) n − i for 0 ≤ z < γ n . Let z ∗ ∈ [0 , 1] b e the large st n um b er su c h that Pr { b p < z ∗ } ≤ δ 2 . Since b p is a discrete random v ariable b oun ded in [0 , 1], it must b e true that Pr { b p ≤ z ∗ } > δ 2 . O bserving that G ( z , p ) is monotonically decreasing with resp ect to p ∈ (0 , 1), w e ha v e { p ≥ p } = { p ≥ p , k < n } ⊆  G ( b p , p ) ≤ G ( b p , p ) = δ 2  ⊆  G ( b p , p ) ≤ δ 2  . Noting that δ 2 < Pr { b p ≤ z ∗ } ≤ G ( z ∗ , p ) and that G ( z , p ) is non-d ecreasing with resp ect to z ∈ (0 , 1), w e h a v e { p ≥ p } ⊆ { G ( b p , p ) ≤ δ 2 } ⊆ { G ( b p , p ) < G ( z ∗ , p ) } ⊆ { b p < z ∗ } . It follo ws that Pr { p ≥ p } ≤ Pr { b p < z ∗ } ≤ δ 2 . ✷ Lemma 23 Pr { p ≤ p } ≤ δ 2 . 20 Pro of . By Theorem 1, Pr { b p ≥ z } =    Pr { P ⌊ γ /z ⌋ i =1 X i ≥ γ } for γ ≤ nz , Pr { P n i =1 X i ≥ nz } f or γ > nz . Since X 1 , X 2 , · · · are i.i.d. Bernoulli random v ariables, w e hav e Pr { b p ≥ z } = H ( z , p ) w here H ( z , p ) =    P ⌊ γ /z ⌋ i = γ  ⌊ γ /z ⌋ i  p i (1 − p ) ⌊ γ /z ⌋− i for γ n ≤ z ≤ 1 , P n i = ⌈ nz ⌉  n i  p i (1 − p ) n − i for 0 ≤ z < γ n . Let z ∗ ∈ [0 , 1] b e the smallest n umber suc h that Pr { b p > z ∗ } ≤ δ 2 . Since b p is a d iscrete r andom v ariable b ou n ded in [0 , 1], it m ust b e true that Pr { b p ≥ z ∗ } > δ 2 . Ob serving that H ( z , p ) is monotonically increasing with r esp ect to p ∈ (0 , 1), w e hav e { p ≤ p } = { p ≤ p , k > 0 } ⊆  H ( b p , p ) ≤ H ( b p , p ) = δ 2  ⊆  H ( b p , p ) ≤ δ 2  . Noting that δ 2 < Pr { b p ≥ z ∗ } = H ( z ∗ , p ) and that H ( z , p ) is non-increasing with respect to z ∈ (0 , 1), we ha v e { p ≤ p } ⊆ { H ( b p , p ) ≤ δ 2 } ⊆ { H ( b p , p ) < H ( z ∗ , p ) } ⊆ { b p > z ∗ } . It follo ws that Pr { p ≤ p } ≤ Pr { b p > z ∗ } ≤ δ 2 . ✷ F Pro of of Th eorem 7 Theorem 7 can b e sho wn b y usin g the follo wing result (a sligh t mo diﬁ cation of Hoeﬀding’s in- equalit y [7]) and a similar argument as that of Theorem 3. Lemma 24 L et X 1 , · · · , X n b e r andom variables with joint d istribution give n b y (5). Then, Pr n P n i =1 X i n ≥ z o ≤ exp ( n M B ( z , p )) fo r 1 ≥ z ≥ p = M N . Similarly, Pr n P n i =1 X i n ≤ z o ≤ exp ( n M B ( z , p )) for 0 ≤ z ≤ p . Pro of . F or z = p , w e ha v e Pr n P n i =1 X i n ≥ z o ≤ exp ( n M B ( z , p )) = 1 . F or p < z < 1, it was shown b y Ho eﬀding in [7 ] that P r n P n i =1 X i n ≥ z o ≤ exp ( n M B ( z , p )) . F or z = 1, Pr n P n i =1 X i n ≥ z o = Pr { X i = 1 , i = 1 , · · · , n } =  M n  /  N n  ≤ p n = exp ( n M B (1 , p )) . F or z = 0, Pr n P n i =1 X i n ≤ z o = Pr { X i = 0 , i = 1 , · · · , n } =  N − M n  /  N n  ≤ (1 − p ) n = exp ( n M B (0 , p )) . F or 0 < z < p , it w as sh o wn by Ho eﬀding in [7 ] that Pr n P n i =1 X i n ≤ z o ≤ exp ( n M B ( z , p )) . F or z = p , w e hav e Pr n P n i =1 X i n ≤ z o ≤ exp ( n M B ( z , p )) = 1 . ✷ 21 G Pro of of Theorem 8 By the same argument as that of Th eorem 1, w e can sho w the follo wing lemma. Lemma 25 F or any z > 0 , Pr { b p ≤ z } =    Pr { P ⌈ γ /z ⌉− 1 i =1 X i < γ } for γ ≤ nz , Pr { P n i =1 X i ≤ nz } for γ > nz Pr { b p ≥ z } =    Pr { P ⌊ γ /z ⌋ i =1 X i ≥ γ } for γ ≤ nz , Pr { P n i =1 X i ≥ nz } for γ > nz . By applying Lemma 25, w e can show th e follo w ing lemma. Lemma 26 Pr { b p ≤ z | M } is monotonic al ly de cr e asing with r esp e ct to M . Similarly, Pr { b p ≥ z | M } i s monotonic al ly incr e asing with r esp e ct to M . No w w e shall in tro d uce some new fun ctions. Let p 0 < p 1 < · · · < p j b e a ll possib le v alues of b p . Deﬁn e rand om v ariable R such that Pr { R = r } = Pr { b p = p r } for r = 0 , 1 , · · · , j . Then, U ( b p ) = U ( p R ). W e denote U ( p R ) as U ( R ). Clearly , U ( . ) is a non-decreasing function deﬁned on domain { 0 , 1 , · · · , j } . By a linear inte rp olat ion, w e can extend U ( . ) as a co nti nuous a nd non-decreasing function on [0 , j ]. Acco rdin gly , w e ca n deﬁne in verse f u nction U − 1 ( . ) suc h that U − 1 ( θ ) = max { x ∈ [0 , j ] : U ( x ) = θ } for U (0) ≤ θ ≤ U ( j ). Th en, θ ≥ U ( R ) ⇐ ⇒ R ≤ U − 1 ( θ ) ⇐ ⇒ R ≤ g ( θ ) where g ( θ ) = ⌊U − 1 ( θ ) ⌋ . Similarly , L ( b p ) = L ( p R ). W e denote L ( p R ) as L ( R ). Clearly , L ( . ) is a non-decreasing function deﬁned on domain { 0 , 1 , · · · , j } . By a linea r in terp olation, w e can extend L ( . ) as a con tin uous and non-decreasing function on [0 , j ]. Accordingly , w e can deﬁne inv erse function L − 1 ( . ) su c h that L − 1 ( θ ) = min { x ∈ [0 , j ] : L ( x ) = θ } for L (0) ≤ θ ≤ L ( j ). Then, θ ≤ L ( R ) ⇐ ⇒ R ≥ L − 1 ( θ ) ⇐ ⇒ R ≥ h ( θ ) wh ere h ( θ ) = ⌈L − 1 ( θ ) ⌉ . Lemma 27 L et 0 ≤ r < j . Th en, h ( m ) = r + 1 for L ( r ) < m ≤ L ( r + 1) . Pro of . Clearly , h ( m ) = r + 1 for m = L ( r + 1). It remains to ev aluate h ( m ) for m satisfying L ( r ) < m < L ( r + 1). F or m > L ( r ), we ha v e r < L − 1 ( m ), otherwise r ≥ L − 1 ( m ), imp lying L ( r ) ≥ m , s in ce L ( . ) is non-decreasing and m / ∈ {L ( r ) : 0 ≤ r ≤ j } . F or m < L ( r + 1) , w e ha v e r + 1 > L − 1 ( m ), otherwise r + 1 ≤ L − 1 ( m ), implying L ( r + 1) ≤ m , since L ( . ) is non-decreasing and m / ∈ {L ( r ) : 0 ≤ r ≤ j } . Therefore, w e h a v e r < L − 1 ( m ) < r + 1 f or L ( r ) < m < L ( r + 1). Hence, r < ⌈L − 1 ( m ) ⌉ ≤ r + 1, i.e., r < h ( m ) ≤ r + 1. Since h ( m ) is an inte ger, w e ha v e h ( m ) = r + 1 for L ( r ) < m < L ( r + 1). ✷ Lemma 28 L et 0 ≤ r < j . Th en, g ( m ) = r for U ( r ) ≤ m < U ( r + 1) . 22 Pro of . Clearly , g ( m ) = r for m = U ( r ). It remains to ev aluate g ( m ) for m s atisfying U ( r ) < m < U ( r + 1). F or m > U ( r ), w e hav e r < U − 1 ( m ), otherwise r ≥ U − 1 ( m ), implying U ( r ) ≥ m , sin ce U ( . ) is non-decreasing and m / ∈ {U ( r ) : 0 ≤ r ≤ j } . F or m < U ( r + 1), w e hav e r + 1 > U − 1 ( m ), otherwise r + 1 ≤ U − 1 ( m ), im p lying U ( r + 1) ≤ m , since U ( . ) is n on-decreasing and m / ∈ {U ( r ) : 0 ≤ r ≤ j } . Therefore, for U ( r ) < m < U ( r + 1), w e h a v e r < U − 1 ( m ) < r + 1. He nce, r ≤ ⌊U − 1 ( m ) ⌋ < r + 1, i.e., r ≤ g ( m ) < r + 1. Since g ( m ) is an inte ger, w e hav e g ( m ) = r for U ( r ) < m < U ( r + 1). ✷ Noting t hat Pr { M ≥ U ( b p ) | M } = Pr { M ≥ U ( R ) | M } , w e h av e Pr { M ≥ U ( b p ) | M } = Pr { R ≤ g ( M ) | M } . Let 0 ≤ r < j . By Lemma 2 8, w e ha v e t hat g ( m ) = r for U ( r ) ≤ m < U ( r + 1). Observing that Pr { M ≥ U ( b p ) | M } = 0 for 0 ≤ M < U (0) and that Pr { M ≥ U ( b p ) | M } = 1 for U ( j ) ≤ M ≤ N , w e ha v e that th e m aximum of Pr { M ≥ U ( b p ) | M } w ith respect to M ∈ [ a, b ] is ac hiev ed on S j − 1 r =0 { m ∈ [ a, b ] : U ( r ) ≤ m ≤ U ( r + 1) } ∪ { a, b } . Now consider the range { m ∈ [ a, b ] : U ( r ) ≤ m ≤ U ( r + 1) } of M . W e only consider th e n on-trivial situation that U ( r ) < U ( r + 1). F or U ( r ) ≤ M < U ( r + 1), we hav e Pr { M ≥ U ( b p ) | M } = Pr { R ≤ g ( M ) | M } = Pr { R ≤ r | M } = Pr { b p ≤ p r | M } , whic h i s non-increasing for this range o f M as can b e seen from Lemma 26. By virtue of s uc h monotonicit y , w e can charac terize th e m aximizer of P r { M ≥ U ( b p ) | M } with resp ect to M on the set { m ∈ [ a, b ] : U ( r ) ≤ m ≤ U ( r + 1) } as follo ws. Case (i): b < U ( r ) or a > U ( r + 1). This is tr ivial. Case (ii): a < U ( r ) ≤ b ≤ U ( r + 1). T he maximizer must b e among {U ( r ) , b } . Case (iii): U ( r ) ≤ a ≤ b ≤ U ( r + 1). The maximizer must b e among { a, b } . Case (iv): U ( r ) ≤ a ≤ U ( r + 1) < b . The maximizer m ust b e among { a, U ( r + 1) } . Case (v): a < U ( r ) ≤ U ( r + 1) < b . Th e m aximizer m ust b e among {U ( r ) , U ( r + 1) } . In su mmary , the maximizer m ust b e among {U ( r ) , U ( r + 1) , a, b } ∩ [ a, b ]. It follo ws that the statemen t on Pr { M ≥ U ( b p ) | M } is established. Next, we consider Pr { M ≤ L ( b p ) | M } . Noting that Pr { M ≤ L ( b p ) | M } = Pr { M ≤ L ( R ) | M } , w e ha ve Pr { M ≤ L ( b p ) | M } = Pr { R ≥ h ( M ) | M } . Let 0 ≤ r < j . By L emma 27, w e ha v e that h ( m ) = r + 1 for L ( r ) < m ≤ L ( r + 1). Obser v in g th at Pr { M ≤ L ( b p ) | M } = 1 for 0 ≤ M ≤ L (0) a nd t hat Pr { M ≤ L ( b p ) | M } = 0 for L ( j ) < M ≤ N , w e ha v e that the maxim um of P r { M ≤ L ( b p ) | M } w ith resp ect t o M ∈ [ a, b ] is ac hieve d on S j − 1 r =0 { m ∈ [ a, b ] : L ( r ) ≤ m ≤ L ( r + 1) } ∪ { a, b } . No w consider the r ange { m ∈ [ a, b ] : L ( r ) ≤ m ≤ L ( r + 1 ) } of M . W e only consider the non-trivial situation that L ( r ) < L ( r + 1). F or L ( r ) < M ≤ L ( r + 1), we ha v e Pr { M ≤ L ( b p ) | M } = Pr { R ≥ h ( M ) | M } = Pr { R ≥ r + 1 | M } = Pr { b p ≥ p r +1 | M } , 23 whic h is non-d ecreasing for t his ran ge o f M as can b e seen fr om Lemma 26. By vir tu e of suc h monotonicit y , w e can charac terize th e m aximizer of P r { M ≤ L ( b p ) | M } with resp ect to M on the set { m ∈ [ a, b ] : L ( r ) ≤ m ≤ L ( r + 1) } as follo ws. Case (i): b < L ( r ) or a > L ( r + 1). T h is is trivial. Case (ii): a < L ( r ) ≤ b ≤ L ( r + 1). The maximizer must b e among {L ( r ) , b } . Case (iii): L ( r ) ≤ a ≤ b ≤ L ( r + 1). The maximizer m ust b e among { a, b } . Case (iv): L ( r ) ≤ a ≤ L ( r + 1) < b . The maximizer must b e among { a, L ( r + 1) } . Case (v): a < L ( r ) ≤ L ( r + 1) < b . The maximizer m ust b e among {L ( r ) , L ( r + 1) } . In su mmary , the m aximizer must b e among {L ( r ) , L ( r + 1) , a, b } ∩ [ a, b ]. It follo ws that the statemen t on Pr { M ≤ L ( b p ) | M } is established. This concludes the pro of of Theorem 8 . H Pro of of T heorem 10 The theorem can b e established b y showing the f ollo w in g lemmas. Lemma 29 Pr { M > M u } ≤ δ 2 . Pro of . Since X i m ust b e either 0 or 1 a nd γ is a n in teger, we hav e Pr { P ⌈ γ /z ⌉− 1 i =1 X i < γ } ≤ Pr { P ⌈ γ /z ⌉ i =1 X i ≤ γ } . Hence, b y Lemma 25, Pr { b p ≤ z } ≤    Pr { P ⌈ γ /z ⌉ i =1 X i ≤ γ } for γ ≤ nz , Pr { P n i =1 X i ≤ nz } for γ > nz and Pr { b p ≤ z } ≤ G ( z , p ), w h ere G ( z , p ) =    P γ i =0  M i  N − M ⌈ γ /z ⌉− i  /  N ⌈ γ /z ⌉  for γ n ≤ z ≤ 1 , P ⌊ nz ⌋ i =0  M i  N − M n − i  /  N n  for 0 ≤ z < γ n with p = M N . Let z ∗ ∈ [0 , 1] b e the largest num b er such that Pr { b p < z ∗ } ≤ δ 2 . Since b p is a d iscrete random v ariable b ounded in [0 , 1], it m ust b e true th at Pr { b p ≤ z ∗ } > δ 2 . Obser v in g that G ( z , p ) is monotonically d ecreasing with resp ect to p ∈ { i N : i = 0 , 1 , · · · , N } , we ha v e { p ≥ p } ⊆  G ( b p , p ) ≤ G ( b p , p ) ≤ δ 2  ⊆  G ( b p , p ) ≤ δ 2  where p = M u +1 N . Noting th at δ 2 < Pr { b p ≤ z ∗ } ≤ G ( z ∗ , p ) and that G ( z, p ) is n on -d ecreasing with resp ect to z ∈ (0 , 1), we ha ve { p ≥ p } ⊆ { G ( b p , p ) ≤ δ 2 } ⊆ { G ( b p , p ) < G ( z ∗ , p ) } ⊆ { b p < z ∗ } . It follo w s that Pr { p ≥ p } ≤ Pr { b p < z ∗ } ≤ δ 2 , which implies that Pr { M > M u } ≤ δ 2 . ✷ Lemma 30 Pr { M < M l } ≤ δ 2 . 24 Pro of . By Lemma 25, we ha v e Pr { b p ≥ z } = H ( z , p ), w h ere H ( z , p ) =    P ⌊ γ /z ⌋ i = γ  M i  N − M ⌊ γ /z ⌋− i  /  N ⌊ γ /z ⌋  for γ n ≤ z ≤ 1 , P n i = ⌈ nz ⌉  M i  N − M n − i  /  N n  for 0 ≤ z < γ n with p = M N . Let z ∗ ∈ [0 , 1] b e the smallest num b er such that Pr { b p > z ∗ } ≤ δ 2 . S ince b p is a discrete random v ariable b ound ed in [0 , 1], it must b e tru e that Pr { b p ≥ z ∗ } > δ 2 . Observ in g that H ( z , p ) is monotonically increasing with resp ect to p ∈ { i N : i = 0 , 1 , · · · , N } , we h a v e { p ≤ p } ⊆  H ( b p , p ) ≤ H ( b p , p ) ≤ δ 2  ⊆  H ( b p , p ) ≤ δ 2  where p = M l − 1 N . Noting that δ 2 < Pr { b p ≥ z ∗ } = H ( z ∗ , p ) and that H ( z , p ) is non-increasing with resp ect to z ∈ (0 , 1), we ha v e { p ≤ p } ⊆ { H ( b p , p ) ≤ δ 2 } ⊆ { H ( b p , p ) < H ( z ∗ , p ) } ⊆ { b p > z ∗ } . It follo ws that Pr { p ≤ p } ≤ Pr { b p > z ∗ } ≤ δ 2 , which implies Pr { M < M l } ≤ δ 2 . ✷ I Pro of of Theorem 11 In the case of M < γ , we hav e n = n and γ p > γ N M > N , from whic h the theorem immediately follo w s. It remains to show the theorem f or the case of M ≥ γ . Notice that E [ n ] = n P r ( n X i =1 X i < γ ) + n X m =1 m Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) < n Pr ( n X i =1 X i < γ ) + n X m =1 n Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) = n. Since M = P N i =1 X i ≥ γ and X i is non-negativ e, we ha v e N [ m = n +1 ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) = ( n X i =1 X i < γ ) . Hence, E [ n ] = N X m = n +1 n Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) + n X m =1 m Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) < N X m =1 m Pr ( m − 1 X i =1 X i < γ , m X i =1 X i ≥ γ ) = E [ m ] , where m is the sample n umber of the classical inverse sampling sc heme with the follo wing stopping rule: Samp ling without r ep lacemen t is cont inued until γ units p ossessing the attribute hav e b een observ ed. By the deﬁnition of the classical inv erse samp ling, we hav e P m i =1 X i = γ . Noting that X 1 , X 2 , · · · , X N are identica l but d ep endent Bernou lli random v ariables with common mean p and 25 that { n ≥ k } dep ends only on X 1 , · · · , X k − 1 for 1 ≤ k ≤ N , we can conclude that W ald’s equation still applies. Hence, E [ P m i =1 X i ] = E [ m ] E [ X i ] = γ , whic h imp lies that E [ m ] = γ E [ X i ] = γ p . Since E [ n ] is less than b oth n and E [ m ] as shown ab o ve, w e hav e E [ n ] < min { n, γ p } . Th is completes the pro of of the theorem. J Pro of of Th eorem 12 The theorem can b e established b y showing the f ollo w in g lemmas. Lemma 31 Pr { λ ≥ λ } ≤ δ 2 . Pro of . By Theorem 1, we ha v e Pr { b λ ≤ z } =    Pr { P ⌈ γ /z ⌉− 1 i =1 X i < γ } for γ ≤ nz , Pr { P n i =1 X i ≤ nz } for γ > nz . and thus Pr { b λ ≤ z } = G ( z , λ ), wh ere G ( z , λ ) =    P γ − 1 i =0 1 i ! [( ⌈ γ z ⌉ − 1) λ ] i exp( − ( ⌈ γ z ⌉ − 1) λ ) for z ≥ γ n , P ⌊ nz ⌋ i =0 1 i ! ( nλ ) i exp( − nλ ) for 0 ≤ z < γ n . Let z ∗ ≥ 0 b e the largest num b er suc h that Pr { b λ < z ∗ } ≤ δ 2 . S ince b λ is a non-negativ e discr ete random v ariable, it must b e tru e th at Pr { b λ ≤ z ∗ } > δ 2 . Obser v in g that G ( z , λ ) is monotonically decreasing with resp ect to λ ∈ (0 , ∞ ), we hav e { λ ≥ λ } ⊆  G ( b λ , λ ) ≤ G ( b λ , λ ) = δ 2  ⊆  G ( b λ , λ ) ≤ δ 2  . Noting th at δ 2 < Pr { b λ ≤ z ∗ } = G ( z ∗ , λ ) and that G ( z , λ ) is non-decreasing with resp ect to z ∈ (0 , ∞ ), w e ha ve { λ ≥ λ } ⊆ { G ( b λ , λ ) ≤ δ 2 } ⊆ { G ( b λ , λ ) < G ( z ∗ , λ ) } ⊆ { b λ < z ∗ } . It follo ws that Pr { λ ≥ λ } ≤ Pr { b λ < z ∗ } ≤ δ 2 . ✷ Lemma 32 Pr { λ ≤ λ } ≤ δ 2 . Pro of . By Theorem 1, we ha v e Pr { b λ ≥ z } =    Pr { P ⌊ γ /z ⌋ i =1 X i ≥ γ } for γ ≤ nz , Pr { P n i =1 X i ≥ nz } for γ > nz and thus Pr { b λ ≥ z } = H ( z , λ ) where H ( z , λ ) =    P ∞ i = γ 1 i ! ( ⌊ γ z ⌋ λ ) i exp( −⌊ γ z ⌋ λ ) for z ≥ γ n , P ∞ i = ⌈ nz ⌉ 1 i ! ( nλ ) i exp( − nλ ) for 0 ≤ z < γ n . 26 Let z ∗ ≥ 0 b e the sm allest num b er such that Pr { b λ > z ∗ } ≤ δ 2 . S ince b λ is a non-negativ e discrete random v ariable, it must b e true that Pr { b λ ≥ z ∗ } > δ 2 . Observ in g that H ( z , λ ) is monotonically increasing with resp ect to λ ∈ (0 , ∞ ), we hav e { λ ≤ λ } = { λ ≤ λ , k > 0 } ⊆  H ( b λ , λ ) ≤ H ( b λ , λ ) = δ 2  ⊆  H ( b λ , λ ) ≤ δ 2  . Noting t hat δ 2 < Pr { b λ ≥ z ∗ } = H ( z ∗ , λ ) and that H ( z , λ ) is non-increasing w ith resp ect to z ∈ (0 , ∞ ), w e hav e { λ ≤ λ } ⊆ { H ( b λ , λ ) ≤ δ 2 } ⊆ { H ( b λ , λ ) < H ( z ∗ , λ ) } ⊆ { b λ > z ∗ } . It follo ws that Pr { λ ≤ λ } ≤ Pr { b λ > z ∗ } ≤ δ 2 . ✷ K Pro of of T heorem 13 By the same metho d as that of Lemma 14, we ha v e Lemma 33 Pr { b µ ≥ (1 + ε r ) µ } < δ 2 for any µ ∈ ( p ⋆ , 1) . Lemma 34 L et 0 < ε r < 1 , z = (1 − ε r ) µ and ε ′ = 1 − γ (1 − ε r ) γ + ε r − 1 . Supp ose γ > 1 − ε r ε r . Then, M I  z γ γ − z , µ  < M I ((1 − ε ′ ) µ, µ ) for any µ ∈ (0 , 1) . Pro of . As a consequence of γ > 1 − ε r ε r , we hav e 0 < z γ γ − z < µ for any µ ∈ (0 , 1). Since M I ( w, µ ) is monotonically increasing with r esp ect to w ∈ (0 , µ ), it suﬃces to sho w that z γ γ − z < (1 − ε ′ ) µ for an y µ ∈ (0 , 1). That is, to sh o w (1 − ε r ) µγ γ − (1 − ε r ) µ < (1 − ε ′ ) µ, ∀ µ ∈ (0 , 1), i.e., (1 − ε r ) γ γ − (1 − ε r ) µ < 1 − ε ′ , ∀ µ ∈ (0 , 1). This follo ws fr om the deﬁn ition of ε ′ . ✷ Lemma 35 Pr { b µ ≤ (1 − ε r ) µ } < δ 2 for any µ ∈ ( p ⋆ , 1) . Pro of . T o p ro v e the lemma, w e sh all consider the follo wing tw o cases: Case (i): (1 − ε r ) µ ≥ γ n ; Case (ii): (1 − ε r ) µ < γ n . F or Case (i), app lying T heorem 1 with z = (1 − ε r ) µ ≥ γ n , w e h a v e Pr { b µ ≤ (1 − ε r ) µ } = Pr n P ⌈ γ /z ⌉− 1 i =1 X i < γ o . As a consequence of γ > 1 − ε r ε r , w e ha ve 0 < γ ⌈ γ /z ⌉− 1 ≤ z γ γ − z < µ for any µ ∈ (0 , 1). He nce, app lying Lemma 5, we h a v e Pr { b µ ≤ (1 − ε r ) µ } ≤ exp  ( ⌈ γ /z ⌉ − 1) M B  γ ⌈ γ /z ⌉ − 1 , µ  = exp  γ M I  γ ⌈ γ /z ⌉ − 1 , µ  ≤ exp  γ M I  γ z γ − z , µ  . 27 By Lemmas 34 and 6 , Pr { b µ ≤ (1 − ε r ) µ } ≤ exp  γ M I ((1 − ε ′ ) µ, µ )  ≤ exp  γ M I ((1 − ε ′ ) p ⋆ , p ⋆ )  = exp  γ M I  γ (1 − ε r ) p ⋆ γ + ε r − 1 , p ⋆  < δ 2 . F or Case (ii), applying Theorem 1 w ith z = (1 − ε r ) µ < γ n and by a similar argumen t as that of Lemma 15, we hav e Pr { b µ ≤ (1 − ε r ) µ } < δ 2 . In summ ary , w e hav e sho wn Pr { b µ ≤ (1 − ε r ) µ } < δ 2 for all cases. The lemma is th us pr o v ed. ✷ By the same metho d as that of Lemma 16, we ha v e Lemma 36 Pr { b µ ≥ µ + ε a } < δ 2 for any µ ∈ (0 , p ⋆ ] . Lemma 37 L et z = µ − ε . Supp ose 0 < z γ γ − z < µ . Then, M I  z γ γ − z , µ  is monotonic al ly incr e asing with r esp e ct to µ ∈  ε, 1 2  . Pro of . Note that ∂ M I ( w, µ ) ∂ µ = 1 µ − 1 w 2 ln  1 − µ 1 − w  ∂ w ∂ µ −  1 w − 1  1 1 − µ , where w = z γ γ − z = − γ + γ 2 γ − z and ∂ w ∂ µ = γ 2 ( γ − z ) 2 = w 2 z 2 . Hence, ∂ M I ( w, µ ) ∂ µ = 1 µ + 1 z 2 ln  1 − w 1 − µ  −  1 w − 1  1 1 − µ > 1 µ + 1 z 2  µ − w 1 − w  −  1 w − 1  1 1 − µ = 1 z 2  µ − w 1 − w  − µ − w µ (1 − µ ) w > 0 if z 2 (1 − w ) < µ (1 − µ ) w , i.e., z 2  1 w − 1  < µ (1 − µ ) ⇐ ⇒ z 2  1 z − 1 γ − 1  < µ (1 − µ ) ⇐ ⇒ z (1 − z ) − z 2 γ < µ (1 − µ ) s ince 1 w = 1 z − 1 γ . ✷ Lemma 38 Pr { b µ ≤ µ − ε a } < δ 2 for any µ ∈ (0 , p ⋆ ] . Pro of . T o p ro v e the lemma, w e sh all consider the follo wing th r ee cases: Case (i): µ < ε a ; Case (ii): µ − ε a ≥ γ n ; Case (iii): 0 ≤ µ − ε a < γ n . F or Case (i), it is eviden t that Pr { b µ ≤ µ − ε a } = 0 < δ 2 . F or Case (ii), app lyin g Th eorem 1 with z = µ − ε a ≥ γ n , w e h av e Pr { b µ ≤ µ − ε a } = Pr n P ⌈ γ /z ⌉− 1 i =1 X i < γ o . F rom the d eﬁnition of z and the a ssump tion that γ > 1 − ε r ε r , we see th at 0 < γ ⌈ γ /z ⌉− 1 ≤ z γ γ − z < µ for an y µ ∈ (0 , p ⋆ ]. Hence, it follo ws f r om Lemma 5 that Pr { b µ ≤ µ − ε a } ≤ exp  ( ⌈ γ /z ⌉ − 1) M B  γ ⌈ γ /z ⌉ − 1 , µ  = exp  γ M I  γ ⌈ γ /z ⌉ − 1 , µ  ≤ exp  γ M I  γ z γ − z , µ  . 28 In v oking Lemma 37, w e hav e Pr { b µ ≤ µ − ε a } ≤ exp  γ M I  γ ( p ⋆ − ε a ) γ − ( p ⋆ − ε a ) , p ⋆  = exp  γ M I  γ (1 − ε r ) p ⋆ γ − (1 − ε r ) p ⋆ , p ⋆  < exp ( γ M I ((1 − ε ′ ) p ⋆ , p ⋆ )) = exp  γ M I  γ (1 − ε r ) p ⋆ γ + ε r − 1 , p ⋆  < δ 2 where ε ′ is deﬁn ed in Lemma 34 . F or Case (iii), applying Theorem 1 with z = µ − ε a < γ n and b y an argumen t as that of Lemma 17, we ha v e Pr { b µ ≤ µ − ε a } < δ 2 . In s u mmary , w e h av e shown Pr { b µ ≤ µ − ε a } < δ 2 for all cases. The lemma is thus p ro v ed. ✷ Finally , the pr o of of Theorem 13 can b e accomplished b y a similar argument as that of T heorem 3. L Pro of of Theorem 14 The theorem can b e established b y showing the f ollo w in g lemmas. Lemma 39 Pr { µ ≥ µ } ≤ δ 2 . Pro of . F or µ > z ≥ γ n , by Theorem 1 and Lemma 5, we hav e Pr { b µ ≤ z } = Pr    ⌈ γ /z ⌉− 1 X i =1 X i < γ    ≤ exp  ( ⌈ γ /z ⌉ − 1) M B  γ ⌈ γ / z ⌉ − 1 , µ  = exp  γ M I  γ ⌈ γ /z ⌉ − 1 , µ  ≤ exp  γ M I  z γ γ − z , µ  where th e last inequalit y is due to γ ⌈ γ /z ⌉− 1 ≤ z γ γ − z and the fact that M I ( z , µ ) is monotonically increasing with resp ect to z ∈ (0 , µ ). F or µ > z and 0 ≤ z < γ n , by T heorem 1 and Lemma 5, we ha v e P r { b µ ≤ z } = P r n P n i =1 X i n ≤ z o ≤ exp( n M B ( z , µ )) . Therefore, Pr { b µ ≤ z } ≤ G ( z, µ ), wh ere G ( z , µ ) =    exp  γ M I  z γ γ − z , µ  for γ n ≤ z < µ , exp( n M B ( z , µ )) for 0 ≤ z < γ n , z < µ. Let z ∗ ∈ [0 , 1] b e the largest num b er suc h that Pr { b µ < z ∗ } ≤ δ 2 . Then, it m ust b e true that either Pr { b µ ≤ z ∗ } > δ 2 or Pr { b µ ≤ z ∗ } = δ 2 . Ob serving that G ( z , µ ) is monotonically dec reasing with resp ect to µ ∈ ( z , 1), we h a v e { µ ≥ µ } = { µ ≥ µ ≥ b µ , k < n } ⊆  G ( b µ , µ ) ≤ G ( b µ , µ ) = δ 2 , µ ≥ µ ≥ b µ  ⊆  G ( b µ , µ ) ≤ δ 2 , b µ ≤ µ  . In th e case of Pr { b µ ≤ z ∗ } > δ 2 , we ha v e δ 2 < Pr { b µ ≤ z ∗ } ≤ G ( z ∗ , µ ). Since G ( z , µ ) is increasing with resp ect to z ∈ (0 , µ ), we ha v e { µ ≥ µ } ⊆ { G ( b µ , µ ) ≤ δ 2 , b µ ≤ µ } ⊆ { G ( b µ , µ ) < G ( z ∗ , µ ) , b µ ≤ 29 µ } ⊆ { b µ < z ∗ } . It follo ws that Pr { µ ≥ µ } ≤ Pr { b µ < z ∗ } ≤ δ 2 . In th e case of Pr { b µ ≤ z ∗ } = δ 2 , we ha v e δ 2 = Pr { b µ ≤ z ∗ } ≤ G ( z ∗ , µ ). S ince G ( z, µ ) is increasing w ith r esp ect to z ∈ (0 , µ ), w e ha v e { µ ≥ µ } ⊆ { G ( b µ , µ ) ≤ δ 2 , b µ ≤ µ } ⊆ { G ( b µ , µ ) ≤ G ( z ∗ , µ ) , b µ ≤ µ } ⊆ { b µ ≤ z ∗ } . It follo ws that Pr { µ ≥ µ } ≤ Pr { b µ ≤ z ∗ } = δ 2 . ✷ Lemma 40 Pr { µ ≤ µ } ≤ δ 2 . Pro of . F or z > µ and 1 ≥ z ≥ γ n , by Theorem 1 and Lemma 5, we hav e Pr { b µ ≥ z } = Pr    ⌊ γ /z ⌋ X i =1 X i ≥ γ    ≤ exp  ⌊ γ / z ⌋ M B  γ ⌊ γ /z ⌋ , µ  = exp  γ M I  γ ⌊ γ /z ⌋ , µ  ≤ exp( γ M I ( z , µ )) where the last inequ ality is d ue to γ ⌊ γ /z ⌋ ≥ z and the fact that M I ( z , µ ) is monotonically decreasing with resp ect to z ∈ ( µ, 1). F or µ < z < γ n , by Theorem 1 and Lemma 5, w e h a v e P r { b µ ≥ z } = Pr n P n i =1 X i n ≥ z o ≤ exp( n M B ( z , µ )) . Therefore, Pr { b µ ≥ z } ≤ H ( z, µ ), wh ere H ( z , µ ) =    exp ( γ M I ( z , µ ) ) for 1 ≥ z ≥ γ n , z > µ, exp( n M B ( z , µ )) for µ < z < γ n . Let z ∗ ∈ [0 , 1] b e the smalle st n um b er such that Pr { b µ > z ∗ } ≤ δ 2 . Then, it must b e tru e that either Pr { b µ ≥ z ∗ } > δ 2 or Pr { b µ ≥ z ∗ } = δ 2 . Observ in g that H ( z , µ ) is m onotonically increasing with resp ect to µ ∈ (0 , z ), we h a v e { µ ≤ µ } = { µ ≤ µ ≤ b µ , k > 0 } ⊆  H ( b µ , µ ) ≤ H ( b µ , µ ) = δ 2 , µ ≤ µ ≤ b µ  ⊆  H ( b µ , µ ) ≤ δ 2 , b µ ≥ µ  . In the case of Pr { b µ ≥ z ∗ } > δ 2 , we h a v e δ 2 < Pr { b µ ≥ z ∗ } ≤ H ( z ∗ , µ ). Since H ( z , µ ) is decreasing with resp ect to z ∈ ( µ, 1), w e hav e { µ ≤ µ } ⊆ { H ( b µ , µ ) ≤ δ 2 , b µ ≥ µ } ⊆ { H ( b µ , µ ) < H ( z ∗ , µ ) , b µ ≥ µ } ⊆ { b µ > z ∗ } . It follo w s that Pr { µ ≤ µ } ≤ Pr { b µ > z ∗ } ≤ δ 2 . In th e ca se of Pr { b µ ≥ z ∗ } = δ 2 , we hav e δ 2 = Pr { b µ ≥ z ∗ } ≤ H ( z ∗ , µ ). S ince H ( z , µ ) is decreasing w ith resp ect to z ∈ ( µ, 1), w e ha ve { µ ≤ µ } ⊆ { H ( b µ , µ ) ≤ δ 2 , b µ ≥ µ } ⊆ { H ( b µ , µ ) ≤ H ( z ∗ , µ ) , b µ ≥ µ } ⊆ { b µ ≥ z ∗ } . It follo ws that Pr { µ ≤ µ } ≤ Pr { b µ ≥ z ∗ } = δ 2 . ✷ References [1] X. Chen, “Inv erse sampling for nonasymptotic s equen tial estimatio n of b ounded v ariable means,” arXiv:0711.2 801 , No v em b er 2007. 30 [2] X. Chen, “In terv al estimat ion of b ounded v ariable means via in verse sampling,” arXiv:0802 .3539 , F eb ruary 2008. [3] J. Ch eng, “Samp ling algorithms for estimating the mean of b oun ded v ariables,” Comput. Statist. , vo l. 16, pp. 1–23, 2001. [4] C. J. Clopp er and E. S . Pearson, “The use of conﬁdence or ﬁducial limits illus tr ated in the case of the binomial,” Biometrika , vo l. 26, pp. 404–41 3, 1934. [5] P . Dagum, R. K arp, M. L u b y and S. Ross, “An optimal algorithm for Monte Carlo estima- tion,” SIAM J. Comput. , vol. 29, pp. 1484–149 6, 2000. [6] F. Garwoo d, “Fiducial limits f or the Poisson distrib ution,” Biometrika , v ol. 28, pp . 437–442, 1936. [7] W. Ho eﬀdin g, “Probabilit y inequaliti es for sums of boun ded v ariables,” J. A mer. Sta tist. Asso c. , v ol. 58, pp . 13–29 , 1963. [8] J. B. S., Haldane, “A lab our-sa ving metho d of sampling,” Natur e , v ol. 155 , pp. 49–50, January 13, 1945. [9] J. B. S., Haldane, “On a method of estimating frequencies,” Biometrika , vol. 33, pp. 222– 225, 1945. [10] L. Mendo and J . M. Hern an d o, “Estimation of a pr obabilit y w ith optim um guarante ed con- ﬁdence in inv erse binomial sampling,” arXiv:080 9.2402 , Septemb er 2008. 31

A Theory of Truncated Inverse Sampling

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment