Exact Methods for Multistage Estimation of a Binomial Proportion

We first review existing sequential methods for estimating a binomial proportion. Afterward, we propose a new family of group sequential sampling schemes for estimating a binomial proportion with prescribed margin of error and confidence level. In pa…

Authors: Zhengjia Chen, Xinjia Chen

Exact Methods for Multistage Estimation of a Binomial Proportion
Exact Metho ds for Multistage Estimation of a Binomial Prop ortion Zheng jia Chen and Xinjia Chen ∗ F ebruary 2013 Abstract W e first review existing sequential metho ds for estimating a binomial prop ortion. After- ward, we prop ose a new family of group sequential sampling schemes for estimating a bino mia l prop ortion with presc ribed margin of err or and confidence level. In par ticular, we establish the uniform controllability of coverage probability and the asymptotic optimality for such a family of sampling schemes. Our theor etical results esta blish the p ossibility that the para meters of this family of sampling schemes can b e determined so that the pr escrib ed level of confidence is g ua r- anteed with little waste of samples. Analytic b ounds for the cumulativ e distr ibution functions and exp e ctations of sample n umbers are derived. Moreov er, we discuss the inherent connection of v arious sampling schemes. Numerical iss ues are addr essed fo r improving the acc ur acy and efficiency of computation. Computational exp eriment s are conducted for co mparing sampling schemes. Illustrative exa mples ar e given for applications in clinical trials. 1 In tro duction Estimating a b inomial prop ortion is a pr oblem of ubiquitous signifi cance in man y areas of engi- neering and sciences. F or economical r easons and other concerns, it is imp ortan t to u se as fewer as p ossible samples to guaran tee th e required reliabilit y of estimation. T o ac hiev e this goal, sequential sampling sc hemes can b e v ery useful. In a sequential sampling sc heme, the total n umber of observ a- tions is not fi xed in adv a nce. Th e sampling pro cess is cont in ued stage by stage unt il a pre-sp ecified stopping ru le is satisfied. T he stopping r ule is ev aluated with accum ulated observ ations. I n many applications, f or administrative f easibility , the sampling exp eriment is p erformed in a group f ash - ion. Similar to group s equen tial tests [4, Section 8], [26], an estimation metho d based on taking samples by groups and ev aluating them s equen tially is referred to as a group s equ en tial estimation metho d. It should b e noted that group sequent ial estimation metho ds are general enough to in clud e fixed-sample-size and fully sequ ential pro cedures as s p ecial cases. P articularly , a fixed-sample-size metho d can b e view ed as a group sequential pro cedure of only one stage. If the increment b etw een ∗ Dr. Zhengjia Chen is working with Department of Biostatistics and Bioinformati cs, Emory U nive rsity , Atlan ta, GA 30322; Email: zc hen38@emory .edu. Dr. Xinjia Chen is w orking with Department of Electri cal Engineering, Southern Un iv ersit y at Baton Rouge, LA 70813; Email: xinjiachen@engr.subr.edu. 1 the sample sizes of consecutiv e stages is equal to 1, then the group sequentia l metho d is actually a fully sequenti al m ethod . It is a common cont en tion that statistical infer en ce, as a uniqu e science to quantify the uncer- tain ties of inferential statement s, should av oid errors in the quanti fication of un certai nti es, w h ile minimizing the sampling cost. That is, a sta tistical inferen tial method is exp ected to b e exact and efficien t. The con ven tional n otio n of exactness is th at n o app ro ximation is in v olv ed, e xcept the roundoff error d ue to fi nite word length of computers. Existing sequential metho ds for esti- mating a binomial prop ortion are dominantly of asymptotic n ature (see, e.g ., [5, 23, 24, 27, 32] and the references therein). Undoubtedly , asymptotic tec hniques pro vide approximate s olutions and imp ortan t insights for the relev ant problems. Ho wev er, an y asymptotic metho d inevitably in tro duces unknown error in the r esultan t app ro ximate solution du e to the necessary use of a finite n umber of samples. In the d irection of n on-asymptotic sequ en tial estimation, the primary goal is to ensure that the tru e co v erage p robabilit y is ab o ve the pre-sp ecified confidence level for an y v alue of the asso ciated p arameter, while the required s amp le size is as lo w as p ossible. In this direction, Mendo and Hernand o [30] devel op ed an inv erse bin omial sampling scheme for estimat- ing a binomial prop ortion with relativ e pr ecisio n. T anak a [33] d ev elop ed a rigorous metho d for constructing fixed-width sequentia l confi d ence in terv als for a binomial prop ortion. Although no appro ximation is inv olv ed, T anak a’s m ethod is very conserv ative due to the b oun ding tec hniqu es emplo yed in the deriv ation of sequenti al confid ence in terv als. F ranz´ en [20 ] studied the constru c- tion of fixed-wid th sequentia l confidence int erv als for a b inomial pr op ortion. How ev er, no effectiv e metho d for d efining stopp ing r ules is p rop osed in [20]. In his later pap er [21], F ranz ´ en prop osed to constru ct fix ed -width confidence in terv als based on sequen tial probab ility ratio tests (SPR Ts) in v en ted by W ald [34]. His metho d can generate fi xed-sample-size confidence interv als based on SPR Ts. Unfortunately , h e made a fu ndamen tal fl a w b y mistaking that if the width of the fixed- sample-size confid ence interv al decreases to b e smaller than the pre-sp ecified length as the num b er of samples is increasing, then the fixed-sample-size confidence inte rv al at the termination of sam- pling pr ocess is th e d esired fixed -width sequ en tial confidence interv al guaran teeing the p rescrib ed confidence leve l. More r ecently , Jesse F rey p ublished a pap er [22] in The Americ an Statistician (T AS) on the classical problem of sequent ially estimating a bin omial prop ortion with p rescrib ed margin of err or and confid en ce level . Before F rey sub m itted his original manuscript to T AS in July 2009, a general framewo rk of multistag e parameter estimation h ad b een established by Chen [6, 8, 10 , 12, 13], whic h provides exact metho ds for estimating parameters of common distribu tions with v arious err or criterion. T his framew ork is also prop osed in [14]. Th e app roac h of F rey [22] is similar to that of Chen [6, 8, 10 , 12, 13] for the sp ecific problem of estimating a binomial prop ortion with prescrib ed margin of err or and confidence lev el. In this pap er, our p rimary interests are in the exact sequentia l metho ds for the estimation of a b inomial prop ortion with p rescrib ed m argin of error and confidence lev el. W e first in tro duce the exact app roac h established in [6, 8, 10, 12, 13]. In particular, w e introd uce the inclusion p r inciple prop osed in [13] and its applications to the construction of concrete stopping rules. W e in v estigate the connection among v arious s topp ing r ules. Afterward, w e pr op ose a new family of stopping rules whic h are extremely simple and accommo date some existing stopping rules as sp ecial cases. W e pro vide rigorous justificatio n for the feasibilit y and asymptotic optimalit y of suc h s topp ing rules. W e 2 pro v e that th e pr escrib ed confidence leve l can b e guaran teed uniformly for all v alues of a binomial prop ortion by c h oosing appropriate parametric v alues f or the s topp ing ru le. W e sh ow that as th e margin of error tends to zero, the sample size tends to th e attainable minim um as if the binomial prop ortion w ere exactly known. W e derive analytic b ound s for distrib utions and exp ectations of sample num b ers . In addition, we addr ess some critical computational issues and p rop ose metho ds to impro v e th e accuracy and efficiency of numerical calculation. W e conduct extensiv e n umerical exp erimen t to study the p erformance of v arious stopping rules. W e determine parametric v alues for the prop osed stopping rules to ac hiev e unp r eceden tedly efficiency while guaranteeing prescrib ed confidence lev els. W e attempt to make our pr op osed metho d as user -fr iendly as p ossible so that it can b e immediately applicable eve n for la y er p ersons. The remaind er of the pap er is organized as follo ws. In Section 2, we int ro duce the exact approac h prop osed in [6, 8, 10, 12, 13 ]. I n Section 3, w e discuss the general p rinciple of constructing stopping rules. In Section 4, we prop ose a new family of samp ling sc hemes and in v estigate their feasibilit y , optimalit y and analytic b ounds of the distribution and exp ecta tion of s ample num b ers. In Section 5, w e compare v arious compu tatio nal metho ds. In particular, we illustrate why the n atural metho d of ev aluating co v erage probabilit y based on gridd ing parameter space is neither rigorous nor efficien t. In Section 6, we p r esen t n umerical r esults for v arious samplin g sc hemes. In Section 7, w e illustrate the applications of our group sequential metho d in clinical trials. S ection 8 is the conclusion. The pro ofs of theorems are giv en in app endices. Through ou t this pap er, w e shall use th e follo wing notations. Th e empt y set is denoted by ∅ . The set of p ositiv e intege rs is d enoted by N . The ceiling function is denoted by ⌈ . ⌉ . Th e n ota tion Pr { E | θ } denotes the probabilit y of th e ev ent E asso ciated with parameter θ . The exp ectation of a r an d om v ariable is denoted by E [ . ]. Th e standard normal distribution is denoted by Φ( . ). F or α ∈ (0 , 1), the n otation Z α denotes the critical v alue suc h that Φ( Z α ) = 1 − α . F or n ∈ N , in the case th at X 1 , · · · , X n are i.i.d. samples of X , we denote the sample mean P n i =1 X i n b y X n , which is also called the relativ e frequency when X is a Bernoulli random v ariable. The other notations will b e made clear as we pro ceed. 2 Ho w Can It Be Exact? In many areas of scien tific in v estigation, the outcome of an exp eriment is of dic hotom y nature and can b e m o deled as a Bernoulli ran d om v ariable X , defined in probabilit y space (Ω , Pr , F ), such that Pr { X = 1 } = 1 − Pr { X = 0 } = p ∈ (0 , 1) , where p is referred to as a b inomial p rop ortion. In general, there is n o analytical metho d for ev aluating th e b inomial pr op ortion p . A fr equen tly-used approac h is to estimate p based on i.i.d. samples X 1 , X 2 , · · · of X . T o reduce the sampling cost, it is appropriate to estimate p by a m ultistage sampling pro cedure. More formally , let ε ∈ (0 , 1) and 1 − δ , with δ ∈ (0 , 1), b e the pre-sp ecified margin of error and confi dence lev el resp ectiv ely . The ob jectiv e is to construct a sequen tial estimator b p f or p based on a multistag e sampling sc heme suc h that Pr {| b p − p | < ε | p } ≥ 1 − δ (1) 3 for any p ∈ (0 , 1). Th roughout this pap er, the probability Pr {| b p − p | < ε | p } is r eferred to as the c over age pr ob ability . Accordingly , the probability Pr {| b p − p | ≥ ε | p } is r eferr ed to as the c omplementary c over age pr ob ability . Clearly , a complete construction of a m ultistage estimation sc h eme needs to determine the num b er of stages, the sample sizes for all stages, the stopping rule, and the estimator f or p . Throughout this pap er, we let s denote th e num b er of stages and let n ℓ denote the n umber of samp les at the ℓ -th stages. That is, the samplin g p r ocess consists of s stages with sample sizes n 1 < n 2 < · · · < n s . F or ℓ = 1 , 2 , · · · , s , define K ℓ = P n ℓ i =1 X i and b p ℓ = K ℓ n ℓ . The stopping rule is to b e defi ned in terms of b p ℓ , ℓ = 1 , · · · , s . Of course, the index of stage at the termination of the sampling pr ocess, denoted by l , is a random num b er. Accordingly , the n umber of samples at the termination of the exp erimen t, denoted by n , is a random num b er w hic h equ als n l . Since for eac h ℓ , b p ℓ is a maximum-lik eliho o d and m inim u m-v ariance unbiased estimator of p , the sequentia l estimator for p is take n as b p = b p l = P n l i =1 X i n l = P n i =1 X i n . (2) In the abov e discussion, we hav e outlined the ge neral c haracteristics of a m ultistage sampling sc h eme for estimating a binomial pr op ortion. It remains to determine the n umber of s tag es, the sample s izes for all stages, and th e stopping rule so that the resu ltan t estimator b p satisfies (1) for an y p ∈ (0 , 1). Actually , the p r oblem of sequentia l estimation of a binomial pr op ortion has b een treated b y Chen [6, 8 , 10, 12, 13] in a general f ramew ork of multistag e parameter estimation. T he tec h niques of [6, 8, 10, 12, 13] are sufficient to offer exact solutions f or a wide range of s equ en tial estimation problems, includin g the estimation of a b inomial pr op ortion as a sp ecial case. The central idea of the app roac h in [6, 8 , 10, 12, 13] is th e cont rol of co verag e pr obabilit y b y a single parameter ζ , r eferred to as the c over age tuning p ar ameter , and the adaptiv e rigorous chec king of co v erage guaran tee by virtu e of b ounds of co verage p robabilities. It is recognized in [6, 8, 10, 12, 13] that, due to the discont inuit y of the co v erage p r obabilit y on p arameter space, the con v en tional metho d of ev aluating the co verage p robabilit y for a fi nite num b er of parameter v alues is neither rigorous not computationally efficient for c h ec king the cov erage probabilit y guarante e. As mentio ned in th e intro d uction, F rey pub lished an artic le [2 2] in T AS on the sequential estimation of a binomial prop ortion with pr escrib ed margin of error and confidence leve l. F or clarit y of presenta tion, the comparison of the works of Chen and F rey is giv en in Section 5.4. In the r emainder of this section, w e shall only introdu ce the idea and tec h niques of [6, 8, 10 , 12, 13], whic h h ad b een pr eceden tially dev elop ed by Chen b efore F rey subm itted his original man uscript to T AS in July 2009. W e will in tro duce the appr oac h of [6, 8, 10, 12, 13] with a fo cus on the sp ecial problem of estimating a binomial p rop ortion with pr escrib ed margin of error and confidence lev el. 2.1 F our Comp onen ts Suffice The exact metho ds of [6, 8, 10, 12, 13] for m ultistage parameter estimation ha v e four main com- p onen ts as follo ws: (I) Stopping rules parameterized by the cov erage tu ning parameter ζ > 0 suc h that the asso ciated co verag e p robabilities can b e made arbitrarily close to 1 by c h oosing ζ > 0 to b e a sufficient ly small 4 n umber. (I I) Recursive ly computable low er and upp er b oun ds for the complemen tary co v erage probabilit y for a giv en ζ and an int erv al of parameter v alues. (I I I) Adapted Branc h and Bound Algorithm. (IV) Bisection co v erage tuning. Without lo oking at the tec h n ical details, one can see that these four comp onen ts are su fficien t for constructing a sequen tial estimator so that the prescrib ed confidence lev el is guaran teed. Th e reason is as follo ws: As lo we r and u pp er b ound s for the complemen tary co v erage probabilit y are a v ailable, the global optimization tec h n ique, Branch and Bound (B&B) Algorithm [28], can b e u sed to compute exactly the maximum of complement ary co v erage pr obabilit y on th e whole parameter space. Th u s, it is p ossible to chec k rigorously whether the co verag e probability asso ciated with a giv en ζ is n o less than the pre-sp ecified confid ence lev el. Since the co v erage p robabilit y can b e con tr olled by ζ , it is p ossible to determine ζ as large as p ossible to guarante e the desir ed confidence lev el by a bisection searc h. This pro cess is r eferred to as b isecti on co ve rage tun ing in [6, 8, 10, 12, 13]. Since a critical sub routine needed for bisection co verag e tuning is to chec k whether the co v erage probabilit y is no less than the pr e-sp ecified confidence leve l, it is not necessary to compute exactly the maximum of the complemen tary co v erage probabilit y . T herefore, C hen revised the standard B&B algorithm to red uce the computational c omplexit y a nd called th e improv ed algorithm as the Adapted B&B Algorithm. The idea is to adaptiv ely partition the parameter space as man y subinterv als. If for all su bin terv als, the u pp er b ounds of the complemen tary co verage probabilit y are no greater than δ , then declare that the co v erage pr ob ab ility is guaran teed. If th ere exists a subinterv al for whic h the lo w er b ound of the complemen tary co verage probabilit y is greater than δ , then declare that the co verage probability is not guaran teed. Con tin ue p artitioning the parameter space if no decision can b e made. The four comp onen ts are illustrated in the sequel und er the headings of s topping ru les, interv al b ound in g, adapted b ranc h and b ound, and b isectio n co v erage tuning. 2.2 Stopping R ules The first comp onent for the exact sequential estimation of a b inomial pr op ortion is the s topp ing rule for constructing a sequentia l estimator suc h that the co v erage prob ab ility can b e con trolled b y the cov erage tu ning parameter ζ . F or conv enience of describ ing some concrete s topp ing rules, define M ( z , θ ) =              z ln θ z + (1 − z ) ln 1 − θ 1 − z for z ∈ (0 , 1 ) a nd θ ∈ (0 , 1) , ln(1 − θ ) for z = 0 and θ ∈ (0 , 1 ) , ln θ for z = 1 and θ ∈ (0 , 1 ) , −∞ for z ∈ [0 , 1 ] and θ / ∈ (0 , 1) and S ( k , l, n , p ) =    P l i = k  n i  p i (1 − p ) n − i for p ∈ (0 , 1) , 0 for p / ∈ (0 , 1) where k and l are int egers s u c h that 0 ≤ k ≤ l ≤ n . Assume that 0 < ζ δ < 1. F or the pur p ose of con tr olling the co v erage probabilit y Pr {| b p − p | < ε | p } by the co v erage tuning p aramete r, Ch en 5 has prop osed four stopping ru les as follo ws: Stopping Rule A : Cont inue sampling unt il M ( 1 2 − | 1 2 − b p ℓ | , 1 2 − | 1 2 − b p ℓ | + ε ) ≤ ln( ζ δ ) n ℓ for some ℓ ∈ { 1 , · · · , s } . Stopping Rule B : Contin ue sampling u n til ( | b p ℓ − 1 2 | − 2 3 ε ) 2 ≥ 1 4 + ε 2 n ℓ 2 ln( ζ δ ) for some ℓ ∈ { 1 , · · · , s } . Stopping Rule C : Con tin ue sampling unt il S ( K ℓ , n ℓ , n ℓ , b p ℓ − ε ) ≤ ζ δ and S (0 , K ℓ , n ℓ , b p ℓ + ε ) ≤ ζ δ for some ℓ ∈ { 1 , · · · , s } . Stopping Rule D : Contin ue sampling u n til n ℓ ≥ b p ℓ (1 − b p ℓ ) 2 ε 2 ln 1 ζ δ for some ℓ ∈ { 1 , · · · , s } . Stopping Rule A w as first prop osed in [6, Theorem 7] and restated in [8, Theorem 16]. Stopp ing Rule B w as first prop osed in [10, Theorem 1] and represented as the third stopping ru le in [9, Section 4.1.1]. Stopp in g Rule C originated fr om [12, Th eorem 1] and w as r estate d as the fir s t stopping rule in [9, S ection 4.1.1]. Stopping Ru le D was d escrib ed in the r emarks follo wing Theorem 7 of [7]. All these stoppin g rules can b e derived from the general p rinciples prop osed in [13, Section 3] and [14, Section 2.4]. Giv en that a stopping rule can b e expressed in terms of b p ℓ and n ℓ for ℓ = 1 , · · · , s , it is p ossib le to fin d a b iv ariate function D ( ., . ) on { ( z , n ) : z ∈ [0 , 1] , n ∈ N } , taking v alues f rom { 0 , 1 } , su c h that the stopping rule can b e stated as: Contin ue s amp ling until D ( b p ℓ , n ℓ ) = 1 for some ℓ ∈ { 1 , · · · , s } . It can b e c h ec ke d that suc h repr esen tation applies to Stopp ing Rules A, B, C , and D. F or example, Stopping Rule B can b e expr essed in this wa y by vir tue of function D ( ., . ) such that D ( z , n ) =    1 if ( | z − 1 2 | − 2 3 ε ) 2 ≥ 1 4 + ε 2 n 2 ln( ζ δ ) , 0 otherwise The motiv ation of introducing f u nction D ( ., . ) is to parameterize the s topp ing rule in terms of design parameters. The function D ( ., . ) determines the form of the stopp ing ru le and consequently , the sample sizes for all stages can b e chosen as f u nctions of design parameters. Sp ecifically , let N min = min  n ∈ N : D  k n , n  = 1 for s ome nonneg ativ e integer k not exceeding n  , (3) N max = min  n ∈ N : D  k n , n  = 1 for a ll nonnegative integer k not ex ceeding n  . (4) T o av oid unn ecessary c hec king of the stopp in g criterion and th us r educe administrative cost, there should b e a p ossibility that the s ampling pr ocess is terminated at the first stage. Hence, the minim um samp le size n 1 should b e c h osen to ensu re th at { n = n 1 } 6 = ∅ . This implies that the sample size n 1 for the first stage can b e tak en as N min . On the other hand, since the samp ling pro cess m ust b e terminated at or b efore the s -th stage, the maxim um sample size n s should b e c h osen to guarant ee that { n > n s } = ∅ . This imp lies that the sample size n s for the last stage can b e taken as N max . I f the n umber of stages s is giv en, then the sample s izes for stages in b et w een 1 and s can b e c hosen as s − 2 intege rs b et we en N min and N max . S p ecially , if th e group s izes are exp ected to b e approximat ely equal, then the sample s izes can b e tak en as n ℓ =  N min + ℓ − 1 s − 1 ( N max − N min )  , ℓ = 1 , · · · , s. (5) Since the stopping ru le is asso ciated with the co verage tuning parameter ζ , it follo ws that the n umber of stages s and the sample sizes n 1 , n 2 , · · · , n s can b e expr essed as fun ctions of ζ . In this 6 sense, it can b e said that the stopp ing r ule is parameterized b y the co v erage tunin g p aramete r ζ . The ab o v e metho d of parameterizing stopping ru les h as b een u sed in [6, 8, 10, 12] and prop osed in [9, Section 2.1, page 9]. 2.3 In terv al Bounding The second comp onen t for the exact sequential estimation of a b inomial prop ortion is th e metho d of b ound ing the complemen tary co verag e p robabilit y Pr {| b p − p | ≥ ε | p } f or p in an inte rv al [ a, b ] con tained by in terv al (0 , 1). App lying Theorem 8 of [8] to the sp ecial case of a Bernoulli distribution immediately yields Pr { b p ≤ a − ε | b } + Pr { b p ≥ b + ε | a } ≤ Pr {| b p − p | ≥ ε | p } ≤ Pr { b p ≤ b − ε | a } + Pr { b p ≥ a + ε | b } (6) for all p ∈ [ a, b ] ⊆ (0 , 1). The b ounds of (6) can b e sho wn as follo ws: Note that P r { b p ≤ a − ε | p } + Pr { b p ≥ b + ε | p } ≤ Pr {| b p − p | ≥ ε | p } = Pr { b p ≤ p − ε | p } + Pr { b p ≥ p + ε | p } ≤ Pr { b p ≤ b − ε | p } + Pr { b p ≥ a + ε | p } f or p ∈ [ a, b ] ⊆ (0 , 1). As a consequence of the monotonicit y of Pr { b p ≥ ϑ | p } and Pr { b p ≤ ϑ | p } with r esp ect to p , wh ere ϑ is a real n umber ind ep enden t of p , the lo wer and upp er b ounds of Pr {| b p − p | ≥ ε | p } f or p ∈ [ a, b ] ⊆ (0 , 1) can b e giv en as Pr { b p ≤ a − ε | b } + Pr { b p ≥ b + ε | a } and Pr { b p ≤ b − ε | a } + Pr { b p ≥ a + ε | b } resp ectiv ely . In page 15, equation (1) of [8], Chen p rop osed to apply the recursive m etho d of Sc hultz [31, Section 2] to compute the low er and up p er b ounds of Pr {| b p − p | ≥ ε | p } give n b y (6). It should b e p oin ted out that su c h lo wer and up p er b ound s of Pr {| b p − p | ≥ ε | p } can also b e compu ted by the recursiv e path-counting m etho d of F ranz´ en [20, page 49]. 2.4 Adapted Branch and Bound The thir d comp onent for the exact sequentia l estimation of a binomial prop ortion is the Adapted B&B Algorithm, which w as prop osed in [8, Section 2.8], for qu ic k d eterminatio n of whether the co verag e pr obabilit y is no less than 1 − δ for any v alue of the asso ciat ed parameter. Suc h a task of c hec kin g the co v erage probability is also r eferred to as c h ec king the co v erage p robabilit y guaran tee. Give n that lo wer and upp er b ounds of the complemen tary co v erage probabilit y on an in terv al of parameter v alues can b e ob tained by the in terv al b oun d ing tec hniqu es, this task can b e accomplished by applying the B&B Algorithm [28] to compute exactly the maxim um of the complemen tary co v erage probability on the parameter sp ace. Ho w ev er , in our applications, it suffices to determine w h ether the maxim um of th e complemen tary co v erage pr obabilit y Pr {| b p − p | ≥ ε | p } w ith r esp ect to p ∈ (0 , 1) is greater than th e confidence p arameter δ . F or f ast c h ec king whether the maximal complement ary co v erage probability exceeds δ , Chen pr op osed to reduce the computational complexit y b y revising the standard B&B Algorithm as the Adapted B&B Algorithm in [8, Section 2.8]. T o describ e this algorithm, let I init denote the parameter space (0 , 1). F or an in terv al I ⊆ I init , let max Ψ( I ) denote the m axim um of the complementa ry co verage probabilit y Pr {| b p − p | ≥ ε | p } with resp ect to p ∈ I . Let Ψ lb ( I ) and Ψ ub ( I ) b e r esp ectiv ely the lo w er and upp er b oun ds of Ψ( I ), wh ic h can b e obtained by the in terv al b ou n ding tec hniqu es introd uced in Section 2.3. Let η > 0 b e a p re-sp ecified tolerance, wh ich is muc h smaller than δ . The Adapted B&B Algorithm of [8] is represente d with a sligh t mo difi cation as follo ws. 7 ∇ Let k ← 0 , l 0 ← Ψ lb ( I init ) and u 0 ← Ψ ub ( I init ). ∇ Let S 0 ← {I init } if u 0 > δ . Otherwise, let S 0 b e empt y. ∇ While S k is nonempt y , l k < δ and u k is greater than max { l k + η , δ } , d o the follo win g: ⋄ Split eac h int erv al in S k as t w o new int erv als of equal length . Let S k denote the set of all new int erv als obtained from this splitting p ro cedure. ⋄ Eliminate an y in terv al I from S k suc h that Ψ ub ( I ) ≤ δ . ⋄ Let S k +1 b e the set S k pro cessed b y the ab ov e elimination pro cedure. ⋄ Let l k +1 ← max I ∈ S k +1 Ψ lb ( I ) and u k +1 ← max I ∈ S k +1 Ψ ub ( I ). Let k ← k + 1. ∇ If S k is empt y and l k < δ , then declare max Ψ( I init ) ≤ δ . Otherwise, declare m ax Ψ( I init ) > δ . It should b e n oted that for a sampling sc heme of symmetrical stopping b ound ary , th e initial in terv al I init ma y b e take n as (0 , 1 2 ) for the sake of efficiency . In Section 5.1, w e will illustrate why the Adapted B&B Algorithm is sup erior than the direct ev aluation based on griddin g parameter space. As will b e seen in Section 5.2, the ob jectiv e of the Adapted B&B Algorithm can also b e accomplished by the Ad aptiv e Maxim u m C hec king Algorithm due to Ch en [9, S ecti on 3.3 ] and redisco vered by F rey in the s eco nd revision of his m an us cr ip t submitted to T AS in April 2010 [22, App endix]. An explanation is giv en in Section 5.3 for the adv antage of working with the complemen tary co v erage probabilit y . 2.5 Bisection Cov erage T uning The f ourth comp onent for the exact sequent ial estimation of a binomial prop ortion is Bisection Co v erage T u ning. Based on the adaptive rigorous c h ecking of co ve rage p robabilit y , Chen prop osed in [6, S ection 2.7] and [8, Section 2.6] to apply a bisection searc h metho d to determine maximal ζ suc h that the co verag e probabilit y is no less than 1 − δ for an y v alue of the asso ciated parameter. Moreo ve r, C hen has dev elop ed asymptotic r esults in [8, p age 21, Theorem 18] for determining the initial in terv al of ζ needed f or the bisection search. S p ecifically , if th e complement ary co v erage probabilit y Pr {| b p − p | ≥ ε | p } asso ciated with ζ = ζ 0 tends to δ as ε → 0, then the initial in terv al of ζ can b e tak en as [ ζ 0 2 i , ζ 0 2 i +1 ], where i is the largest integ er s u c h that th e complemen tary co verag e probabilit y asso ciate d with ζ = ζ 0 2 i is no greater th an δ for all p ∈ (0 , 1). B y vir tue of a bisection searc h , it is p ossible to obtain ζ ∗ ∈ [ ζ 0 2 i , ζ 0 2 i +1 ] such that the complementary co v er age probabilit y asso ciated with ζ = ζ ∗ is guaran teed to b e no greater than δ for all p ∈ (0 , 1). 3 Principle of Constructing Stopping Rules In this section, w e sh all illustrate the inherent connection b et w een v arious stopping rules. It will b e demonstrated that a lot of stopp ing rules can b e deriv ed by virtue of th e inclusion principle prop osed b y Chen [13, Section 3]. 8 3.1 Inclusion P rinciple The p roblem of estimating a binomial prop ortion can b e considered as a sp ecial case of parameter estimation for a random v ariable X p arameteriz ed by θ ∈ Θ , w here the ob jectiv e is to construct a sequential estimator b θ for θ su c h that Pr {| b θ − θ | < ε | θ } ≥ 1 − δ f or an y θ ∈ Θ. Assu m e that the samp lin g pr ocess consists of s s tages with sample sizes n 1 < n 2 < · · · < n s . F or ℓ = 1 , · · · , s , define an estimator b θ ℓ for θ in terms of samp les X 1 , · · · , X n ℓ of X . Let [ L ℓ , U ℓ ] , ℓ = 1 , 2 , · · · , s b e a sequence of confid ence in terv als su c h th at for any ℓ , [ L ℓ , U ℓ ] is d efined in terms of X 1 , · · · , X n ℓ and that the co verag e probab ility Pr { L ℓ ≤ θ ≤ U ℓ | θ } can b e made arbitrarily close to 1 b y choosing ζ > 0 to b e a s u fficien tly sm all n umber . In Theorem 2 of [13], Chen prop osed the follo wing general stopping rule: Con tin ue samplin g until U ℓ − ε ≤ b θ ℓ ≤ L ℓ + ε for some ℓ ∈ { 1 , · · · , s } . (7) A t the termination of the sampling pro cess, a sequent ial estimator for θ is tak en as b θ = b θ l , wh ere l is the index of stage at the termination of sampling pro cess. Clearly , the general stopping ru le (7) can b e restated as follo w s: Con tin ue samplin g until th e confidence inte rv al [ L ℓ , U ℓ ] is included by in terv al [ b θ ℓ − ε, b θ ℓ + ε ] for some ℓ ∈ { 1 , · · · , s } . The sequ en ce of confidence inte rv als are p arameterize d b y ζ for purp ose of con trolling the co verag e p robabilit y Pr {| b θ − θ | < ε | θ } . Du e to the inclusion relationship [ L ℓ , U ℓ ] ⊆ [ b θ ℓ − ε, b θ ℓ + ε ], suc h a general metho dology of using a sequence of confidence inte rv als to construct a s topp ing ru le for con trolling the co v erage probabilit y is referred to as the inclusion principle . It is asserted by Theorem 2 of [13] that Pr {| b θ − θ | < ε | θ } ≥ 1 − sζ δ ∀ θ ∈ Θ (8) pro vided that Pr { L ℓ < θ < U ℓ | θ } ≥ 1 − ζ δ f or ℓ = 1 , · · · , s and θ ∈ Θ. This demons trates that if th e n u m b er of s tag es s is b ounded with r esp ectiv e to ζ , then th e cov erage probabilit y Pr {| b θ − θ | < ε | θ } asso ciated w ith the stopp ing ru le d eriv ed f rom the in clusion principle can b e con trolled by ζ . Actually , b efore explicitly p rop osing the in clusion pr inciple in [13], Chen had extensiv ely applied the inclusion pr inciple in [6, 8, 10, 12] to construct stopp ing rules for estimating parameters of v arious d istributions such as binomial, P oisson, geometric, hyper geometric, normal distributions, etc. A more general v ersion of th e inclusion prin ciple is prop osed in [14, Section 2.4]. F or simplicit y of the stoppin g ru le, Chen had made effort to eliminate the compu tatio n of confidence limits. In the con text of estimating a binomial prop ortion p , the inclusion p rinciple immediately leads to the follo wing general stopping rule: Con tin ue samplin g un til b p ℓ − ε ≤ L ℓ ≤ U ℓ ≤ b p ℓ + ε for some ℓ ∈ { 1 , · · · , s } . (9) Consequent ly , the sequen tial estimator for p is tak en as b p according to (2). It shou ld b e p oint ed out that the stopp ing r u le (9) had b een r edisco v ered b y F rey in S ectio n 2, the 1st paragraph of [22]. The f our stopping rules consider ed in his p ap er follo w immediately from app lying v arious confidence in terv als to the general stopp in g ru le (9). In the sequel, w e will illustrate ho w to app ly (9) to the deriv ation of Stopping Rules A, B, C , D int ro duced in S ectio n 2.2 and other sp ecific stoppin g ru les. 9 3.2 Stopping R ule from W ald In terv als By vir tue of W ald’s metho d of interv al estimatio n for a binomial pr op ortion p , a sequ ence of confidence in terv als [ L ℓ , U ℓ ] , ℓ = 1 , · · · , s for p can b e constructed suc h that L ℓ = b p ℓ − Z ζ δ s b p ℓ (1 − b p ℓ ) n ℓ , U ℓ = b p ℓ + Z ζ δ s b p ℓ (1 − b p ℓ ) n ℓ , ℓ = 1 , · · · , s and that Pr { L ℓ ≤ p ≤ U ℓ | p } ≈ 1 − 2 ζ δ for ℓ = 1 , · · · , s and p ∈ (0 , 1). Note that, for ℓ = 1 , · · · , s , the ev en t { b p ℓ − ε ≤ L ℓ ≤ U ℓ ≤ b p ℓ + ε } is the same as the ev en t   b p ℓ − 1 2  2 ≥ 1 4 − n ℓ  ε Z ζ δ  2  . So, applying this sequence of confiden ce in terv als to (9) results in the stoppin g rule “con tinue sampling until  b p ℓ − 1 2  2 ≥ 1 4 − n ℓ  ε Z ζ δ  2 for some ℓ ∈ { 1 , · · · , s } ”. S ince for any ζ ∈ (0 , 1 δ ), there exists a unique num b er ζ ′ ∈ (0 , 1 δ ) suc h that Z ζ δ = q 2 ln 1 ζ ′ δ , this stopping rule is equiv alen t to “Con tin ue sampling u n til  b p ℓ − 1 2  2 ≥ 1 4 + ε 2 n ℓ 2 ln( ζ δ ) for some ℓ ∈ { 1 , · · · , s } .” This stopping rule is actually the same as Stopp ing R u le D, since n  b p ℓ − 1 2  2 ≥ 1 4 + ε 2 n ℓ 2 ln( ζ δ ) o = n n ℓ ≥ b p ℓ (1 − b p ℓ ) 2 ε 2 ln 1 ζ δ o for ℓ ∈ { 1 , · · · , s } . 3.3 Stopping R ule from Revised W ald Interv als Define e p ℓ = n ℓ b p ℓ + a n ℓ +2 a for ℓ = 1 , · · · , s , wh er e a is a p ositive num b er. In spired by W ald’s metho d of in terv al estimation for p , a sequence of confiden ce interv als [ L ℓ , U ℓ ] , ℓ = 1 , · · · , s can b e constructed suc h that L ℓ = b p ℓ − Z ζ δ s e p ℓ (1 − e p ℓ ) n ℓ , U ℓ = b p ℓ + Z ζ δ s e p ℓ (1 − e p ℓ ) n ℓ and that Pr { L ℓ ≤ p ≤ U ℓ | p } ≈ 1 − 2 ζ δ f or ℓ = 1 , · · · , s and p ∈ (0 , 1). This sequence of confidence in terv als wa s app lied by F rey [22] to the general stopp ing rule (9). As a matter of fact, such idea of r evising W ald int erv al  X n − Z ζ δ q X n (1 − X n ) n , X n + Z ζ δ q X n (1 − X n ) n  b y replacing the relativ e frequency X n = P n i =1 X i n in v olv ed in the confi dence limits w ith e p a = n X n + a n +2 a had b een prop osed by H. Chen [3, Section 4]. As can b e seen from Section 2, page 243, of F rey [22], applying (9) with the sequ en ce of r evised W ald in terv als yields the stopping rule “Contin ue sampling until  e p ℓ − 1 2  2 ≥ 1 4 + ε 2 n ℓ 2 ln( ζ δ ) for some ℓ ∈ { 1 , · · · , s } .” Clearly , replacing b p ℓ in S topp ing Rule D with e p ℓ = a + n ℓ b p ℓ n ℓ +2 a also leads to this stopping rule. 3.4 Stopping R ule from Wilson’s Confidence In terv als Making use of the interv al estimation m etho d of Wilson [35], one can obtain a sequence of confi dence in terv als [ L ℓ , U ℓ ] , ℓ = 1 , · · · , s for p such that L ℓ = max        0 , b p ℓ + Z 2 ζ δ 2 n ℓ − Z ζ δ r b p ℓ (1 − b p ℓ ) n ℓ +  Z ζ δ 2 n ℓ  2 1 + Z 2 ζ δ n ℓ        , U ℓ = min        1 , b p ℓ + Z 2 ζ δ 2 n ℓ + Z ζ δ r b p ℓ (1 − b p ℓ ) n ℓ +  Z ζ δ 2 n ℓ  2 1 + Z 2 ζ δ n ℓ        10 and that Pr { L ℓ ≤ p ≤ U ℓ | p } ≈ 1 − 2 ζ δ for ℓ = 1 , · · · , s and p ∈ (0 , 1). I t should b e p oint ed out that the sequence of Wilson’s confidence int erv als has b een app lied by F rey [22, Section 2, p age 243] to the general s topp ing ru le (9) for estimating a b inomial pr op ortion. Since a stopping ru le directly inv olve s the sequen ce of Wilson’s confidence interv als is cumb er- some, it is desirable to eliminate th e computation of Wilson’s confidence interv als in the stopping rule. F or this pu rp ose, we need to u se the follo wing result. Theorem 1 Assume that 0 < ζ δ < 1 and 0 < ε < 1 2 . Then, Wilson ’s c onfidenc e intervals satisfy { b p ℓ − ε ≤ L ℓ ≤ U ℓ ≤ b p ℓ + ε } =     b p ℓ − 1 2   − ε  2 ≥ 1 4 − n ℓ  ε Z ζ δ  2  for ℓ = 1 , · · · , s . See App endix A for a p r oof. As a consequence of Theorem 1 and the fact that for any ζ ∈ (0 , 1 δ ), there exists a u nique n umber ζ ′ ∈ (0 , 1 δ ) su c h that Z ζ δ = q 2 ln 1 ζ ′ δ , applying th e sequence of Wilson’s confidence in terv als to (9) leads to the follo win g stopping ru le: Cont inue samp ling un til      b p ℓ − 1 2     − ε  2 ≥ 1 4 + ε 2 n ℓ 2 ln( ζ δ ) (10) for some ℓ ∈ { 1 , · · · , s } . 3.5 Stopping R ule from Clopp er-P earson Confidence In terv als Applying the interv al estimation metho d of Clopp er -Pearson [17], a sequence of confid ence in terv als [ L ℓ , U ℓ ] , ℓ = 1 , · · · , s for p can b e obtained such that Pr { L ℓ ≤ p ≤ U ℓ | p } ≥ 1 − 2 ζ δ for ℓ = 1 , · · · , s and p ∈ (0 , 1), w h ere th e upp er confid ence limit U ℓ satisfies the equation S (0 , K ℓ , n ℓ , U ℓ ) = ζ δ if K ℓ < n ℓ ; and the lo w er confidence limit L ℓ satisfies the equation S ( K ℓ , n ℓ , n ℓ , L ℓ ) = ζ δ if K ℓ > 0. The we ll kno wn equ atio n (10.8) in [19, page 173] implies that S (0 , k , n, p ), w ith 0 ≤ k < n , is decreasing with r esp ect to p ∈ (0 , 1) and that S ( k , n, n, p ), with 0 < k ≤ n , is increasing with resp ect to p ∈ (0 , 1). It follo ws that { b p ℓ − ε ≤ L ℓ } = { 0 < b p ℓ − ε ≤ L ℓ } ∪ { b p ℓ ≤ ε } = { b p ℓ > ε, S ( K ℓ , n ℓ , n ℓ , b p ℓ − ε ) ≤ ζ δ } ∪ { b p ℓ ≤ ε } = { b p ℓ > ε, S ( K ℓ , n ℓ , n ℓ , b p ℓ − ε ) ≤ ζ δ } ∪ { b p ℓ ≤ ε, S ( K ℓ , n ℓ , n ℓ , b p ℓ − ε ) ≤ ζ δ } = { S ( K ℓ , n ℓ , n ℓ , b p ℓ − ε ) ≤ ζ δ } and { b p ℓ + ε ≥ U ℓ } = { 1 > b p ℓ + ε ≥ U ℓ } ∪ { b p ℓ ≥ 1 − ε } = { b p ℓ < 1 − ε, S (0 , K ℓ , n ℓ , b p ℓ + ε ) ≤ ζ δ } ∪ { b p ℓ ≥ 1 − ε } = { b p ℓ < 1 − ε, S (0 , K ℓ , n ℓ , b p ℓ + ε ) ≤ ζ δ } ∪ { b p ℓ ≥ 1 − ε, S (0 , K ℓ , n ℓ , b p ℓ + ε ) ≤ ζ δ } = { S (0 , K ℓ , n ℓ , b p ℓ + ε ) ≤ ζ δ } for ℓ = 1 , · · · , s . Consequentl y , { b p ℓ − ε ≤ L ℓ ≤ U ℓ ≤ b p ℓ + ε } = { S ( K ℓ , n ℓ , n ℓ , b p ℓ − ε ) ≤ ζ δ, S (0 , K ℓ , n ℓ , b p ℓ + ε ) ≤ ζ δ } for ℓ = 1 , · · · , s . This d emonstrates th at app lyin g the sequence of Clopp er-Pe arson confidence in terv als to the general stopping r ule (9) give s Stopping Rule C. It should b e p oin ted out that Stopping Rule C wa s r ed isco v ered by J. F rey as the third stopping rule in Section 2, p age 243 of his pap er [22]. 11 3.6 Stopping R ule from Fishman’s Confidence In terv als By the inte rv al estimation metho d of Fishman [18], a sequence of confid ence in terv als [ L ℓ , U ℓ ] , ℓ = 1 , · · · , s for p can b e obtained su c h that L ℓ = ( 0 if b p ℓ = 0 , { θ ℓ ∈ (0 , b p ℓ ) : M ( b p ℓ , θ ℓ ) = ln( ζ δ ) n ℓ } if b p ℓ > 0 U ℓ = ( 1 if b p ℓ = 1 , { θ ℓ ∈ ( b p ℓ , 1) : M ( b p ℓ , θ ℓ ) = ln( ζ δ ) n ℓ } if b p ℓ < 1 Under the assumption that 0 < ζ δ < 1 and 0 < ε < 1 2 , b y similar tec hniques as the pro of of Theorem 7 of [7], it can b e sho wn that { b p ℓ − ε ≤ L ℓ ≤ U ℓ ≤ b p ℓ + ε } = { M ( 1 2 − | 1 2 − b p ℓ | , 1 2 − | 1 2 − b p ℓ | + ε ) ≤ ln( ζ δ ) n ℓ } for ℓ = 1 , · · · , s . Th erefore, applying the sequence of confiden ce in terv als of Fish man to the general stopping rule (9) gives S topping Ru le A. It should b e n oted that Fishman’s confid ence interv als are actually derive d from the Ch er n off b ounds of the taile d p robabilities of the sample mean of Bernou lli rand om v ariable. Hence, S topping Rule A is also referred to as the stopping rule from Ch ernoff b ound s in this p ap er . 3.7 Stopping R ule from Confidence Interv als of Chen et. al. Using the inte rv al estimat ion method o f Chen et . al. [16], a sequence o f confidence in terv als [ L ℓ , U ℓ ] , ℓ = 1 , · · · , s for p can b e obtained such that L ℓ = max        0 , b p ℓ + 3 4 1 − 2 b p ℓ − r 1 + 9 n ℓ 2 ln 1 ζ δ b p ℓ (1 − b p ℓ ) 1 + 9 n ℓ 8 ln 1 ζ δ        , U ℓ = min        1 , b p ℓ + 3 4 1 − 2 b p ℓ + r 1 + 9 n ℓ 2 ln 1 ζ δ b p ℓ (1 − b p ℓ ) 1 + 9 n ℓ 8 ln 1 ζ δ        and that Pr { L ℓ ≤ p ≤ U ℓ | p } ≥ 1 − 2 ζ δ for ℓ = 1 , · · · , s and p ∈ (0 , 1). Und er the assump tion th at 0 < ζ δ < 1 and 0 < ε < 1 2 , b y similar tec hn iques as the p ro of of Theorem 1 of [11], it can b e sh own that { b p ℓ − ε ≤ L ℓ ≤ U ℓ ≤ b p ℓ + ε } = { ( | b p ℓ − 1 2 | − 2 3 ε ) 2 ≥ 1 4 + ε 2 n ℓ 2 ln( ζ δ ) } for ℓ = 1 , · · · , s . This implies that app lying the sequ ence of confidence in terv als of Chen et. al. to the general stopping ru le (9) leads to Stoppin g Rule B. Actually , the confidence in terv als of Chen et. al. [16] are derive d from Massart’s inequalit y [29] on the tailed pr obabilitie s of the sample mean of Bernoulli random v ariable. F or this reason, Stopping Rule B is also referred to as the stopp ing r ule from Massart’s inequ alit y in [9, Section 4.1.1]. 4 Double-P arab olic Sequen tial Estimation F r om Sections 2.2, 3.2 and 3.7, it can b e seen that, by in tro ducing a n ew parameter ρ ∈ [0 , 1] an d letting ρ take v alues 2 3 and 0 resp ectiv ely , Stopping Rules B and D can b e accommodated as sp ecial cases of the follo wing general stoppin g r u le: 12 Con tin ue the sampling pro cess until      b p ℓ − 1 2     − ρε  2 ≥ 1 4 + ε 2 n ℓ 2 ln( ζ δ ) (11) for some ℓ ∈ { 1 , 2 , · · · , s } , w here ζ ∈ (0 , 1 δ ). Moreo ve r, as can b e seen from (10), the stopping r u le derived from app lying Wilson’s confidence in terv als to (9) can also b e viewe d as a sp ecial case of such general stopp ing rule with ρ = 1. F r om the stoppin g cond ition (11), it can b e seen that th e stopping b ound ary is asso ciated with the d ouble-parab olic f unction f ( x ) = 2 ε 2 ln( ζ δ ) h 1 4 −    x − 1 2   − ρε  2 i suc h that x and f ( x ) corresp ond to the sample mean and sample s ize resp ectiv ely . F or ε = 0 . 1 , δ = 0 . 05 and ζ = 1, stopping b oundaries with v arious ρ are shown by Figur e 1 . 0 0.2 0.4 0.6 0.8 1 40 60 80 100 120 140 160 x f(x) rho = 1 rho = 2/3 Figure 1: Double-parab olic sampling F or fi xed ε and δ , the parameters ρ and ζ affect the shap e of the stoping b ound ary in a wa y as follo ws. As ρ increases, the span of stopping b oundary is increasing in th e axis of sample mean. By decreasing ζ , the stopping b oundary can b e dr agge d to ward the direction of increasing sample size. Hence, th e p arameter ρ is referred to as the dilation c o efficient . The parameter ζ is referr ed to as th e c over age tuning p ar ameter . Since the stopp ing b oun d ary consists of tw o parab olas, this approac h of estimating a b inomial p rop ortion is refereed to as the double-p ar ab olic se quential estimation metho d. 4.1 P arametrization of the Sampling Sc heme In th is sectio n, we shall parameterize the double-parab olic sequen tial sampling scheme b y the metho d describ ed in Section 2 .2. F rom the stopp ing condition (1 1), the stopp ing rule can b e restated as: Cont inue s ampling until D ( b p ℓ , n ℓ ) = 1 for some ℓ ∈ { 1 , · · · , s } , where the fun ctio n D ( ., . ) is defined by D ( z , n ) =    1 if ( | z − 1 2 | − ρε ) 2 ≥ 1 4 + ε 2 n 2 ln( ζ δ ) , 0 otherwise (12) 13 Clearly , the fu nction D ( ., . ) asso ciated with the doub le-parabolic sequen tial sampling sc heme de- p ends on the design parameters ρ, ζ , ε and δ . Ap plying the function D ( ., . ) defined by (12) to (3) yields N min = min ( n ∈ N :      k n − 1 2     − ρε  2 ≥ 1 4 + ε 2 n 2 ln( ζ δ ) for some nonnegative integer k not exceeding n ) . (13) Since ε is usually small in pr actical app licati ons, we restrict ε to s atisfy 0 < ρε ≤ 1 4 . As a consequence of 0 ≤ ρε ≤ 1 4 and the fact th at   z − 1 2   ≤ 1 2 for an y z ∈ [0 , 1], it must b e true that    z − 1 2   − ρε  2 ≤  1 2 − ρε  2 for any z ∈ [0 , 1]. It follo ws f rom (13) that  1 2 − ρε  2 ≥ 1 4 + ε 2 N min 2 ln( ζ δ ) , whic h implies th at the min im um sample size can b e tak en as N min =  2 ρ  1 ε − ρ  ln 1 ζ δ  . ( 14) On the other h and, applying the function D ( ., . ) defi n ed by (12) to (4) giv es N max = min ( n ∈ N :      k n − 1 2     − ρε  2 ≥ 1 4 + ε 2 n 2 ln( ζ δ ) for all nonnegative integer k not exceeding n ) . (15) Since    z − 1 2   − ρε  2 ≥ 0 for an y z ∈ [0 , 1], it follo ws from (15) that 1 4 + ε 2 N max 2 ln( ζ δ ) ≤ 0, whic h imp lies that maxim um sample size can b e tak en as N max =  1 2 ε 2 ln 1 ζ δ  . (16) Therefore, the sample sizes n 1 , · · · , n s can b e c hosen as functions of ρ, ζ , ε an d δ w hic h satisfy the follo wing constraint: N min ≤ n 1 < · · · < n s − 1 < N max ≤ n s . (17) In particular, if the num b er of stages s is giv en and the group sizes are exp ected to b e app ro ximately equal, then the sample sizes, n 1 , · · · , n s , for all stages can b e obtained by substituting N min defined b y (14) and N max defined by (16) into (5). F or example, if the v alues of design parameters are ε = 0 . 05 , δ = 0 . 05 , ρ = 3 4 , ζ = 2 . 6759 and s = 7, then the sample sizes of th is samp ling sc heme are calculated as n 1 = 59 , n 2 = 116 , n 3 = 173 , n 4 = 231 , n 5 = 288 , n 6 = 345 , n 7 = 403 . The stoppin g rule is completely d etermined by substituting the v alues of design parameters in to (11). 4.2 Uniform Controllabilit y of Co v erage Pr obabili ty Clearly , for pr e-sp ecified ε, δ and ρ , the co verage p r obabilit y Pr {| b p − p | < ε | p } d ep ends on the parameter ζ , the num b er of s tag es s , and the samp le sizes n 1 , · · · , n s . As illustrated in Section 4.1, the num b er of stages s and the sample sizes n 1 , · · · , n s can b e defined as fu nctions of ζ ∈ (0 , 1 δ ). That is, the stoppin g rule can b e parameterized by ζ . Accordingly , for any p ∈ (0 , 1), the co v erage probabilit y Pr {| b p − p | < ε | p } b ecomes a fu nction of ζ . The follo wing theorem sho ws that it suffices to c ho ose ζ ∈ (0 , 1 δ ) small enough to guarante e the pre-sp ecified confidence level. 14 Theorem 2 L et ε, δ ∈ (0 , 1) and ρ ∈ (0 , 1] b e fixe d. Assume that the numb er of stages s and the sample sizes n 1 , · · · , n s ar e fu nctions of ζ ∈ (0 , 1 δ ) such that the c onstr aint (17) is satisfie d. Then, Pr {| b p − p | < ε | p } is no less than 1 − δ for any p ∈ (0 , 1) pr ovide d that 0 < ζ ≤ 1 δ exp ln δ 2 + ln  1 − exp( − 2 ε 2 )  4 ερ (1 − ρε ) ! . See App end ix B for a pr o of. F or Theorem 2 to b e v alid, the choice of sample sizes is v ery flexible. Sp ecially , the sample sizes can b e arithmetic or geometric p rogressions or any others, as long as the constrain t (17) is satisfied. It can b e seen that f or the co v erage pr ob ab ility to b e uniformly con trollable, th e dilation co efficien t ρ must b e greater than 0. T h eorem 2 asserts that there exists ζ > 0 suc h that the co v erage probabilit y is no less than 1 − δ , regardless of th e asso ciated binomial prop ortion p . F or the purp ose of reducing sampling cost, we wan t to ha v e a v alue of ζ as large as p ossib le suc h that the pre-sp ecified confidence leve l is guarante ed for any p ∈ (0 , 1). This can b e accomplished b y the tec h nical comp onen ts introdu ced in Sections 2.1, 2.3, 2.4 an d Section 2.5. Clearly , for eve ry v alue of ρ , we can obtain a corresp onding v alue of ζ (as large as p ossib le) to en s ure the d esired confiden ce level . Ho wev er, the p erformance of resultant stopping rules are differen t. Therefore, w e can try a num b er of v alues of ρ and pick the b est resultan t stopping ru le for practical use. 4.3 Asymptotic Optimalit y of Sampling Sc hemes No w we sh all provide an imp ortan t reason why w e pr op ose the sampling scheme of that structur e b y sho wing its asymptotic optimalit y . Sin ce the p erformance of a group sampling sc heme will b e close to its fully sequentia l counterpart, we inv estigate the optimalit y of the fu lly sequ en tial samp ling sc h eme. In th is s cenario, the sample sizes n 1 , n 2 , · · · , n s are consecutiv e integ ers such that  2 ρ  1 ε − ρ  ln 1 ζ δ  = n 1 < n 2 < · · · < n s − 1 < n s =  1 2 ε 2 ln 1 ζ δ  . (18) The fu lly sequen tial sampling sc heme can b e view ed as a sp ecial case of a group samplin g s c heme of s = n s − n 1 + 1 stages and group size 1. Clearly , if δ, ζ and ρ are fixed, the sampling sc heme is dep end en t only on ε . Hence, for an y p ∈ (0 , 1), if w e allo w ε to v ary in (0 , 1), then th e co verage probabilit y Pr {| b p − p | < ε | p } and the a v erage s ample num b er E [ n ] are fun ctions of ε . W e are in terested in kno wing the asymptotic b eh avior of these functions as ε → 0, since ε is usu ally small in practical situations. T he follo w ing theorem p ro vides u s th e desired insigh ts. Theorem 3 Assume that δ ∈ (0 , 1) , ζ ∈ (0 , 1 δ ) and ρ ∈ (0 , 1] ar e fixe d. Define N ( p, ε, δ , ζ ) = 2 p (1 − p ) ln 1 ζ δ ε 2 for p ∈ (0 , 1) and ε ∈ (0 , 1) . Then, Pr  lim ε → 0 n N ( p, ε, δ, ζ ) = 1 | p  = 1 , lim ε → 0 Pr {| b p − p | < ε | p } = 2 Φ  r 2 ln 1 ζ δ  − 1 , (19) lim ε → 0 E [ n ] N ( p, ε, δ, ζ ) = 1 (20) for any p ∈ (0 , 1) . 15 See Ap p endix C for a p r oof. F rom (19), it can b e seen that lim ε → 0 Pr {| b p − p | < ε | p } = 1 − δ for an y p ∈ (0 , 1) if ζ = 1 δ exp( − 1 2 Z 2 δ/ 2 ). Suc h v alue can b e take n as an initial v alue for the co verag e tu n ing parameter ζ . In addition to pro vide guid an ce on the cov erage tuning tec hniques, Theorem 3 also establishes the optimal it y o f the sampling sc heme. T o see this, let N ( p, ε, δ ) denote the minimum sample size n requir ed f or a fix ed -sample-size pro cedure to guarante e that Pr {| X n − p | < ε | p } ≥ 1 − δ for an y p ∈ (0 , 1), wh ere X n = P n i =1 X i n . It is w ell kno wn that from the cen tral limit theorem, lim ε → 0 N ( p, ε, δ ) p (1 − p )  Z δ/ 2 ε  2 = 1 . (21) Making u se of (20), (21) and letting ζ = 1 δ exp( − 1 2 Z 2 δ/ 2 ), we ha v e lim ε → 0 N ( p,ε, δ ) N ( p, ε,δ,ζ ) = 1 for p ∈ (0 , 1) and δ ∈ (0 , 1), whic h imp lies the asymptotic optimalit y of the doub le-parabolic samp ling scheme. By virtue of (20), an approxi mate f orm ula for compu ting the a verag e sample n umber is giv en as E [ n ] ≈ N ( p, ε, δ , ζ ) = 2 p (1 − p ) ln 1 ζ δ ε 2 (22) for p ∈ (0 , 1) and ε ∈ (0 , 1). F rom (21), one obtains N ( p, ε, δ ) ≈ p (1 − p )  Z δ/ 2 ε  2 , whic h is a w ell-kno wn resu lt in statistics. In situations that no information of p is a v ailable, one us u ally uses N normal def = & 1 4  Z δ/ 2 ε  2 ' (23) as the sample size for estimating th e b in omial p rop ortion p with prescrib ed margin of er r or ε and confidence lev el 1 − δ . Since the samp le size formula (23) can lead to u nder-co verag e, researc hers in man y areas are willing to use a more conserv ativ e b u t rigorous samp le s ize formula N c h def = & ln 2 δ 2 ε 2 ' , (24) whic h is derived from the Ch ernoff-Hoeffding b ound [2, 25]. C omparing (22) and (24), one can see that und er the premise of guaranteeing the prescrib ed confid ence lev el 1 − δ , the doub le-parab olic sampling sc heme can lead to a s ubstan tial reduction of sample n umber when the u nkno wn binomial prop ortion p is close to 0 or 1. 4.4 Bounds on Distribution and Exp ectation of Sample Num b er W e sh all deriv e analytic b ound s for the cumulativ e distribu tion fu nction and exp ectation of the sample num b er n asso ciated with the doub le-parabolic samp ling sc heme. In this direction, w e h a ve obtained the follo wing r esults. Theorem 4 L et p ∈ (0 , 1 2 ] . Define a ℓ = 1 2 − ρε − q 1 4 + ε 2 n ℓ 2 ln( ζ δ ) for ℓ = 1 , · · · , s . L et τ d enote the index of stage su c h that a τ − 1 ≤ p < a τ . Then, Pr { n > n ℓ | p } ≤ exp ( n ℓ M ( a ℓ , p )) for τ ≤ ℓ < s . Mor e over, E [ n ] ≤ n τ + P s − 1 ℓ = τ ( n ℓ +1 − n ℓ ) exp( n ℓ M ( a ℓ , p )) . See App endix D for a pro of. By the sym metry of th e d ou b le-parab olic sampling sc heme, similar analytic b ounds for the distribu tion and exp ectation of the sample num b er can b e d eriv ed for th e case that p ∈ [ 1 2 , 1). 16 5 Comparison of Computational Metho ds In this section, w e shall compare v arious computational m ethod s . First, we will illustrate why a frequent ly-used metho d of ev aluating the co verage probabilit y based on griddin g th e parameter space is not rigorous and is less efficient as compared to the Adapted B&B Algorithm. S econd, we will introd uce the Ad aptiv e Maximum Checking Algorithm of [9] wh ic h has b etter compu tatio nal efficiency as compared to the Ad apted B&B Algorithm. Third, we w ill explain that it is more adv an tageous in terms of numerical accuracy to work with the complemen tary co v erage probability as compared to direct ev aluation of th e co ve rage p robabilit y . Finally , we will compare the compu - tational metho ds of Chen [6, 8 , 10, 12, 13] and F rey [22] for the design of sequentia l pro cedures for estimating a bin omial prop ortion. 5.1 V erifying Co v erage Guaran tee wit hout Gridding Parame ter Space F or purp ose of constru cting a samp lin g sc h eme so that th e pr escrib ed confidence lev el 1 − δ is guaran teed, an essen tial task is to d etermine whether th e co verage probability Pr {| b p − p | < ε | p } asso ciate d with a given stoppin g rule is n o less than 1 − δ . In other w ord s, it is necessary to compare the infi m u m of co v erage p robabilit y w ith 1 − δ . T o accomplish suc h a task of chec king co v erage guaran tee, a natur al metho d is to ev aluate the infimum of co verage probability as follo ws: (i) : C ho ose m grid p oints p 1 , · · · , p m from parameter space (0 , 1). (ii) : C ompute c j = Pr {| b p − p | < ε | p j } f or j = 1 , · · · , m . (iii) : T ake min { c 1 , · · · , c m } as inf p ∈ (0 , 1) Pr {| b p − p | < ε | p } . This metho d can b e easily mistak en as an exact approac h and has b een fr equen tly u sed for ev aluating co v erage p robabilities in man y pr oblem areas. It is n ot hard to sho w that if the samp le size n of a sequential pro cedure h as a supp ort S , then the co v er age p robabilit y Pr {| b p − p | < ε | p } is discon tin uous at p ∈ P ∩ (0 , 1), wh er e P = { k n ± ε : k is a nonn egat iv e in teger no greater than n ∈ S } . The set P typically has a large n umber of p arameter v alues. Due to the discon tin uit y of the co verag e probabilit y as a fu nction of p , the cov erage probabilities can differ significan tly for t wo parameter v alues whic h are extremely close. T his implies that an in tolerable error can b e in tro duced b y taking the minimum of co v erage probabilities of a finite num b er of parameter v alues as the infimum of co v erage pr ob ab ility on the whole parameter space. So, if one s im p ly u s es the minimum of the co verage probab ilities of a finite num b er of parameter v alues as the infimum of co verage pr obabilit y to chec k the co v erage guaran tee, the sequen tial estimator b p of the resultan t stopping ru le will fail to guarant ee the prescrib ed confidence lev el. In addition to the lac k of r igorousness, another dr awbac k of chec king co verag e guarante e based on the m ethod of griddin g parameter sp ace is its lo w efficiency . A critical issue is on the c hoice of the n umber, m , of grid p oin ts. If the num b er m is to o small, the induced error can b e subs tan tial. On the other h and, c ho osing a large num b er for m resu lts in high computational complexit y . In con trast to the metho d based on gridding parameter space, th e Adapted B&B Algorithm is a rigorous approac h for chec king cov erage guarante e as a consequence of the mec hanism for comparing the b ounds of cov erage pr obabilit y w ith the prescrib ed confidence lev el. Th e algorithm is also efficien t due to the mec hanism of prun ing branches. 17 5.2 Adaptiv e Maxim um Chec king Algorit hm As illus tr ated in Section 2, the techniques dev elop ed in [6, 8, 10, 12, 13] are sufficient to provide exact solutions for a wide range of sequent ial estimat ion problems. H o w ev er, one of the four comp onen ts, the Adapted B&B Algorithm, requir es computing b oth the lo wer and u pp er b oun ds of th e complemen tary co v erage probabilit y . T o fur th er reduce the computational complexit y , it is desirable to ha v e a c hec king algorithm w hic h needs only one of the low er and u pp er b oun ds. F or this p u rp ose, C h en had d ev elop ed the Adaptiv e Maxim u m Ch ec king Algorithm (AMCA) in [9 , Section 3.3] and [14, Section 2.7]. In the follo wing introdu ctio n of the AMCA, w e shall follo w the description of [9]. The AMCA can b e applied to a wide class of computational p roblems dep endent on the follo wing critical subr outine: Determine whether a function C ( θ ) is smaller th an a prescrib ed num b er δ f or ev ery v alue of θ con tained in interv al [ θ , θ ]. Sp ecially , for c hecking th e co verage guaran tee in th e con text of estimati ng a binomial prop ortion, the parameter θ is the binomial prop ortion p and the function C ( θ ) is actually the complement ary co verag e p robabilit y . I n man y situations, it is imp ossible or ve ry difficult to ev aluate C ( θ ) for ev ery v alue of θ in inte rv al [ θ , θ ], since th e interv al ma y cont ain infin itely many or an extremely large n umber of v alues. Similar to the Adapted B&B Algorithm, the p u rp ose of AMCA is to redu ce the computational complexit y asso ciated with the problem of d etermining whether the maximum of C ( θ ) ov er [ θ , θ ] is less than δ . The only assumption required for AMCA is that, for an y interv al [ a, b ] ⊆ [ θ , θ ], it is p ossib le to compute an upp er b ound C ( a, b ) su c h that C ( θ ) ≤ C ( a, b ) for an y θ ∈ [ a, b ] and that the u pp er b ound con v erges to C ( θ ) as the in terv al width b − a tends to 0. The bac kward AMCA pro ceeds as follo ws: ∇ Cho ose initial step size d > η . ∇ Let F ← 0 , T ← 0 and b ← θ . ∇ While F = T = 0, d o the follo wing : ⋄ Let st ← 0 and ℓ ← 2; ⋄ While st = 0, do the follo wing : ⋆ Let ℓ ← ℓ − 1 and d ← d 2 ℓ . ⋆ If b − d > θ , then let a ← b − d and T ← 0. Otherwise, let a ← θ and T ← 1. ⋆ If C ( a, b ) < δ , then let st ← 1 and b ← a . ⋆ If d < η , then let s t ← 1 and F ← 1 . ∇ Return F . The output of th e b ac kw ard AMCA is a binary v ariable F suc h that “ F = 0” m eans “ C ( θ ) < δ ” and “ F = 1” means “ C ( θ ) ≥ δ ”. An intermediate v ariable T is introd u ced in the description of AMCA such that “ T = 1” means that the left endp oint of the interv al is reac hed. The bac kw ard AMCA starts from th e righ t end p oin t of the in terv al (i.e., b = θ ) and attempts to find an int erv al [ a, b ] s uc h that C ( a, b ) < δ . If such an inte rv al is a v ailable, then, attempt to go bac kwa rd to fin d the next consecutiv e inte rv al with t wice width. If d ou b ling the interv al w idth fails to guarante e C ( a, b ) < δ , then try to r ep eate dly cu t the int erv al width in h alf to ensu re that C ( a, b ) < δ . If the 18 in terv al w id th b ecomes smaller th an a prescrib ed tolerance η , then AMCA declares that “ F = 1”. F or our relev an t statistical problems, if C ( θ ) ≥ δ for some θ ∈ [ θ , θ ], it is sur e that “ F = 1” w ill b e declared. On th e other hand, it is p ossible that “ F = 1” is declared ev en though C ( θ ) < δ for an y θ ∈ [ θ , θ ]. Ho we v er, suc h situation can b e made extremely rare and immaterial if w e c h oose η to b e a very small num b er. Moreo ver, this will only in tro duce negligible conserv ativ eness in the ev aluation of C ( θ ) if η is c hosen to b e suffi ciently small (e.g., η = 10 − 15 ). Clearly , the bac kward AMCA can b e easily mo dified as f orw ard AMCA. Moreo ver, the AMCA can also b e easily mo dified as Adaptiv e Minim um C hec king Algorithm (forward and backw ard). F or c h ec king the m axim um of complemen tary co v erage p robabilit y Pr {| b p − p | ≥ ε | p } , one can use the AMCA with C ( p ) = Pr {| b p − p | ≥ ε | p } o v er inte rv al [0 , 1 2 ]. W e would lik e to p oin t out that, in contrast to the Adapted B&B Algorithm, it seems difficu lt to generalize th e AMCA to pr oblems in v olving m ultidimensional parameter spaces. 5.3 W orking wit h Complemen tary Cov erage P robabilit y W e w ould lik e to p oin t out that, instead of ev aluating the co verag e probability as in [22], it is b etter to ev aluate th e complementary co v erage probabilit y for purp ose of reducing numerical error. The adv an tage of working on the complementary co verage probab ility can b e explained as f ollo ws: Note that, in m any cases, the co v erage probabilit y is very close to 1 and the complemen tary co verage probabilit y is v ery close to 0. Since the absolute precision for computing a num b er close to 1 is m uc h lo w er than the absolute precision for compu ting a num b er close to 0, the metho d of directly ev aluating the cov erage p robabilit y will lead to in tolerable numerical error f or problems inv olving small δ . As an example, consid er a situation that the complemen tary co v erage p robabilit y is in the order of 10 − 5 . Direct computation of the co v erage probabilit y can easily lead to an absolute error of the order of 10 − 5 . How ev er, the absolute error of computing the complementary co v er age probabilit y can b e readily con trolled at the order of 10 − 9 . 5.4 Comparison of Approac hes of Chen and J. F rey As mentio ned in the in tro duction, J. F rey pub lished a pap er [22] in The Americ an Statistician (T AS) on the sequential estimation of a bin omial pr op ortion with prescrib ed m argin of error and confidence lev el. The approac hes of C hen and F rey are based on the same strategy as f ollo ws: First, construct a f amily of stopping ru les parameterized by γ (and p ossibly other design parameters) so that th e asso ciated co v erage pr obabilit y Pr {| b p − p | < ε | p } can b e controlle d b y p arameter γ in the sense that the cov erage probabilit y can b e made arbitrarily clo se to 1 by in creasing γ . Second, adaptiv ely and rigorously c hec k the co v er age guarante e by vir tue of b ound s of co verage probabilities. Third, app ly a bisection search metho d to d etermine the p arameter γ so that the co verag e pr ob ab ility is no less than th e pr escrib ed confidence lev el 1 − δ for an y p ∈ (0 , 1). F or the purp ose of con trolling the cov erage probabilit y , F rey [22] applied th e inclusion pr in ciple previously prop osed in [13, S ecti on 3] and us ed in [6 , 8, 10, 12]. As illustrated in Section 3, the cen tr al id ea of inclusion p r inciple is to use a sequ ence of confid ence inte rv als to construct stopp ing rules so th at the sampling pro cess is conti nued u n til a confidence int erv al is included b y an in terv al defined in term s of the estimator and margin of error. Due to the inclusion relationship, the 19 asso ciate d co v erage probabilit y can b e controlle d by the confidence co efficien ts of the s equ ence of confidence in terv als. The critical v alue γ u sed by F rey pla ys the same role for con trolling co v erage probabilities as that of the co verag e tun ing parameter ζ used by C hen. F rey [22] stated stopping rules in terms of confiden ce limits. This wa y of exp ressing stopping rules is straigh tforw ard and insigh tful, sin ce one can readily s een the pr inciple b ehin d the construction. F or conv enience of practical use, Chen prop osed to eliminate the necessit y of computing confiden ce limits. F r ey’s metho d for chec king co verage guarantee d iffers from the Adapted B&B Algorithm, bu t coincides with other tec h niques of Chen [9]. On September 18, 2011, in resp onse to an inquiry on the coincidence of the researc h results, F rey simultaneously emailed Xinjia Chen (the coauthor of the present pap er) and T AS Editor John S tufk en all pr e-final r evisions of his manuscript for the pap er [22]. In h is original man u script su bmitted to T AS in J uly 2009, F rey’s m ethod w as to “simply appro ximate C P ( γ ) by taking the minimum ov er the grid of v alues p = 1 / 2001 , ..., 2000 / 2001.” In the first r evision of his man uscript su bmitted to T AS in No v em b er 2009, F r ey’s metho d w as to “ap- pro ximate C P ( γ ) b y taking the minimum of T ( p ; γ ) o v er the grid of v alues p = 1 / 200 1 , ..., 2000 / 2001 and the set of v alues of the form p = c ± ǫ , wh ere c ∈ C and ǫ = 10 − 10 .” In F rey’s notational system, γ is the critical v alue which pla ys the same role as that of the co verag e tun ing parame- ter ζ in the present pap er, T ( p ; γ ) is th e co verag e probability , C P ( γ ) is the infi m u m of cov erage probabilit y for p ∈ (0 , 1), and C = { b p ± ε : b p is a p ossible v alue of b p } ∩ (0 , 1). F rom the original and the fi rst revision of his manuscript subm itted to T AS b efore April 2010, it can b e seen that F r ey’s m ethod of chec king co v erage guarantee was dep end en t on taking the minim um of co v erage probabilities f or a finite num b er of grid ding p oint s of p ∈ (0 , 1) as the infimum co verag e p r obabilit y for p ∈ (0 , 1). As can b e seen from Section 5.1 of the presen t pap er, suc h metho d lac ks r igorousn ess and efficiency . In the second r evision of h is man uscript s u bmitted to T AS in Apr il 2010, for the purp ose of chec king co v erage guarantee, F rey replaced the metho d of gridd ing parameter sp ace with an in terv al b oundin g tec hnique and prop osed a c h ec king algorithm which is essent ially the same as the AMCA preceden tially established by Chen [9, Section 3.3] in No vem b er 2009. Similar to the AMCA of [9, Section 3.3], the algorithm of F rey [22, Ap p endix] for c hec king co verag e guaran tee adaptiv ely scans the p aramete r space based on in terv al b ounding. Th e adaptiv e metho d u s ed by F r ey for u p dating step size is essen tially the same as that of the AMCA. Ignoring the num b er 0 . 01 in F rey’s expression “ ǫ i = min { 0 . 01 , 2( p i − 1 − p i − 2 ) } ”, which has very little impact on the computational efficiency , F rey’s step size ǫ i can b e ident ified as the adaptiv e step size d in the AMCA. The op eration asso ciated with “ ǫ i = min { 0 . 01 , 2( p i − 1 − p i − 2 ) } ” has a similar fu nction as that of the command “Let st ← 0 and ℓ ← 2” in the outer lo op of th e AMCA. T he op eration asso ciate d with F rey’s expression “ p i − 1 + ǫ i / 2 j , j ≥ 0” is equiv alen t to that of the command “Let ℓ ← ℓ − 1 and d ← d 2 ℓ ” in the inner lo op of the AMCA. F rey pr op osed to declare a failure of co verag e g uarante e if “the distance from p i − 1 to the candid ate v alue for p i falls b elo w 10 − 14 ”. The num b er “10 − 14 ” actually pla ys the same role as “ η ” in the AMCA, where “ η = 10 − 15 ” is recommended by [9]. 20 6 Numerical Re sults In this section, w e s h all illustrate the pr op osed doub le-parabolic sampling scheme through exam- ples. As d emonstrated in Section 2.2 and Section 4, the double-parab olic sampling scheme can b e parameterized by the dilation co efficien t ρ and the co verage tunin g parameter ζ . Hence, the p erformance of the resultant stopping ru le can b e optimized with r esp ect to ρ ∈ (0 , 1] and ζ by c h oosing v arious v alues of ρ from inte rv al (0 , 1] and determining the corresp ond in g v alues of ζ by the computational tec hniques introd u ced in Section 2 to guaran tee the desired confid en ce int erv al. 6.1 Asymptotic Analysis Ma y Be I nadequ ate F or fully sequen tial cases, we h a ve ev aluated the double-parab olic sampling sc heme with ε = 0 . 1 , δ = 0 . 05 , ρ = 0 . 1 and ζ = 1 δ exp  − 1 2 Z 2 δ/ 2  ≈ 2 . 93 . The stopping b oun dary is disp la ye d in the left side of Figure 2 . T h e fun ction of co verag e probabilit y with resp ect to the binomial pro- p ortion is sho wn in the right side of Figure 2, wh ic h indicates that the co verag e p robabilities are generally su b stan tially low er than the p rescrib ed confid ence lev el 1 − δ = 0 . 05. By consid er in g ε = 0 . 1 as a sm all num b er and applying the asymptotic th eory , the co verage pr ob ab ility asso ciated with the samp ling scheme is exp ected to b e close to 0 . 95. This numerical example d emonstrates that although the asymptotic metho d is insight ful and inv olv es vir tually no computation, it ma y not b e adequate. In general, the main d r a wback of an asymptotic m ethod is th at there is no guaran tee of co ve rage probabilit y . Although an asymptotical metho d asserts that if th e margin of error ε tends to 0, the co verag e probabilit y will tend to th e pre-sp ecified confidence level 1 − δ , it is difficult to determin e ho w small the margin of err or ε is su fficien t for th e asymptotic metho d to b e ap p licable. Note that ε → 0 implies the av erage sample size tends to ∞ . Ho we v er, in realit y , the samp le sizes m ust b e finite. C onsequen tly , a n asymp toti c metho d in evitably intro d uces unkn o wn statistical error. Since an asymptotic metho d do es n ot necessarily gu arantee th e prescrib ed confi dence level , it is not fair to compare its asso ciated sample size with that of an exact m ethod, w hic h guarant ees the pre-sp ecified confidence lev el. This example also indicates that, due to the discrete nature of the p roblem, the co v erage probabilit y is a discon tin uous and erratic function of p , whic h implies that Mon te Carlo simulatio n is not suitable for ev aluating the co v erage p erformance. 21 0 20 40 60 80 100 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Sample size Relative frequency 0 0.2 0.4 0.6 0.8 1 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 Binomial proportion Coverage probability True coverage probability Prescribed confidence level Figure 2: Double-parab olic sampling with ε = 0 . 1 , δ = 0 . 05 , ρ = 1 10 and ζ = 2 . 93 6.2 P arametric V alues of F ully Sequen t ial Sch emes F or fully sequentia l cases, to allo w direct application of our doub le-parabolic sequentia l metho d, we ha v e obtained v alues of co verag e tuning parameter ζ , whic h guaran tee the prescrib ed confidence lev els, for doub le-parabolic sampling sc hemes with ρ = 3 4 and v arious combinatio ns of ( ε, δ ) as sho wn in T able 1. W e used the computational tec hniqu es introd uced in Section 2 to obtain this table. T able 1: Co v erage T unin g Pa rameter ε δ ζ ε δ ζ ε δ ζ 0 . 1 0 . 1 2 . 0427 0 . 1 0 . 05 2 . 417 4 0 . 1 0 . 01 3 . 060 8 0 . 05 0 . 1 2 . 0503 0 . 05 0 . 0 5 2 . 5862 0 . 0 5 0 . 01 3 . 3125 0 . 02 0 . 1 2 . 1725 0 . 02 0 . 0 5 2 . 5592 0 . 0 2 0 . 01 3 . 4461 0 . 01 0 . 1 2 . 1725 0 . 01 0 . 0 5 2 . 5592 0 . 0 1 0 . 01 3 . 4461 T o illustrate the use of T able 1, sup p ose that on e wan ts a fu lly sequentia l sampling pr ocedu r e to en s ure that Pr {| b p − p | < 0 . 1 | p } > 0 . 95 for any p ∈ (0 , 1). Th is means that one can c ho ose ε = 0 . 1 , δ = 0 . 05 and the range of sample size is giv en by (18). F rom T able 1, it can b e seen that the v alue of ζ corresp ondin g to ε = 0 . 1 , δ = 0 . 05 is 2 . 4174. Consequ ently , the stopping ru le is completely determined by substituting the v alues of d esign parameters ε = 0 . 1 , δ = 0 . 05 , ρ = 3 4 , ζ = 2 . 4174 into its defin ition. The stopp in g b ound ary of this sampling sc heme is display ed in the left side of Figure 3. The fu nction of co v erage probabilit y with r esp ect to the binomial p rop ortion is sho wn in the righ t side of Figure 3. 22 20 30 40 50 60 70 80 90 100 110 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Sample size Relative frequency 0 0.2 0.4 0.6 0.8 1 0.94 0.95 0.96 0.97 0.98 0.99 1 Binomial proportion Coverage probability True coverage probability Prescribed confidence level Figure 3: Double-parab olic sampling with ε = 0 . 1 , δ = 0 . 05 , ρ = 3 4 and ζ = 2 . 4174 6.3 P arametric V alues of Group Sequen t ial Scheme s In m an y situ ations, especially in clinical trials, it is desirable to use group sequentia l samp ling sc h emes. In T ables 2 and 3, assuming that sample sizes satisfy (5) for the pur p ose of having appro ximately equal group sizes, we hav e obtained p arameters for concrete sc hemes by th e compu- tational tec hn iques int ro duced in S ectio n 2. F or dilation co efficien t ρ = 3 4 and confi dence parameter δ = 0 . 05, w e h a v e obtained v alues of co v erage tunin g parameter ζ , wh ic h guarantee the prescrib ed confidence lev el 0 . 95, for double- parab olic sampling sc hemes, with the num b er of stages s ranging from 3 to 10, as shown in T able 2. F or dilation co efficien t ρ = 3 4 and confi dence parameter δ = 0 . 01, w e h a v e obtained v alues of co v erage tunin g parameter ζ , wh ic h guarantee the prescrib ed confidence lev el 0 . 99, for double- parab olic sampling sc hemes, with the num b er of stages s ranging from 3 to 10, as shown in T able 3. T able 2: Co v erage T unin g Pa rameter s = 3 s = 4 s = 5 s = 6 s = 7 s = 8 s = 9 s = 10 ε = 0 . 1 2 . 6583 2 . 6583 2 . 50 96 2 . 5946 2 . 445 9 2 . 6512 2 . 50 96 2 . 4459 ε = 0 . 05 2 . 6759 2 . 675 9 2 . 6759 2 . 6759 2 . 675 9 2 . 675 9 2 . 6759 2 . 6759 ε = 0 . 02 2 . 6725 2 . 672 5 2 . 6725 2 . 6725 2 . 672 5 2 . 672 5 2 . 6725 2 . 6725 ε = 0 . 01 2 . 6796 2 . 679 6 2 . 6796 2 . 6796 2 . 679 6 2 . 587 5 2 . 6796 2 . 6796 23 T able 3: Co v erage T unin g Pa rameter s = 3 s = 4 s = 5 s = 6 s = 7 s = 8 s = 9 s = 10 ε = 0 . 1 3 . 3322 3 . 3322 3 . 33 22 3 . 3322 3 . 332 2 3 . 2709 3 . 07 82 3 . 3322 ε = 0 . 05 3 . 5074 3 . 507 4 3 . 5074 3 . 5074 3 . 507 4 3 . 507 4 3 . 5074 3 . 5074 ε = 0 . 02 3 . 5430 3 . 543 0 3 . 5430 3 . 5430 3 . 543 0 3 . 543 0 3 . 5430 3 . 5430 ε = 0 . 01 3 . 5753 3 . 575 3 3 . 5753 3 . 5753 3 . 575 3 3 . 575 3 3 . 5753 3 . 5753 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Sample size Relative frequency 0 0.2 0.4 0.6 0.8 1 0.99 0.992 0.994 0.996 0.998 1 Binomial proportion Coverage probability True coverage probability Prescribed confidence level Figure 4: Double-parab olic sampling with ε = δ = 0 . 01 , s = 10 , ρ = 3 4 and ζ = 3 . 575 3 T o illustrate the use of these tables, sup p ose th at one wan ts a ten-stage sampling pro cedure of appro ximately equal group sizes to ensur e that Pr {| b p − p | < 0 . 01 | p } > 0 . 99 for any p ∈ (0 , 1). This m eans that one can c ho ose ε = δ = 0 . 01 , s = 10 and sample sizes satisfying (5). T o obtain appropriate parameter v alues for the sampling pr ocedu re, one can lo ok at T able 3 to fi nd the co verag e tun in g parameter ζ corresp onding to ε = 0 . 01 and s = 10. F r om T able 3, it can b e seen th at ζ can b e tak en as 3 . 5753. Consequent ly , the stopping r ule is completely determined by substituting the v alues of design parameters ε = 0 . 01 , δ = 0 . 01 , ρ = 3 4 , ζ = 3 . 5753 , s = 10 into its definition and equation (5). The stoppin g b ound ary of this sampling s cheme and the fun ctio n of co v erage probabilit y with resp ect to the binomial prop ortion are d ispla yed, resp ectiv ely , in the left and right sid es of Figure 4. 6.4 Comparison of Sampling Sche mes W e ha v e conducted numerical exp eriments to inv estigate th e impact of dilation co efficient ρ on the p erformance of our double-parab olic sampling schemes. O ur compu tatio nal exp eriences indicate that the dilation coefficient ρ = 3 4 is frequently a go o d choice in terms of av erage sample n umber and 24 co verag e p robabilit y . F or example, consider the case that the margin of err or is giv en as ε = 0 . 1 and th e prescrib ed confid ence lev el is 1 − δ with δ = 0 . 05. F or the d ou b le-parab olic sampling sc h eme w ith the dilation co efficien t ρ c hosen as 2 3 , 3 4 and 1, w e h a ve determined that, to ensure the prescrib ed confidence leve l 1 − δ = 0 . 95, it su ffices to set the co v erage tunin g parameter ζ as 2 . 1 , 2 . 4 and 2 . 4, resp ectiv ely . The a verage sample num b ers of these sampling schemes and the co verage probabilities as fu nctions of the b inomial pr op ortion are sh own, resp ectiv ely , in the left and righ t sides of Figure 5. F r om Figure 5, it can b e seen that a d ouble-parab olic s ampling sc heme with dilation co efficien t ρ = 3 4 has b etter p erformance in terms of av erage sample num b er and co v erage probabilit y as compared to that of the dou b le-parab olic samp lin g scheme with smaller or larger v alues of dilation co efficien t. 0 0.2 0.4 0.6 0.8 1 20 30 40 50 60 70 80 90 100 110 120 Binomial proportion Average sample number rho = 3/4 rho = 2/3 rho = 1 0 0.2 0.4 0.6 0.8 1 0.95 0.955 0.96 0.965 0.97 0.975 0.98 0.985 0.99 0.995 1 Binomial proportion Coverage probability rho = 2/3 rho = 3/4 rho = 1 Figure 5: Double-parab olic sampling with v arious dilation co efficien ts W e ha v e inv estigated the impact of confidence interv als on the p erforman ce of fully sequential sampling s chemes constru cted fr om the inclus ion principle. W e hav e observed that the stopping rule d eriv ed from Clopp er-Pe arson in terv als generally outp erform s the stopp ing rules deriv ed from other t yp es of confidence interv als. Ho w ev er, via ap p ropriate c h oice of the dilation co efficien t, the double-parab olic sampling sc h eme can p erform un if orm ly b etter than the stopping r ule d eriv ed from Clopp er-P earson interv als. T o illustrate, consider th e case that ε = 0 . 1 and δ = 0 . 05. F or stopping rules d er ived from Clopp er-P earson in terv als, Fishm an’s in terv als, Wilson’s interv als, and revised W ald in terv als w ith a = 4, w e ha v e determined that to guarantee the prescrib ed confidence lev el 1 − δ = 0 . 95, it suffices to set the co verage tun in g parameter ζ as 0 . 5 , 1 , 2 . 4 an d 0 . 37, resp ective ly . F or the stoppin g r ule derive d from W ald inte rv als, we ha v e determined ζ = 0 . 77 to ensur e the confidence level, un der the condition that the min im um samp le size is take n as l 1 ε ln 1 ζ δ m . Recall that for the doub le-parabolic samp ling s cheme with ρ = 3 4 , we ha v e obtained ζ = 2 . 4 for p u rp ose of guarantee ing the confidence lev el. T h e av erage sample num b er s of these sampling sc hemes are sho wn in Figure 6. F rom these plots, it can b e seen that as compared to th e stopping r u le derived 25 from Clopp er-P earson int erv als, the stopp ing ru le der ived from the revised W ald interv als p erform s b etter in the region of p close to 0 or 1, but p erforms wo rse in the region of p in the middle of (0 , 1). T he p er f ormance of stopping r ules from Fishm an’s in terv als (i.e., fr om Chern off b ound) and W ald interv als are obviously inferior as compared to that of the stoppin g ru le derived fr om Clopp er-P earson interv als. It can b e observ ed that the d ouble-parab olic sampling sc heme uniformly outp erforms the stopping ru le derived fr om Clopp er-Pearson interv als. 0 0.2 0.4 0.6 0.8 1 30 40 50 60 70 80 90 100 110 120 Binomial proportion Average sample number Double Parabolic Clopper−Pearson Revised Wald Interval 0 0.2 0.4 0.6 0.8 1 20 40 60 80 100 120 140 160 180 Binomial proportion Average sample number Double Parabolic Chernoff Wald Interval Figure 6: Comparison of av erage sample num b ers 6.5 Estimation with High Confidence Lev el In some situations, w e n eed to estimate a binomial pr op ortion with a high confidence level. F or example, one migh t wa nt to construct a sampling sc heme such that, for ε = 0 . 05 and δ = 10 − 10 , the resultan t sequentia l estimator b p satisfies Pr {| b p − p | < ε | p } > 1 − δ for an y p ∈ (0 , 1). By w orking with the complementa ry co verag e probabilit y , we determined that it suffices to let the dilation co efficien t ρ = 3 4 and the co v erage tun ing parameter ζ = 7 . 65. Th e stopp in g b oun dary and the function of co verag e p r obabilit y with r esp ect to the binomial p rop ortion are display ed, resp ectiv ely , in the left and right sides of Figure 7. As addressed in S ectio n 5.3, it sh ould b e n oted that it is imp ossible to obtain suc h a sampling scheme without wo rking w ith the complementa ry co verag e pr ob ab ility . 26 500 1000 1500 2000 2500 3000 3500 4000 4500 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Sample size Relative frequency 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x 10 −10 Binomial proportion Complementary coverage probability Figure 7: Double-parab olic sampling with ε = 0 . 05 , δ = 10 − 10 , ρ = 3 4 and ζ = 7 . 65 7 Illustrativ e Examples for Clinical T rial s In this section, w e shall illustrate the applications of our d ou b le-parab olic group sequen tial estima- tion metho d in clinical trials. An example of our double-parab olic sampling scheme can b e illustrated as follo ws. Assume that ε = δ = 0 . 05 is giv en and that the s ampling pro cedur e is exp ected to ha v e 7 stages with sample s izes satisfying (5). Ch oosing ρ = 3 4 , we hav e d etermined that it suffices to tak e ζ = 2 . 6759 to guarantee that the co v erage probabilit y is no less than 1 − δ = 0 . 95 for all p ∈ (0 , 1). Accordingly , the sample sizes of this sampling sc h eme are calculated as 59 , 116 , 173 , 231 , 288 , 345 and 403. Th is sampling sc h eme, with a s ample p ath, is sh own in the left side of Figure 8. In this case, the stopping rule can b e equiv alen tly d escrib ed by vir tue of Figure 8 as: C on tinue sampling until ( b p ℓ , n ℓ ) hit a green line at some stage. Th e co v erage probabilit y is sh o wn in the right sid e of Figure 8. T o app ly this estimation metho d in a clinical trial for estimating the prop ortion p of a binomial resp onse w ith margin of error 0 . 05 and confiden ce lev el 95%, we can ha ve sev en grou p s of patien ts with group s izes 59 , 57 , 57 , 58 , 57 , 57 and 58. In the first stage, we condu ct exp erimen t with the 59 p atien ts of the fi r st group. W e observe the relativ e frequen cy of resp onse and r ecord it as b p 1 . Supp ose there are 12 p atie nts ha ving p ositiv e resp onses, then the relativ e frequency at the fi r st stage is b p 1 = 12 59 = 0 . 2034 . With the v alues of ( b p 1 , n 1 ) = (0 . 2034 , 59), we c h ec k if th e stoppin g ru le is satisfied. This is equiv alen t to see if the p oint ( b p 1 , n 1 ) hit a green line at the first stage. F or suc h v alue of ( b p 1 , n 1 ), it can b e seen that the stopping condition is n ot fulfilled. So, we need to conduct the second stage of exp eriment with the 57 patien ts of the second group. W e obs er ve the resp onse of these 57 patien ts. S u pp ose w e observe that 5 patien ts among this group ha v e p ositiv e resp onses. Th en, we add 5 w ith 12, the num b er of p ositiv e r esp onses b efore the second stage, to obtain 17 p ositiv e r esp onses among n 2 = 59 + 57 = 116 patien ts. So, at the second stage, we get the relativ e frequ ency b p 2 = 17 116 = 0 . 1466. Sin ce the stopping rule is not satisfied with the 27 v alues of ( b p 2 , n 2 ) = (0 . 1466 , 116), we need to conduct the thir d stage of exp eriment w ith the 57 patien ts of the th ird group. Su pp ose w e observ e th at 14 patient s among th is group hav e p ositiv e resp onses. Then, we add 14 with 17, the num b er of p ositiv e resp onses b efore the third s tag e, to get 31 p ositiv e resp onses among n 3 = 59 + 57 + 57 = 173 patien ts. So, at the third stage, w e get the relativ e frequ ency b p 3 = 31 173 = 0 . 1792. Sin ce the stopping rule is not satisfied with the v alues of ( b p 3 , n 3 ) = (0 . 1792 , 173), w e n eed to cond uct th e fourth stage of exp eriment with the 58 patien ts of the fourth group . Supp ose we observe th at 15 p atien ts among this group ha v e p ositive resp onses. Then, we add 15 with 31, the num b er of p ositive resp onses b efore the fourth stage, to get 46 p ositive resp onses among n 4 = 59 + 57 + 57 + 58 = 231 patien ts. So, at th e fourth stage, w e get the relativ e frequency b p 4 = 46 231 = 0 . 1991. Sin ce the stopping rule is not satisfied w ith the v alues of ( b p 4 , n 4 ) = (0 . 1991 , 231), we need to cond uct the fi fth stage of exp eriment with th e 57 patien ts of the fifth group. Supp ose w e observ e that 6 patien ts among this group hav e p ositiv e resp onses. T h en, we add 6 with 46, the n umber of p ositiv e r esp onses b efore the fi fth stage, to get 52 p ositiv e resp onses among n 5 = 59 + 57 + 57 + 58 + 57 = 288 patients. So, at the fif th stage, w e get the r elat iv e frequency b p 5 = 52 288 = 0 . 1806 . It can b e seen that th e stopp ing r ule is satisfied with the v alues of ( b p 5 , n 5 ) = (0 . 1806 , 288). T h erefore, w e can terminate the sampling exp eriment and tak e b p = 52 288 = 0 . 1806 as an estimate of the prop ortion of the w hole p opulation having p ositiv e resp onses. With a 95% confiden ce lev el, one can b eliev e that the difference b et ween the true v alue of p and its estimate b p = 0 . 1806 is less than 0 . 05. 50 100 150 200 250 300 350 400 450 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Sample size Relative frequency 0 0.2 0.4 0.6 0.8 1 0.94 0.95 0.96 0.97 0.98 0.99 1 Binomial parameter Coverage probability True coverage probability Prescribed confidence level Figure 8: Double-parab olic sampling with ε = δ = 0 . 05 , s = 7 , ρ = 3 4 and ζ = 2 . 675 9 In this exp eriment, w e only use 288 samples to obtain the estimate for p . Except the r ound- off error, th er e is no other sour ce of error f or rep orting statistical accuracy , since n o asymptotic appro ximation is inv olve d. As compared to fi x ed -sample-size pr ocedur e, we ac hiev ed a substantia l sa ve of samples. T o see this, one can c hec k that using the rigorous formula (24) giv es a sample 28 size 738, whic h is ov erly conserv ativ e. F rom the classical app r o ximate formula (22), the sample size is determined as 385, wh ic h has b een known to b e insufficient to guaran tee the prescrib ed confidence level 95%. The exact metho d of [15] shows that at least 391 samples are n eeded. As compared to the b est fixed sample size obtained by the metho d of [15], the reduction of s amp le sizes resulted from our double-parab olic sampling scheme is 391 − 288 = 103. It can b e seen that the fixed-sample-size p ro cedure w astes 103 288 = 35 . 76% samples as compared to our group sequen tial metho d, whic h is also an exact metho d. This p ercen tage m ay not b e serious if it were a sa v e of n umber of simulation runs. How ev er, as the num b er coun t is for p atien ts, the r eduction of samples is imp ortan t for ethical and economical reasons. Using our group sequenti al metho d, the wo rst-case sample size is equal to 403, wh ic h is only 12 more than the m in im um sample size of fixed-sample pro cedure. How ev er, a lot of samp les can b e sa ved in the av erage case. As ε or δ b ecome smaller, the reduction of samples is more significan t. F or example, let ε = 0 . 02 and δ = 0 . 05, we ha v e a double-parab olic samp le scheme with 10 stages. Th e sampling sc heme, with a sample p ath, is shown in the left side of Figure 9. The co verage prob ab ility is sho wn in the righ t side of Figur e 9. 0 500 1000 1500 2000 2500 3000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Sample size Relative frequency 0 0.2 0.4 0.6 0.8 1 0.94 0.95 0.96 0.97 0.98 0.99 1 Binomial proportion Coverage probability True coverage probability Prescribed confidence level Figure 9: Double-parab olic sampling with ε = 0 . 02 , δ = 0 . 05 , s = 10 , ρ = 3 4 and ζ = 2 . 6725 8 Conclusion In this p ap er, w e h a v e r eview ed recent develo pment of group sequentia l estimation metho ds for a bin omial prop ortion. W e hav e illustrated the inclusion principle and its app licat ions to v arious stopping rules. W e h a ve introd uced computational tec hniques in the literature, whic h suffice for de- termining parameters of stopp ing rules to guarant ee d esired confid ence leve ls. Moreo ver, w e hav e prop osed a new f amily of sampling schemes with stopping b oundary of double-parab olic shap e, 29 whic h are parameterized by the co v erage tunin g parameter and the dilation co efficien t. T h ese pa- rameters can b e determin ed by the exact computational tec hn iques to reduce th e sampling cost, while en suring prescrib ed confidence lev els. The new family of sampling schemes are extremely sim- ple in stru cture and asymptotically optimal as the margin of error tends to 0. W e h a ve established analytic b ou n ds for the d istribution and exp ectatio n of the s ample num b er at the termination of the sampling pr ocess. W e hav e obtained parameter v alues v ia the exact computational tec hniqu es for the p rop osed sampling schemes suc h that th e confidence lev els are guarant eed and that the sampling sc hemes are generally more efficien t as compared to existing ones. A Pro of of Theorem 1 Consider fu nction g ( x, z ) = ( x − z ) 2 x (1 − x ) for x ∈ (0 , 1) and z ∈ [0 , 1]. It can b e c hec ked th at ∂ g ( x,z ) ∂ x = ( x − z )[ z (1 − x ) + x (1 − z )][ x (1 − x )] − 2 , whic h sh ows that f or an y fixed z ∈ [0 , 1], − g ( x, z ) is a unimo dal function of x ∈ (0 , 1), with a m axim um attained at x = z . By su c h a p rop ert y of g ( x, z ) and the definition of Wilson’s confidence inte rv als, we hav e { b p ℓ − ε ≤ L ℓ } = { 0 < b p ℓ − ε ≤ L ℓ } ∪ { b p ℓ ≤ ε } = ( 0 < b p ℓ − ε ≤ L ℓ ≤ b p ℓ , g ( L ℓ , b p ℓ ) = Z 2 ζ δ n ℓ ) ∪ { b p ℓ ≤ ε } = ( b p ℓ > ε, ε 2 ( b p ℓ − ε )[1 − ( b p ℓ − ε )] ≥ Z 2 ζ δ n ℓ ) ∪ { b p ℓ ≤ ε } and { b p ℓ + ε ≥ U ℓ } = { 1 > b p ℓ + ε ≥ U ℓ } ∪ { b p ℓ + ε ≥ 1 } = ( 1 > b p ℓ + ε ≥ U ℓ ≥ b p ℓ , g ( U ℓ , b p ℓ ) = Z 2 ζ δ n ℓ ) ∪{ b p ℓ + ε ≥ 1 } = ( b p ℓ < 1 − ε, ε 2 ( b p ℓ + ε )[1 − ( b p ℓ + ε )] ≥ Z 2 ζ δ n ℓ ) ∪ { b p ℓ ≥ 1 − ε } for ℓ = 1 , · · · , s , w here we ha v e u s ed the fact that { b p ℓ > ε } ⊆ { L ℓ > 0 } , { b p ℓ < 1 − ε } ⊆ { U ℓ < 1 } and 0 ≤ L ℓ ≤ b p ℓ ≤ U ℓ ≤ 1. R ecall that 0 < ε < 1 2 . It follo ws that { b p ℓ − ε ≤ L ℓ ≤ U ℓ ≤ b p ℓ + ε } = ( ε < b p ℓ < 1 − ε, ε 2 ( b p ℓ − ε )[1 − ( b p ℓ − ε )] ≥ Z 2 ζ δ n ℓ , ε 2 ( b p ℓ + ε )[1 − ( b p ℓ + ε )] ≥ Z 2 ζ δ n ℓ ) [ ( b p ℓ ≤ ε, ε 2 ( b p ℓ + ε )[1 − ( b p ℓ + ε )] ≥ Z 2 ζ δ n ℓ ) [ ( b p ℓ ≥ 1 − ε, ε 2 ( b p ℓ − ε )[1 − ( b p ℓ − ε )] ≥ Z 2 ζ δ n ℓ ) = ( ε < b p ℓ < 1 − ε,      b p ℓ − 1 2     − ε  2 ≥ 1 4 − n ℓ  ε Z ζ δ  2 ) [ ( b p ℓ ≤ ε,      b p ℓ − 1 2     − ε  2 ≥ 1 4 − n ℓ  ε Z ζ δ  2 ) [ ( b p ℓ ≥ 1 − ε,      b p ℓ − 1 2     − ε  2 ≥ 1 4 − n ℓ  ε Z ζ δ  2 ) = (      b p ℓ − 1 2     − ε  2 ≥ 1 4 − n ℓ  ε Z ζ δ  2 ) for ℓ = 1 , · · · , s . This completes the pro of of the theorem. 30 B Pro of of Theorem 2 By the assumption that n s ≥ 1 2 ε 2 ln 1 ζ δ , w e h av e 1 4 + ε 2 n s 2 ln( ζ δ ) ≤ 0 and consequen tly , Pr { ( | b p s − 1 2 | − ρε ) 2 ≥ 1 4 + ε 2 n s 2 ln( ζ δ ) } = 1 . I t follo ws fr om th e definition of the samplin g sc heme that the sampling pro cess m ust stop at or b efore the s -th stage. In other words, P r { l ≤ s } = 1. This allo ws one to write Pr {| b p − p | ≥ ε | p } = s X ℓ =1 Pr {| b p − p | ≥ ε, l = ℓ | p } = s X ℓ =1 Pr {| b p ℓ − p | ≥ ε, l = ℓ | p } ≤ s X ℓ =1 Pr {| b p ℓ − p | ≥ ε | p } (25) for p ∈ (0 , 1). By vir tue of the we ll-kno wn C hernoff-Hoeffd in g b oun d [2, 25], we h av e Pr {| b p ℓ − p | ≥ ε | p } ≤ 2 exp( − 2 n ℓ ε 2 ) (26) for ℓ = 1 , · · · , s . Making use of (25), (26) and the f act that n 1 ≥ 2 ρ ( 1 ε − ρ ) ln 1 ζ δ as can b e seen from (18), we hav e Pr {| b p − p | ≥ ε | p } ≤ 2 s X ℓ =1 exp( − 2 n ℓ ε 2 ) ≤ 2 ∞ X m = n 1 exp( − 2 mε 2 ) = 2 exp( − 2 n 1 ε 2 ) 1 − exp( − 2 ε 2 ) ≤ 2 exp  − 2 ε 2 × 2 ρ ( 1 ε − ρ ) ln 1 ζ δ  1 − exp( − 2 ε 2 ) = 2 exp (4 ερ (1 − ρε ) ln ( ζ δ )) 1 − exp( − 2 ε 2 ) for an y p ∈ (0 , 1). Th erefore, to guaran tee that Pr {| b p − p | < ε | p } ≥ 1 − δ for any p ∈ (0 , 1), it is suffi- cien t to c ho ose ζ s u c h that 2 exp (4 ερ (1 − ρε ) ln ( ζ δ )) ≤ δ [1 − exp( − 2 ε 2 )]. This inequalit y can b e writ- ten as 4 ερ (1 − ρε ) ln( ζ δ ) ≤ ln δ 2 + ln  1 − exp( − 2 ε 2 )  or equiv alen tly , ζ ≤ 1 δ exp  ln δ 2 +ln [ 1 − exp( − 2 ε 2 ) ] 4 ερ (1 − ρε )  . The pro of of the theorem is th us completed. C Pro of of Theorem 3 First, we need to sho w that Pr { lim ε → 0 n N ( p, ε,δ,ζ ) = 1 | p } = 1 for any p ∈ (0 , 1). Clearly , the sample num b er n is a rand om n umber dep en den t on ε . Note that for any ω ∈ Ω, the sequences { X n ( ω ) ( ω ) } ε ∈ (0 , 1) and { X n ( ω ) − 1 ( ω ) } ε ∈ (0 , 1) are subs ets of { X m ( ω ) } ∞ m =1 . By the str on g la w of large n umbers, for almost every ω ∈ Ω, the sequence { X m ( ω ) } ∞ m =1 con verges to p . Since ev ery subse- quence of a conv ergen t sequ ence m ust con v erge, it follo ws that the sequences { X n ( ω ) ( ω ) } ε ∈ (0 , 1) and { X n ( ω ) − 1 ( ω ) } ε ∈ (0 , 1) con verge to p as ε → 0 pro vided that n ( ω ) → ∞ as ε → 0. Since it is certain that n ≥ 2 ρ ( 1 ε − ρ ) ln 1 ζ δ → ∞ as ε → 0, we h a ve that  lim ε → 0 n − 1 n = 1  is a su re even t. It follo ws that B = { lim ε → 0 X n − 1 = p, lim ε → 0 X n = p, lim ε → 0 n − 1 n = 1 } is an almost sure even t. By the definition of the samp lin g sc heme, we hav e that A = (      X n − 1 − 1 2     − ρε  2 < 1 4 + ε 2 ( n − 1) 2 ln( ζ δ ) ,      X n − 1 2     − ρε  2 ≥ 1 4 + ε 2 n 2 ln( ζ δ ) ) 31 is a sure even t. Hence, A ∩ B is an almost su re ev en t. Defin e C = n lim ε → 0 n N ( p, ε,δ,ζ ) = 1 o . W e n eed to sho w that C is an almost sur e ev en t. F or this p urp ose, we let ω ∈ A ∩ B and exp ect to show that ω ∈ C . As a consequence of ω ∈ A ∩ B , n ( ω ) N ( p, ε, δ, ζ ) < n ( ω ) n ( ω ) − 1 h 1 4 −    X n ( ω ) − 1 ( ω ) − 1 2   − ρε  2 i p (1 − p ) , lim ε → 0 X n ( ω ) − 1 ( ω ) = p , lim ε → 0 n ( ω ) − 1 n ( ω ) = 1 . By the contin uit y of the fu nction   x − 1 2   − ρε w ith r esp ect to x an d ε , we hav e lim sup ε → 0 n ( ω ) N ( p, ε, δ, ζ ) ≤ lim ε → 0 n ( ω ) n ( ω ) − 1 × h 1 4 −    lim ε → 0 X n ( ω ) − 1 ( ω ) − 1 2   − lim ε → 0 ρε  2 i p (1 − p ) = 1 . (27) On the other h and, as a consequence of ω ∈ A ∩ B , n ( ω ) N ( p, ε, δ, ζ ) ≥ h 1 4 −    X n ( ω ) ( ω ) − 1 2   − ρε  2 i p (1 − p ) , lim ε → 0 X n ( ω ) ( ω ) = p. Making use of the conti nuit y of the function   x − 1 2   − ρε w ith r esp ect to x an d ε , we hav e lim inf ε → 0 n ( ω ) N ( p, ε, δ, ζ ) ≥ h 1 4 −    lim ε → 0 X n ( ω ) ( ω ) − 1 2   − lim ε → 0 ρε  2 i p (1 − p ) = 1 . (28) Com bining (27) and (28) y ields lim ε → 0 n ( ω ) N ( p, ε,δ,ζ ) = 1 and thus A ∩ B ⊆ C . This implies th at C is an almost sur e ev en t and th us Pr n lim ε → 0 n N ( p,ε,δ,ζ ) = 1 | p o = 1 for p ∈ (0 , 1). Next, we need to sho w that lim ε → 0 Pr {| b p − p | < ε | p } = 2Φ  q 2 ln 1 ζ δ  − 1 f or any p ∈ (0 , 1). F or simplicit y of n otat ions, let σ = p p (1 − p ) and a = q 2 ln 1 ζ δ . Note that Pr {| b p − p | < ε | p } = Pr {| X n − p | < ε | p } = Pr { √ n | X n − p | /σ < ε √ n /σ } . Clearly , for any η ∈ (0 , a ), Pr { √ n | X n − p | / σ < ε √ n /σ } ≤ Pr { √ n | X n − p | / σ < ε √ n /σ, ε √ n /σ ∈ [ a − η , a + η ] } + Pr { ε √ n /σ / ∈ [ a − η , a + η ] } ≤ Pr { √ n | X n − p | / σ < a + η , ε √ n /σ ∈ [ a − η , a + η ] } + Pr { ε √ n /σ / ∈ [ a − η , a + η ] } ≤ Pr { √ n | X n − p | / σ < a + η } + Pr { ε √ n /σ / ∈ [ a − η , a + η ] } (29) and Pr { √ n | X n − p | /σ < ε √ n /σ } ≥ Pr { √ n | X n − p | /σ < ε √ n /σ , ε √ n /σ ∈ [ a − η , a + η ] } ≥ Pr { √ n | X n − p | /σ < a − η , ε √ n /σ ∈ [ a − η , a + η ] } ≥ Pr { √ n | X n − p | /σ < a − η } − P r { ε √ n /σ / ∈ [ a − η , a + η ] } . (30) Recall that we ha v e established th at n /N ( p, ε, δ , ζ ) → 1 almost sur ely as ε → 0. T h is imp lies that ε √ n /σ → a and n /N ( p, ε, δ, ζ ) → 1 in p r obabilit y as ε tends to zero. It f ollo ws from Anscom b e’s 32 random cen tral limit theorem [1] that as ε tends to zero, √ n ( X n − p ) /σ conv erges in distribu tion to a Gaussian random v ariable with zero mean and unit v ariance. Hence, f rom (29), lim sup ε → 0 Pr { √ n | X n − p | /σ < ε √ n /σ } ≤ lim ε → 0 Pr { √ n | X n − p | /σ < a + η } + lim ε → 0 Pr { ε √ n /σ / ∈ [ a − η , a + η ] } = 2Φ( a + η ) − 1 and from (30), lim inf ε → 0 Pr { √ n | X n − p | /σ < ε √ n /σ } ≥ lim ε → 0 Pr { √ n | X n − p | /σ < a − η } − lim ε → 0 Pr { ε √ n /σ / ∈ [ a − η , a + η ] } = 2Φ( a − η ) − 1 . Since this argument holds f or arb itrarily small η ∈ (0 , a ), it m ust b e true that lim inf ε → 0 Pr { √ n | X n − p | /σ < ε √ n /σ } = lim sup ε → 0 Pr { √ n | X n − p | /σ < ε √ n /σ } = 2Φ( a ) − 1 . So, lim ε → 0 Pr {| b p − p | < ε | p } = lim ε → 0 Pr { √ n | X n − p | / σ < ε √ n /σ } = 2 Φ( a ) − 1 = 2Φ  q 2 ln 1 ζ δ  − 1 for an y p ∈ (0 , 1). No w, w e fo cus our atten tion to show that lim ε → 0 E [ n ] N ( p, ε,δ,ζ ) = 1 for any p ∈ (0 , 1). F or this purp ose, it suffices to sh o w that 1 − η ≤ lim inf ε → 0 E [ n ] N ( p, ε, δ, ζ ) ≤ lim sup ε → 0 E [ n ] N ( p, ε, δ, ζ ) ≤ 1 + η , ∀ p ∈ (0 , 1) (31) for an y η ∈ (0 , 1). F or simplicit y of n otat ions, we abbreviate N ( p, ε, δ, ζ ) as N in the sequel. Since w e hav e established Pr { lim ε → 0 n N ( p, ε,δ,ζ ) = 1 } = 1, w e can conclude that lim ε → 0 Pr { (1 − η ) N ≤ n ≤ (1 + η ) N } = 1 . (32) Noting that E [ n ] = ∞ X m =0 m Pr { n = m } ≥ X (1 − η ) N ≤ m ≤ (1+ η ) N m Pr { n = m } ≥ (1 − η ) N X (1 − η ) N ≤ m ≤ (1+ η ) N Pr { n = m } , w e hav e E [ n ] ≥ (1 − η ) N Pr { (1 − η ) N ≤ n ≤ (1 + η ) N } . (33) Com bining (32) and (33) yields lim inf ε → 0 E [ n ] N ( p, ε, δ, ζ ) ≥ (1 − η ) lim ε → 0 Pr { (1 − η ) N ≤ n ≤ (1 + η ) N } = 1 − η . On the other h and, usin g E [ n ] = P ∞ m =0 Pr { n > m } , we can write E [ n ] = X 0 ≤ m< (1+ η ) N Pr { n > m } + X m ≥ (1+ η ) N Pr { n > m } ≤ ⌈ (1 + η ) N ⌉ + X m ≥ (1+ η ) N Pr { n > m } . Since lim sup ε → 0 ⌈ (1+ η ) N ⌉ N ( p,ε,δ,ζ ) = 1 + η , f or the pu rp ose of establishing lim sup ε → 0 E [ n ] N ( p, ε,δ,ζ ) ≤ 1 + η , it remains to show th at lim sup ε → 0 P m ≥ (1+ η ) N Pr { n > m } N ( p, ε, δ, ζ ) = 0 . 33 Consider fun ctions f ( x ) = 1 4 −    x − 1 2   − ρε  2 and g ( x ) = x (1 − x ) for x ∈ [0 , 1]. Note that | f ( x ) − g ( x ) | =       x − 1 2  2 −      x − 1 2     − ρε  2      = ρε || 2 x − 1 | − ρε | ≤ ρε (1 + ρε ) for all x ∈ [0 , 1]. F or p ∈ (0 , 1), there exists a p ositiv e num b er γ < min { p, 1 − p } su c h that | g ( x ) − g ( p ) | < η 2 p (1 − p ) for any x ∈ ( p − γ , p + γ ), since g ( x ) is a con tin uous fu nction of x . F rom no w on, let ε > 0 b e s ufficien tly small such that ρε (1 + ρε ) < η 2 p (1 − p ). T hen, f ( x ) ≤ g ( x ) + ρε (1 + ρε ) < g ( p ) + η 2 p (1 − p ) + ρε (1 + ρε ) < (1 + η ) p (1 − p ) for all x ∈ ( p − γ , p + γ ). This implies that { X m ∈ ( p − γ , p + γ ) } ⊆ ( (1 + η ) p (1 − p ) ≥ 1 4 −      X m − 1 2     − ρε  2 ) (34) for all m > 0. T aking complemen tary ev en ts on b oth sides of (34) leads to ( (1 + η ) p (1 − p ) < 1 4 −      X m − 1 2     − ρε  2 ) ⊆ { X m / ∈ ( p − γ , p + γ ) } for all m > 0. S ince (1 + η ) p (1 − p ) = (1+ η ) N ε 2 2 ln 1 ζ δ ≤ mε 2 2 ln 1 ζ δ for all m ≥ (1 + η ) N , it follo w s that ( mε 2 2 ln 1 ζ δ < 1 4 −      X m − 1 2     − ρε  2 ) ⊆ { X m / ∈ ( p − γ , p + γ ) } for all m ≥ (1 + η ) N . Therefore, w e ha v e sho wn that if ε is sufficientl y small, th en there exists a n umber γ > 0 su c h th at { n > m } ⊆ (      X m − 1 2     − ρε  2 < 1 4 + mε 2 2 ln( ζ δ ) ) ⊆ { X m / ∈ ( p − γ , p + γ ) } for all m ≥ (1 + η ) N . Using this inclusion relationship and the Chernoff-Ho effding b ound [2, 25], w e hav e Pr { n > m } ≤ Pr { X m / ∈ ( p − γ , p + γ ) } ≤ 2 exp( − 2 mγ 2 ) (35) for all m ≥ (1 + η ) N pr ovided that ε > 0 is s ufficien tly small. Letting k = ⌈ (1 + η ) N ⌉ and u sing (35), w e hav e X m ≥ (1+ η ) N Pr { n > m } = X m ≥ k Pr { n > m } ≤ X m ≥ k 2 exp( − 2 mγ 2 ) = 2 exp( − 2 k γ 2 ) 1 − exp( − 2 γ 2 ) pro vided that ε is sufficien tly small. Consequently , lim sup ε → 0 P m ≥ (1+ η ) N Pr { n > m } N ( p, ε, δ, ζ ) ≤ lim sup ε → 0 2 N exp( − 2 k γ 2 ) 1 − exp( − 2 γ 2 ) = 0 , since k → ∞ and N → ∞ as ε → 0. So, we ha v e established (31). Since the argumen t holds for arbitrarily sm all η > 0, it must b e tru e that lim ε → 0 E [ n ] N ( p, ε,δ,ζ ) = 1 for an y p ∈ (0 , 1). This completes the pro of of the theorem. 34 D Pro of of Theorem 4 Recall that l denotes the ind ex of s tage at the termination of the sampling pro cess. Obser v in g that n s − n 1 Pr { l = 1 } = n s Pr { l ≤ s } − n 1 Pr { l ≤ 1 } = s X ℓ =2 ( n ℓ Pr { l ≤ ℓ } − n ℓ − 1 Pr { l < ℓ } ) = s X ℓ =2 n ℓ (Pr { l ≤ ℓ } − Pr { l < ℓ } ) + s X ℓ =2 ( n ℓ − n ℓ − 1 ) P r { l < ℓ } = s X ℓ =2 n ℓ Pr { l = ℓ } + s − 1 X ℓ =1 ( n ℓ +1 − n ℓ ) Pr { l ≤ ℓ } , w e ha v e n s − P s ℓ =1 n ℓ Pr { l = ℓ } = P s − 1 ℓ =1 ( n ℓ +1 − n ℓ ) P r { l ≤ ℓ } . Making u se of this result and the fact n s = n 1 + P s − 1 ℓ =1 ( n ℓ +1 − n ℓ ), we hav e E [ n ] = s X ℓ =1 n ℓ Pr { l = ℓ } = n s − n s − s X ℓ =1 n ℓ Pr { l = ℓ } ! = n 1 + s − 1 X ℓ =1 ( n ℓ +1 − n ℓ ) − s − 1 X ℓ =1 ( n ℓ +1 − n ℓ ) P r { l ≤ ℓ } = n 1 + τ − 1 X ℓ =1 ( n ℓ +1 − n ℓ ) Pr { l > ℓ } + s − 1 X ℓ = τ ( n ℓ +1 − n ℓ ) P r { l > ℓ } . (36) By the definition of the stopping ru le, we h a ve { l > ℓ } ⊆ (      b p ℓ − 1 2     − ρε  2 < 1 4 + ε 2 n ℓ 2 ln( ζ δ ) ) = ( ρε − s 1 4 + ε 2 n ℓ 2 ln( ζ δ ) <     b p ℓ − 1 2     < ρε + s 1 4 + ε 2 n ℓ 2 ln( ζ δ ) ) = ( ρε − s 1 4 + ε 2 n ℓ 2 ln( ζ δ ) < 1 2 − b p ℓ < ρε + s 1 4 + ε 2 n ℓ 2 ln( ζ δ ) , b p ℓ ≤ 1 2 ) [ ( ρε − s 1 4 + ε 2 n ℓ 2 ln( ζ δ ) < b p ℓ − 1 2 < ρε + s 1 4 + ε 2 n ℓ 2 ln( ζ δ ) , b p ℓ > 1 2 ) ⊆ { a ℓ < b p ℓ < b ℓ } ∪ { 1 − b ℓ < b p ℓ < 1 − a ℓ } (37) for 1 ≤ ℓ < s , where b ℓ = 1 2 − ρε + q 1 4 + ε 2 n ℓ 2 ln( ζ δ ) for ℓ = 1 , · · · , s − 1. By the assumption th at ε and ρ are non-n egativ e, w e h a ve 1 − b ℓ − a ℓ = 2 ρε ≥ 0 for ℓ = 1 , · · · , s − 1. It follo ws fr om (37) th at { l > ℓ } ⊆ { b p ℓ > a ℓ } for ℓ = 1 , · · · , s − 1. By the d efinition of τ , we ha v e p < a ℓ for τ ≤ ℓ < s . Making use of this f act, the inclusion relationship { l > ℓ } ⊆ { b p ℓ > a ℓ } , ℓ = 1 , · · · , s − 1, and Chernoff-Ho effding b ound [2, 25], we hav e Pr { n > n ℓ | p } = Pr { l > ℓ | p } ≤ Pr { b p ℓ > a ℓ | p } ≤ exp( n ℓ M ( a ℓ , p )) (38) 35 for τ ≤ ℓ < s . It follo ws fr om (36) and (38) that E [ n ] ≤ n 1 + τ − 1 X ℓ =1 ( n ℓ +1 − n ℓ ) + s − 1 X ℓ = τ ( n ℓ +1 − n ℓ ) P r { l > ℓ } = n τ + s − 1 X ℓ = τ ( n ℓ +1 − n ℓ ) P r { l > ℓ } ≤ n τ + s − 1 X ℓ = τ ( n ℓ +1 − n ℓ ) exp ( n ℓ M ( a ℓ , p )) . This completes the p ro of of the theorem. References [1] F. J. Anscombe, “Sequen tial estimation,” J. R oy. Statist. So c. Ser. B , vol. 15, p p. 1–21, 1953. [2] H. Chern off, “A measure of asymptotic efficiency f or tests of a h yp othesis based on the sum of observ ations,” Ann. M ath. Statist. , v ol. 23, pp . 493–50 7, 1952. [3] H. C hen, “The accuracy of appro ximate interv als for a binomial parameter,” The Journal of Americ an Statistic al Asso ciation , vol . 85, pp . 514–51 8, 1990. [4] S. C . Chow, J. Shao, an d H. W ang, Sample Size Calculations in Clinic al T rials , 2n d edition, Chapman & Hall, 2008. [5] Y. S. Cho w and H. Robbin, “On the asymptotic theory of fixed wid th confidence in terv als for the mean,” Ann. Math. Statist. , v ol. 36, pp . 457–462, 1965. [6] X. Chen, “A new framewo rk of multistag e estimation,” arXiv:08 09.1241 v1 [math.ST], app eared in ht tp://arxiv.org/abs/08 09.1241v1 on September 8, 2008. [7] X. Chen, “A new framewo rk of multistag e estimation,” arXiv:08 09.1241 v4 [math.ST], app eared in ht tp://arxiv.org/abs/08 09.1241v4 on Decem b er 2, 2008. [8] X. Chen, “A new framework of multista ge estimation,” arXiv:0809. 1241 v12 [math.ST], ap- p eared in ht tp://arxiv.org/abs/08 09.1241v12 on April 27, 2009. [9] X. Chen, “A new framework of multista ge estimation,” arXiv:0809. 1241 v16 [math.ST], ap- p eared in ht tp://arxiv.org/abs/08 09.1241v16 on No v em b er 20, 2009. [10] X. Ch en, “Multistage estimation of b ounded -v ariable means,” arXiv:0809.46 79 v1 [m ath.S T], app eared in htt p://arxiv.org/abs/08 09.4679v1 on Septemb er 26, 2008. [11] X. Ch en, “Multistage estimation of b ounded -v ariable means,” arXiv:0809.46 79 v2 [m ath.S T], app eared in htt p://arxiv.org/abs/08 09.4679v2 on Octob er 16, 2008. [12] X. Chen, “Estimating the parameters of binomial and Poisson distrib u tions via m ultistage sampling,” arXiv:08 10.043 0 v1 [math.ST], app eared in http:// arxiv.org/abs/0810 .0430v1 on Octob er 2, 2008. 36 [13] X. Chen, “Confid ence in terv al for the mean of a b ound ed rand om v ariable and its app licat ions in p oin t estimation,” arXiv:0802.345 8 v2 [math.ST], ap p eared in h ttp://arxiv.org/a bs/0802.345 8v2 on April 2, 2009. [14] X. Chen , “A new framew ork of multistag e parametric inference,” Pr o c e e ding of SP IE Confer- enc e , v ol. 7666, p p. 76660R1–12, Orlando, Florida, Apr il 2010. [15] X. Ch en, “Exact compu tatio n of minimum sample size for estimation of binomial parameters,” Journal of Statistic al Planning and Infer enc e , v ol. 141, pp. 2622–26 32, 2011. [16] X. C hen, K. Zh ou and J. Ara v ena, “Explicit formula for constructing b inomial confi d ence in terv al with guarant eed cov erage pr obabilit y ,” Communic ations in Statistics – The ory and metho ds , vol. 37, pp. 1173–1 180, 2008. [17] C. J. Clopp er and E. S. P earson, “The use of confidence or fiducial limits illustrated in the case of the binomial,” Biometrika , vo l. 26, pp. 404–4 13, 1934. [18] G. S. Fish m an, “Confid ence in terv als for the mean in the b oun ded case”, Statistics & Pr ob a- bility L etters , vol. 12, pp. 223–227 , 1990. [19] W. F eller, An Intr o duction to Pr ob ability The ory and Its Applic ations , v ol. 1, 3rd ed., Wiley , New Y ork, 1968. [20] S. F ranz ´ en, “Fixed length sequen tial co nfid ence int erv als for the probabilit y of resp onse,” Se quential Analysis , v ol. 20, pp. 45–54, 2001. [21] S. F ranz´ en, “SPR T fix ed length confi d ence in terv als,” Communic ations in Statistics The ory and M etho ds , v ol. 33, pp. 305–31 9, 2004. [22] J. F rey , “Fixed-width sequen tial confid ence in terv als for a pr op ortion,” The Americ an Statisti- cian , vo l. 64, no. 3, p p. 242–249, August 2010. The original, first, s econd and final revisions of this pap er were su bmitted to The Americ an Statistician in July 2009, No vem b er 2009, April 2010, and Jun e 2010 r esp ectiv ely . [23] B. K . Ghosh and P . K. Sen , H andb o ok of Se quential Analysis , Marcel Dekk er Inc., 1991. [24] M. Gh osh, N. Mukhopadhy a y and P . K . S en , Se qu e ntial Estimation , Wiley , New Y ork, 1997. [25] W. Ho effding, “Probabilit y inequalities for sum s of boun ded v ariables,” J. Amer. Statist. Asso c. , vo l. 58, pp. 13–29, 1963. [26] C. Jenn ison and B. W. T u rn bull, Gr oup Se quential Metho ds with Applic ations to Clinic al T rials , Chapman & Hall, 1999. [27] T. L. Lai, “Sequen tial An alysis: S ome classical p roblems and new c hallenges,” Statistic a Sinic a , v ol. 11, pp . 303–408, 2001. [28] A. H. Land and A. G. Doig, “An automatic metho d of solving discrete programming p r oblems,” Ec onometric a , v ol. 28, no. 3, pp . 497–520, 1960. 37 [29] P . Massart, “The tigh t constan t in the Dv oretzky-Kiefer-W olfo witz inequ alit y ,” The Annals of Pr ob ability , p p. 1269–1283 , v ol. 18, 1990. [30] L. Mend o and J . M. Hern ando, “Improv ed sequ en tial s topping rule for Mont e C arlo simula- tion,” IEEE T r ans. Commun. , vo l. 56, no. 11, p p. 1761–176 4, No v . 2008. [31] J. R. Sc hultz , F. R. Nic hol, G. L. Elfring, and S. D. W eed, “Multiple-stage pro cedures for d rug screening,” Biometrics , vol. 29, pp. 293–30 0, 1973. [32] D. Siegm und, Se quential Anal ysis: T ests and Confidenc e Intervals , Sp r inger-V erlag, New Y ork, 1985. [33] M. T anak a, “On a confidence in terv al of give n length for the parameter of the binomial and P oisson distribu tions,” Ann. Inst. Statist. Math. , vol. 13, pp. 201–2 15, 1961. [34] A. W ald, Se quential Analysis , Wiley , New Y ork, 1947. [35] E. B. Wilson, “Probable inference, the law of su ccessio n, and statistical inference,” Journal of the Americ an Statistic al Asso ciation, v ol. 22, pp . 209–212, 1927. 38

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment