Optimal Explicit Binomial Confidence Interval with Guaranteed Coverage Probability

In this paper, we develop an approach for optimizing the explicit binomial confidence interval recently derived by Chen et al. The optimization reduces conservativeness while guaranteeing prescribed coverage probability.

Authors: Xinjia Chen

Optimal Explicit Binomial Confide nce In terv al with Guaran teed Cov erage Probabil i t y ∗ Xinjia Chen Submitted in April, 200 8 Abstract In this pap er, we dev elop an approa ch for optimizing the explicit binomia l confidence int erv al re c e n tly derived by Chen et al. The optimizatio n r educes conser v ativeness while guaranteeing prescrib ed coverage probabilit y . 1 Explicit F orm ula of Chen et al. Let X be a Bernoulli r andom v ariable d efined in probabilit y sp ace (Ω , F , Pr) with distribution Pr { X = 1 } = 1 − Pr { X = 0 } = p ∈ (0 , 1). It is a frequent problem to constru ct a confidence in terv al for p based on n i.i.d. rand om samples X 1 , · · · , X n of X . Recen tly , Ch en et al. ha ve prop osed an explicit confidence in terv al in [2] with low er confiden ce limit L n,δ = K n + 3 4 1 − 2 K n − q 1 + 9 2 ln 2 δ K (1 − K n ) 1 + 9 n 8 ln 2 δ (1) and u p p er confidence limit U n,δ = K n + 3 4 1 − 2 K n + q 1 + 9 2 ln 2 δ K (1 − K n ) 1 + 9 n 8 ln 2 δ (2) where K = P n i =1 X i . Suc h confiden ce in terv al guarantees that the co v erage pr obabilit y Pr { L n,δ < p < U n,δ | p } is greater than 1 − δ for an y p ∈ (0 , 1). Clearly , the explicit binomial confid ence in terv al is conserv ativ e and it is desirable to optimize the confi d ence in terv al by tunin g the parameter δ . Th is is ob j ective of the next s ection. ∗ The author is currently with Department of Electrical Engineering, Louisiana State Universi ty at Baton R ouge, LA 70803, U SA, and Department of Electrical Engineering, South ern Universit y and A&M College, Baton Rouge, LA 70813, U SA; Email: c henxinjia@gmail.com 1 2 Optimization of Explicit Binomial Confi dence In terv al As will b e seen in Section 3, it can b e sh own that Theorem 1 F or any fixe d n and p ∈ (0 , 1) , the c over age pr ob ability of c onfidenc e interval [ L n,δ , U n,δ ] de cr e ases as δ incr e ases. Hence, it is p ossible to fi nd δ > α such that Pr { L n,α < p < U n,α | p } > 1 − α, ∀ p ∈ (0 , 1) for α ∈ (0 , 1). T o reduce conserv atism of the confidence in terv al, we consider the follo wing optimization pr oblem: F or a giv en α ∈ (0 , 1), maximize δ s u b ject to the constrain t that inf p ∈ (0 , 1) Pr { L n,δ ≤ p ≤ U n,δ | p } ≥ 1 − α. A similar problem is to maximize δ sub ject to the constraint that inf p ∈ (0 , 1) Pr { L n,δ < p < U n,δ | p } ≥ 1 − α. As a result of Theorem 1, the maxim um δ can b e obtained from ( α, 1) b y a bisection search. In this regard, it is essen tial to efficient ly ev aluate inf p ∈ (0 , 1) Pr { L n,δ ≤ p ≤ U n,δ | p } and inf p ∈ (0 , 1) Pr { L n,δ < p < U n,δ | p } . This is accomplished by the follo wing theorem derived from the th eory of rand om in terv als established in [3]. Theorem 2 L et δ ∈ (0 , 1) . Define T − ( p ) = np + 1 − 2 p − q 1 + 18 np (1 − p ) ln(2 /δ ) 2 3 n + 3 ln(2 /δ ) , T + ( p ) = np + 1 − 2 p + q 1 + 18 np (1 − p ) ln(2 /δ ) 2 3 n + 3 ln(2 /δ ) for p ∈ (0 , 1) and L ( k ) = k n + 3 4 1 − 2 k n − q 1 + 9 2 ln 2 δ k (1 − k n ) 1 + 9 n 8 ln 2 δ , U ( k ) = k n + 3 4 1 − 2 k n + q 1 + 9 2 ln 2 δ k (1 − k n ) 1 + 9 n 8 ln 2 δ for k = 0 , 1 , · · · , n . Define C l ( k ) = Pr {⌈ T − ( L ( k )) ⌉ ≤ K ≤ k − 1 | L ( k ) } , C ′ l ( k ) = Pr {⌊ T − ( L ( k )) ⌋ +1 ≤ K ≤ k − 1 | L ( k ) } for k ∈ { 0 , 1 , · · · , n } such that 0 < L ( k ) < 1 . Define C u ( k ) = Pr { k +1 ≤ K ≤ ⌊ T + ( U ( k )) ⌋ | U ( k ) } , C ′ u ( k ) = Pr { k +1 ≤ K ≤ ⌈ T + ( U ( k )) ⌉− 1 | U ( k ) } for k ∈ { 0 , 1 , · · · , n } such that 0 < U ( k ) < 1 . 2 Then, the fol lowing statements hold true: (I): inf p ∈ (0 , 1) Pr { L n,δ ≤ p ≤ U n,δ | p } e quals to the minimum of { C l ( k ) : 0 ≤ k ≤ n ; 0 < L ( k ) < 1 } ∪ { C u ( k ) : 0 ≤ k ≤ n ; 0 < U ( k ) < 1 } . (II): inf p ∈ (0 , 1) Pr { L n,δ < p < U n,δ | p } e quals to the minimum of { C ′ l ( k ) : 0 ≤ k ≤ n ; 0 < L ( k ) < 1 } ∪ { C ′ u ( k ) : 0 ≤ k ≤ n ; 0 < U ( k ) < 1 } . The pr o of of Th eorem 2 is pro vided in Section 4. 3 Pro of of T heorem 1 F or simp licit y of notations, defin e λ = 9 n 8 ln 2 δ and z = k n Then, for K = k , the upp er and lo w er confidence limits are U n,δ = U ( z ) and L n,δ = L ( z ) resp ectiv ely , where U ( z ) = z + 3 4 1 − 2 z + p 1 + 4 λ z (1 − z ) 1 + λ , L ( z ) = z + 3 4 1 − 2 z − p 1 + 4 λ z (1 − z ) 1 + λ . Since (1 − 2 p ) 2 ≤ 1 + 4 λ p (1 − p ) for p ∈ (0 , 1) and λ > 0, we h a v e L ( z ) ≤ z and U ( z ) ≥ z . Hence, to show T heorem 1, it s u ffices to sh o w that b oth U ( z ) − z and z − L ( z ) d ecrease as δ increases for an y z ∈ [0 , 1]. W e sh all first sho w that U ( z ) − z decreases as δ increases for any fixed z ∈ [0 , 1]. F or th is purp ose, we can define y = 4[ U ( z ) − z ] 3 and sho w that ∂ y ∂ λ < 0. T o this end, we can use the definition of y to obtain the follo win g equation [(1 + λ ) y − (1 − 2 z )] 2 = 1 + 4 λz (1 − z ). Differentia ting b oth sides of this equation with resp ect to λ yields 2[(1 + λ ) y − (1 − 2 z )]  (1 + λ ) ∂ y ∂ λ + y  = 4 z (1 − z ) , from wh ic h we ha ve (1 + λ ) ∂ y ∂ λ = 2 z (1 − z ) (1 + λ ) y − (1 − 2 z ) − y . Clearly , to sho w ∂ y ∂ λ < 0, it su ffi ces to show that the righ t-hand side of the ab ov e equ ation is negativ e f or an y z ∈ [0 , 1] and λ > 0. That is, to show 2 z (1 − z ) p 1 + 4 λ z (1 − z ) < y , or equiv alently , (1 + λ )2 z (1 − z ) < (1 − 2 z ) p 1 + 4 λ z (1 − z ) + 1 + 4 λ z (1 − z ) , 3 whic h can b e written as 2 z (1 − z ) < w ( λ ), where w ( λ ) = (1 − 2 z ) p 1 + 4 λ z (1 − z ) + 1 + 2 λ z (1 − z ) . Note that ∂ w ( λ ) ∂ λ = 2 z (1 − z ) " 1 − 2 z p 1 + 4 λ z (1 − z ) + 1 # > 0 as a result of 1 − 2 z p 1 + 4 λ z (1 − z ) > − 1 p 1 + 4 λ z (1 − z ) > − 1 . Hence, w ( λ ) > w (0) = 2(1 − z ) for any λ > 0. T his sho ws that 2 z (1 − z ) < w ( λ ) for any z ∈ [0 , 1] and λ > 0. Consequently , w e ha ve established ∂ y ∂ λ < 0, which imp lies that U ( z ) − z decreases as δ increases for an y fixed z ∈ [0 , 1]. Obser v in g that L ( z ) = 1 − U (1 − z ) for any z ∈ [0 , 1], we hav e z − L ( z ) = z − [1 − U (1 − z )] = U (1 − z ) − (1 − z ) . Therefore, it m ust b e true that z − L ( z ) decreases as δ in creases for fi xed an y z ∈ [0 , 1]. So, the pro of of Theorem 1 is completed. 4 Pro of of T heorem 2 F or simp licit y of notations, we d efine λ, z , L ( z ) and U ( z ) as in the pr o of of Theorem 1. Note that L (1) = 1 − 3 2(1+ λ ) < 1 and the deriv ativ e of L ( z ) w ith resp ect to z is L ′ ( z ) = 1 + 3 4(1 + λ ) " − 2 − 1 2 4 λ (1 − 2 z ) p 1 + 4 λz (1 − z ) # = 1 − 3 2(1 + λ ) − 3 2(1 + λ ) λ (1 − 2 z ) p 1 + 4 λz (1 − z ) = 1 2(1 + λ ) " 2 λ − 1 − 3 λ (1 − 2 z ) p 1 + 4 λz (1 − z ) # whic h is p ositiv e if and only if (2 λ − 1) p 1 + 4 λz (1 − z ) > 3 λ (1 − 2 z ). T o complete the pr o of of Theorem 2, w e need some preliminary results. Lemma 1 F or any n ≥ 1 and δ ∈ (0 , 1) , (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 ≥ 1 4 (3) if and only if λ ≤ 1 5 . 4 Pro of . Note that (3) is equiv alen t to (2 λ − 1) 2 (1 + λ ) ≥ 9 λ 2 + λ (2 λ − 1) 2 , whic h can b e s implified as (5 λ − 1)( λ + 1) ≤ 0. Since δ ∈ (0 , 1) and n ≥ 1, we ha ve λ > 0. Hence, the inequalit y (3) holds if and only if λ ≤ 1 5 . ✷ Lemma 2 L ( z ) is monotonic al ly incr e asing with r esp e ct to z ∈ [0 , 1] suc h that L ( z ) > 0 . Simi- larly, U ( z ) is monotonic al ly incr e asing with r esp e ct to z ∈ [0 , 1] such that U ( z ) < 1 . Pro of . W e shall firs t sho w that L ( z ) is monotonically increasing with resp ect to z ∈ [0 , 1] suc h that L ( z ) > 0. It suffices to consider four cases: Case (i): λ ≥ 1 2 and 0 < z ≤ 1 2 ; Case (ii): λ ≥ 1 2 and 1 > z > 1 2 ; Case (iii): λ < 1 2 and 0 < z ≤ 1 2 ; Case (iv): λ < 1 2 and 1 > z > 1 2 . In Case (i), L ( z ) increases if and only if (2 λ − 1) 2 [1 + 4 λz (1 − z )] > 9 λ 2 (1 − 2 z ) 2 , or equiv alen tly ,  z − 1 2  2 < (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 . Define z ∗ = 1 2 − s (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 . By Lemma 1 , we ha v e z ∗ > 0. If follo ws that L ( z ) is m onotonicall y decreasing with resp ect to z ∈ (0 , z ∗ ) and monotonically increasing with resp ect to z ∈  z ∗ , 1 2  . Th is imp lies th at L ( z ) ac h iev es its m inim um at z ∗ and L ( z ) < L (0) = 0 f or an y z ∈ (0 , z ∗ ). Therefore, we ha ve shown that L ( z ) is monotonically increasing with resp ect to z ∈ (0 , 1) such that L ( z ) ≥ 0 and that the conditions of Case (i) hold true. In Case (ii), L ( z ) increases for z ∈  1 2 , 1  . In Case (iii), L ( z ) decreases for z ∈  0 , 1 2  . It can b e seen that L ( z ) < L (0) = 0 for an y z ∈  0 , 1 2  . In Case (iv), L ( z ) increases if and only if (2 λ − 1) p 1 + 4 λz (1 − z ) > 3 λ (1 − 2 z ), whic h can b e wr itten as (1 − 2 λ ) p 1 + 4 λz (1 − z ) < 3 λ (2 z − 1) or equiv alently ,  z − 1 2  2 > (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 Define z ⋆ = 1 2 + s (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 . 5 If 1 2 > λ > 1 5 , by Lemma 1, w e h a v e z ⋆ < 1. Hence, L ( z ) increases for z ∈ ( z ⋆ , 1) and L ( z ) < L (1) = z + 3 4 1 − 2 z − p 1 + 4 λ z (1 − z ) 1 + λ = 2 λ − 1 2(1 + λ ) < 0 , ∀ z ∈ ( z ⋆ , 1) . Moreo ver, L ( z ) decreases for z ∈  1 2 , z ⋆  and L ( z ) < L  1 2  = z + 3 4 1 − 2 z − p 1 + 4 λ z (1 − z ) 1 + λ = 1 2 − 3 4 √ 1 + λ < 0 , ∀ z ∈ ( z ⋆ , 1) . If 0 < λ ≤ 1 5 , by Lemma 1, w e h a v e z ⋆ ≥ 1. Hence, L ( z ) decreases for z ∈ ( 1 2 , 1) and L ( z ) < L  1 2  < 0 ∀ z ∈  1 2 , 1  . Based on the preceding in vestig ation, we can conclude that the lo wer confidence limit is non- decreasing with resp ect to z ∈ (0 , 1) su c h that L ( z ) ≥ 0. Recalling that L (1) < 1, w e hav e that L ( z ) < 1 for an y z ∈ (0 , 1). Since U ( z ) = 1 − L (1 − z ) > 0 for an y z ∈ (0 , 1), w e h a v e th at the up p er confidence limit U ( z ) is also non-decreasing with resp ect to z ∈ (0 , 1) suc h that U ( z ) ≤ 1. ✷ No w w e consider the minim um co verage probability . By the definitions of L n,δ , U n,δ and L ( k ) , U ( k ), w e ha ve Pr { L n,δ ≤ p < U n,δ | U ( k ) } = Pr { k < K ≤ T + ( p ) | U ( k ) } , 0 < U ( k ) < 1 Pr { L n,δ < p ≤ U n,δ | L ( k ) } = Pr { T − ( p ) ≤ K < k | L ( k ) } , 0 < L ( k ) < 1 Pr { L n,δ < p < U n,δ | U ( k ) } = Pr { k < K < T + ( p ) | U ( k ) } , 0 < U ( k ) < 1 Pr { L n,δ < p < U n,δ | L ( k ) } = Pr { T − ( p ) < K < k | L ( k ) } , 0 < L ( k ) < 1 . Since b oth L ( z ) and U ( z ) are monotone, the p ro of of T heorem 2 can b e completed by making use of th e ab ov e results and applying the theory of co v erage p r obabilit y of random in terv als established b y Chen in [3]. References [1] Brown, L. D., Cai, T. and DasGupta, A., “Interv al estimation for a binomial p r op ortion and asymptotic exp ansions,” The Annals of Statistics ,” vol . 30, p p. 160-201, 2002. [2] Ch en X., Zhou K. and Ara vena J., “Explicit formula for constructing b inomial confid ence in terv al with guaran teed cov erage pr ob ab ility ,” Communic ations in Statistics – The ory and metho ds , vol. 37, pp. 1173- 1180, 2008. [3] Ch en X., “Cov erage Probabilit y of Rand om Inte rv als,” arXiv:07 07.2814 , J uly 2007. 6

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment