Optimal Explicit Binomial Confidence Interval with Guaranteed Coverage Probability

Optimal Explicit Binomial Conﬁde nce In terv al with Guaran teed Cov erage Probabil i t y ∗ Xinjia Chen Submitted in April, 200 8 Abstract In this pap er, we dev elop an approa ch for optimizing the explicit binomia l conﬁdence int erv al re c e n tly derived by Chen et al. The optimizatio n r educes conser v ativeness while guaranteeing prescrib ed coverage probabilit y . 1 Explicit F orm ula of Chen et al. Let X be a Bernoulli r andom v ariable d eﬁned in probabilit y sp ace (Ω , F , Pr) with distribution Pr { X = 1 } = 1 − Pr { X = 0 } = p ∈ (0 , 1). It is a frequent problem to constru ct a conﬁdence in terv al for p based on n i.i.d. rand om samples X 1 , · · · , X n of X . Recen tly , Ch en et al. ha ve prop osed an explicit conﬁdence in terv al in [2] with low er conﬁden ce limit L n,δ = K n + 3 4 1 − 2 K n − q 1 + 9 2 ln 2 δ K (1 − K n ) 1 + 9 n 8 ln 2 δ (1) and u p p er conﬁdence limit U n,δ = K n + 3 4 1 − 2 K n + q 1 + 9 2 ln 2 δ K (1 − K n ) 1 + 9 n 8 ln 2 δ (2) where K = P n i =1 X i . Suc h conﬁden ce in terv al guarantees that the co v erage pr obabilit y Pr { L n,δ < p < U n,δ | p } is greater than 1 − δ for an y p ∈ (0 , 1). Clearly , the explicit binomial conﬁd ence in terv al is conserv ativ e and it is desirable to optimize the conﬁ d ence in terv al by tunin g the parameter δ . Th is is ob j ective of the next s ection. ∗ The author is currently with Department of Electrical Engineering, Louisiana State Universi ty at Baton R ouge, LA 70803, U SA, and Department of Electrical Engineering, South ern Universit y and A&M College, Baton Rouge, LA 70813, U SA; Email: c henxinjia@gmail.com 1 2 Optimization of Explicit Binomial Conﬁ dence In terv al As will b e seen in Section 3, it can b e sh own that Theorem 1 F or any ﬁxe d n and p ∈ (0 , 1) , the c over age pr ob ability of c onﬁdenc e interval [ L n,δ , U n,δ ] de cr e ases as δ incr e ases. Hence, it is p ossible to ﬁ nd δ > α such that Pr { L n,α < p < U n,α | p } > 1 − α, ∀ p ∈ (0 , 1) for α ∈ (0 , 1). T o reduce conserv atism of the conﬁdence in terv al, we consider the follo wing optimization pr oblem: F or a giv en α ∈ (0 , 1), maximize δ s u b ject to the constrain t that inf p ∈ (0 , 1) Pr { L n,δ ≤ p ≤ U n,δ | p } ≥ 1 − α. A similar problem is to maximize δ sub ject to the constraint that inf p ∈ (0 , 1) Pr { L n,δ < p < U n,δ | p } ≥ 1 − α. As a result of Theorem 1, the maxim um δ can b e obtained from ( α, 1) b y a bisection search. In this regard, it is essen tial to eﬃcient ly ev aluate inf p ∈ (0 , 1) Pr { L n,δ ≤ p ≤ U n,δ | p } and inf p ∈ (0 , 1) Pr { L n,δ < p < U n,δ | p } . This is accomplished by the follo wing theorem derived from the th eory of rand om in terv als established in [3]. Theorem 2 L et δ ∈ (0 , 1) . Deﬁne T − ( p ) = np + 1 − 2 p − q 1 + 18 np (1 − p ) ln(2 /δ ) 2 3 n + 3 ln(2 /δ ) , T + ( p ) = np + 1 − 2 p + q 1 + 18 np (1 − p ) ln(2 /δ ) 2 3 n + 3 ln(2 /δ ) for p ∈ (0 , 1) and L ( k ) = k n + 3 4 1 − 2 k n − q 1 + 9 2 ln 2 δ k (1 − k n ) 1 + 9 n 8 ln 2 δ , U ( k ) = k n + 3 4 1 − 2 k n + q 1 + 9 2 ln 2 δ k (1 − k n ) 1 + 9 n 8 ln 2 δ for k = 0 , 1 , · · · , n . Deﬁne C l ( k ) = Pr {⌈ T − ( L ( k )) ⌉ ≤ K ≤ k − 1 | L ( k ) } , C ′ l ( k ) = Pr {⌊ T − ( L ( k )) ⌋ +1 ≤ K ≤ k − 1 | L ( k ) } for k ∈ { 0 , 1 , · · · , n } such that 0 < L ( k ) < 1 . Deﬁne C u ( k ) = Pr { k +1 ≤ K ≤ ⌊ T + ( U ( k )) ⌋ | U ( k ) } , C ′ u ( k ) = Pr { k +1 ≤ K ≤ ⌈ T + ( U ( k )) ⌉− 1 | U ( k ) } for k ∈ { 0 , 1 , · · · , n } such that 0 < U ( k ) < 1 . 2 Then, the fol lowing statements hold true: (I): inf p ∈ (0 , 1) Pr { L n,δ ≤ p ≤ U n,δ | p } e quals to the minimum of { C l ( k ) : 0 ≤ k ≤ n ; 0 < L ( k ) < 1 } ∪ { C u ( k ) : 0 ≤ k ≤ n ; 0 < U ( k ) < 1 } . (II): inf p ∈ (0 , 1) Pr { L n,δ < p < U n,δ | p } e quals to the minimum of { C ′ l ( k ) : 0 ≤ k ≤ n ; 0 < L ( k ) < 1 } ∪ { C ′ u ( k ) : 0 ≤ k ≤ n ; 0 < U ( k ) < 1 } . The pr o of of Th eorem 2 is pro vided in Section 4. 3 Pro of of T heorem 1 F or simp licit y of notations, deﬁn e λ = 9 n 8 ln 2 δ and z = k n Then, for K = k , the upp er and lo w er conﬁdence limits are U n,δ = U ( z ) and L n,δ = L ( z ) resp ectiv ely , where U ( z ) = z + 3 4 1 − 2 z + p 1 + 4 λ z (1 − z ) 1 + λ , L ( z ) = z + 3 4 1 − 2 z − p 1 + 4 λ z (1 − z ) 1 + λ . Since (1 − 2 p ) 2 ≤ 1 + 4 λ p (1 − p ) for p ∈ (0 , 1) and λ > 0, we h a v e L ( z ) ≤ z and U ( z ) ≥ z . Hence, to show T heorem 1, it s u ﬃces to sh o w that b oth U ( z ) − z and z − L ( z ) d ecrease as δ increases for an y z ∈ [0 , 1]. W e sh all ﬁrst sho w that U ( z ) − z decreases as δ increases for any ﬁxed z ∈ [0 , 1]. F or th is purp ose, we can deﬁne y = 4[ U ( z ) − z ] 3 and sho w that ∂ y ∂ λ < 0. T o this end, we can use the deﬁnition of y to obtain the follo win g equation [(1 + λ ) y − (1 − 2 z )] 2 = 1 + 4 λz (1 − z ). Diﬀerentia ting b oth sides of this equation with resp ect to λ yields 2[(1 + λ ) y − (1 − 2 z )]  (1 + λ ) ∂ y ∂ λ + y  = 4 z (1 − z ) , from wh ic h we ha ve (1 + λ ) ∂ y ∂ λ = 2 z (1 − z ) (1 + λ ) y − (1 − 2 z ) − y . Clearly , to sho w ∂ y ∂ λ < 0, it su ﬃ ces to show that the righ t-hand side of the ab ov e equ ation is negativ e f or an y z ∈ [0 , 1] and λ > 0. That is, to show 2 z (1 − z ) p 1 + 4 λ z (1 − z ) < y , or equiv alently , (1 + λ )2 z (1 − z ) < (1 − 2 z ) p 1 + 4 λ z (1 − z ) + 1 + 4 λ z (1 − z ) , 3 whic h can b e written as 2 z (1 − z ) < w ( λ ), where w ( λ ) = (1 − 2 z ) p 1 + 4 λ z (1 − z ) + 1 + 2 λ z (1 − z ) . Note that ∂ w ( λ ) ∂ λ = 2 z (1 − z ) " 1 − 2 z p 1 + 4 λ z (1 − z ) + 1 # > 0 as a result of 1 − 2 z p 1 + 4 λ z (1 − z ) > − 1 p 1 + 4 λ z (1 − z ) > − 1 . Hence, w ( λ ) > w (0) = 2(1 − z ) for any λ > 0. T his sho ws that 2 z (1 − z ) < w ( λ ) for any z ∈ [0 , 1] and λ > 0. Consequently , w e ha ve established ∂ y ∂ λ < 0, which imp lies that U ( z ) − z decreases as δ increases for an y ﬁxed z ∈ [0 , 1]. Obser v in g that L ( z ) = 1 − U (1 − z ) for any z ∈ [0 , 1], we hav e z − L ( z ) = z − [1 − U (1 − z )] = U (1 − z ) − (1 − z ) . Therefore, it m ust b e true that z − L ( z ) decreases as δ in creases for ﬁ xed an y z ∈ [0 , 1]. So, the pro of of Theorem 1 is completed. 4 Pro of of T heorem 2 F or simp licit y of notations, we d eﬁne λ, z , L ( z ) and U ( z ) as in the pr o of of Theorem 1. Note that L (1) = 1 − 3 2(1+ λ ) < 1 and the deriv ativ e of L ( z ) w ith resp ect to z is L ′ ( z ) = 1 + 3 4(1 + λ ) " − 2 − 1 2 4 λ (1 − 2 z ) p 1 + 4 λz (1 − z ) # = 1 − 3 2(1 + λ ) − 3 2(1 + λ ) λ (1 − 2 z ) p 1 + 4 λz (1 − z ) = 1 2(1 + λ ) " 2 λ − 1 − 3 λ (1 − 2 z ) p 1 + 4 λz (1 − z ) # whic h is p ositiv e if and only if (2 λ − 1) p 1 + 4 λz (1 − z ) > 3 λ (1 − 2 z ). T o complete the pr o of of Theorem 2, w e need some preliminary results. Lemma 1 F or any n ≥ 1 and δ ∈ (0 , 1) , (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 ≥ 1 4 (3) if and only if λ ≤ 1 5 . 4 Pro of . Note that (3) is equiv alen t to (2 λ − 1) 2 (1 + λ ) ≥ 9 λ 2 + λ (2 λ − 1) 2 , whic h can b e s impliﬁed as (5 λ − 1)( λ + 1) ≤ 0. Since δ ∈ (0 , 1) and n ≥ 1, we ha ve λ > 0. Hence, the inequalit y (3) holds if and only if λ ≤ 1 5 . ✷ Lemma 2 L ( z ) is monotonic al ly incr e asing with r esp e ct to z ∈ [0 , 1] suc h that L ( z ) > 0 . Simi- larly, U ( z ) is monotonic al ly incr e asing with r esp e ct to z ∈ [0 , 1] such that U ( z ) < 1 . Pro of . W e shall ﬁrs t sho w that L ( z ) is monotonically increasing with resp ect to z ∈ [0 , 1] suc h that L ( z ) > 0. It suﬃces to consider four cases: Case (i): λ ≥ 1 2 and 0 < z ≤ 1 2 ; Case (ii): λ ≥ 1 2 and 1 > z > 1 2 ; Case (iii): λ < 1 2 and 0 < z ≤ 1 2 ; Case (iv): λ < 1 2 and 1 > z > 1 2 . In Case (i), L ( z ) increases if and only if (2 λ − 1) 2 [1 + 4 λz (1 − z )] > 9 λ 2 (1 − 2 z ) 2 , or equiv alen tly ,  z − 1 2  2 < (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 . Deﬁne z ∗ = 1 2 − s (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 . By Lemma 1 , we ha v e z ∗ > 0. If follo ws that L ( z ) is m onotonicall y decreasing with resp ect to z ∈ (0 , z ∗ ) and monotonically increasing with resp ect to z ∈  z ∗ , 1 2  . Th is imp lies th at L ( z ) ac h iev es its m inim um at z ∗ and L ( z ) < L (0) = 0 f or an y z ∈ (0 , z ∗ ). Therefore, we ha ve shown that L ( z ) is monotonically increasing with resp ect to z ∈ (0 , 1) such that L ( z ) ≥ 0 and that the conditions of Case (i) hold true. In Case (ii), L ( z ) increases for z ∈  1 2 , 1  . In Case (iii), L ( z ) decreases for z ∈  0 , 1 2  . It can b e seen that L ( z ) < L (0) = 0 for an y z ∈  0 , 1 2  . In Case (iv), L ( z ) increases if and only if (2 λ − 1) p 1 + 4 λz (1 − z ) > 3 λ (1 − 2 z ), whic h can b e wr itten as (1 − 2 λ ) p 1 + 4 λz (1 − z ) < 3 λ (2 z − 1) or equiv alently ,  z − 1 2  2 > (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 Deﬁne z ⋆ = 1 2 + s (2 λ − 1) 2 (1 + λ ) 36 λ 2 + 4 λ (2 λ − 1) 2 . 5 If 1 2 > λ > 1 5 , by Lemma 1, w e h a v e z ⋆ < 1. Hence, L ( z ) increases for z ∈ ( z ⋆ , 1) and L ( z ) < L (1) = z + 3 4 1 − 2 z − p 1 + 4 λ z (1 − z ) 1 + λ = 2 λ − 1 2(1 + λ ) < 0 , ∀ z ∈ ( z ⋆ , 1) . Moreo ver, L ( z ) decreases for z ∈  1 2 , z ⋆  and L ( z ) < L  1 2  = z + 3 4 1 − 2 z − p 1 + 4 λ z (1 − z ) 1 + λ = 1 2 − 3 4 √ 1 + λ < 0 , ∀ z ∈ ( z ⋆ , 1) . If 0 < λ ≤ 1 5 , by Lemma 1, w e h a v e z ⋆ ≥ 1. Hence, L ( z ) decreases for z ∈ ( 1 2 , 1) and L ( z ) < L  1 2  < 0 ∀ z ∈  1 2 , 1  . Based on the preceding in vestig ation, we can conclude that the lo wer conﬁdence limit is non- decreasing with resp ect to z ∈ (0 , 1) su c h that L ( z ) ≥ 0. Recalling that L (1) < 1, w e hav e that L ( z ) < 1 for an y z ∈ (0 , 1). Since U ( z ) = 1 − L (1 − z ) > 0 for an y z ∈ (0 , 1), w e h a v e th at the up p er conﬁdence limit U ( z ) is also non-decreasing with resp ect to z ∈ (0 , 1) suc h that U ( z ) ≤ 1. ✷ No w w e consider the minim um co verage probability . By the deﬁnitions of L n,δ , U n,δ and L ( k ) , U ( k ), w e ha ve Pr { L n,δ ≤ p < U n,δ | U ( k ) } = Pr { k < K ≤ T + ( p ) | U ( k ) } , 0 < U ( k ) < 1 Pr { L n,δ < p ≤ U n,δ | L ( k ) } = Pr { T − ( p ) ≤ K < k | L ( k ) } , 0 < L ( k ) < 1 Pr { L n,δ < p < U n,δ | U ( k ) } = Pr { k < K < T + ( p ) | U ( k ) } , 0 < U ( k ) < 1 Pr { L n,δ < p < U n,δ | L ( k ) } = Pr { T − ( p ) < K < k | L ( k ) } , 0 < L ( k ) < 1 . Since b oth L ( z ) and U ( z ) are monotone, the p ro of of T heorem 2 can b e completed by making use of th e ab ov e results and applying the theory of co v erage p r obabilit y of random in terv als established b y Chen in [3]. References [1] Brown, L. D., Cai, T. and DasGupta, A., “Interv al estimation for a binomial p r op ortion and asymptotic exp ansions,” The Annals of Statistics ,” vol . 30, p p. 160-201, 2002. [2] Ch en X., Zhou K. and Ara vena J., “Explicit formula for constructing b inomial conﬁd ence in terv al with guaran teed cov erage pr ob ab ility ,” Communic ations in Statistics – The ory and metho ds , vol. 37, pp. 1173- 1180, 2008. [3] Ch en X., “Cov erage Probabilit y of Rand om Inte rv als,” arXiv:07 07.2814 , J uly 2007. 6

Optimal Explicit Binomial Confidence Interval with Guaranteed Coverage Probability

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment