Simultaneous concentration of order statistics

Sim ultaneous concen tration of order statistics Daniel F resen T o my p ar ents John F r esen and Jil l F r esen Abstract. Let µ b e a probabilit y measure on R with cum ulative distribution function F , ( x i ) n 1 a large i.i.d. sample from µ , and F n the asso ciated empiri cal distribution f unct ion. The Glivenk o-Can telli theorem states that with pr oba- bility 1, F n con verges uniformly to F . In so doing it describes the macroscopic structure of { x i } n 1 , how ev er i t is insensitive to the p osition of individual p oints. Indeed any subset of o ( n ) p oin ts can b e per turbed at wi ll without di sturbing the con vergen ce. W e pr o vide sev eral reﬁnemen ts of the Glivenk o-Cant elli theorem which are sensitive not only to the global structure of the sample but also to individual points. Our main result pro vides conditions that guarantee simu ltaneous concen tration of all order s tatistics. The example of main i n terest is the normal distribution. 1. In tro duction Let µ be a pr o babilit y measure on R with cum ulative distribution function F and let ( x i ) ∞ 1 denote an i.i.d. sequence of random v ariables with distribution µ . F or eac h n ∈ N let F n denote the empirical cum ulative distribution function F n ( t ) = 1 n |{ i ∈ N : i ≤ n , x i ≤ t }| where | A | denotes the cardinality of a set A . The Gliv enko-Cantelli theorem (see e.g. [ 8 ]) states that with probabilit y 1 , lim n →∞ sup t ∈ R | F ( t ) − F n ( t ) | = 0 The Dv oretzky- Kiefer-W olfowitz inequality ([ 9 ] and [ 17 ]) provides a quantitativ e formulation o f this and states that for all n ∈ N and all λ > 0, with probability at least 1 − 2 exp( − 2 λ 2 ), sup t ∈ R √ n | F ( t ) − F n ( t ) | ≤ λ 2000 Mathematics Subje ct Classiﬁc ation. Primary 62G30; Secondary 60G55. Key wor ds and phr ases. Gliv enko -C antelli theorem, order s tat istics, l og-conc av e, Lipschitz. I am grateful to John F resen and Jill F resen for m y education and for many interesting mathematical discussions throughout the y ears. Many thanks to Jo el Zinn as well as my advisors Alexander Koldobsky and Mark Rudelson for their comments and suggestions. 1 2 DANIEL F RESEN This titanic theorem w ould be well deserving of the name ’ the fundamental the or em of statistics ’ as it is the theoretical foundation b ehind the idea that a large inde- pendent sample is r epresent ative of the populatio n. Ther e is, how e ver, a c e rtain crudeness in this noble theor em. Asymptotically , individual p oin ts pla y a negli- gible role and w e learn very little ab out the ﬁner structure of the sa mple { x i } n 1 . F or instance, it gives us a lmo st no information ab o ut either the maximum or the minim um. W e could take any subs e t of o ( n ) p oin ts and p erturb them a s we please without aﬀecting the con vergence. Donsker’s theorem (see e.g. [ 7 ], [ 14 ] and [ 16 ]) gives more insight in to the structure of the sample. Consider the stochastic pro cess X n deﬁned on R b y X n ( t ) = √ n ( F n ( t ) − F ( t )) Provided that F is strictly increa sing and contin uous, X n conv erges to a re-scale d Brownian bridge (more precisely , X n ◦ F − 1 conv erges to a B ro wnian br idge on [0 , 1]). How ever Donsker’s theorem is plagued by a similar insensitivit y to the cr ies of the minority . Through the ey es of Donsker’s theorem, w e can ’s e e’ subsets as small as √ n but a re blind to anything smaller suc h as subsets of s iz e lo g( n ). In this pap er we provide r eﬁned forms of the Glivenko-Can telli theorem which, under certain co nditions, guarantee tigh t control ov er all or mos t points in the sample, no t only individually but simultane ously . Super -expo nen tial dec ay of the distribution provides simultaneous co ncen tration of al l order statistics (see theo- rem 1) while exp onen tial decay provides simult aneo us concentration of most order statistics and sligh tly weak er co n trol ov er the r est (see theorems 2 and 3). W e provide quantitativ e b ounds for log-concav e distributions (see theor em 4 ). Our results extend the Gnedenk o law of lar ge num b ers, which g uarantees con- centration of max { x i } n 1 . The y may b e compared to the res ults in [ 10 ] where the Gnedenko law of la rge num b ers is extended to the multi-dimensional s etting, to the pap er [ 13 ] that provides estimates of order statistics in terms of Orlicz functions and to the article [ 1 ] that concerns optimal matchings o f ra ndom p oint s unifor mly distributed within the unit square. W e refer the reader to [ 11 ] and [ 1 9 ] for an extensive treatmen t o f empirical pro cess theor y and to [ 2 ], [ 4 ] and [ 18 ] for infor- mation o n o r der statistics. Int er e sting pap ers on the Glivenk o-Ca n telli theorem include [ 5 ], [ 20 ], [ 21 ] and [ 22 ]. Theorem 1. L et µ b e any pr ob ability me asur e on R with a c ontinuous strictly incr e asing cumulative di str ibut ion function F su ch that for al l ε > 0 (1.1) lim t →∞ 1 − F ( t + ε ) 1 − F ( t ) = lim t →−∞ F ( t ) F ( t + ε ) = 0 Then ther e exists a se qu enc e ( δ n ) ∞ 1 with lim n →∞ δ n = 0 such that for al l n ∈ N , if ( x i ) n 1 is an i. i.d. sample fr om µ with c orr esp onding or der statistics ( x ( i ) ) n 1 , then with pr ob ability at le ast 1 − δ n , (1.2) sup 1 ≤ i ≤ n | x ( i ) − x ∗ ( i ) | ≤ δ n wher e x ∗ ( i ) = F − 1 ( i/ ( n + 1)) . CONCENTRA TION OF ORDER ST A TISTICS 3 Theorem 2. L et µ b e any pr ob ability me asur e on R with a c ontinuous strictly incr e asing cumulative di str ibut ion function F su ch that for al l ε > 0 lim sup t →∞ 1 − F ( t + ε ) 1 − F ( t ) < 1 (1.3) lim sup t →−∞ F ( t ) F ( t + ε ) < 1 (1.4) L et ( ω n ) ∞ 1 b e any se qu enc e in N with lim n →∞ ω n = ∞ . Then ther e exists a se quen c e ( δ n ) ∞ 1 with lim n →∞ δ n = 0 , such that for al l n ∈ N , if ( x i ) n 1 is an i.i .d. sample fr om µ with c orr esp onding or der statistics ( x ( i ) ) n 1 , then with pr ob ability at le ast 1 − δ n , sup ω n ≤ i ≤ n − ω n | x ( i ) − x ∗ ( i ) | ≤ δ n wher e x ∗ ( i ) = F − 1 ( i/ ( n + 1)) . Theorem 3. L et µ b e any pr ob ability me asur e on R that ob eys the c onditions of the or em 2. Then ther e ex ists k > 0 such that for al l T > 1 0 6 and al l n ∈ N , if ( x i ) n 1 is an i.i. d. sample fr om µ with c orr esp onding or der statistics ( x ( i ) ) n 1 , then with pr ob ability at le ast 1 − 40 0 T − 1 / 2 , sup 1 ≤ i ≤ n | x ( i ) − x ∗ ( i ) | ≤ k T Note tha t in theore m 2 w e can take ( ω n ) ∞ 1 to grow a rbitrarily slowly , for exam- ple let ω n = log log log n . W e thus ha ve tight co n trol ov e r almost the en tire data set with the exception of a very small prop ortion of p oin ts. This is subs tan tially better than the √ n ’visibility’ of Donsker’s theorem. A pr obabilit y measure µ is called p -lo g-concav e for s ome p ∈ (0 , ∞ ) if it has a density function of the form f ( x ) = c ex p( − g ( x ) p ) where g is non-nega tiv e and conv ex. The 1-log -concav e distributions ar e simply r eferred to as log-concav e. If µ is p - lo g-concav e then it is als o q -log-concave for a ll 1 ≤ q ≤ p . Theorem 4. L et p > 1 , q > 0 and let µ b e a p - lo g-c onc ave pr ob ability me asur e o n R with a c ontinuous st rictly incr e asing cumulative distribution function F . Then ther e exists c > 0 such t hat for any n ∈ N and any i .i.d. sample ( x i ) n 1 fr om µ with or der statistics ( x ( i ) ) n 1 , with pr ob ability at le ast 1 − c (log n ) − q , sup 1 ≤ i ≤ n | x ( i ) − x ∗ ( i ) | ≤ c log log n (log n ) 1 − 1 /p wher e x ∗ ( i ) = F − 1 ( i/ ( n + 1)) . The main idea b ehind the pr oof of these theorems is to ﬁrst analy ze the uniform distribution o n [0 , 1 ]. W e do this using a p o werful repre s en tation of the empirical po in t pr oces s via indep endent r andom v ar iables that a llows us to use classical results such as the law of large n um b ers (in the form of Cheb yshev’s inequality) and the law of the iterated loga rithm. A key step in this analysis is to ex ploit the inhere nt regular ity of order statistics which allows for c o n trol ov er all points based on an insp e ction of merely lo g n carefully c hosen points. W e then transform the po ints under the action o f F − 1 to analyze the general case. W e introduce a new class of metrics on (0 , 1 ) deﬁned b y (1.5) θ p ( x, y ) = max  log( x − 1 y ) (log x − 1 ) 1 − 1 /p , log((1 − y ) − 1 (1 − x )) (log(1 − y ) − 1 ) 1 − 1 /p  4 DANIEL F RESEN for 1 ≤ p < ∞ and 0 < x ≤ y < 1. T o see that e a c h θ p is indeed a metric, note that θ p ( x, y ) is decr easing in x and incr easing in y throughout the triangular region { ( x, y ) ∈ (0 , 1 ) 2 : x < y } . W e show that F − 1 is either Lipschitz or uniformly contin uo us with respect to these metrics (depending o n the assumptions imposed on µ ). After this, our main results become str aight for ward to pro ve. There are endless v ariatio ns on the main theme of this pa per. Our inten tion is simply to highlight a phenomenon and introduce methods b y which to study it. Note that our results are pur ely asymptotic in na tur e and we can (and do) assume throughout the pap er tha t n > n 0 for some n 0 ∈ N . 2. The uniform distri bution Let ( γ i ) n 1 denote an i.i.d. sample from the uniform distr ibution on [0 , 1] with corres p onding order statistics ( γ ( i ) ) n 1 and let ( z i ) n +1 1 be a n i.i.d. sequence of random v ar ia bles that f ollow the sta ndard exp onential distr ibution. F or 1 ≤ i ≤ n deﬁne y i =   i X j =1 z j     n +1 X j =1 z j   − 1 It is of great interest to us that ( y i ) n 1 and ( γ ( i ) ) n 1 hav e the same distribution in R n (see c hapter 5 in [ 6 ]). This is nothing but an expression of the fact that the empirical po in t pro cess lo cally r esem bles the Poisson p oin t pro cess. Also o f interest is the fact that these random vectors hav e the same distribution as the partial sums of a random vector unifor mly distributed (with resp ect to Lebesg ue measure) in the s ta ndard simplex ∆ n = { w ∈ R n +1 : w i ≥ 0 ∀ i , P i w i = 1 } . The power of this r epresent atio n is that w e hav e a n expres s ion for ( γ ( i ) ) n 1 in ter ms of independent random v ar iables. Note that (2.1) y i = i n + 1   1 i i X j =1 z j     1 n + 1 n +1 X j =1 z j   − 1 Both lemma 1 and lemma 3 b elow ca n b e co mpared to the results in [ 23 ]. Lemma 1 . L et T > 10 6 and n ∈ N . With pr ob ability at le ast 1 − 400 T − 1 / 2 the fol lowing ine qualities hold simultane ously for a l l 1 ≤ i ≤ n , (2.2) T − 1 ≤ γ ( i )  i n + 1  − 1 ≤ T (2.3) T − 1 ≤ (1 − γ ( i ) )  1 − i n + 1  − 1 ≤ T Proof. Let Q = 2 − 1 T 1 / 2 and momen tarily ﬁx 1 ≤ i ≤ n + 1. The random v ar ia ble i − 1 P i j =1 z j has mea n 1 and v aria nce i − 1 . Using Chebyshev’s inequality , with proba bilit y at least 1 − i − 1 Q − 2 we hav e − Q < 1 − 1 i i X j =1 z j < Q The random v ariable U i = |{ j ∈ N : j ≤ i, z j ≤ 2 Q − 1 }| CONCENTRA TION OF ORDER ST A TISTICS 5 follows a binomia l distribution with i trials and success pro babilit y 1 − exp( − 2 Q − 1 ) ≤ 2 Q − 1 . Using Cheb yshev ’s ineq ualit y aga in, with pr obabilit y at leas t 1 − 32 i − 1 Q − 1 we hav e U i < i/ 2, which implies t hat i − 1 P i j =1 z j > Q − 1 . Hence, with proba bilit y at least 1 − 33 i − 1 Q − 1 we have (2.4) Q − 1 < 1 i i X j =1 z j < Q + 1 Let M = ⌊ log 2 ( n ) ⌋ . With probability at lea st 1 − 33 Q − 1 P M j =0 2 − j − 33( n + 1) − 1 Q − 1 ≥ 1 − 10 0 Q − 1 equation (2 .4 ) ho lds simultaneously for i = 1 , 2 , 2 2 , 2 3 . . . 2 M and for i = n + 1. H ence, by (2.1), with probability at least 1 − 100 Q − 1 we have that for a ll such i 1 2 Q − 2 i n + 1 ≤ y i ≤ 2 Q 2 i n + 1 Since ( y i ) n 1 is an increas ing sequenc e , control ov er the v alues ( y 2 j ) M j =1 leads to control ov er the entire s equence and, recalling the r epresentation o f ( γ ( i ) ) n 1 in terms of ( y i ) n 1 , the b ound (2.2) follows for a ll 1 ≤ i ≤ n . The b o und (2.3) then follows by symmetry .  Lemma 2. L et t ∈ (0 , 1) and n ∈ N . With pr ob ability at le ast 1 − 2 exp( − nt 2 / 5) the following ine quality holds simultane ously for all 1 ≤ i ≤ n , (2.5)     γ ( i ) − i n + 1     ≤ t Proof. W e can as sume witho ut loss of generality that n − 1 ≤ 2 t/ 3 (otherwise the probability b ound becomes trivia l). Note that since our sa mple is taken from the uniform distributio n w e hav e sup 1 ≤ i ≤ n | γ ( i ) − i ( n + 1) − 1 | ≤ n − 1 + sup 1 ≤ i ≤ n | γ ( i ) − in − 1 | = n − 1 + sup 0 ≤ t ≤ 1 | F n ( t ) − F ( t ) | where F ( t ) = t is the cum ulative distribution function and F n is the empirical distribution function. By the Dvoretzky-Kiefer-W olfowitz inequality (as mentioned in the introduction), with probability at lea st 1 − 2 exp( − 5 − 1 nt 2 ) we have sup 0 ≤ t ≤ 1 | F n ( t ) − F ( t ) | ≤ t/ 3 and the res ult follows.  Note that in the pr eceding pro of o ne ca n a lso use Do ob’s ma r tingale ineq ualit y (in the form of Kolmogo rov’s inequality) and the representation of ( γ ( i ) ) n 1 in terms of ( y n ) n 1 , althoug h this appr oach yields an infer ior pro babilit y b ound. Lemma 3. L et ( ω n ) ∞ 1 b e any se quenc e in N such that lim n →∞ ω n = ∞ . Then for al l T > 1 and al l δ ∈ (0 , 1) t her e ex ists n 0 ∈ N such that for all n > n 0 , if ( γ ( i ) ) n 1 ar e t he or der st atistics fr om an i.i.d. sample fr om t he uniform distribution on [0 , 1] , then wi th pr ob ability at le ast 1 − δ , (2.2) and (2.3 ) hold for al l ω n ≤ i ≤ n − ω n . Proof. W e use the represen tation (2.1). Let T > 1 and δ ∈ (0 , 1) be given. Without lo ss of g eneralit y we may assume tha t T ≤ 2 . Let ( e z i ) ∞ 1 denote any i.i.d. 6 DANIEL F RESEN sequence of ra ndom v ariables that follow the standard exp onent ial dis tribution. Deﬁne the deter ministic sequence ( λ j ) ∞ 1 as follows, λ j = P { s up i ≥ j (2 i log log i ) − 1 / 2      i X k =1 ( e z k − 1 )      ≤ 2 } Note tha t ( λ j ) ∞ 1 is an increasing s e quence and by the law o f the iterated logar ithm, lim j →∞ λ j = 1. Fix n 0 ∈ N with n 0 ≥ 64 δ − 1 ( T 1 / 2 − 1 ) − 2 such that fo r all n > n 0 we hav e the following inequalities, λ ω ( n ) ≥ 1 − δ / 4  8 log log ω n ω n  1 / 2 ≤ T 1 / 2 − 1 Now cons ider an y n > n 0 and let ( γ ( i ) ) n 1 denote the order statistics mentioned in the statement of the lemma. With proba bilit y at lea st 1 − δ / 4, for all ω ( n ) ≤ i ≤ n ,       1 − 1 i i X j =1 z j       ≤  8 log log ω n ω n  1 / 2 ≤ T 1 / 2 − 1 By Chebyshev’s inequality and the fact that the function u 7→ u − 1 is 4-Lipschitz on [1 / 2 , ∞ ), with probabilit y at least 1 − 16 n − 1 ( T 1 / 2 − 1 ) − 2 ≥ 1 − δ / 4        1 −   1 n + 1 n +1 X j =1 z j   − 1        < T 1 / 2 − 1 By (2.1), w ith probability at least 1 − δ / 2, (2 .2) holds for all ω ( n ) ≤ i ≤ n . By symmetry , with the same probability (2.3) holds for all 1 ≤ i ≤ n − ω ( n ). The lemma is thus prov en.  3. The general case Lemma 4. L et F b e a c ontinu ous st rictly incr e asing cumulative distribution func- tion that satisﬁes (1.1). Then F − 1 is c ontinu ou s and for al l T > 1 and al l δ > 0 ther e exists η ∈ (0 , 1 ) such t hat for al l x, y ∈ (0 , η ) with T − 1 ≤ xy − 1 ≤ T and al l x, y ∈ (1 − η , 1) with T − 1 ≤ (1 − x )(1 − y ) − 1 ≤ T we ha ve | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . Proof. Consider any T > 1 and δ > 0. By (1.1) there exists t 0 ∈ R such that for all t ≤ t 0 , T F ( t ) < F ( t + δ ). Let η 1 = F ( t 0 ). Consider any x, y ∈ (0 , η 1 ) such that T − 1 ≤ xy − 1 ≤ T . Without loss of gener alit y , x < y . Let s = F − 1 ( x ) and t = F − 1 ( y ). Then s ≤ t 0 , hence F ( t ) = y ≤ T x = T F ( s ) < F ( s + δ ), f ro m which it follows that t < s + δ and that | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . Analysis of the right hand tail is iden tical and pr o vides us with η 2 > 0 such that for all x, y ∈ (1 − η 2 , 1) with T − 1 ≤ (1 − x )(1 − y ) − 1 ≤ T we hav e | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . The result follows with η = min { η 1 , η 2 } .  CONCENTRA TION OF ORDER ST A TISTICS 7 Lemma 5. L et F b e a c ontinu ous st rictly incr e asing cumulative distribution func- tion that s atisﬁes b oth (1.3) and (1. 4 ) . Then F − 1 is c ontinuous and for a l l δ > 0 ther e exists T > 1 such that for al l x, y ∈ (0 , 1) such that T − 1 ≤ xy − 1 ≤ T and T − 1 ≤ (1 − x )(1 − y ) − 1 ≤ T we ha ve | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . In p articular, F − 1 is uniformly c ontinuous with r esp e ct to the metric θ 1 (se e (1.5)). Proof. Consider any δ > 0. By (1.4) there exists T 1 > 1 a nd t 0 ∈ R suc h that for a ll t < t 0 , T 1 F ( t ) ≤ F ( t + δ ). Let η 1 = min { F ( t 0 ) , 2 − 1 } . As in the pro of of th e pr evious lemma, it follows that for all x, y ∈ (0 , η 1 ) with T − 1 1 ≤ xy − 1 ≤ T 1 we have | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . Similarly (using (1.3)), there exists T 2 > 1 a nd η 2 ∈ (2 − 1 , 1) such that for all x, y ∈ ( η 2 , 1) with T − 1 2 ≤ (1 − x )(1 − y ) − 1 ≤ T 2 we hav e | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . By contin uity o f F − 1 relative to the s tandard top ology o n (0 , 1), and by compa ctness of [2 − 1 η 1 , 1 − 2 − 1 η 2 ] there exis ts 0 < δ ′ < 10 − 1 min { η 1 , η 2 } such that for all x, y ∈ [2 − 1 η 1 , 1 − 2 − 1 η 2 ] with | x − y | < δ ′ we hav e | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . W e lea ve it to the reader to verify tha t the result holds with T = min { T 1 , T 2 , 1 + δ ′ }  Proof of theorem 1. W e shall construct a function h that takes a n arbi- trary δ ∈ (0 , 1 ) a nd pro duces an appropriate n 0 = h ( δ ) ∈ N . Then, using this function we sha ll deﬁne the desired sequence ( δ n ) ∞ 1 that is mentioned in the state- men t of the theorem. T o this e nd, let δ ∈ (0 , 1 ) be giv en. Deﬁne (3.1) T = 10 6 δ − 2 By lemma 4 there exists η ∈ (0 , 1) suc h that if x, y ∈ (0 , η ) and T − 1 ≤ xy − 1 ≤ T , or x, y ∈ (1 − η , 1) and T − 1 ≤ (1 − x )(1 − y ) − 1 ≤ T , t hen | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . By co mpactness, F − 1 is unif or mly contin uous o n [ η / 2 , 1 − η / 2], which implies the existence of t ∈ (0 , η / 2) such that if x, y ∈ [ η / 2 , 1 − η / 2] and | x − y | ≤ t , then | F − 1 ( x ) − F − 1 ( y ) | ≤ δ . Deﬁne (3.2) n 0 =  5 t − 2 log(4 δ − 1 )  and consider an y n ≥ n 0 . Let ( γ ( i ) ) n 1 denote the order statistics corresp onding to an i.i.d. sample from the uniform distribution on [0 , 1]. Note that we hav e the representation (3.3) x ( i ) = F − 1 ( γ ( i ) ) v alid for all 1 ≤ i ≤ n . By lemmas 1 and 2, as w ell as equations (3.1) and (3.2), with pro babilit y a t least 1 − δ inequalities (2.2), (2.3) and (2.5) hold simultaneously for all 1 ≤ i ≤ n . Suppos e that th ese inequalities do indeed hold and consider any ﬁxed 1 ≤ i ≤ n . Since t ≤ η / 2, one of the three sets [0 , η ], [ η / 2 , 1 − η/ 2 ] and [1 − η , 1] contains b oth γ ( i ) and i ( n + 1) − 1 , which implies that | F − 1 ( γ ( i ) ) − F − 1 ( i ( n + 1 ) − 1 ) | ≤ δ , which is ineq ualit y ( 1.2). Deﬁne the non- decreasing sequence ( κ n ) ∞ 1 by κ n = max { h ( e − i ) : 1 ≤ i ≤ n } and set δ n = exp( − max { i ∈ N : κ i ≤ n } ) where we deﬁne ma x ∅ = 0. It is clear that lim n →∞ δ n = 0. Co nsider any ﬁxed n ∈ N . If { i ∈ N : κ i ≤ n } = ∅ t hen the probability bo und is trivial, otherwise let j = max { i ∈ N : κ i ≤ n } . The res ult follows by the inequalit y h ( δ n ) = h ( e − j ) ≤ κ j ≤ n and by deﬁnition of the function h .  8 DANIEL F RESEN Proof of theorems 2 and 3. The pro of is very similar to that of theorem 1. W e use the repres en tation (3.3). The main diﬀerence is t hat w e use lemmas 3 and 5 instead of lemmas 1 and 4. The details a re left to the reader.  4. Log -conca v e distrib utions The following tw o lemm a s are modiﬁca tions of lemmas 6 and 9 in [ 10 ]. Lemma 6. L et µ b e a lo g-c onc ave pr ob ability me asur e on R with a c ontinuous strictly incr e asing cum u lative distribution function F . Then ther e ex ists c > 0 such tha t for al l 0 < x < y < 1 , (4.1) | F − 1 ( y ) − F − 1 ( x ) | ≤ c max    F − 1 ( y )   log( x − 1 y ) log y − 1 ,   F − 1 ( x )   log((1 − x ) / (1 − y )) log(1 − x ) − 1  Proof. By theorem 5 .1 in [ 15 ] (see lemma 5 in [ 10 ] for a proo f ) F is log- concav e. Hence the function u ( t ) = − log F ( t ) is conv ex (and strictly decreasing). Let E µ denote the cen troid of µ (the expected v alue of a random v ar iable with distribution µ ). B y lemma 5.12 in [ 15 ] (see also lemma 3.3 in [ 3 ]) F ( E µ ) ≥ e − 1 , hence u ( E µ ) ≤ 1. By convexit y of u w e have the inequalit y ( t − s ) − 1 ( u ( t ) − u ( s )) ≤ ( E µ − t ) − 1 ( u ( E µ ) − u ( t )), which is v alid for all s < t < E µ . Let 0 < x < y < min { e − 2 , F (0) , F ( − 2 E µ ) } and deﬁne s = F − 1 ( x ) and t = F − 1 ( y ). Then w e hav e F − 1 ( y ) − F − 1 ( x ) ≤ ( E µ − F − 1 ( y )) log( x − 1 y ) log y − 1 − u ( E µ ) It follo ws from the restrictions on y that F − 1 ( y ) < 0 and that   F − 1 ( y )   ≥ 2 | E µ | . Since y < F ( E µ ) 2 , it follo ws that log y − 1 > 2 u ( E µ ) and (4.1) follows for such x and y with c = 4. F or other v alues of x a nd y , inequality (4.1) follows b y compactness, contin uity and symmetry .  Lemma 7. L et p ≥ 1 and let µ b e a p -lo g-c onc ave pr ob ability me asur e on R with cumulative distribution function F . Then ther e exists c > 0 such t hat for al l x ∈ (0 , 1) , (4.2) | F − 1 ( x ) | ≤ c max { (log x − 1 ) 1 /p , (log(1 − x ) − 1 ) 1 /p } As a c onse quenc e o f (4.2) and (4.1), F − 1 is Lipschitz with r esp e ct to the metric θ p (se e (1.5)). Proof. By lemma 9 in [ 1 0 ] (whic h ho lds for p ≥ 1) there exists c 1 , c 2 > 0 and t 0 > 1 s uc h that for all t < − t 0 , F ( t ) ≤ c 1 | t | 1 − p exp( − c 2 | t | p ). Let η 1 = min { F ( − t 0 ) , c − 1 1 } a nd consider an y x ∈ (0 , η 1 ). Let t = F − 1 ( x ). Hence x = F ( t ) ≤ c 1 | t | 1 − p exp( − c 2 | t | p ), which implies that | F − 1 ( x ) | = − t ≤ ( c − 1 2 (log c 1 + lo g x − 1 )) 1 /p ≤ 2 1 /p c − 1 /p 2 (log x − 1 ) 1 /p The result now follows by symmetry , compactness a nd contin uity .  CONCENTRA TION OF ORDER ST A TISTICS 9 Lemma 8. L et F b e a c ontinu ous st rictly incr e asing cumulative distribution func- tion asso ciate d t o a lo g-c onc ave pr ob ability me asur e. Then ther e exists c > 0 such that f or al l ε ∈ (0 , 1 / 2) and al l x, y ∈ [ ε, 1 − ε ] , | F − 1 ( x ) − F − 1 ( y ) | ≤ cε − 1 | x − y | Proof. This follows from lemmas 6 and 7 with p = 1 a nd the inequality log t ≤ t − 1.  Proof of theorem 4. By lemmas 1, 6 a nd 7, with probabilit y at least 1 − 400(log n ) − q , for a ll i ≤ n 3 / 4 and all i ≥ n − n 3 / 4 we hav e | x ( i ) − x ∗ ( i ) | ≤ c log log n (log n ) 1 − 1 /p Let I = [2 − 1 n − 1 / 4 , 1 − 2 − 1 n − 1 / 4 ]. B y lemma 8, for all x, y ∈ I w e hav e | F − 1 ( x ) − F − 1 ( y ) | ≤ cn 1 / 4 | x − y | By lemma 2, with probabilit y at lea s t 1 − 2 exp( − 5 n 1 / 4 ), for all 1 ≤ i ≤ n w e hav e | γ ( i ) − i ( n + 1) − 1 | ≤ n − 3 / 8 Hence for a ll n 3 / 4 ≤ i ≤ n − n 3 / 4 bo th γ ( i ) and i ( n + 1 ) − 1 are elements of I and the res ult follows.  References [1] Ajtai, M., Koml` os, J., T usn` ady , G.: On optimal matchings. Combinatorica 4, 259-264 (1984) [2] Balakrishnan, N. , Cli ﬀord Cohen, A.: Or der Statistics and Inference. Statistical Modeling and Descision Science. Academic Press (199 1) [3] Bobko v, S.: On concen tration of distributions of random w eighted sums . Ann. Probab. 31 (1), 195-215 (2003) [4] David, H . A.: Order Statistics. Wiley Series in Probability and M athe matical Statistics. John Wiley & Sons (1970) [5] Dehardt, J.: Generalizations of the Gli v enk o-Cantelli theorem. Ann. Math. Statist. 4 2 (6), 2050-2055 (197 1) [6] Devroy e, L. : Non-Uniform Random V ariate Generation. Originally published with Springer- V erlag, New Y or k (1986) [7] Donsker, M . D.: Justiﬁcation and extension of Do ob’s heuristic approach to the Kolm ogoro v- Smirnov theorems. Ann. Math. Statist. 23 , 277–281 (1952) [8] Dudley , R. M .: Real Analysi s and Probability . W adsw orth & Bro oks/Cole (1989) [9] Dvoretzky , A. , Kiefer, J., W olfowitz, J.: Asymptotic minimax c haracter of the sample dis- tribution function and of th e classical m ultinomial estimator. Ann. Math. Sta tist. 27 (3) , 642–669 (1956) [10] F resen, D.: A m ultiv ariate Gnedenk o law of lar ge n umbers. [11] Gaensller, P ., Stute, W.: Empirical pro cesses: a sur v ey of r esults for independent and iden- tically dis tributed random v ar i ables. Ann. Probab. 7 (2) 193-243 (1979) [12] Gnedenk o, B.: Sur la distribution limite du terme maximu m d’une s ´ erie al ´ eatoire. A nn. Math. 4 4, 423-453 (1943) [13] Gordon, Y., Litv ak, A., Sch¨ utt, C., W erner, E.: Uniform estimates for order statistics and Orlicz functions. arXiv: 0809 .2989v1 [14] Koml` os, J., Ma jor, P ., T usn` ady , G.: An approximation of partial sums of i ndependen t R V’-s and the sample DF. I. Z. W ahrscheinlic hkeitst heorie ve rw. Gebiete 32 , 111–131. (1975 ) [15] Lov´ asz, L. , V empala, S.: The geometry of logconca ve functions and sampling algorithms. Random Structu r es Al gorithms 30 (3), 307-358 (2007) [16] Mason, D. M., v an Zw et, W.: A reﬁnemen t of the KMT inequalit y for the uniform empirical process. Ann. Probab. 15 , 871-884 (1987) 10 DANIEL F RESEN [17] Massart, P .: The tight constan t in the Dvoretzky– Kief er–W olfowitz inequality . Ann. Probab. 18 (3), 1269–1283 (1990) [18] Sarhan, A . E., Green b erg, B. G. (eds.): Cont ri butions to Or der Statistics. Wiley Publi cations in Statistics. John Wiley & Sons (1962) [19] Shorack , G., W ell ner, J.: Empir ical Pro cesses with Applications to Statistics. Wiley Series in Probability and Mathemat ical Statistics. John Wil ey & Sons (1986 ) [20] T alagrand, M.: The Glive nko-Can telli problem. Ann. Pr obab. 15 (3), 837-870 (1987) [21] T alagrand, M.: The Gliv enk o-Cantelli problem, ten years later. J. Theoret . Probab. 9 (2 ), 371-384 (1996) [22] W ellner, J.: A Glivenk o- C antelli theorem and strong l aws of large nu mbers for functions of order s tat istics. Ann. Statist. 5 (3), 473-480 (1977) [23] W ellner, J .: Limit theorems for the ratio of the empirical distribution function to the true distribution f unct ion. Pr obab. Theory Relat. Fields 45 (1) 73-88 (1978) Dep art men t of Ma thema tics, University of Missouri E-mail addr ess : djfb6b@m ail.missouri.edu

Simultaneous concentration of order statistics

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment