Tight Bounds for Blind Search on the Integers

Symposium on Theoretical Aspects of Computer Science 2008 (Bordeaux), pp. 241-252 www .stacs-conf .org TIGHT BOUNDS FOR BLIND SEAR CH ON THE INT E GERS MAR TIN DIETZFELBINGER 1 , JON A THAN E. RO WE 2 , I N GO WEGENER 3 , AND PHILI PP WOELFEL 4 1 F akult¨ at f¨ ur In formatik und A utomatisierung, T ec hnische Univ. Ilmenau, 98684 Ilmenau, Germany E-mail addr ess : martin.die tzfelbinger@tu- ilmenau.de 2 School of Computer Science, Un ive rsity of Birmingham, Birmingham B15 2TT, United Kingdom E-mail addr ess : J.E.Rowe@c s.bham.ac.uk 3 FB In formatik, LS2, Universit¨ a t Dortmund, 44221 Dortm u n d, Germany E-mail addr ess : ingo.wegen er@uni- dortmund.de 4 Department of Computer Science, Universi ty of Calgary , Calga ry , A lberta T2N 1N4, Canada E-mail addr ess : woelfel@cp sc.ucalgary.ca Abstra ct. W e analyze a simple random pro cess in which a token is mov ed in the interv al A = { 0 , . . . , n } : Fix a probabilit y distribution µ over { 1 , . . . , n } . Initially , t he tok en is placed in a ran d om p osition in A . I n round t , a random va lue d is chosen according to µ . If the token is in p osition a ≥ d , then it is mov ed to p osition a − d . Otherwise it stays put. Let T b e the num b er of rounds until the token reac hes p osition 0. W e show tigh t b ounds for the exp ectation of T for th e optimal distribution µ . Mo re precisely , we sho w that min µ { E µ ( T ) } = Θ ` (log n ) 2 ´ . F or the proof, a no vel p otential function argument is introduced. The researc h is motiv ated by the problem of approximating the minim um of a continuous fun ct ion ov er [0 , 1] with a “blind” optimization strategy . 1. In tro duction F or a p ositive in teger n , assume a probabilit y distribution µ on X = { 1 , . . . , n } is giv en. Consider the follo wing random pr o cess. A toke n mov es in A = { 0 , . . . , n } , as follo ws: • Initially , place the tok en in some p osition in A . • In r ou n d t : T he tok en is at p osition a ∈ A . Cho ose an element d from X at rand om, according to µ . If d ≤ a , mov e the tok en to p osition a − d (the step is “acc epted”), otherwise lea v e it wh er e it is (the step is “rejected”). W ork of th e ﬁrst author w as done in part while visiting ETH Z¨ uric h, Switzerland. The t hird author was supp orted in part b y the DFG collaborative research pro ject SFB 531. W ork of the last auth or was done in part while at the Univers ity of T oron to, supp orted by DFG grant WO123 2/1-1 and by SUN Microsystems. Join t wo rk on this topic wa s initiated during D agstuhl S eminar 06111 on Complexit y of Bo olean F unctions (2006). c  M. Dietzfelbinger, J .E. Rowe, I. Wegener, and P . W oe lfel CC  Creative Commons Attribution-NoDer ivs License 242 M. DIETZFELBINGER, J.E. ROW E, I. WEGENER, AND P . W OELFEL When the token has reac h ed p osition 0, no fur ther mo v es are p ossible, and we r egard the pro cess as ﬁn ished. A t the b eginn in g th e tok en is placed at a p osition c hosen uniformly at random from { 1 , . . . , n } = A − { 0 } . (F or sim p licit y of nota tion, w e p refer this initial distribution o v er the p ossibly more natural uniform distrib ution on { 0 , . . . , n } . Of course, there is n o real diﬀerence betw een the tw o starting conditions.) Let T b e th e num b er of rounds needed unt il p osition 0 is reac hed. A basic p erformance parameter f or the pro cess is E µ ( T ). As µ v aries, the v alue E µ ( T ) will v ary . The pr ob ab ility distribu tion µ ma y b e regarded as a strategy . W e ask: How should µ b e c hosen so that E µ ( T ) is as small as p ossible? It is easy to exh ib it distributions µ such that E µ ( T ) = O ((log n ) 2 ). (All asymp totic notation in this pap er refers to n → ∞ .) In particular, w e will see that the “harmonic distribution” giv en by µ har ( d ) = 1 d · H n , for 1 ≤ d ≤ n , (1.1) where H n = P 1 ≤ d ≤ n 1 d is the n th h armonic num b er, sati sﬁ es E µ har ( T ) = O ((log n ) 2 ). As the m ain r esult of the p ap er, we w ill sho w that this upp er b oun d is optimal up to constan t factors: E µ ( T ) = Ω((log n ) 2 ), f or ev ery distribution µ . F or the pro of of this lo wer b ound, w e introd uce a no vel p oten tial function tec hnique, whic h may b e useful in other con texts. 1.1. Motiv at ion a nd B ackgro und: Blind Optimization Strategies Consider the problem of m inimizing a fu nction f : [0 , 1] → R , in whic h the deﬁn ition of f is un kno wn: the only in formation w e can gain ab out f is through trying sample p oin ts. This is an instance of a black b ox optimization pr oblem [1 ]. One algorithmic approac h to suc h p roblems is to start with an in itial random p oin t, and iterativ ely attempt to improv e it b y making random p erturbations. That is, if the curr en t p oin t is x ∈ [0 , 1], th en we c h o ose some distance d ∈ [0 , 1] according to some probabilit y distribution µ on [0 , 1], and m ov e to x + d or x − d if this is an improv ement. The distribu tion µ ma y b e regarded as a “searc h strategy”. Such a searc h is “blind ” in the sense that it do es not try to estimate ho w close to the minimum it is and to adapt t h e distribu tion µ a ccordingly . Th e problem is ho w t o sp ecify µ . Of course, an optimal distribution µ dep ends on details of the fu nction f . The diﬃculty the searc h algorithm faces is that for general fu n ctions f there is no infor- mation ab out the scale of p erturbations w h ic h are n ecessary to get close to the minim u m. This leads us to the id ea that the d istribution might b e c hosen so that it is sc ale invariant , meaning t h at steps of all “o rd ers of magnitude” o ccur with ab out the sa me probab ility . Suc h a distribution is describ ed in [4]. One starts by s p ecifying a minimum p ertur bation size ε . Then one c ho oses the probability density fu nction h ( t ) = 1 / ( pt ) f or ε ≤ t ≤ 1, and h ( t ) = 0 otherwise, wh ere p = ln(1 /ε ) is the pr e ci sion of the algorithm. (A random n umb er distributed according to this densit y fu nction ma y b e generated b y ta kin g d = exp( − pu ), where u is u n iformly random in [0 , 1].) F or general fu nctions f , no analysis of this search strategy is kno w n, bu t in exp eri- men ts on standard b enchmark fu nctions it (or h igher dimensional v arian ts) exhibits a go o d p erformance. (F o r details see [4 ].) F rom here on, we fo cus on the simple ca se where f is unimo dal , meaning th at it is strictly decreasing in [0 , x 0 ] a nd strictly incr easing in [ x 0 , 1], where x 0 is the unkn o wn minim um p oin t. Remark 1.1. If o n e is g iven th e information t h at f is un imo dal, one will us e other, de- terministic searc h strategies, whic h appro ximate the optimum up to ε within O (log (1 /ε )) TIGHT BOUNDS F OR BLIND SEARCH ON THE INTEGERS 243 steps. As early as 1953, in [3], “Fib onacci searc h” was prop osed and analyzed, whic h for a giv en tolerance ε uses the optimal num b er of steps in a ve ry strong sense. The “blin d searc h” strate gy from [4 ] can b e applied to more general functions f , but the follo wing analysis is v alid only for unimo dal functions. If the distance of the curren t p oint x from the optim um x 0 is τ ≥ 2 ε then every d istance d w ith τ 2 ≤ d ≤ τ will lead to a new p oin t w ith distance at most τ / 2. Thus, the probability of at least halving the distance to x 0 in one step is at least 1 2 R τ τ / 2 dt pt = ln 2 2 p , whic h is indep enden t of the curren t state x . Ob viously , then, the exp ected num b er of s teps b efore the distance to x 0 has b een halved is 2 p/ ln 2. W e regard the algo r ith m to b e successful if the current p oin t has distance smaller than 2 ε from x 0 . T o reac h this goal, the initial d istance has to b e halv ed at most log(1 /ε ) times, leading to a b ound of O (log (1 /ε ) 2 ) for the exp ected num b er of steps. The qu estion then arises whether this is the b est that can b e ac h iev ed. Is there p erhaps a c hoice for µ that works ev en b etter on u nimo dal functions? T o inv estigate this question, w e c onsid er a discrete v ersion of the situation. The domain of f is A = { 0 , . . . , n } , and f is strictly in creasing, so that f tak es its minim um a t x 0 = 0. In th is case, the searc h pro cess is v ery simple: the actual v alues of f are irrelev ant; going f r om a to a + d is nev er an impro vemen t. Actually , the searc h pro cess is fully describ ed by the simple random pro cess from S ection 1 . Ho w long do es it tak e to reac h the optimal p oin t 0, for a µ c h osen as cleve rly as p ossible? F or µ = µ har , we w ill show an upp er b ound of O ((log n ) 2 ), with an argument v ery similar to that one leading to the b oun d O (log(1 /ε ) 2 ) in the contin u ous ca se. T he main result of this p ap er is that the b ound for the discrete case is op timal. 1.2. F ormalization as a Marko v c hain F or the sak e of simplicit y , w e let from now on [ a, b ] d enote the d iscrete int erv al { a, . . . , b } if a and b are in tegers. Giv en a probability distrib ution µ on [1 , n ], the Marko v c hain R = ( R 0 , R 1 , . . . ) is d eﬁned o v er the state space A = [0 , n ] b y the tr an s ition probabilities p a,a ′ =    µ ( a − a ′ ) for a ′ < a ; 1 − P 1 ≤ d ≤ a µ ( d ) for a ′ = a ; 0 for a ′ > a. Clearly , 0 is an absorbing state. W e deﬁne the rand om v ariable T = min { t | R t = 0 } . Let us w r ite E µ ( T ) f or the exp ectation of T if R 0 is u niformly d istributed in A − { 0 } = [1 , n ]. W e study E µ ( T ) in dep endence on µ . In particular, we wish to identify distr ib utions µ th at mak e E µ ( T ) as small as p ossible (up to constan t factors, where n is gro w ing). Observ ation 1.2. If µ (1) = 0 then E µ ( T ) = ∞ . This is b ecause with p robabilit y 1 n p osition 1 is chosen as the starting p oint, and f rom state 1, th e p ro cess will neve r reac h 0 if µ (1) = 0. As a consequence, for the whole p ap er w e assu me that all distribu tions µ that are considered satisfy µ (1) > 0 . (1.2) Next w e note that it is not hard to derive a “closed expression” for E µ ( T ). Fix µ . F or a ∈ A , let F ( a ) = µ ([1 , a ]) = P 1 ≤ d ≤ a µ ( d ) . W e note recursion formulas for the exp ected tra v el time T a = E µ ( T | R 0 = a ) when starting from p osition a ∈ A . It is not hard to 244 M. DIETZFELBINGER, J.E. ROW E, I. WEGENER, AND P . W OELFEL obtain (details are omitted due to space constrain ts) E µ ( T ) = 1 n · X 1 ≤ a 1 < ··· L . Suc h terms are alw a ys meant to ha ve v alue 0. Consider th e pro cess R = ( R 0 , R 1 , . . . ). Assume t ≥ 1 and i ≥ 1. If R t − 1 ≥ 2 i then all n umb ers d ∈ I i − 1 will b e accepted as steps and lead to a p rogress of at least 2 i − 1 . Hence Pr ( R t ≤ R t − 1 − 2 i − 1 | R t − 1 ≥ 2 i ) ≥ p i − 1 . F urther, if R t − 1 ∈ I i , w e n eed to c ho ose step sizes from I i − 1 at most t wice to get b elo w 2 i . Since the exp ected w aiting time for the random distances to h it I i − 1 t wice is 2 / p i − 1 , the exp ected time pr o cess R remains in I i is not larger than 2 /p i − 1 . Adding up o ve r 1 ≤ i ≤ L , the exp ected time process R sp ends in the in terv al [2 , a ], where a ∈ I j is the starting p osition, is not larger than 2 p j − 1 + 2 p j − 2 + . . . + 2 p 1 + 2 p 0 . After the pro cess has left I 1 = [2 , 3], it has reac hed p osition 0 or p osition 1, and the exp ected time b efore we hit 0 is n ot larger than 1 /p 0 = 1 /µ (1). Th us, the exp ected n um b er T a of steps to get from a ∈ I j to 0 satisﬁes T a ≤ 2 p j − 1 + 2 p j − 2 + . . . + 2 p 1 + 3 p 0 . This implies the b ound E µ ( T ) ≤ 2 p L − 1 + 2 p L − 2 + . . . + 2 p 1 + 3 p 0 , for arbitrary µ . If we arr ange that p 0 = · · · = p L − 1 = 1 L , (2.1) w e will hav e T a ≤ (2 j + 1) L ≤ (2(log a ) + 1)(l og n ) = O ((log a )(lo g n )) = O ((log n ) 2 ). Clearly , then, E µ ( T ) = O ((log n ) 2 ) as w ell. The simplest distribu tion µ with (2.1) is the one that distrib utes the w eigh t ev enly on the p ow ers of 2 b elo w 2 L : µ pow2 ( d ) =  1 /L, if d = 2 i , 0 ≤ i < L , 0 , otherwise. 1 log means “logarithm to the base 2” throughout. TIGHT BOUNDS F OR BLIND SEARCH ON THE INTEGERS 245 Th u s, E µ pow2 ( T ) = O ((log n ) 2 ) . The “harmonic distribution” deﬁn ed b y (1.1 ) satisﬁes p i ≈ (ln(2 i +1 ) − ln(2 i )) /H n ≈ ln 2 / ln( n ) = 1 / log 2 n , and we also get T a = O ((log a )(lo g n )) an d E µ har ( T ) = O ((log n ) 2 ). More generally , all distribu tions µ with p 0 , . . . , p L − 1 ≥ α/L , w here α > 0 is constan t, satisfy E µ ( T ) = O ((log n ) 2 ). 3. Lo w er bound W e sho w, as the main result of this pap er, that the upp er b ou n d of Section 2 is optimal up to a constan t factor. Theorem 3.1 . E µ ( T ) = Ω((log n ) 2 ) for al l distributions µ . This theorem is prov ed in the remainder of th is sectio n . The distr ibution µ is ﬁxed from here on; we sup press µ in th e notation. Reca ll that we m ay assume that µ (1) > 0. W e con tin u e to use the interv als I 0 , I 1 , I 2 , . . . , I L that partition [1 , n ], as we ll as the pr obabilities p i , 0 ≤ i ≤ L . 3.1. In tuition The basic id ea for the lo w er b ound is the follo win g. F or the ma jorit y of the starting p ositions, the p ro cess has to tra verse all in terv als I L − 2 , I L − 3 , . . . , I 1 , I 0 . Consid er an interv al I i . If the pro cess reac hes in terv al I i +1 , then afterw ard s steps of size 2 i +2 and larger are rejected, and so d o not help at all for crossing I i . Steps of size from I i +1 , I i , I i − 1 , I i − 2 ma y b e of signiﬁ can t help. S maller step sizes will not help m uch. S o, very roughly , the exp ected time to tra v erse in terv al I i completely when starting in I i +1 will b e b ound ed from b elo w by 1 p i +1 + p i + p i − 1 + p i − 2 , since 1 / ( p i +1 + p i + p i − 1 + p i − 2 ) is the w aiting time for the ﬁrst step with a “signiﬁcan t” size to app ear. If it were the case that th er e is a constan t β > 0 with the p r op ert y that for eac h 0 ≤ i < L − 1 the probabilit y that in terv al I i +1 is visited is at least β then it would not b e hard to s h o w that the exp ected tra v el time is b ounded b elo w by X 1 ≤ j < L / 2 β p 2 j +1 + p 2 j + p 2 j − 1 + p 2 j − 2 . (3.1) (W e pick ed out only th e even i = 2 j to a vo id double coun ting.) No w the sum of th e denominators in the s um in (3.1) is at m ost 2, and the sum is min imal when all denominators are equal, so the sum is b oun d ed b elo w by β · ( L/ 2) · ( L/ 2) / 2 = β · L 2 / 8, hence the exp ected tra v el time wo u ld b e Ω( L 2 ) = Ω((log n ) 2 ). It turns out that it is n ot straigh tforw ard to turn this informal argum ent in to a rig- orous pro of. First, there are (somewhat strange) d istributions µ for which it is not the case that eac h interv al is visited w ith constan t probabilit y . (F or example, let µ ( d ) = B d − 1 · ( B − 1) / ( B n − 1), for a large base B like B = n 3 . Then the “correct” ju mp d irectly to 0 h as an o verwhelming probability to b e c hosen ﬁr st. 2 ) E v en for reasonable distrib u tions µ , it may happ en that some interv als or ev en blo c ks of in terv als are ju mp ed ov er with high probabilit y . This m eans that the analysis of the cost of tra ve r sing I i has to tak e into accoun t that this trav ersal might happ en in one big jump starting f rom an interv al I j with j m uch 2 The aut hors th ank Uri F eige for pointing th is out. 246 M. DIETZFELBINGER, J.E. ROW E, I. WEGENER, AND P . W OELFEL larger than i . Second, in a formal argumen t, the con trib ution of the steps of size smaller than 2 i − 2 m ust b e tak en in to accoun t. In the remainder of this section, w e giv e a rigorous pro of of the lo w er b ound . F or this, some mac hinery has to b e dev elop ed. The crucial comp onen ts are a reformulatio n of pro cess R as another pro cess, wh ic h as long as p ossible defers decisions ab out what the (randomly c hosen) starting p osition is, and a p oten tial function to measure ho w m uch progress the pro cess has made in direction to its goal, n amely reac h in g p osition 0. 3.2. Reform ulation of t he pro cess W e c hange our p oint o f view on the pro cess R (w ith initial distribu tion uniform in [1 , n ]). Th e idea is that w e do not h a v e to ﬁx th e starting p osition righ t at the b eginning, but rather make partial d ecisions on what the starting p osition is as the pro cess adv ances. The information w e hold on for step t is a random v ariable S t , with the follo wing interpretation: if S t > 0 th en R t is uniformly distrib uted in [1 , S t ]; if S t = 0 th en R t = 0. What prop erties should the random pro cess S = ( S 0 , S 1 , . . . ) on [0 , n ] ha ve to b e a prop er mo del of the Marko v c hain R from Section 1.2? W e ﬁr st giv e an in tuitive description of process S , and la ter formally d eﬁne th e corresp onding Mark ov chain. Clearly , S 0 = n : the starting p osition is uniform ly d istributed in [1 , n ]. Giv en s = S t − 1 ∈ [0 , n ], we choose a step length d fr om X , according to distribution µ . Then th er e are t w o cases. Case 1: d > s . — If s ≥ 1, this step cannot b e used for an y p osition in [1 , s ], th us w e reject it and let S t = s . If s = 0, n o further mo v e is p ossib le at all, and we also r eject. Case 2: d ≤ s . — Then s ≥ 1, and the tok en is at some position in [1 , s ]. What happ ens no w depend s on the p osition of th e token relativ e to d , for which we only h a v e a probabilit y distribution. W e d istin gu ish thr ee sub cases: (i) Th e p osition of the tok en is larger than d . — This happ ens with p robabilit y ( s − d ) /s . In this case w e “accept” the step, and no w know that the to ken is in [1 , s − d ], uniformly distributed; thus, w e let S t = s − d . (ii) Th e p osition of the tok en equals d . — This happ ens with probabilit y 1 /s . In this case w e “ﬁn ish” the pro cess, and let S t = 0. (iii) Th e p osition of the toke n is smaller than d . — Th is h app ens with pr ob ab ility d − 1 s . In this case we “reject ” the step, and no w kno w that th e tok en is in [1 , d − 1], uniformly distributed; thus, w e let S t = d − 1. Clearly , once state 0 is reac hed, all further steps are rejected via Case 1. W e formalize this idea b y deﬁn ing a new Mark o v c hain S = ( S 0 , S 1 , . . . ), as follo ws. The state sp ace is A = [0 , n ]. F or a state s ′ , we collec t the total prob ab ility that we get fr om s to s ′ . If s ′ > s , this probability is 0; if s ′ = s , this probability is P s s ′ = 0; ( µ ( s ′ + 1) + µ ( s − s ′ )) · s ′ /s if s > s ′ ≥ 1; 1 − F ( s ) if s = s ′ . Again, sev eral initial distributions are p ossible for pro cess S . Th e ve rs ion w ith initial distribution with Pr ( S 0 = n ) = 1 is m eant to d escrib e p ro cess R . Deﬁne the s topping time T S = min { t | S t = 0 } . TIGHT BOUNDS F OR BLIND SEARCH ON THE INTEGERS 247 W e note th at it is suﬃcient to analyze pro cess S (with the standard initial d istribution). Lemma 3.2. E ( T ) = E ( T S ) . Pr o of. F or 0 ≤ s ≤ n , consid er the v ersion R ( s ) of pr o cess R ind uced by c ho osing the uniform distribution o n [1 , s ] (for s ≥ 1) resp. { 0 } (for s = 0) as the initial distribution. W e let A ( s ) = E (min { t | R ( s ) t = 0 } ) . Clearly , A ( n ) = E ( T ) and A (0) = 0. W e deriv e a recurren ce for ( A (0) , . . . , A ( n ) ). Let s ≥ 1, and assu m e the starting p oint R 0 is chosen uniformly at random f rom [1 , s ]. W e carry out the ﬁrst step of R ( s ) , whic h starts with c ho osing d . T he follo w ing situ ations ma y arise. (i) d > s . — T his happ ens with p robabilit y 1 − F ( s ) < 1. T h is distance will b e rejected for all starting p oin ts in [1 , s ], so the exp ected remaining tra vel time is A ( s ) again. (ii) 1 ≤ d ≤ s . F or eac h d , th e probabilit y for th is to h ap p en is µ ( d ). F or the starting p oint R 0 there are three p ossibilities: - R 0 ∈ [1 , d − 1] (only p ossib le if d > 1). — This h app ens with probabilit y d − 1 s . The remaining exp ected trav el time is A ( d − 1) . - R 0 = d . — This happ ens with pr obabilit y 1 s . The remaining tra vel time is 0. - R 0 ∈ [ d + 1 , s ] (only p ossible if d < s ). — Th is happ ens with p robabilit y s − d s . The remaining exp ected trav el time in this case is A ( s − d ) . W e obtain: A ( s ) = 1 + (1 − F ( s )) A ( s ) + X 1 ≤ d ≤ s µ ( d )  d − 1 s · A ( d − 1) + s − d s · A ( s − d )  . W e rename d − 1 in to s ′ in the ﬁrst sum and s − d in to s ′ in the second sum and r earrange to obtain A ( s ) = 1 F ( s ) ·   1 + X 1 ≤ s ′ s ′ ≥ 1. — Th is o ccurs with probabilit y ( µ ( s ′ + 1) + µ ( s − s ′ )) · s ′ /s . The exp ected remaining tra v el time is B ( s ′ ) . Summing up, we obtain B ( s ) = 1 + (1 − F ( s )) B ( s ) + X 1 ≤ s ′ i ( a ) X d ∈ I j µ ( d ) · 2 ( i ( a )+1 − j ) / 2 = X j ≤ i ( a ) p j · 2 ( j +1 − i ( a )) / 2 + X j >i ( a ) p j · 2 ( i ( a )+1 − j ) / 2 = 2 c ·   X 0 ≤ j ≤ L p j · c | j − i ( a ) |   . Hence X a ∈ I i ϕ a = X a ∈ I i 1 aσ a ≥ 2 i 2 c · 2 i +1 ·  P 0 ≤ j ≤ L p j · c | j − i |  = ψ i 4 c , with ψ i from (3.4). Th u s, Φ 0 ≥ X 0 ≤ i 0 and let T = min { t | X 1 + · · · + X t ≥ g } . If E ( T ) < ∞ and E ( X t | T ≥ t ) ≤ C for al l t ∈ N , then E ( T ) ≥ g /C . 250 M. DIETZFELBINGER, J.E. ROW E, I. WEGENER, AND P . W OELFEL Pr o of of 3.1: Since S t = 0 if and only if Φ t = 0 (Lemma 3.4(b)), the stopp ing time T Φ = min { t | Φ t = 0 } of the p otenti al reac hing 0 satisﬁes T Φ = T S . Thus, to p ro ve T h eorem 3.1, it is suﬃ cient to sho w that E ( T Φ ) = Ω((log n ) 2 ). F or this, w e let X t = Φ t − 1 − Φ t , the progress made in step t in terms of the p otentia l. By Lemma 3.5, E ( X t | S t − 1 = s ) ≤ C , for all s ≥ 1, and hen ce E ( X t | T ≥ t ) = E ( X t | Φ( S t − 1 ) > 0) ≤ C . Observe that X 1 + · · · + X t = Φ 0 − Φ t and hence T Φ = min { t | X 1 + · · · + X t ≥ Φ 0 } . App lying Lemma 3.6, and c ombining with Lemma 3.4, w e ge t that E ( T Φ ) ≥ Φ 0 /C = Ω((log n ) 2 ), whic h p ro v es Theorem 3.1. The only missing p art to ﬁll in is the pr o of of Lemma 3.5. 3.4. Proof of the Main Lemma ( Lemma 3.5) Fix s ∈ [1 , n ], and assume S t − 1 = s . Ou r aim is to sho w that the “exp ected p oten tial loss” is constant , i. e., that E (Φ t − Φ t − 1 | S t − 1 = s ) = O (1) . Clearly , E (Φ t − Φ t − 1 | S t − 1 = s ) = P 0 ≤ x ≤ s ∆( s, x ), where ∆( s, x ) =  Φ( s ) − Φ( x )  · Pr ( S t = x | S t − 1 = s ) . (3.8) W e sh o w that P 0 ≤ x ≤ s ∆( s, x ) is b ounded b y a constan t, b y consider in g ∆( s, s ), ∆( s, 0), and P 1 ≤ x 0, so the denominator is not zero. F or eac h b ≤ a w e clearly h a v e µ ( b )( b − 1) ≤ µ ( b ) √ ab , thus the sum in the n umerator in (3.12) is smaller than the sum in the denominator, and we get λ a < 1. Next, we b ound γ a for a ≤ s : γ a = ϕ a · X 1 ≤ x 0. Hence, if µ ( x ) = 0 for all s − a < x < s , then γ a = 0. Otherw ise, b y omitting some of the summand s in the denominator w e obtain γ a ≤ X s − a 0. F or s − a < b ≤ a , this quotien t is µ ( b ) ( s − b ) µ ( b ) √ ab ≤ a − 1 p a · ( s − a + 1) < r a s − a + 1 ≤ r s s − a + 1 . F or max { a, s − a } < b < s , the quotient of the corresp onding su m mands is µ ( b )( s − b ) µ ( b ) a 3 / 2 / √ b ≤ min { a, s − a } · √ b a 3 / 2 ≤ a · √ s a 3 / 2 = r s a . 252 M. DIETZFELBINGER, J.E. ROW E, I. WEGENER, AND P . W OELFEL Hence, γ a ≤ p s/ ( s − a + 1) + p s/a. Plugging this b ound on γ a and the b oun d λ a < 1 into (3.11), and usin g that X 1 ≤ a ≤ s 1 √ a = 1 + X 2 ≤ a ≤ s 1 √ a < 1 + Z s 1 dx √ x = 1 + [2 √ x ] s 1 = 1 + 2 √ s − 2 < 2 √ s, w e obtain X 1 ≤ x

Original Paper

Loading high-quality paper...

Related Papers

Loading...

Comments & Academic Discussion

Loading comments...

Leave a Comment

~~Twitter Facebook~~