The Anderson-Weber strategy is not optimal for symmetric rendezvous search on K4
We consider the symmetric rendezvous search game on a complete graph of n locations. In 1990, Anderson and Weber proposed a strategy in which, over successive blocks of n-1 steps, the players independently choose either to stay at their initial locat…
Authors: : Anderson, Weber, Fan
The Anderso n–W eb er strategy is not optimal for symmetric ren dezv ous se arc h on K 4 Ric hard W eb er † 8 July 2009 Abstract W e consider the symmetric rendezvous search game on a complete g raph of n lo ca tions. In 1 990, Anderso n and W ebe r prop osed a str ategy in which, ov er successive blo cks of n − 1 steps, th e pla yers independently choose either to s tay at their initia l lo ca tion o r to tour the other n − 1 lo cations, with proba bilities p a nd 1 − p , resp ectively . Their str a tegy has bee n proved o ptimal for n = 2 with p = 1 / 2 , and for n = 3 with p = 1 / 3. The pro of for n = 3 is very complicated and it has been difficult to guess what mig h t be true for n > 3. Anderson and W eb er susp ected that their stra tegy might not be o ptimal for n > 3, but they had no par ticular rea son to b elieve this a nd no one has be e n able to find anything better . This pap er descr ib es a strateg y that is b etter than Anders on–W eb er for n = 4. How ev er, it is b etter by only a tiny fra c tion of a p ercent. 1 The Anderson–W eb er strategy In the sym metric rendezv ous search game on K n (the completely connected graph on n v ertices) tw o pla y ers are in itially placed at tw o distinct ve rtices (called lo cations). The game is pla y ed in discrete ste ps and at eac h s tep ea c h pla yer can either sta y where he i s or mov e to a different lo cation. The pla y ers share no common lab elling of t he lo cations. Our aim is to find a (r an d omizing) strategy suc h that if b oth p la y ers indep endentl y follo w this strategy then t hey minimize the expected n um b er o f steps until th ey fi r st meet. Rendezv ous sea rc h games of this type were fi rst prop osed by S teve Alp ern in 1976. They are simp le to d escrib e, and ha ve receiv ed considerable atten tion in th e popu lar press a s they mo d el pr oblems that are f amiliar in real life. They are n otoriously d ifficult to analyse. The Anders on –W eb er strategy is a mixed strategy that p ro ceeds in b lo c ks of n − 1 s teps . Pla y ers b egin at distinct locations, called their home lo cations. In eac h successiv e blo ck a pla y er either sta ys at h is home lo cation, w ith probabilit y p , or mak es a randomly chosen tour of h is n − 1 non-home lo catio ns, d oing this with probabilit y 1 − p . T he m otiv ation fo r the strategy comes from the wai t-for-momm y strate gy that is optimal in an asymmetric v ersion of the problem. With p robabilit y 2 p (1 − p ) the pla yers pla y the wait-fo r-mommy strategy o v er the first n − 1 steps and so rendezv ous in exp ected time ( n + 1) / 2. Anderson an d W eb er (199 0) pro v ed that the ab o v e strateg y is o ptimal for the game on K 2 , with p = 1 / 2, and conjectured that it should b e optimal for K 3 , with p = 1 / 3. This † Statistical Laboratory , Centre for Mathematica l Sciences, Wilb erforce Road, Cam bridge CB2 0WB, rrw1@cam.ac.uk 1 w as fi n ally prov ed by W eb er (2006 ), wh o established a strong A W prop ert y (SA W) that A W minimizes E [min { T , k } ] for all k . Anderson and W eb er susp ected that th eir str ategy migh t not b e optimal for n > 3, bu t they had no p articular reason to b eliev e this and no one h as b een able to find an y strategy that is b etter. Indeed, A W has b een sho wn optimal amongst 2–Mark ov s trategies. F an (2009) s ho w ed that A W minimizes P ( T > 2) and E [min { T , 2 } ]. He also found that A W is n ot optimal on K 4 if pla ye rs h a v e the extra information that the lo cation can b e viewed as b eing arranged on a circle and the pla y ers are giv en a common notion of clo ckwise. Ho wev er, the question as to whether or not A W is optimal has remained op en for the case in w h ic h there is no such sp ecial extra information. F an writes, ‘Th e author b eliev es that SA W still holds on K 4 , and so A W strategy is still optimal’. W e were inclined to agree, but n o w find that A W can b e b ettered. F or more bac kground to the prob lem see W eb er (2006). Let us b egin by r eprising the A W strategy for the symmetric rendezvo us game on K 4 . W e assume that there is no sp ecial knowledge (suc h as a common notion of clo c kwise on a circle). The A W strategy is a 3–Mark o v str ategy that rep eats in blo c ks of 3 steps. In eac h successiv e blo c k of 3 steps, eac h pla y er, in dep end en tly , remains at his home lo cation with probabilit y p , or do es a random c hosen tour of his 3 non-home locations, with probability 1 − p . This leads to r en dezv ous in an exp ected num b er steps E T , where E T = p 2 × (3 + E T ) + 2 p (1 − p ) × 2 + (1 − p ) 2 × 1 2 (16 / 9 ) + 1 2 (3 + E T ) = 43 − 14 p + 25 p 2 9 (1 + 2 p − 3 p 2 ) . This is explained as follo ws. 1. If b oth sta y home they d o not meet. 2. If one sta ys home, while the other tours, then they meet in exp ected time 2. 3. If b oth tour, then they meet with probabilit y 1 / 2, and cond itional on meeting they meet in exp ected time 16 / 9. One easily fin d s that the minimum of E T is ac hiev ed by taking p = 1 4 3 √ 681 − 77 ≈ 0 . 321983 and th en E T = 1 12 15 + √ 681 ≈ 3 . 42466 . 2 A strategy b etter than And erson–W eb er on K 4 W e now explain how the A W strategy can b e b ettered. S upp ose pla y er I h as lo cation 1 as his home, and pla y er I I has lo cation 2 as home. W e migh t imagine that eac h p la y er lab els his non-home lo cations as a, b, c , and so a tour of his non-h ome lo cations is one of six p ossible 2 tours: abc , acb , bac , bca , cab , cba . In the case that play er I has ( a, b, c ) = (2 , 3 , 4) and play er I I h as ( a, b, c ) = (1 , 3 , 4) we can compu te the matrix B = 2 X 3 X X 2 X 2 X 2 3 X 3 X 1 1 X X X 2 1 1 X X X 3 X X 1 1 2 X X X 1 1 where we h a v e ord ered the rows and columns to corresp ond to abc , acb , bac , bca , cab , cba . A n umber en try indicates the step at whic h pla y ers meet when they meet, and X indicates th at they do not meet. There are 36 suc h matrices, o v er whic h we must a v erage, for eac h p ossib le pair of assignments by pla y ers I and I I, of (2 , 3 , 4) and (1 , 3 , 4), resp ectiv ely , to ( a, b, c ). Let us b egin b y noting that if a pla y er stays home for three steps and meeting do es n ot o ccur, then the other play er must also ha v e b een sta ying home. Similarly , if a pla y er tour s for three steps and m eeting d o es not o ccur, then the other p la y er m ust also h a v e b een tour ing (and their tour s not meeting). Th us after any 3 k steps (a multiple of 3) eac h play er knows exactly h o w man y times b oth h a v e b een touring. Whenev er a p la y er makes a tour in the A W strategy h e c ho oses his tour at random (indep end en tly of previous tours). W e sho w ho w to imp r o v e A W introdu cing some dep endence b et we en tours. Let us adopt a n otation in w hic h the fi rst tour a pla ye r mak es is lab elled A . The second distinct tour a play er mak es is lab elled B , and so on. So, for example, AAB means that on his first three tours, a play er (i) first mak es a random tour , (ii) second mak es the same tour as his first tour, (iii) and third m ak es a tour chosen randomly from amongst the 5 tours he has not ye t tried. Let us consider fir st a m o dified p roblem in which at eac h so-ca lled ‘t–step’ eac h p la y er mak es a tour of his non-home lo cations. In this mo dified problem no p la y er sta ys home for a t–step. W e wish to minimize the exp ected num b er of t–steps unt il the pla y ers meet. At the first t–step b oth pla y ers do A and the p robabilit y of meeting is 1 / 2. I f a 1–Marko v strategy is emplo y ed, so successiv e t–steps are c hosen at r andom, then the exp ected n umber of t–steps unt il meeting o ccurs is 2. Ov er the first t w o t–steps, the play ers can do either AA or AB . The matrix for not meeting is P 2 = 1 2 1 5 1 5 13 50 ! One can c hec k that P 2 ≻ 0 (i.e., P 2 is p ositive definite). th us for a 2–Mark o v strategy w e w ould b e solving E T = x ⊤ ( J + P 2 E T ) x where J is a 2 × 2 matrix filled with 1s. This has a minim um v alue of E T = 2, when w e tak e x ⊤ = (1 / 6 , 5 / 6). This means that, r estricting to 2–Mark o v strategies, tours should b e c hosen at r an d om. Similarly , o v er th e fir st three t–steps, the p la y ers can do AAA , AAB , AB A , AB B , AB C . 3 The m atrix f or not meeting is P 3 = 1 2 1 5 1 5 1 5 1 20 1 5 13 50 2 25 2 25 11 100 1 5 2 25 13 50 2 25 11 100 1 5 2 25 2 25 13 50 11 100 1 20 11 100 11 100 11 100 7 50 . Again, P 3 0, and ( x AAA , x AAB , x AB A , x AB B , x AB C ) = (1 / 36 , 5 / 36 , 5 / 36 , 5 / 36 , 20 / 36) is optimal in the sense of minimizing the solution of E T = x ⊤ ( J + P 2 + P 3 E T ) x , where J is no w 5 × 5 and P 2 is expanded to the appropr iate 5 × 5 m atrix. Th us amongst 3–Mark ov strategies, tours should also b e c hosen at random. Ho w ev er, o v er f our t–steps th ings turn out differen tly . There are no w 15 p ossible strategies: AAAA , AAAB , AAB A , AAB B , AAB C , AB AA , AB AB , AB AC , AB B A , AB B B , AB B C , AB C A , AB C B , AB C C , AB C D . The matrix for not meeting can b e computed to b e P 4 = 1 2 1 5 1 5 1 5 1 20 1 5 1 5 1 20 1 5 1 5 1 20 1 20 1 20 1 20 0 1 5 13 50 2 25 2 25 11 100 2 25 2 25 11 100 2 25 2 25 11 100 1 50 1 50 1 50 3 100 1 5 2 25 13 50 2 25 11 100 2 25 2 25 1 50 2 25 2 25 1 50 11 100 11 100 1 50 3 100 1 5 2 25 2 25 13 50 11 100 2 25 2 75 1 30 2 75 2 25 1 30 1 30 1 30 11 100 23 450 1 20 11 100 11 100 11 100 7 50 1 50 1 30 7 150 1 30 1 50 7 150 7 150 7 150 1 20 14 225 1 5 2 25 2 25 2 25 1 50 13 50 2 25 11 100 2 25 2 25 1 50 11 100 1 50 11 100 3 100 1 5 2 25 2 25 2 75 1 30 2 25 13 50 11 100 2 75 2 25 1 30 1 30 11 100 1 30 23 450 1 20 11 100 1 50 1 30 7 150 11 100 11 100 7 50 1 30 1 50 7 150 7 150 1 20 7 150 14 225 1 5 2 25 2 25 2 75 1 30 2 25 2 75 1 30 13 50 2 25 11 100 11 100 1 30 1 30 23 450 1 5 2 25 2 25 2 25 1 50 2 25 2 25 1 50 2 25 13 50 11 100 1 50 11 100 11 100 3 100 1 20 11 100 1 50 1 30 7 150 1 50 1 30 7 150 11 100 11 100 7 50 1 20 7 150 7 150 14 225 1 20 1 50 11 100 1 30 7 150 11 100 1 30 7 150 11 100 1 50 1 20 7 50 7 150 7 150 14 225 1 20 1 50 11 100 1 30 7 150 1 50 11 100 1 20 1 30 11 100 7 150 7 150 7 50 7 150 14 225 1 20 1 50 1 50 11 100 1 20 11 100 1 30 7 150 1 30 11 100 7 150 7 150 7 150 7 50 14 225 0 3 100 3 100 23 450 14 225 3 100 23 450 14 225 23 450 3 100 14 225 14 225 14 225 14 225 7 90 It n ow turns out th at P 4 has a negativ e eigen v alue. The A W strategy wo uld b e to c ho ose tours at rand om, which giv es x ⊤ = p AAAA , p AAAB , p AAB A , p AAB B , p AAB C , p AB AA , p AB AB , p AB AC , p AB B A , p AB B B , p AB B C , p AB C A , p AB C B , p AB C C , p AB C D = 1 6 3 (1 , 5 , 5 , 5 , 20 , 5 , 5 , 2 0 , 5 , 5 , 20 , 20 , 20 , 20 , 60) . 4 Solving E T = x ⊤ ( J + P 2 + P 3 + P 4 E T ) x , we find E T = 2, as we exp ect. Ho wev er consider y ⊤ = 0 , 1 12 , 1 12 , 0 , 0 , 1 12 , 0 , 0 , 0 , 1 12 , 0 , 0 , 0 , 0 , 2 3 . Solving E T = y ⊤ ( J + P 2 + P 3 + P 4 E T ) y give s E T = 2 − 23 16200 = 1 . 99858. T hus, rendezvo us o ccurs in a smaller exp ected num b er of t–steps than it do es und er A W. T his happ ens when pla y ers us e a mixed 4–Mark o v strategy of d oing AAAB , AAB A , AB AA , AB B B , eac h with probabilit y 1 / 12, and AB C D with pr obabilit y 2 / 3. This corresp ond s to c ho osing tour s f or the first t wo t–steps at random, but then making the c hoice of tours at the 3rd and 4th t–step dep end on the tours tak en at the 1st and 2nd t–step. Th e c hoice of y is not un ique. It has b een c ho ose to b e simple, con taining man y 0s, and it w as found by using the fact that the eigen- v ector of P 4 ha ving a n egativ e eigen v alue is of a p attern ( α, β , β , γ , δ, β , γ , δ, γ , β , δ, δ, δ, δ , ǫ ) for some irrational α , β , γ , δ , ǫ . The ab o v e m ak es it ve ry p lausible that we can find a strategy that is b etter than A W on K 4 . W e n o w n eed to do s ome carefu l calculatio ns. W e co nsider a 12–Mark o v strategy consisting of 4 t–steps. In eac h t-step a play er remains home with probabilit y p , and tours with prob ab ility 1 − p . When h e mak es tours, he d o es so in an manner that achiev es the distribution previously describ ed . Th at is, an y 1st and 2nd tours are made at r andom, but 3rd and 4th tours are made so that these are consisten t with the distribu tion o v er 4 tours b eing AAAB , AAB A , AB AA , AB B B , eac h w ith probab ility 1 / 12, and AB C D with pr obabilit y 2 / 3. If at the end of 12 steps the pla ye rs hav e n ot met then the strategy restarts, forgetting ab out the num b er of pr evious t-steps on whic h pla y ers made non-meeting tours. W e foun d it easiest to calculate the exp ected meeting time b y attac h ing a pr obabilit y to eac h p ossible 12–step paths th at th e strategy might tak e. There are 1585 p ossible paths whic h ha v e nonzero probability . W e computed the step at whic h play ers meet, or ev en t that they do not meet, for eac h of the 1585 × 1585 p ossib ilities, and a v eraged these usin g the appr op r iate probabilities. The calculations are intric ate, but can b e c hec k ed in v arious wa ys to provide confidence that no mistake has b een m ad e. It turns out that the exp ected meeting time is E T = − 227773 p 8 + 582884 p 7 − 132931 9 p 6 + 1737938 p 5 − 194123 5 p 4 + 1420688 p 3 − 998569 p 2 + 389834 p − 217648 3 (82001 p 8 − 218608 p 7 + 327728 p 6 − 315256 p 5 + 215870 p 4 − 104656 p 3 + 36128 p 2 − 8008 p − 15199) . T aking p = 1 4 3 √ 681 − 77 , whic h is the op timal v alue for the A W strategy , we find that the new strategy pro duces an exp ected meeting time th at is less than that of A W by 243 75041 961207 + 47008531 01 √ 681 32754 0887401 488016 ≈ 0 . 00014668 3 . The tin y improv ement is due to the fact that when b oth p la y ers do four t-tours (which happ ens with probability (1 − p ) 4 ), the new strategy giv es a greater probab ility that the play ers meet than do es A W. It w ould b e p ossib le to m ake th e new strateg y ev en b etter, b y choosing p sligh tly d ifferen tly , or ind eed making it dep end on th e num b er of tours that h a v e b een tak en so far ov er w h ic h pla y ers hav e n ot met. W e could also do b etter b y not restarting after 12 steps. Ho w ev er, our aim is not to try to fi nd the b est str ategy f or K 4 , whic h still seems v ery difficult, but simp ly to sho w that A W is not optimal. This w e h a v e no w done. 5 References [1] E. J. And er s on and R. R. W eb er. The rendezv ous p roblem on discrete lo cations. J . Appl. Pr ob. , 27:839– 851, 1990. [2] J . F an. Symmetric R e ndezvous Pr oblem with Overlo oking . PhD thesis, Universit y of Cam bridge, 2009. [3] R. R. W eb er The optimal strategy for symmetric rend ezvous on K 3 . arXiv:0906.5 447 v1, 2006. 6
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment