Some remarks on cops and drunk robbers
The cops and robbers game has been extensively studied under the assumption of optimal play by both the cops and the robbers. In this paper we study the problem in which cops are chasing a drunk robber (that is, a robber who performs a random walk) o…
Authors: Athanasios Kehagias, Pawel Pralat
SOME REMARKS ON COPS AND DR UNK R OBBE RS A THANASIOS KEHAGIAS AND P A WE L PRA LA T Abstract. The cops a nd robb ers game has been extensively studied under the assumption of optimal play by bo th the cops a nd the robb ers . In this pa p er we study the problem in which cops are chasing a drunk robb er (that is, a robb er who p er forms a r andom w alk ) on a gra ph. Our main g oal is to characterize the “co st of drunkenness.” Spec ifically , we study the ratio of exp ected capture times for the optimal v ersio n and the drunk ro bber one. W e a lso examine the algorithmic side of the problem; that is, how to compute near -optimal sear ch schedules for the cops. Finally , w e present a preliminary inv estig ation of the invisible robb er game a nd p oint o ut differences betw een this game and gr ap h se ar ch . 1. Intr oduction The game of Cops and R obb ers , in tro duced indep enden tly b y No w ak ow ski and Winkler [1 6 ] and Quilliot [19] a lmo st thirt y years ago, is pla y ed on a fixed undirected, simple, and finite graph G . There are tw o pla y ers, a team of k c ops , where k ≥ 1 is a fixed in teger, and the r obb er . In the first round of the game, the cops o ccup y a n y set of k v ertices and then the r o bb er choo ses a v ertex to start from; in the follo wing rounds, first the cops and then t he robb er mo v e from v ertex to v ertex, f ollo wing the edges of G . More than one cop is allow ed to o ccup y a v ertex, and the play ers ma y remain on their current p ositions. A t ev ery step of the game, b ot h pla y ers kno w the p ositions of a ll cops and the robb er. The cops win if they capture the robb er; that is, if at least o ne of cop ev en tually o ccupies the same vertex as the robb er; the robb er wins if he can av oid b eing captured indefinitely . The play ers a re adversarial ; that is, they play optimally against each other. Since placing a cop on eac h v ertex guar a n tees t ha t the cops win, w e ma y define the c op numb er , written c ( G ), to b e the minim um n um b er of cops needed to win on G . The cop n um b er w as in tro duced b y Aigner and F romme in [1]. In this pap er we study a new v ersion o f the ga me, in whic h t he ro bb er is drunk ; that is, he p erforms a random walk on G . The cops are assumed to fo llo w a strategy whic h is optimal with resp ect to the robb er’s random b eha vior. This version w as prop o sed b y D. Thilik os during the 4th W orkshop on GR Aph Searching, The ory and Applications (GRAST A 2011) and he sp ecifically ask ed the following question: “w hat is the c ost of drunken n ess ?” In other w ords, ho w m uc h faster than the adv ersarial robb er is the drunk one captured? W e try to answ er v ario us v ersions of this question. In addition, w e study some algorithmic questions ; for example, ho w to compute the exp ected capture time f o r an optimal strategy of cops. There is a large bibliograph y o n pursuit games on g raphs. The reader interes ted in cops and robb ers can start b y p erusing the surv eys [2, 7, 8] and the recen t b o ok [4]. T o the b est of our kno wledge, the problem o f a drunk robb er ha s not b een previously studied in the cops and robb ers lit era t ure. Ho w ev er there is a strong connection to the Marko v De cision Pr o c esses (MDP) literature; w e will commen t on this connection (and use it) in Section 5. The reader can refer to [12, 18, 20] for MDP surv eys. While the emphasis of the curren t pap er is on cops c hasing the visible robb er, w e also touc h briefly t he case of invisible robb er, b oth adv ersarial and drunk. Not m uc h has b een written 1 2 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T on this problem, but a related problem which has b een extensiv ely studied is the Gr aph Se ar ch problem, where a team of searc hers try to lo cate in a graph an invisible fugitiv e, who is also assumed to b e arbitr arily fast and om niscient (he alwa ys kno ws the search ers’ lo cations as w ell as their strat egy). A recen t comprehensiv e review of graph searc h app ears in [7]. W e emphasize that the graph searc h problem is similar but not iden tical to cops c hasing an invisible robb er. The pap er is structured as follo ws. In Section 2 we presen t definitions and our notation; the form ulation is, naturally , probabilistic. In pa r ticular, w e define the cost of drunk enness to b e the r a tio of the capture t ime for the adv ersarial robb er and the exp e cte d capture time for the drunk robb er. W e also presen t a n um b er of lemmas whic h we will rep eatedly use in the follo wing sections. In Section 3 w e obtain b o unds on the cost of drunk enness for v arious sp ecial f a milies of graphs; for example, paths, cycles, g r ids, and complete d -ary trees. In Section 4 w e lo ok at the problem more generally and sho w that, fo r an y c ∈ [1 , ∞ ), there is a graph for whic h the cost of drunke nness is arbitra rily close t o c . In Section 5 w e connect the cops and drunk robb er problem to Markov De cision Pr o c es s e s (MDP); that is, Marko v chains with a c ontr o l input whic h can mo dify the tr a nsition probabilities. MDP’s pro vide a natural language for the problem; in particular they a re useful in the computatio n of optimal cop strategies; that is, strategies whic h minimize the exp ected robb er capture time. W e then use the MDP machine ry to presen t algor it hms whic h compute the optimal cop strategy for a giv en graph and a drunk robb er. In Section 6 w e giv e a brief, preliminary discussion o f the cost of drunk enness for an invisible robb er. Finally , in Section 7 w e list p ossible future researc h directions. 2. Preliminaries 2.1. Definitions. Let G = ( V , E ) b e a fixe d undirected, simple, and finite g raph. Since the game pla y ed on a disconnected graph can b e analyzed by in ve stigating e ach comp onent separately , w e assume that G is connected. W e will use the follo wing notatio n and assumptions. (i) There are k cops (fo r the time b eing w e assume k ≥ c ( G ) but this assumption will be relaxed in later sections). (ii) X i t denotes t he p osition of the i -th cop at time t ( i ∈ { 1 , 2 , . . . , k } , t ∈ { 0 , 1 , 2 , . . . } ); X t = ( X 1 t , . . . , X k t ) denotes the v ector of a ll cop p ositions a t time t ; X = ( X 0 , X 1 , X 2 , . . . ) denotes the p ositions of all cops during the game ( X ma y ha v e finite or infinite length). (iii) Y t denotes the p osition of the robb er at time t a nd Y = ( Y 0 , Y 1 , Y 2 , . . . ) the p ositions of the ro bb er during the game. (Let us note that there is a correlation b etw een X a nd Y ; that is, pla ye rs adjust their strategies observing mo ve s of the opp onen t.) (iv) The mov ing seque nce is as fo llo ws: first the cops c ho o se initia l p ositions X 0 ∈ V , then the robb er c ho oses Y 0 ∈ V . F or t ∈ { 1 , 2 , . . . } first the cops c ho ose X t and then t he robb er c ho o ses Y t . Pla y ers use edges of the graph G to mo ve from v ertex to ano ther one; that is, { X i t , X i t +1 } ∈ E for i ∈ { 1 , 2 , . . . , k } and t ∈ { 0 , 1 , 2 , . . . } , and { Y t , Y t +1 } ∈ E for t ∈ { 0 , 1 , 2 , . . . } . (v) The c aptur e time is denoted by T and defined as follows T = min { t : ∃ i suc h that X i t = Y t } ; that is, it is the first time a cop is lo cated at t he same v ertex as the robb er ( note that this can happen either after the cops mo v e or a fter the ev ader mo v es). Note that T < ∞ , since k ≥ c ( G ) and c ( G ) cops can capture the adv ersarial robb er (a nd so, of course, the drunk one to o ). SOME REMARKS ON COPS AND D RUNK ROBBERS 3 Assuming for the momen t adv ersarial cops and robb er, and giv en initial cop p ositions x ∈ V k and robb er p osition y ∈ V , w e let ct x,y ( G, k ) = T . The k -c aptur e time is defined as follo ws: ct( G, k ) = min x ∈ V k max y ∈ V ct x,y ( G, k ) . In ot her w ords, w e allow our p erfect pla ye rs to c ho ose their initial p ositions in order to ac hiev e the b est outcome. Finally , when k = c ( G ) w e simply write ct( G ) instead of ct( G, c ( G )), and call it the c aptur e time instead of c ( G )-capture time. Let us stress one mo r e time that the ab o v e quan tities are defined under the assumption of optima l pla y by b oth pla yers . Next let us assume that the cops ar e a dv ersarial but the ro bb er is drunk . More sp ecifically , w e assume the robb er p erforms a ra ndom walk on G . G iv en that he is at v ertex v ∈ V at time t , he mo v es to u ∈ N ( v ) at time ( t + 1) with probabilit y equal to 1 / | N ( v ) | . Note that w e do not include v in N ( v ); tha t is, w e consider op en, not closed, neighbourho o ds. Moreo v er, the robb er probability distribution do es not dep end on curren t p osition of cops; in particular, it can happ en that the robb er mov es to a v ertex o ccupied b y a cop (something the a dv ersarial robb er w ould nev er do). Under the ab ov e a ssumptions, the drunk ro bb er ga me is actually a one-pla ye r game and, for giv en initial configuration and cops strategy , the capture time T is a random v ariable. F or an y x ∈ V k and y ∈ V , let dct x,y ( G, k ) = E ( T | X 0 = x, Y 0 = y , k cops are used optimally ) ; in other w ords, it is the exp ected capture time giv en initial cops and robb er configurations x , y and optimal play b y the k cops. Since the robb er is drunk, we cannot exp ect him to choose the most suitable v ertex to start with—instead, he c ho oses a n initial v ertex uniformly at random. Cops are, of course, aw are o f this and so they try to choose an initial configuration so that the exp ected length of the game is as small as p ossible. Hence, w e define the exp ected k -capture time as follows : dct ( G, k ) = min x ∈ V k X y ∈ V dct x,y ( G, k ) | V | . As b efore, dct( G ) = dct( G, c ( G )). W e define the c ost of drunkennes s a s fo llows F ( G ) = ct( G ) dct( G ) and w e obviously hav e F ( G ) ≥ 1. While we concen trate on the case k = c ( G ), it is also natura l to consider exp ected capture time dct( G, k ) f or k 6 = c ( G ) . The next theorem show s that this is w ell defined for a n y k ≥ 1 (in particular, ev en for k < c ( G )). Theorem 2.1. dct( G, k ) < ∞ for any c onne cte d gr aph G and k ≥ 1 . Pr o of. Let G = ( V , E ) b e an y connecte d graph, D = D ( G ) be the diameter of G , a nd ∆ = ∆( G ) b e the maximum degree of G . Fix an y vertex v ∈ V , place k cops on v , and let X i t = v for all i and t (that is, cops nev er mov e; this is clearly a sub optima l strategy). F or a giv en v ertex y ∈ V o ccupied b y the drunk robb er, the probability that he uses a shortest pa t h from y to v to mov e straigh t to v is a t least (1 / ∆) D . This implies that, regardless of the curren t p osition of the robb er at time t , the probabilit y that he will b e caught after at most D further rounds is at 4 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T least ε = (1 / ∆) D . Moreo ve r, corresp onding ev en ts f o r times t + iD , i ∈ N ∪ { 0 } are mutually indep enden t. Th us, we get immediately t hat E T = X t ≥ 0 P ( T > t ) ≤ X t ≥ 0 P T > t D D = X i ≥ 0 D · P ( T > iD ) ≤ D X i ≥ 0 (1 − ε ) i = D ε = D ∆ D < ∞ , and w e are done. Let us remark that sharp er b ounds can b e obtained for the captur e time o f a drunk robb er, ev en in the case that the cops are also drunk; for example see [5]. How eve r, Theorem 2.1 will b e sufficien t for our needs. 2.2. Some Useful Lemmas. W e will b e using the following v ersion of a w ell-kno wn Chernoff b ound man y times so let us state it explicitly . Lemma 2.2 ([11]) . L et X b e a r andom variable that c an b e expr esse d as a sum X = P n i =1 X i of indep endent r andom indic ator variables whe r e X i ∈ Be( p i ) with (p ossibly) differ ent p i = P ( X i = 1) = E X i . Then the fol low i n g holds for t ≥ 0 : P ( X ≥ E X + t ) ≤ exp − t 2 2( E X + t/ 3 ) , P ( X ≤ E X − t ) ≤ exp − t 2 2 E X . In p articular, if ε ≤ 3 / 2 , then P ( | X − E X | ≥ ε E X ) ≤ 2 exp − ε 2 E X 3 . Let us now consider the f ollo wing (simple) random w alk on Z . Understanding the b ehaviour of this Mark ov c hain will b e imp ortan t in inv estigating simple fa milies of graphs lat er. Le t X 0 = 0, and for a give n t ≥ 0, let X t +1 = ( X t + 1 with proba bility 1 / 2 X t − 1 otherwis e. It is known that with high probability , ra ndom v ariable X t sta ys relativ ely close to zero. W e mak e this precise b elow using the Chernoff b ound. Lemma 2.3. L et n ∈ N and c ∈ (2 , ∞ ) . F or a simple r andom walk ( X t ) on Z with X 0 = 0 we have that | X t | ≤ c √ n log n for every t ∈ { 0 , 1 , . . . , n } with pr ob ability at le ast 1 − 2 n 1 − c 2 / 4 . Pr o of. Fix n ∈ N and c ∈ (2 , ∞ ). Let us p erform n steps of a simple random w alk on Z starting with X 0 = 0. Let Y t (1 ≤ t ≤ n ) denote the n um b er of times the pro cess go es ‘up’ until time t . It is clear that E Y t = t/ 2 and X t = Y t − ( t − Y t ) = 2 ( Y t − t/ 2) . SOME REMARKS ON COPS AND D RUNK ROBBERS 5 F or a giv en t , it follows from Chernoff b ound (L emma 2.2) that P X t < − c p n log n = P Y t ≤ t 2 − c 2 p n log n ≤ exp − ( c √ n log n/ 2 ) 2 2( t/ 2) ≤ exp − c 2 4 log n = n − c 2 / 4 . A symmetric a r gumen t can b e used to get that X t > c √ n log n with probabilit y a t most n − c 2 / 4 . Finally , from a unio n b ound w e get t hat the probabilit y that there exists t (1 ≤ t ≤ n ) with | X t | > c √ n log n is at most n · 2 n − c 2 / 4 = 2 n 1 − c 2 / 4 . 3. Bounds on the Cost of Drunk e nness In this section we place upp er and lo w er b ounds o n the cost of drunk ennes s F ( G ) when k cops a re av ailable. W e emphasize the case k = c ( G ) but also consider v alues of k 6 = c ( G ). W e start with simple graphs (namely: paths, cycles, trees, a nd grids) in o rder to prepare for sligh tly more complicated fa milies in the next section. 3.1. Paths and a Subopt imal Strategy. In this subsection w e pla y the game on P n , a path on n v ertices ( V ( P n ) = { 0 , 1 , . . . , n − 1 } , E ( P n ) = {{ i − 1 , i } : i ∈ { 1 , 2 , . . . , n − 1 }} ). Clearly , c ( P n ) = 1 ; t hat is, one cop can catc h the adv ersarial robb er. Since the drunk robb er is easier t o catc h than the adve rsarial one, let us study the drunk robb er play ing aga inst a single cop. In this subsection w e will compute the exp ected capture time using a s ub optimal strategy , namely starting the cop a t X 0 = 0 and moving him to the other end un til he reac hes n − 1 (or un til capture tak es place). It is clear that this strategy ac hiev es capture; f ur t hermore (as will b ecome apparen t in the fo llowing sections) man y optimal strategies can b e a nalyzed using this sub optimal one. Let Z t = Y t − X t b e the distance b etw een pla y ers at time t . If the drunk robb er starts at v ertex k ∈ { 0 , 1 , . . . , n − 1 } , w e ha ve Z 0 = Y 0 = k . (In order to simplify the a r g umen t, w e allo w play ers to “pass eac h other” whic h is nev er the case in the real game; that is, Z t can b e negativ e.) W e can r edefine the capture time as T n = T n ( k ) = min { t : Z t ≤ 0 } . No w, it is not so difficult to see the b ehaviour of the sequence ( Z t ) t ≥ 0 . Note that at time t, the maxim um distance b etw een play ers is n − 1 − t whic h implies that the ro bb er will b e caugh t in at most n − 1 steps. W e ha v e the follo wing Mark o v chain to inv estigat e: for t ∈ { 0 , 1 , . . . , n − 2 } , if Z t < n − 1 − t , t hen Z t +1 = ( Z t − 2 with probabilit y 1/2 (the ro bb er g o es tow ard the cop) Z t with proba bility 1/2 (the ro bb er go es aw a y from the cop) . If Z t = n − 1 − t (that is, the robber o ccupies the end of the path), then Z t +1 = Z t − 2 (deterministically). Consider another Mark o v c hain Z ′ t , which has the follo wing simple b ehav iour: Z ′ 0 = k and for ev ery t ≥ 0, Z ′ t +1 = Z ′ t − 2 with probabilit y 1/2; otherwise Z ′ t +1 = Z ′ t . Define T ′ = min { t : Z t ≤ 0 } . In other words, w e will b e c hasing the robb er o n the infinite ray R ( V ( R ) = N ∪ { 0 } , E ( R ) = {{ i − 1 , i } : i ∈ N } ) , w hich is sligh tly more difficult for the cop. Hence, it is easy 6 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T to pro ve that E ( T n | Z 0 = k ) ≤ E ( T ′ | Z ′ 0 = k ). Moreov er, it is also easy (using a recursiv e argumen t) to sho w that E ( T ′ | Z ′ 0 = k ) = k , and so E ( T n ( k )) ≤ k . Now w e are ready to sho w the fo llo wing. Theorem 3.1. Cons i der that the c op starts on one end of the p ath P n and moves towar d the other end . L et T n b e the c aptur e time, pr ovide d that the r obb er is d runk . Then, n 2 1 − O log n n ≤ E T n ≤ n − 1 2 . Before we mo v e to the pro of of this theorem let us men tion tha t , in fact, with a sligh tly more sophisticated a rgumen t, it is p ossible to show that E T n = n/ 2 − O (1). Pr o of. Let n ∈ N and fix any c > 2. The robb er starts his w alk on a v ertex k ∈ { 0 , 1 , . . . , n − 1 } . Let us note that he is captured after a t most n − 1 steps of the pro cess (deterministically); that is, T n ( k ) ≤ n − 1. As w e already men tioned E T n ( k ) ≤ k . Since the starting v ertex for the robb er is c hosen uniformly at random, we get that E T n ≤ P n − 1 k =0 k /n = ( n − 1) / 2, so it remains to inv es tiga te a lo w er b ound. Supp ose first that k ≤ ( n − 1) − c √ n log n . It follows from Lemma 2.3 that the robb er reac hes the other end of the path with probabilit y at most 2 n 1 − c 2 / 4 . If this is the case, w e apply a trivial lo w er b ound for T n ( k ), namely , T n ( k ) ≥ 0; otherwise w e get that the (conditional) exp ectation for T n ( k ) is equal to k . Hence, E T n ( k ) ≥ k (1 − 2 n 1 − c 2 / 4 ). Supp ose no w that k > ( n − 1) − c √ n log n . Using Lemma 2.3 one more t ime, w e get that with probability at least 1 − 2 n 1 − c 2 / 4 the robb er is not caught b efore time k − c √ n log n . Since the starting v ertex for t he robb er is c hosen uniformly at random, w e get that E T n ≥ 1 n n − 1 X k =0 E T n ( k ) ≥ 1 n n − 1 − c √ n log n X k =0 k + n − 1 X k = n − c √ n log n ( k − c p n log n ) (1 − 2 n 1 − c 2 / 4 ) ≥ n − 1 2 − c 2 log n (1 − 2 n 1 − c 2 / 4 ) . F or a giv en n , the parameter c can b e adjusted for the b est o utcome. T o get an asymptotic b eha viour, we can use, say , c = 3 to get that E T n ≥ n 2 1 − O log n n , and the pro of is complete. The pro of of the theorem actually giv es us more. W e get that with pro ba bilit y tending to 1 as n → ∞ , f o r all starting p oin ts for the robb er ( k ∈ { 0 , 1 , . . . , n − 1 } ), the cop needs k + O ( √ n log n ) mo v es to catc h the robb er. 3.2. Paths. W e con tin ue studying a visible robb er on P n but we no w apply the optimal capture strategy ( it is o ptima l for b oth adve rsarial and drunk robb er). If n is o dd, w e start b y placing a cop on v ertex ( n − 1) / 2; if n is ev en we hav e t w o optimal strategies, the cop can start on n/ 2 or n/ 2 − 1. In any case, after selecting an initial v ertex the strat egy is the same: the cop SOME REMARKS ON COPS AND D RUNK ROBBERS 7 k eeps mo ving tow ard the robb er. Except for initial placemen t, this is the strategy examined in the previous subsection and w e ha ve ct( P n ) = ⌊ n/ 2 ⌋ . W e easily get the following result. Theorem 3.2. n 4 1 − O log n n ≤ dct(P n ) ≤ n 4 . In p articular, dct( P n ) = (1 + o (1)) n/ 4 and the c ost o f drunke n ness is F ( P n ) = ct( P n ) dct( P n ) = 2 + o (1) . Pr o of. As w e already mentioned, after the robb er selects his initial v ertex to start from, the game is pla ye d ess entially on a path o f length at most ⌊ n/ 2 ⌋ + 1 . F ro m Theorem 3 .1, w e get immediately tha t dct( P n ) ≤ E T ⌊ n/ 2 ⌋ +1 ≤ n/ 4 . F or a low er b ound, w e not ice t ha t the length o f eac h subpath is at least ⌊ n/ 2 ⌋ . By Theorem 3 .1, dct( P n ) ≥ E T ⌊ n/ 2 ⌋ ≥ n 4 1 − O log n n , and the pro of is complete. In the general case when k ∈ N cops a re a v ailable, w e need to ‘slice’ a path in to k shorter paths and place a cop on their cen ters. W e get that dct( P n , k ) = (1 + o (1)) n/ (4 k ). 3.3. Cycles. Let us pla y the game on a cycle C n for n ≥ 4 ( V ( C n ) = { 1 , 2 , . . . , n } , E ( C n ) = {{ i, i + 1 } : i ∈ { 1 , 2 , . . . , n − 1 }} ∪ {{ 1 , n }} ). It is not difficult to see that c ( C n ) = 2; w e use t w o cops to chase the robb er. They start b y o ccupy ing tw o vertice s at the distance ⌊ ( n + 1) / 2 ⌋ , the maxim um p ossible distance on the cycle. When the robb er selects his v ertex to start with, they mo ve tow ard him and capture o ccurs at time ct( C n ) = ⌊ ( n + 1 ) / 4 ⌋ . The same strategy is used when the robb er is drunk. As fo r paths, one can introduce a random v ariable Z t to measure the distance betw een the robb er and cops at time t . The problem (almost) reduces to the problem on a path. W e men tion briefly the difference b elow but the formal pro of is omitted. If n is o dd, then Z t has exactly the same b ehaviour as b efore. Ho w ev er, Z 0 = ⌊ ( n + 1) / 2 ⌋ with probability tw o times smaller than an y other legal starting v alue (note that a uniform distribution on V ( C n ) is used but there is just one v ertex at the distance ⌊ ( n + 1 ) / 2 ⌋ ). If n is eve n, then w e get a uniform distribution for starting v alues but the tr a nsition from Z t to Z t +1 is sligh tly differen t, namely , there is a c hance for Z t to stay a t the same v alue, provided that the robb er o ccupies the v ertex which is at the maxim um distance from cops. In any case, it is straig htforw ard to sho w that b oth upp er and lo w er b ounds still hold so w e get the follow ing. Theorem 3.3. n 8 1 − O log n n ≤ dct( C n ) ≤ n + 1 8 . In p articular, dct( C n ) = (1 + o (1)) n/ 8 and the c ost of drunkenn ess is F ( C n ) = ct( C n ) dct( C n ) = 2 + o (1) . In the general case when k ∈ N cops are av ailable, we spread them as ev enly a s p ossible. W e get that dct( C n , k ) = (1 + o (1)) n/ (4 k ). 8 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T 3.4. T rees. All families of graphs we discussed so far hav e a ve ry nice prop erty , namely , it is clear what the optimal strategy for the cops is. Once play ers fix their initial p ositions (that is, X 0 and Y 0 ), cops mus t mov e tow ard the ro bb er in order to decrease the exp ected capture time. As w e men tioned b efore, it is natural to measure the distance Z t b et w een pla y ers at time t ; Z t decreases b y 2 if the robb er mak es a bad mov e or is o ccup ying a leaf; otherwise the distance remains the same. This applies to the family o f trees as well (note t ha t c ( T ) = 1 for any tree T ). Ho w ev er, this time it is not clear whic h v ertex should b e used f o r the cop to start with in order to optimize the exp ected capture time. F or this family , the random v ariable Z t decreases with probabilit y 1 / deg( v ), pro vided that the robb er o ccupies v ertex v , and the b ehaviour of the sequence ( Z t ) t ≥ 0 highly dep ends not only on the degree distribution but on the structure of a tree as w ell. It is non-trivial t o estimate the cost of drunk ennes s for a particular tree without p erforming extensiv e calculations for ev ery vertex as a starting p oint (these calculations can b e p erformed b y computer, using the algorithms of Section 5 .2). Ho we v er, some sub-families of trees are still relativ ely easy to deal with. Let us consider d regular, ro oted tree T ( d, k ) of depth k . T he ro ot v ertex on the lev el 0 has d neighbours (c hildren), v ertices on lev els 1 to k − 1 hav e degree d + 1 (one paren t and d c hildren), lea ve s on the lev el k ha ve degree 1 (j ust o ne paren t). There are d i v ertices on lev el i for a total of ( d k +1 − 1) / ( d − 1) v ertices. Due to t he symmetry , the cop m ust start the game on the ro ot. Since the drunk robb er prefers to mov e to w ard lea v es, it is natural to exp ect that his b eha viour is similar to the one of the adv ersarial r o bb er. Moreov er, almost all v ertices a re lo cated on leve ls k − o ( k ) so the robb er a lmost alw a ys starts on these v ertices whic h is clearly a go o d mo v e. W e show that the cost of drunk enness is as b est a s p ossible; t ha t is, dct( T ( d, k )) is tending to ct( T ( d, k )) = k as k → ∞ . Theorem 3.4. k − O ( p k log k ) ≤ dct( T ( d, k )) ≤ k . In p articular, dct( T ( d, k )) = (1 + o (1)) k and the c ost of drunkenne s s is F ( T ( d, k )) = ct( T ( d, k )) dct( T ( d , k )) = 1 + o (1) . Pr o of. Supp ose that the drunk robb er starts on lev el i ≥ k − √ k log k . It follo ws from Lemma 2.2 that with proba bility 1 − O ( k − 1 ) he will b e caugh t on lev el k − O ( √ k log k ). (In f a ct, it is also true for i ≥ k /d , since the robb er mov es tow a rd leav es with higher r a te, namely , with probability ( d − 1) / d . Ho w ev er, an error follo wing from this part is negligible comparing to the other error, so we sta y with this obvious b o und for i .) Therefore, dct( T ( d , k )) ≥ k X i = k − √ k log k d i ( d k +1 − 1) / ( d − 1) ( k − O ( p k log k ))(1 − O ( k − 1 )) = (1 − O ( d − √ k log k ))( k − O ( p k log k ))(1 − O ( k − 1 )) = k − O ( p k log k ) , whic h finishes the pro of. 3.5. Grids. The Cartesian pr o duct of tw o graphs G and H is a graph with v ertex set V ( G ) × V ( H ) and with the v ertices ( u 1 , v 1 ) a nd ( u 2 , v 2 ) a djacen t if either u 1 = u 2 and v 1 , v 2 are a djacen t in H , or v 1 = v 2 and u 1 , u 2 are adjacen t in G . W e denote t he Cartesian pro duct of G and H b y G H . In this subsection, w e will study a square grid P n P n . SOME REMARKS ON COPS AND D RUNK ROBBERS 9 It is kno wn that for an y t w o trees T 1 , T 2 , w e hav e c ( T 1 T 2 ) = 2 [15]. The capture time of the Cartesian pro duct of trees w as recen tly studied in [14]. It w as sho wn that f or an y tw o trees T 1 , T 2 w e ha ve ct( T 1 T 2 ) = D ( T 1 T 2 ) 2 = D ( T 1 ) + D ( T 2 ) 2 , where D = D ( G ) is the diameter of G . In part icular, for a square grid we hav e that ct( P n P n ) = n − 1. W e will sho w that the cost of dr unkenn ess for a grid is asymptotic to 8 / 3 . Theorem 3.5. dct( P n P n ) = (1 + o (1)) 3 8 n, and the c ost of drunkenness is F ( P n P n ) = 8 / 3 + o (1) . Pr o of. Supp ose that the drunk robb er o ccupies an in ternal v ertex ( u , v ). The decision where to go fro m there can b e made in the f ollo wing w ay : toss a coin to decide whether mo dify the first co ordinate ( u ) or the second one ( v ); independently , another coin is tossed to decide whether we increase or decrease the v alue. Hence the robb er will mo v e with probabilit y 1/ 4 to one of the four neigh b ors o f ( u, v ). Note that , if we restrict ourselv es to lo ok at o ne dimension only (for example, let us call it North/South direction) we see the robb er g oing North with probability 1/4, go ing South with the same probability a nd staying in place with probability 1 /2. In other words the robb er p erforms a lazy random w alk on the pa t h. Hence, b oth co or dinates b eha v e similarly to the lazy random walk o n in tegers (mo ve with probability 1 / 2; do nothing, otherwise). The same argumen t as in the previous pro ofs can b e used to sho w tha t with probability , sa y , 1 − o ( n − 1 ), the r o bb er stays within the distance O ( √ n log n ) = o ( n ) from the initial v ertex. Hence, if w e lo ok a t the grid from the ‘large distance’ the drunk robb er is not mo ving at all. Therefore, since w e w ould lik e to in v estigate an a symptotic b ehaviour, t he problem reduces to finding a set S consisting of tw o v ertices suc h that the av era g e distance to S is as small as p ossible. Cops should start on S to ac hiev e the b est outcome. It is clear that, due to the symmetry of P n P n , t here are t wo symmetric optimal configurat io ns for set S : S = { ( n/ 2 + O (1) , n/ 4 + O (1)) , ( n/ 2 + O (1) , 3 n/ 4 + O (1)) } , S = { ( n/ 4 + O (1) , n/ 2 + O (1)) , (3 n/ 4 + O (1) , n/ 2 + O (1)) } . In any case, the av erage distance is n − 1 X u =0 n − 1 X v =0 dist (( u, v ) , S ) = ( 1 + o (1))8 n Z 1 / 2 x =0 Z 1 / 4 y = 0 ( x + y ) dy dx = (1 + o (1 )) 3 8 n. The result follows. 4. The cost of dr unkenness In this section w e sho w that the cost of drunk enness can b e arbitrarily close to an y real n um b er c ∈ [1 , ∞ ). In order to do it , w e introduce t w o families of graphs, barb ells and lo llip ops. 10 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T 4.1. Barb ell. Let n ∈ N and c ≥ 0. The b arb el l B ( n, c ) is a graph that is obtained fr om t w o complete gra phs K ⌊ cn ⌋ connected b y a path P n (that is, one end of the path b elongs to the first clique whereas the o ther end b elongs to the sec ond one). The n umber of v ertices of B ( n, c ) is (1 + 2 c ) n + O (1), c ( B ( n, c )) = 1. In order to catc h (either the adv ersarial or the drunk) robb er, the cop should start at the cen ter of the path a nd mov e tow a rd the robb er; ct( B ( n, c )) = n/ 2 + O (1). This family can b e used to get an y ratio from (1 , 2]. Theorem 4.1. L et c ≥ 0 . T h en, dct( B ( n, c )) = (1 + o (1)) n 2 · 1 + 4 c 2 + 4 c , and the c ost of drunkenness is F ( B ( n, c )) = ct( B ( n, c )) dct( B ( n, c )) = 1 + 1 1 + 4 c + o (1) . Pr o of. The drunk robb er starts on a clique with probabilit y (2 c ) / (1 + 2 c ) + o (1). If this is the case, the capture o ccurs at time n/ 2 + O ( √ n log n ) with probability , sa y , 1 − o ( n − 1 ) b y Lemma 2.3. If the robb er c ho oses a vertex at t he distance k from the ro bb er t o start with, he is captured after k + O ( √ n log n ) steps, again with pro babilit y 1 − o ( n − 1 ). Hence the exp ected capture time is (1 + o (1)) 2 c 1 + 2 c · n 2 + 1 1 + 2 c · n 4 = (1 + o (1)) n 2 · 1 + 4 c 2 + 4 c . The theorem holds. 4.2. Lollip op. Let n ∈ N and c ≥ 0. The lol lip op L ( n, c ) is a graph that is obtained fro m a complete graph K ⌊ cn ⌋ connected to a path P n (that is, one end of the path belongs to the clique). The n um b er of ve rtices o f L ( n, c ) is (1 + c ) n + O (1), and t he cop num ber c ( L ( n, c )) is 1. In order to catch t he p erfect robb er, the cop should start at the cen ter of the path and mov e to w ard the robb er; ct( L ( n, c )) = n/ 2 + O ( 1). How ev er, it is not clear what the o ptima l strategy for the drunk robb er is. The la rger the clique is, the closer t o the clique the cop should start the ga me. Theorem 4.2. L et c ≥ 0 . T h en, dct( L ( n, c )) = ( (1 + o (1)) n 4 · ( √ 2 − 1+ c )( √ 2+1 − c ) 1+ c , for c ∈ [0 , 1] (1 + o (1)) n 2(1+ c ) , for c > 1 . and the c ost of drunkenness is F ( L ( n, c )) = ct( L ( n, c )) dct( L ( n, c )) = ( 2(1+ c ) ( √ 2 − 1+ c )( √ 2+1 − c ) + o (1) , for c ∈ [0 , 1 ] (1 + c ) + o (1) , for c > 1 . Before w e mov e to the pro of of this result, let us mention tha t the cost of drunk ennes s (as a function of the parameter c ) has a n interes ting b ehav iour. F or c = 0 it is 2 (we play on t he path), but then it is decreasing to hit its minimu m of 1 + √ 2 / 2 for c = √ 2 − 1 . After that it is increasing ba ck to 2 for c = 1, and g o es t o infinity together with c . Therefore, this family can b e used to get a n y ratio at least 1 + √ 2 / 2 ≈ 1 . 71. SOME REMARKS ON COPS AND D RUNK ROBBERS 11 Pr o of. Let the cop start on v ertex v at the distance (1 + o (1)) bn from the clique ( b ∈ [0 , 1] will b e chos en to o btain the minim um exp ected capture time). The drunk robb er starts on a clique with probabilit y c/ (1 + c ) + o (1) . If this is the case, the capture o ccurs at time bn + O ( √ n log n ) with probabilit y , sa y , 1 − o ( n − 1 ) b y Lemma 2.3. If the robb er c ho oses ve rtex at the distance k from the cop, then he is captured, again with probabilit y 1 − o ( n − 1 ), after k + O ( √ n log n ) rounds. The robb er starts b et w een the cop and the clique with proba bilit y b/ (1 + c ) + o (1) a nd on the other side with remaining probability . Hence t he exp ected capture time is equal to (1 + o (1)) c 1 + c · bn + b 1 + c · bn 2 + 1 − b 1 + c · (1 − b ) n 2 = (1 + o (1)) n 1 + c b 2 + ( c − 1) b + 1 / 2 . The ab o v e expression is a function of b (that is, a function o f the starting vertex v for the cop) and is minimized at b = min 1 − c 2 , 0 . The theorem holds. It follows immediately from Theorems 3.4, 4.1, a nd 4.2 that the cost of drunk ennes s can b e arbitrarily close to any constan t c ≥ 1. Corollary 4.3. F or every r e al c on stant c ≥ 1 , ther e exists a s e quenc e of gr aphs ( G n ) n ≥ 1 such that lim n →∞ F ( G n ) = lim n →∞ ct( G n ) dct( G n ) = c. 5. Comput a tional Aspects In this section w e deal with computational asp ects of the cop a g ainst drunk robb er pro blem. Our analysis holds for an y n um b er of cops, that is, we no longer assume that k = c ( G ). 5.1. Computing exp ected capture t ime for a given strategy. Supp ose t ha t w e a r e giv en a gra ph and w e fix a strategy b efo re the game actually starts. W e will now sho w ho w to explicitly compute the probabilit y of capture at time t ∈ { 0 , 1 , 2 , . . . } as w ell as the exp ected capture time. Fixing a strategy in adv ance is the b est one can do f o r the in visible robb er case (see Section 6) but for a visible one, cops should adjust their strategy based on the b eha viour of the o pp onen t; this will b e treat ed in the next subsection 5.2. How ev er, the approac h presen ted here is less demanding computationally and can b e used to provide an upp er bo und for the optimal exp ected capture time. Let G = ( V , E ) b e a connected graph with V = { 0 , 1 , . . . , n − 1 } . Letting P i,j = Pr ( Y t = j | Y t − 1 = i ) w e ha ve P i,j = 1 | N ( i ) | for j ∈ N ( i ) 0 otherwise. Note that P is the n × n transition probabilit y matrix gov erning the robb er’s random w alk in G in the absenc e of c ops . T o accoun t for capture by the cops, define a new state space V = V ∪ { n } , 12 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T that is, the old state space augmen ted b y the c aptur e state n . The corresp onding ( n + 1) × ( n + 1) transition matrix is P = P 0 0 1 . In the absence of cops, the robb er p erforms a standard random w alk on G and nev er en ters the capture state; if ho w ev er he starts in the capture state, he remains there forev er: P n,n = 1. In other w ords, the Marko v c hain go v erned b y P contains t wo noncommunicating equiv alence classes: { 0 , 1 , . . . , n − 1 } and { n } . Supp ose now that a single cop is lo cated in v ertex x . W e will denote the corresp onding transition probability matrix b y P ( x ). Obvious ly , P ( x ) 6 = P . The difference is caused b y t he p ossibilit y of capture, whic h can o ccur in tw o w a ys. (i) A t the ( t − 1)-th round t he robb er is lo cated a t x and, in the first phase of the t - t h round, the cop mov es into x . Then the robb er is captured, so P x,n ( x ) = 1 and P x,y ( x ) = 0 for y ∈ V . (ii) A t the ( t − 1)-th round the ro bb er is lo cated at y 6 = x and, in the second pha se of t he t -th round, he mo v es from y to x . Hence the r obb er is captured with pro babilit y P y , x . So, for all y ∈ V − { x } , P y , n ( x ) = P y , x , P y , x ( x ) = 0. W e can summarize the ab o v e b y writing P ( x ) = P ( x ) p ( x ) 0 1 , where P ( x ) has 0’s in the x - t h ro w and column and the cor r esp o nding probabilities hav e b een mo v ed into the p ( x ) vec tor . F or example, letting G b e the path with 5 no des, the matrices P and P (2 ) are: P = 0 1 0 0 0 0 1 / 2 0 1 / 2 0 0 0 0 1 / 2 0 1 / 2 0 0 0 0 1 / 2 0 1 / 2 0 0 0 0 1 0 0 0 0 0 0 0 1 , P (2) = 0 1 0 0 0 0 1 / 2 0 0 0 0 1 / 2 0 0 0 0 0 1 0 0 0 0 1 / 2 1 / 2 0 0 0 1 0 0 0 0 0 0 0 1 . Esp ecially for the placemen t round of the game ( t = 0) w e need a differen t matrix, b ecause the robb er do es not p erform a ra ndom-w alk, but simply c ho oses an initial p osition uniformly at random; if he chooses the one already o ccupied by the cop, then he is captured immediately . Hence, for this round the appropriate transition matrix is b P ( x ), whic h is the unit mat rix with the one of the x -t h row mo v ed to the ( n + 1)-th column. Let π i ( t ) = P ( Y t = i ) fo r i ∈ V and t ∈ { 0 , 1 , . . . , s } and π ( t ) = ( π 0 ( t ) , π 1 ( t ) , . . . , π n ( t )); also let b π (0 ) = 1 n , 1 n , . . . , 1 n , 0 . Then, giv en a strategy X = ( x 0 , x 1 , . . . , x s ), the ab o v e for mulation yields π (0) = b π (0) b P ( x 0 ) and, for t ∈ { 1 , 2 , . . . } , π ( t ) = π ( t − 1) P ( x t − 1 ) . This implies that π ( t ) = b π (0) b P ( x 0 ) P ( x 1 ) P ( x 2 ) . . . P ( x t ). T o illustrate this, let us con tin ue the example. Supp ose a single cop en ters the path and follows the strategy X = (0 , 1 , 2 , 3 , 4) (start on one end o f the path and mov e to the other o ne). Then w e ha v e SOME REMARKS ON COPS AND D RUNK ROBBERS 13 π ( 0) = b π ( 0) b P ( x 0 ) = 1 / 5 1 / 5 1 / 5 1 / 5 1 / 5 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 = 0 1 5 1 5 1 5 1 5 1 5 π ( 1) = π (0) P ( x 1 ) = 0 1 5 1 5 1 5 1 5 1 5 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 / 2 0 1 / 2 0 0 1 / 2 0 1 / 2 0 0 0 0 1 0 0 0 0 0 0 0 1 = 0 0 1 10 3 10 1 10 1 2 π ( 2) = π (1) P ( x 2 ) = 0 0 1 10 3 10 1 10 1 2 0 1 0 0 0 0 1 / 2 0 0 0 0 1 / 2 0 0 0 0 0 1 0 0 0 0 1 / 2 1 / 2 0 0 0 1 0 0 0 0 0 0 0 1 = 0 0 0 1 10 3 20 3 4 π ( 3) = π (2) P ( x 3 ) = 0 0 0 1 10 3 20 3 4 0 1 0 0 0 0 1 / 2 0 1 / 2 0 0 0 0 1 / 2 0 0 0 1 / 2 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 = 0 0 0 0 0 1 The elemen ts π n ( t ) giv e the proba bilities P ( X t = n ) at time t , that is, the probabilities of capture in a t most t steps. The probabilities of capture ex a ctly at time t are then giv en b y π n ( t ) − π n ( t − 1). The exp ected capture time (conditional on strategy X b eing used) is E T = ∞ X t =1 t · ( π n ( t ) − π n ( t − 1)) . In the ab ov e example w e hav e E T = 1 · 1 2 − 1 5 + 2 · 3 4 − 1 2 + 3 · 1 − 3 4 = 31 20 . The approac h can b e generalized to mo r e than one cop, b y letting x = ( x 1 , x 2 , . . . , x k ) b e a configuration of cops and defining P ( x ), P ( x ) analogously t o the one cop case. Giv en t ha t the cops follow the strategy X = ( X 1 , X 2 , . . . , X s ), the transition probabilities of Y satisfy P ( Y t = j | Y t − 1 = i ) = P ij ( X t ) for t ≤ s . So the robb er pro cess is an inhomogeneous Marko v chain, with the transitions con trolled b y the cops’ actions. Mark o v c hains of this t yp e are called Markov De cision Pr o c esses (MDP) or Contr ol le d Markov Pr o c esses , where the control function is X t ; it is a (sto c hastic) con trol in the sense that it allows us to c hange the transition probabilities of Y t . W e can use the MDP formulation to compute E T f or any giv en strategy X in reasonable time. Computing the optimal strategy is not computationally viable; for example, with | V | = n and k cops there ma y exist up to Θ( ( n k ) t ) strategies of length t (a nd the same n um b er of corresp o nding E T ’s) to ev aluat e. In the Section 5.2 w e will presen t a computationally viable approac h to compute the strategy tha t is arbitrarily close to the optimal one. MDP’s w ere in tro duced in the b o ok [10]; b o ok-length treatmen ts are [3, 17, 18 , 20]; an online tutorial is [12]. They ha v e b een applied to a v ersion of the cops-robb er problem in [6]. 5.2. Computing near-optimal strategies and minim um exp ect ed capture time. Let us no w presen t and alg o rithm to compute F ( G ) = ct( G ) dct( G ) with arbitrar ily go o d precision. Basi- cally this reduces to computing ct( G ) and a go o d appro ximation of dct( G ), whic h can b e done 14 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T indep enden tly . T o this end w e presen t t w o algorithms, b oth of whic h ha v e previously app eared in the literature. T o impro ve the presen tatio n we assign a name to eac h alg o rithm and mak e a few notational mo difications; also we p oin t out the similarit y b et w een the tw o algorithms (whic h apparen tly has not b een noticed b efo re). (i) The CAAR ( C op A gainst A dv ersarial R obb er) algorithm computes ct x,y ( G ) for ev ery initial cop/robb er configurat io n ( x, y ). In addition, CAAR computes the optimal cop and robb er pla y for ev ery ( x, y ). Capture time ct ( G ) is easily computed fro m ct ( G ) = min x max y ct x,y ( G ). (ii) Similarly , the CADR ( C op A gainst D runk R obb er) algorit hm computes (a n arbitr a r- ily go o d appro ximation of ) dct x,y ( G ) and the (near-)optimal cop pla y for ev ery ( x, y ); drunk en capture t ime dct( G ) is computed f r om dct( G ) = min x P y dct x,y ( G ) n . CAAR was introduced by Hahn and MacGillivra y in [9]. W e presen t the algorithm for the case of a single cop (the generalization f o r more than o ne cops is straightforw ard). Sligh tly c hanging notation, w e will use C x,y to denote t he game duratio n when the cop is lo cated at x , the robb er at y and it is the cop’s turn to mo ve (in o ther w ords, C x,y equals ct x,y ( G )). Similarly R x,y denotes game duration when it is the robb er’s turn to mo ve . F or b oth C x,y and R x,y w e assume optima l play b y b o t h cop and robb er. Let us a lso define b V 2 = V × V − { ( x, x ) : x ∈ V } , (that is, V 2 excluding the diago na l) and for all x ∈ V , let N + ( x ) = N ( x ) ∪ { x } b e the closed neigh b ourho o d of x . CAAR consists of the follo wing recursion (for i = 1 , 2 , . . . ): ∀ ( x, y ) ∈ b V 2 : R ( i ) x,y = max y ′ ∈ N + ( y ) C ( i − 1) x,y ′ , (1) ∀ ( x, y ) ∈ b V 2 : C ( i ) x,y = 1 + min x ′ ∈ N + ( x ) R ( i ) x ′ ,y . (2) C and R are initialized with C (0) x,y = R (0) x,y = ∞ for a ll x 6 = y . W e tak e C ( i ) x,x = R ( i ) x,x = 0 fo r i = 0 , 1 , 2 , . . . . Then (1)- (2) is essen tially equiv alent to the v ersion pres ente d b y Hahn and MacGillivra y in [9], with just one difference whic h w e will no w discuss . In (1) -(2) t he matrix C is computed iterative ly: the ( i − 1)-t h matrix C ( i − 1) is stored and used in the i -th iteration to compute C ( i ) . In numeric al analysis this is know n a s a Jac obi iteration. It is w ell kno wn t ha t an alternativ e approach to computations of this t yp e is the Gauss-Se i d el iteration. In this itera t io n a single cop y of C is stored and its elemen ts are up dat ed “in place.” In [9], Hahn and MacGillivra y presen t the Jacobi v ersion of CAAR and pro v e that the algorithm con v erges (in a finite n umber of steps) if and only if c ( G ) = 1. Henc e CAAR computes the solution o f the equations ∀ ( x, y ) ∈ b V 2 : R x,y = max y ′ ∈ N + ( y ) C x,y ′ , (3) ∀ ( x, y ) ∈ b V 2 : C x,y = 1 + min x ′ ∈ N + ( x ) R x ′ ,y , (4) ∀ x ∈ V : C x,x = R x,x = 0 . (5) The in terpretation of the equations is the follo wing. Equation (3) captures the prop erty that from configuratio n ( x, y ) the robb er mov es so as to maximize the length of the ga me; similarly , (4) desc rib es the cop’s goal to minimize the game duration (since the cop mo v es in the first phase of eac h round, 1 time unit m ust b e added to min R x ′ ,y ); finally (5) sa ys that the game ends when cop and r obb er o ccup y the same vertex . SOME REMARKS ON COPS AND D RUNK ROBBERS 15 Extending the CAAR idea t o the drunk robb er game, let us no w use C x,y to denote dct x,y ( G ). In o ther w ords C x,y (resp ectiv ely , R x,y ) is the exp e c te d game duration after the cop’s (resp ec- tiv ely , robb er’s) mov e. Recall (see Subsection 5.1) that P y , y ′ ( x ) is the probability of the robb er transiting from y to y ′ , giv en that the cop is at x ; no te that P ( x ) is a substo chastic matrix. The analog o f (1)- (2) is ∀ ( x, y ) ∈ b V 2 : R ( i ) x,y = X y ′ ∈ N ( y ) P y , y ′ ( x ) C ( i − 1) x,y ′ , (6) ∀ ( x, y ) ∈ b V 2 : C ( i ) x,y = 1 + min x ′ ∈ N + ( x ) R ( i ) x ′ ,y (7) and the analo g of (3)-(5) is ∀ ( x, y ) ∈ b V 2 : R ( x, y ) = X y ′ ∈ N ( y ) P y , y ′ ( x ) C x,y ′ , (8) ∀ ( x, y ) ∈ b V 2 : C x,y = 1 + min x ′ ∈ N + ( x ) R x ′ ,y . (9) ∀ x ∈ V : C x,x = R x,x = 0 . (10) W e w an t (6)-( 7) to con v erge to t he solution of (8)-(10). W e will discuss conv ergence conditions (and initialization) presen tly . Actually (6)-(7) can b e simplified. Since the drunk ro bb er do es not c ho ose his mo v es, w e can eliminate R ( i ) x,y from (6)-(7) and obtain the CADR algorithm recursion: ∀ ( x, y ) ∈ b V 2 : C ( i ) x,y = 1 + min x ′ ∈ N + ( x ) X y ′ ∈ N ( y ) P y , y ′ ( x ′ ) C ( i − 1) x ′ ,y ′ . (11) W e ha v e deriv ed (11) from (6)-(7), whic h w e see as an a nalog o f (1)-(2). Ho w ev er, w e will now sho w that (11) is a v ersion of the value iter ation algorithm, intro duced and studied in t he MDP literature [3, 17, 1 8, 2 0]. Consider a general MDP pro cess with state space S , a ction space A , transition mat rix Q a nd cost matrix G ( a ) (that is, G s,s ′ ( a ) is the cost of transition s → s ′ using action a ). The state space satisfies S = S T ∪ S A , where S T are the transien t states and S A the absorbing ones; it is a ssumed that tra nsitions a f ter absorption ha v e zero cost: G s,s ′ ( a ) = 0 for s, s ′ ∈ S A . Let C s b e the exp ected to t al cost of the pro cess starting fro m state s and con tin uing un til absorption. Then [18] C satisfies the equations ∀ s ∈ S T : C s = min a ∈ A G s,s ′ ( a ) + X s ′ ∈ S T Q s,s ′ ( a ) C s ′ ! (12) and the solutions to (1 2) can b e obtained b y the follow ing v alue iteration: ∀ s ∈ S T : C ( i ) s = min a ∈ A G s,s ′ ( a ) + X s ′ ∈ S T Q s,s ′ ( a ) C ( i − 1) s ′ ! . (13) T o show that (1 3) can b e reduced to (11) let us tak e S T = b V 2 and A = V ; in other w ords, states s = ( x, y ) are cop/r o bb er configura t ions and actions a = x ′ are new cop p ositions. Regarding mo v e costs: (a) b efore capture ev ery mo ve has unit cost, (b) after capture only mo v es of the 16 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T form ( x, x ) → ( x, x ) are p ossible and these ha v e zero cost; in short G ( x,y ) , ( x ′ ,y ′ ) ( x ′ ) = 1 if and only if x 6 = y 0 otherwise. Finally , Q ( x,y ) , ( x ′ ,y ′ ) ( a ) = P y , y ′ ( x ′ ) if a = x ′ ∈ N + ( x ) and y ′ ∈ N ( y ) 0 otherwise. Using the ab ov e, it is easy t o reduce (13) to (11). The con v ergence of the C ADR algorithm has been studied by sev eral authors, in v arious degrees of generalit y [6, 1 0, 2 0 ]. A simple ye t strong result, deriv ed in [6], uses the concept of pr op er str ate gy : a strategy is called prop er if it yields finite expected capture time. It is prov ed in [6] that: if a prop er strategy exists for graph G , then the Ga uss-Seidel v ersion of CADR con v erges to the true C for arbitrary C (0) pro vided C (0) x,y ≥ 0 for all ( x, y ) ∈ b V 2 . As w e hav e seen in The orem 2.1, the cop has a prop er strategy for ev ery G . It can b e pro ve d that the Jacobi v ersion of CADR also con v erges under the same conditions. No w, F ( G ) can b e computed, easily . F or ev ery pair ( x, y ), one can obtain a desired approx i- mation of ct x,y ( G ) and dct xy ( G ) b y p erforming CAAR a nd CADR, resp ectiv ely . Then F ( G ) = ct ( G ) dct ( G ) = min x ∈ V max y ∈ V ct xy ( G ) min x ∈ V 1 | V | P y ∈ V dct xy ( G ) . Both CAAR and CADR can b e generalized for the case of k cops, replacing x b y a k -tuple x = ( x 1 , x 2 , . . . , x k ); ho w ev er, execution time of b oth algorithms increases exponentially with k , hence the algorithms ar e computationa lly viable o nly for small k ’s. Also CADR will w ork for a ny transition probability matrix P , not j ust for random w alks. Hence, if desired, we can compute the cost of drunk enness for any n um b er of cops (not just for k = c ( G )) and for non- uniform random w alks (i.e., discrete time birth- and-death pro cesses) and other kinds of Marko vian ro bb ers. Both CAAR and CA D R can easily prov ide an optimal and ne ar- optimal cop strategy in fe e db ack form U x,y , that is, the optima l cop mov e when the cop/r obb er configuration is ( x, y ). This is ac hiev ed by recording a minimizing x ′ in (4) / ( 11). The optimal robb er strategy W x,y (for the a dv ersarial robb er) can b e similarly obtained b y CAAR. F or ev ery ( x, y ) configuration w e can ha v e more than one optimal mo v es, but t hey a ll yield the same (optimal) game duration. W e ha ve implemen ted the CAAR and CADR algorithms in the Matla b pack age CopsRobber , whic h can b e downloaded fro m [13]. W e hav e used this pac k age to p erform a n um b er of num erical exp eriments, some of whic h are presen ted in the tec hnical rep ort [21]. This rep ort also con tains presen tation of t he algorithms in pseudo-co de and a discussion of v arious computational issues. 6. The Invisible Ro bbe r In this section w e presen t an intro ductory discussion of the cops and robb er g ame when the robb er is invisible ; in other words, the cops do not kno w t he robb er’s lo cation unless he is o ccup ying the same vertex as o ne of the cops. All the other rules of the g ame remain the same. This v ersion r a ises sev eral interes ting questions, a full study of which will b e undertak en in a future pa p er. Since the cops nev er see the robb er until capture, they cannot use feedbac k strategies. In ot her w ords, the cop strategy is determined b efore the game starts. This do es not mean that ev ery cop mo v e is predetermined because in certain cases it make s sens e for the cops to randomize their SOME REMARKS ON COPS AND D RUNK ROBBERS 17 mo v es. Hence capture time will in general b e a random v ariable, eve n in the case of adve rsarial robb er (who may also b enefit from a randomized strategy). Let us first examine the case of a dv ersarial in visible robb er. It is clear that, given enough cops, exp ected capture time will b e finite. This is ob viously true for | V | cops, but in fact c ( G ) cops suffice, as seen by the follow ing theorem. Theorem 6.1. Supp ose that c ( G ) c ops p erform a r andom walk on a c onne cte d gr aph G , starting fr om any initial p osition. The r obb er, playing p erfe ctly, is trying to avo i d b eing c aptur e d. L et r andom variable T b e the c aptur e time. Then, E T < ∞ . Pr o of. Let G = ( V , E ) b e a ny connected graph, a nd let ∆ = ∆( G ) b e the maximum degree of G . Put k = c ( G ). F or any configuration of cops x ∈ V k and any v ertex o ccupied b y the ro bb er y ∈ V , there exists a winning strat egy S x,y that guarantees t hat the robb er is caug ht after a t most t x,y rounds. It is clear t ha t cops will follow S x,y with probability at least (1 / ∆) k t x,y . No w, let us define ε = min x ∈ V k ,y ∈ V (1 / ∆) k t x,y = (1 / ∆) k T 0 > 0, where T 0 = m ax x ∈ V k ,y ∈ V t x,y . This implies that, regardless o f the current p osition of pla y ers a t time t , the probabilit y that the robb er will b e caugh t after at most T 0 further rounds is at least ε . Moreov er, correspo nding ev en ts for times t, t + T 0 , t + 2 T 0 , . . . are m utually independent. Th us, w e g et immediately that E T = X t ≥ 0 P ( T > t ) ≤ X t ≥ 0 P T > t T 0 T 0 = X i ≥ 0 T 0 P ( T > iT 0 ) ≤ T 0 X i ≥ 0 (1 − ε ) i = T 0 ε < ∞ , (14) and w e are done. Hence c ( G ) is the minimum num ber of cops require d to capture the adv ersarial in visible robb er in finite expected time, since this task is at least as hard as capturing the adv ersarial visible ro bb er. Of course, generally it will take long er, comparing to the visible robb er case, to capture the in visible ro bb er. Let us define ict x,y ( G, k ) to b e the exp ected capture time when the initial cops/robber configuration is ( x, y ) and bo t h the k cops and the r o bb er pla y optimally; w e also define ict( G, k ) = min x ∈ V k max y ∈ V ict x,y ( G, k ) and, finally , ict( G ) = ict( G, c ( G )). W e no w turn to the drunk in visible ro bb er. He c ho oses his starting ve rtex uniformly at random and p erforms a ra ndom walk, as b efore. F or a giv en starting p o sition x ∈ V k for k cops, there is a strategy that yields the smallest exp ected capture time idct x ( G, k ). Cops hav e to minimize this b y selecting a go o d starting p osition: idct( G, k ) = min x ∈ V k idct x ( G, k ) . As usual, idct(G) = idct(G , c(G)) but it make s sense to consider an y v alue of k ≥ 1. The pro of of the next theorem is exactly the same as Theorem 2.1 and so is omitted. Theorem 6.2. idct(G , k) < ∞ for any c onne cte d g r aph G and k ≥ 1 . 18 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T Finally , the cost of drunk enness fo r the invisible robb er game is F i ( G ) = ict( G ) idct( G ) . It follo ws from last theorem that this gr a ph parameter is w ell defined ( t ha t is, finite). Let us mak e a f ew remarks regarding the invis ible r o bb er with “ i nfinite ” sp eed (a ctually , what w e mean by this is an arbitr a rily high sp eed). Let us define the cop num ber for this case b y c ∞ ( G ); it is the minim um n umber of cops tha t hav e a strategy to obtain a finite exp ected capture time. It is clear that c ( G ) ≤ c ∞ ( G ) ≤ s ( G ), where s ( G ) is the se ar ch numb er of G , that is, the minim um n um b er of cops required to cle a n the graph in the Gr ap h Se ar ch ( G S) ga me (men tioned in Section 1). W e w an t to emphasize that the cops and robb er game (with in visible, infinite sp eed robb er) is differ ent from the GS game and, in particular, there are graphs for whic h c ∞ ( G ) < s ( G ). F or example, for the C 3 cycle, s ( C 3 )=2 but c ∞ ( G ) = 1, namely one cop using a r andomize d strategy , can capture the in visible, adve rsary , infinite speed robb er in T with E T = 2. Similarly , one cop on K 1 , 3 , the star with 3 rays, can achie ve E T = 11 / 3 . Man y o ther examples can b e found. The main reason for the discrepancy b et w een c ∞ ( G ) and s ( G ) is that, in the G S game, the fugitiv e is assumed omn iscient and (under o ne inte rpretatio n) this means he kno ws in advanc e all the cop mov es (until the end o f the game). In the cops and robb er family of games, on the other hand, omniscience is not assumed, either explicitly or implicitly . W e can summarize in one phrase: c l e aring is har der than c apturing ev en an infinite sp eed robb er. W e intend to further explore t his issue, as w ell the computation o f optimal stra t egies for cops c hasing an invisible advers aria l robb er in a future publication. W e will finish this section with the computation of the cost of drunk enness for tw o examples (path and cycle) in v olving an in visible (unit speed) robb er. In b oth cases the computation is p ossible b ecause t he optimal strat egy (fo r b o t h the cops and the advers aria l robb er) is “ob vious.” Our examples are similar to the ones we ha v e considered for the visible ro bb er a nd pro ofs are omitted, since they are almost identical to those of Section 3. Consider the path P n again, with a single cop and a n in visible robber. It is clear that the b est strategy for the cop (regar dless of whether he is pla ying against a perfect robb er or a drunk one) is to start fro m one end o f the path (say , from v ertex 0) and mo ve along the path until the robb er is captured. W e hav e ict( P n ) = n − 1. When cops are pla ying a gains a drunk robber, the exp ected capture time is roughly tw o times smaller. Theorem 6.3. n 2 1 − O log n n ≤ idct( P n ) ≤ n − 1 2 . In p articular, idct ( P n ) = (1 + o (1)) n/ 2 and the c ost o f drunke n ness is F i ( G ) = ict( P n ) idct( P n ) = 2 + o (1) . Let us now play the game with tw o cops and an in visible robber on the cycle C n for n ≥ 4. It is no t difficult to see that s ( C n ) = 2 = c ( C n ). The b est cop strat egy is to start o n vertice s 1 and n ; the cop o ccup ying v ertex 1 will mo v e to w ard higher v alues, the ot her one will mov e in the opp osite direction. The game ends a fter ict( C n ) = ⌊ ( n − 1) / 2 ⌋ steps. When cops are playing against a drunk robb er, the expected capture t ime is roughly t w o times smaller. Theorem 6.4. We have n 4 1 − O log n n ≤ idct( C n ) ≤ n − 1 4 . SOME REMARKS ON COPS AND D RUNK ROBBERS 19 In p articular, idct ( C n ) = (1 + o (1)) n/ 4 and the c ost o f drunke n ness is F i ( G ) = ict( C n ) idct( C n ) = 2 + o (1) . 7. Conclusion Most of the results in the pap er p ertain to the case of a visible (adv ersarial / drunk) robb er, pursued by k = c ( G ) cops. The cases of arbitrary k and inv isible ro bb er ha ve b een briefly touc hed. W e conclude the curren t pap er b y listing additiona l questions regarding the cost of drunk ennes. W e b egin by listing sev eral questions related t o the visible ro bb er. (i) Our analysis can b e expanded to strat egies whic h use an arbitra ry num ber of cops. As sho wn in Theorem 2.1, ev en a single cop can catc h a drunk robb er in finite expected time. Hence, f or a giv en G w e can study dct( G, k ) as a function of k . Ob viously this is a decreasing function; what more can b e said ab out it? As a first step in this dir ection, the n umerical approac h of Section 5 can b e used to explore t he prop erties of dct( G, k ) for a giv en graph G . (ii) Let us define dct( G, X ) to b e the exp ected capture t ime in gr a ph G using strategy X . It is no longer assumed that X is an optimal strategy . Under what conditions o n X and/or G will dct( G, X ) b e finite? Can w e use the approac h of Section 5 to obtain non-trivial b ounds on dct( G, X )? (iii) A related question is whether (for a sp ecific G and either optimal or general strategies) exp ected capture time can b e connected to some graph parameter such a s treewidth, path width etc. (iv) Ho w robust are our results to sligh t (natura l) mo difications of the cops/robb er g ame rules? F or example, w ould the cost of drunk enness c hange if w e allo w ed the robb er to lo op in to its curren t lo cation (that is, to perfor m lazy random walk)? What ab out a “general” random walk (that is, with non uniform transition probabilities). What ab out dir e cte d graphs? Finally , do es the situation c hange significan tly if the cops and the robb er mo v e sim ultaneously rather than the cops mo ving first? The algo r it hm of Section 5 can b e easily mo dified to handle these cases a nd n umerical experimen ts ma y b e useful for an initial exploration. One can try to obtain similar r esults for the invisible robb er. In Section 6 we show ed how o ur approac h can b e extended (a t least for certain families o f graphs) to this case. In t he examples w e examined (paths, cycles) the optimal cop strategy is ob vious. F or general graphs, finding the searc h strategy optima l f or the invis ible (adv ersarial / drunk) robb er will b e more complicated. Is there a (computationally viable, p erhaps appro ximate) algor it hm to achiev e this? Finally , let us no te that all of the ab ov e analyses adopt the cops’ p o in t o f view. It will b e in teresting to study the cost of drunk enness for the cops. In other w orlds, assuming an adv ersarial ev ader and k drunk cops, can w e place b ounds on the increase of exp ected capture time as compared to the case of adv ersarial cops? Theorem 6.1 ma y b e used as a starting p oint to achiev e this goal. Reference s [1] M. Aigner and M. F romme, A game o f co ps and robber s, Discr ete Applie d Mathematics 8 (1 984) 1–12. [2] B. Alspach, Sweeping a nd searching in graphs: a brief survey , Matematiche 59 (2006 ) 5–3 7. [3] J. Ber tsek as and J. Tsitsiklis. Par allel and Distribute d Computation . Addison-W esley , 1989 . [4] A. Bo nato a nd R. Now akowski. The Game of Cops and R obb ers on Gr aphs . AMS, 2 011. 20 A THA NASIOS KEHAGIAS A ND P A WE L PRA LA T [5] D. Coppersmith, P . T etali and P . Winkler, Collisions among random w alks on a graph, SIAM J. Disc. Math. 6 (1993) 36 3–37 4. [6] J.H. Eaton and L.A. Za deh, Optimal pursuit strategies in discrete-state probabilistic systems, T ra ns. ASME Ser. D, J. Basic Eng 84 (1962) 23 –29. [7] F.V. F omin and D. Thilikos, An annotated bibliogr aphy on g uaranteed gr aph searching, The or etic al Com- puter Scienc e 399 (2008) 236– 245. [8] G. Hahn, Cops, robb ers and graphs, T atr a Mountain Mathematic al Public ations 3 6 (2007) 16 3–176 . [9] G. Hahn and G. Ma cGillivray , A note on k -co p, l - robb er games on graphs , Discr ete Mathematics , 306 (2006) 2492–2 497. [10] R.A. Howard, Dynamic pr o gr amming and Markov pr o c ess , MIT Press, 1 960. [11] S. Janson, T. Luczak, and A. Ruci ´ nski, R andom Gr aphs , Wiley , New Y ork , 2 000. [12] L. Kallenber g, Markov Decision Pro ces ses, http://www.math.leidenuniv.nl/˜k allen b erg/ Survey %20MDP .pdf [13] A th. Kehagias and P . Pra la t, Cops and visible robb ers, T echnical Rep or t a v ailable at http:/ /user s.auth.gr/~kehagiat/GraphSearch/TRCODvis.pdf [14] A. Mehrabian, The capture time of grids, D iscr ete Math. 311 (2011), 102–105 . [15] S. Neufeld and R. No wako wski, A game of cops and robb ers play ed on pro ducts of graphs, Discr ete Mathe- matics 186 (1998 ), 253 –268 . [16] R. Now ako wski and P . Winkler, V ertex to v er tex pursuit in a graph, Discr ete Mathematics 43 (1983) 230–2 39. [17] R. P allu de la Ba rriere. Optimal Contr ol The ory . Dov er, 1980. [18] M.L. Puter man. Marko v De cision Pr o c esses , Wiley , 19 94. [19] A. Quilliot, Jeux et p ointes fixes sur les graphes, P h.D. Dissertation, Univ ers it ´ e de Paris VI, 197 8. [20] D.J. White, Markov de cision pr o c esses , Wiley , 1993. [21] http:// users .auth.gr/~kehagiat/GraphSearch/ . Dep ar tment of Ma thema tics, Physics and Computer Sciences, Aristotle U niversity of Thes- saloniki, Thessaloniki GR54124 , Greece E-mail addr ess : keha giat@ auth. gr Dep ar tment of Ma thema tics, R yerson University, Toronto, ON, Canada, M5B 2K3 E-mail addr ess : pral at@ry erson .ca
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment