An expected-case sub-cubic solution to the all-pairs shortest path problem in R

An exp ected-case sub-cubic s olution to the all-pairs shortest path problem in R Julian J. McAuley ∗ and Tib ´ erio S. Caetano No ve m b er 1, 2018 Abstract It has been shown by Alon et al. that the so-called ‘all-pairs shortest-path’ problem can b e solv ed in O (( M V ) 2 . 688 log 3 ( V )) for graphs with V v ertices, with integer distances b ounded b y M . W e solve the more general problem for graphs in R (assuming no negative cy cle s), with exp ected-case running time O ( V 2 . 5 log( V )). While our result app ears to v io late the Ω( V 3 ) requirement of “F unn y Matrix Multiplication” (due t o Kerr), we ﬁnd th a t it has a sub-cu b ic exp e cte d time solution sub ject to reasonable conditions on the data distribution. The exp ected time solution arises w h en certain sub-p ro b le ms are uncorrelated, though we can do b etter/w orse than the exp ected-case under p ositiv e/negativ e correlation (resp ectiv ely ) . Wheth e r we observ e p os itive/negativ e correl ation depends on th e statistics of th e graph in q uestio n. In practice, our algorithm is signiﬁcantly faster than Flo yd-W arshall, even for dense graphs. 1 Problem Deﬁnition The all-pair s shortest path problem [Dijkstra , 19 59 ] consists of solving d ( v , v ′ ) = min p ∈P v,v ′ f ( p ) (1) for all vertices v , v ′ ∈ V , where P v, v ′ is the space of a ll paths connecting v to v ′ in V , and f ( p ) is the path length, i.e., f ( p ) = P | p |− 1 i =1 e ( p i , p i +1 ) wher e e ( p i , p j ) is the weigh t o f the edge co nnec ting p i to p j , or ∞ if no such edge exis t s. A simple divide-a nd- conquer solution to (eq. 1) can b e obtained by deﬁning d ( u, v , k ) to b e the shortest path b et ween u and v con ta inin g a t mos t k edg e s. This so lutio n exploits the fact that d ( u, v , k ) =  e ( u, v ) if k = 1 min x ( d ( u, x, k / 2) + d ( x, v , k / 2)) otherwise (2) This allows us to solve the all-pair s sho r test path problem via Algorithm 1, which w e requir es Θ( V 3 log( V ) ) time (this is by no means the o ptimal s olution, though it is this version to which our improv ements apply). ∗ The a uthors are with the Statistical M ac hine Learning Program at NICT A, and the Research Sc ho ol of Information Sciences and Engineering, A u stralian National Univ ersity . Queries should be addressed to julian.m cauley@n icta.com.au . 1 Algorithm 1, Line 9 r equires that we solve a pro blem of the form Φ( a, b ) = min x Ψ 1 ( a, x ) | {z } v a + Ψ 2 ( b, x ) | {z } v b . (3) Although this app ears to b e a line ar-t i me op eration (in V ), we note that it can b e reduced to O ( √ V ) (in the exp ected-case) if we know the p erm utations that sor t v a and v b . The sorted v alues of v b will be reused for every v alue of a , and likewise the sorted v alues o f v a will be reused for every v alue of b . Lines 7 – 9 of Algo rithm 1 ar e sometimes refer r ed to as the “F unn y Matrix Multiplication” problem: replacing (min , + ) with (+ , × ) y ields the traditional version of matrix multiplication. Kerr [Ker r, 19 7 0 ] showed that it is Ω( V 3 ) if only the op erations min a nd + are allow ed. W e ﬁnd that under reaso nable co ndit ions on v a and v b , an exp ected-case sub-cubic solution exis t s, requir ing only min and + . Algorithm 1 All-pairs shortest- pa th problem Input: a gra ph V 1: for u ∈ V do 2: for v ∈ V do 3: d ( u, v , 0) := e ( u, v ) 4: end for 5: end for 6: for i ∈ { 1 . . . ⌈ lo g V ⌉} { k = 2 i } do 7: for u ∈ V do 8: for v ∈ V do 9: d ( u, v , i ) = min x ( d ( u, x, i − 1) + d ( x, v , i − 1)) { Θ( V ) } 10: end for 11: end for { Θ( V 3 ) } 12: end for { Θ( V 3 log( V ) ) } 2 Our App r oac h The following elemen tar y lemma is the key obs erv a tion requir ed in or der to solve (eq. 3) eﬃciently: Lemma 1. If the p th smal lest element of v a has the same index as the q th smal lest element of v b , then we only ne e d to se ar ch t h r ough the p smal lest values of v a , and the q smal lest values of v b ; any values ‘b ehind’ these c annot p ossibl y c ontain t h e smal lest solution. This o bserv ation is used to construct Alg orithm 2. Here w e iterate through the indices starting from the smallest v alues of v a and v b , stopping once b oth indices ar e ‘behind’ the minimum v alue found so far (which w e then know is the minimum). This alg orithm is demonstrated pic to rially in Figure 1. An upp er-bound on the exp ected-case r unning time of Algorithm 2 is given by the following theorem: Theorem 2. The ex pected running time of A lgorithm 2 is O ( √ V ) . 2 Algorithm 2 Find i such that v a [ i ] + v b [ i ] is minimised Input: tw o vectors v a and v b , and p erm uta tio n functions p a and p b that sort them in incr e asing order (so that v a [ p a [1]] is the sma llest element in v a ) 1: Initialize: start := 1, end a := p − 1 a [ p b [1]], end b := p − 1 b [ p a [1]] { if end b = k , then the smallest element in v a has the same index as the k th smallest element in v b } 2: b est := p a [1], min := v a [ b est ] + v b [ b est ] 3: if v a [ p b [1]] + v b [ p b [1]] < min then 4: b est := p b [1], min := v a [ b est ] + v b [ b est ] 5: end i f 6: while st a rt < end a do 7: start := start + 1 8: if v a [ p a [ start ]] + v b [ p a [ start ]] < min then 9: b est := p a [ start ] 10: min := v a [ b est ] + v b [ b est ] 11: end i f 12: if p − 1 b [ p a [ start ]] < end b then 13: end b := p − 1 b [ p a [ start ]] 14: end i f 15: { rep eat Lines 8 – 14, interc hang ing a and b } 16: end while { this takes ex p e ct e d t i me O ( √ V ) } 17: Return: b est The ex pected-case running time ar ises under the assumption that v a and v b are uncorr elated. The r unning time appro ac hes O (1) as v a and v b bec ome increasingly corr elated, and it appro ac hes O ( V ) as v a and v b bec ome increa singly anti-correlated. Algor ithm 2 sha ll b e a nalysed in detail in Section 3. Using Algorithm 2, we can s olv e the all-pairs shortest path problem in O ( V 2 . 5 log( V ) ) in the exp ected-case, for graphs with edge- w eights in R with no nega tiv e cycles. This is s ho wn in Algo- rithm 3. F or dense graphs, o ur metho d has w or st-case p erformance Θ( V 3 log( V )), and bes t -case per formance Θ( V 2 log 2 ( V ) ). Our Algo r ithm requires Θ ( V 2 log( V )) memory . Also note that Al- goritm 2 ca n explo it sparsity in the graph structure: the alg orithm ma y ter minate a s s oon as it reaches entries with inﬁnit e weigh t – th us if only f ( V ) edges are viable, our algorithm has w or s t-case per formance O ( V 2 f ( V ) log( V )) (meaning that it do es not surpas s J ohnson’s Algor ithm on spa r se graphs [Johnson, 19 77 ]). 2.1 Comparison to E xis ting Approac hes T o our knowledge, the only existing sub-cubic approach is due to [Alon et al., 1997] (for edge w eights taking small in teg er v a lues); our algorithm shall not surpass this p er se , as it is not deterministic – it depe nds on the distribution of the edge weights, and it is certainly p ossible to adversaria lly g enerate graphs yielding worst-case p erformance. O ur a lgorithm has b est-case and worst-case p erformance of Θ ( V 2 log 2 ( V )) and Θ( V 3 log( V )) resp ectively; th us it do es not surpass Floyd-W arshall on dense graphs in the worst-case. Unlike Floyd-W arshall it is able to exploit g r aph sparsity , though it do es not hav e b etter worst-case performa nce than Johnson’s Algorithm. In short, our algorithm do es not improv e upo n ex is tin g solutions in the worst-case, though under rea sonable conditions , it has low er 3 2 4 72 87 8 28 12 85 32 93 25 4 42 72 18 31 start = 1 start = 2 start = 3 s tart = 4 4 18 25 31 32 42 72 93 2 4 8 12 28 72 85 87 4 18 25 31 32 42 72 93 2 4 8 12 28 72 85 87 4 18 25 31 32 42 72 93 2 4 8 12 28 72 85 87 4 18 25 31 32 42 72 93 2 4 8 12 28 72 85 87 Figure 1: Left: The lists v a and v b befo re sorting. Rig ht: B la c k squares show c o rrespo nding elements in the sor t ed lists ( v a [ p a [ i ]] and v b [ p b [ i ]]); r ed squar es indicate the elements c ur ren tly being read ( v a [ p a [ start ]] and v b [ p b [ start ]]). W e can imagine expanding a gray b o x of size start × start un til it contains an en try; note that the minimum is found dur ing the ﬁrs t step. m V f ( V ) m (a) (b) (c) Figure 2: (a) A p erm uta t ion ca n b e represented as an a rra y , wher e ther e is exactly one non-zero ent ry in each row and column; (b) W e want to ﬁnd the smalles t v alue of m such that the grey box includes a non-z e ro entry; (c) F or the sake of esta blishing an upp er-b ound, we consider a shaded region of width f ( V ) a nd height m . complexity than existing algo rithms. W e shall see in Sectio n 4 tha t our algorithm is s ig niﬁcatly faster than Floyd-W ar shall in pra ctice, making it a via ble solution to rea l-w or ld a ll- pairs shor test path problems, despite its lack of worst-case guar an tees. 3 Asymptotic Pe r formanc e of Algorithm 2 In this section we shall determine the exp ected-case running time o f Algorithm 2. Algorithm 2 trav er s es v a and v b un til it reaches the smalles t v alue of m for which ther e is so me j ≤ m fo r which m ≥ p − 1 b [ p a [ j ]]. If M is a rando m v ariable repr esen ting this smalle s t v alue of m , then we wish to ﬁnd E ( M ). By repr esen ting a p erm utatio n of the dig its 1 to V as s ho wn in Figure 2, we observe that m is simply the width of the smallest square (expanding from the top left) tha t includes an element of the p erm utation (i.e., it includes i and p [ i ]). Simple ana lysis r e veals that the pr obabilit y of cho osing a p erm utation that do es not contain a v alue inside a squar e of size m is P ( M > m ) = ( V − m )!( V − m )! ( V − 2 m )! V ! . (4) 4 Algorithm 3 All-pairs shortest- pa th problem in exp ected-case O ( V 2 . 5 log( V ) ) Input: a gra ph V 1: for u ∈ V do 2: for v ∈ V do 3: d ( u, v , 0) := e ( u, v ) 4: end for 5: end for 6: for i ∈ { 1 . . . ⌈ lo g V ⌉} { k = 2 i } do 7: for u ∈ V do 8: p a ( u ) := p erm utation that sorts d ( u, x, i − 1) 9: p b ( u ) := per m utation that sor ts d ( x, u , i − 1) { Θ( V log ( V )) } 10: end for { Θ( V 2 log( V ) ) } 11: for u ∈ V do 12: for v ∈ V do 13: y := Alg 2( d ( u , x , i − 1 ) , d ( x , v , i − 1 ) , p a ( u ) , p b ( v )) { O ( √ V ) } 14: d ( u, v , i ) := d ( u, y , i − 1) + d ( y , v , i − 1) 15: end for 16: end for { O ( V 2 √ V ) } 17: end for { O ( V 2 √ V log( V )) } This is precisely 1 − F ( m ), where F ( m ) is the cumulativ e densit y function of M . It is immediately clear that 1 ≤ M ≤ ⌊ V / 2 ⌋ , which deﬁnes the b est a nd worst-case per f ormance of Algor ithm 2. Using the identit y E ( X ) = P ∞ x =1 P ( X ≥ x ), we ca n write down a for m ula fo r the exp ected v alue of M : E ( M ) = ⌊ V / 2 ⌋ X m =0 ( V − m )!( V − m )! ( V − 2 m )! V ! . (5) Thu s the exp ected-case running time o f our a ll-pairs shortest path solver (assuming uncorrela ted sub-problems) is Θ( V 2 E ( M ) lo g( V )). W e show in the following sec tio n that E ( M ) ∈ O ( √ V ). 3.1 An Upp er Bound on E ( M ) Although (eq. 5) precis e ly deﬁnes the r un ning time of Algo rithm 2, it is not ea sy to ascertain the sp eed impr ovemen t it achiev es, as the v alues to which the summations converge for larg e V are not obvious. Here, we sha ll try to obtain an upper -bound on their p erformance, which we shall assess exp erimen tally in Sec t ion 4. In doing so we sha ll prov e Theo rem 2. Pr o of of The or em 2. Co nsider the shaded r e gion in Fig ure 2 (c). This reg io n has a width of f ( V ), and its height m is chosen such that it contains precisely one non-zero v a lue. Let ˙ M be a r andom v ariable representing the height of the grey regio n needed in order to include a non-z e ro en tr y . W e note that E ( ˙ M ) ∈ O ( f ( V ) ) → E ( M ) ∈ O ( f ( V )); (6) our aim is to ﬁnd the sma llest f ( V ) such that E ( ˙ M ) ∈ O ( f ( V )). The pro ba bilit y that no ne o f the 5 ﬁrst m samples a ppear in the shaded region is P ( ˙ M > m ) = m Y i =0  1 − f ( V ) V − i  . (7) Next we obse r v e that if the entries in our V × V grid do not deﬁne a p erm utation, but we ins tead choose a r andom en try in each r o w, then the pro ba bilit y (now for ¨ M ) b ecomes P ( ¨ M > m ) =  1 − f ( V ) V  m (8) (for simplicity we allow m to take arbitra rily large v alues). W e certainly hav e that P ( ¨ M > m ) ≥ P ( ˙ M > m ), meaning that E ( ¨ M ) is an upp er b ound o n E ( ˙ M ), and therefore on E ( M ). Thus we compute the exp ected v alue E ( ¨ M ) = ∞ X m =0  1 − f ( V ) V  m . (9) This is just a ge ometric progr ession, which sums to V /f ( V ) . Thus w e nee d to ﬁnd f ( V ) such that f ( V ) ∈ O  V f ( V )  . (10) Clearly f ( V ) ∈ O ( √ V ) will do. Thus we conclude that E ( M ) ∈ O ( √ V ) . (11) W e will show that this upp er b o und is empiric ally tight in the following se c tio n. 4 Exp erimen ts 4.1 P erfor mance of Algorithm 2 F or our ﬁrst exp erimen t, we co mpare the p erformance o f Algorithm 2 to the na ¨ ıve linear time solution. W e genera te 2 V unifo r m samples from [0 , 1) to obtain the lists v a and v b . V co rrespo nds to the size of the gr aph in question. The p erformance of Algorithm 2 is shown in Figure 3; the v alue rep orted is simply the v alue o f start up on termina tio n of the algo rithm; this is compared to V itself, which is the nu m be r of element s read b y the na ¨ ıv e solution. The upp er-bo unds we obtaine d in the previous section are also rep o rted, while the true expe cted p erformance (i.e., (eq. 5)). Visually , we ﬁnd that our upp er-bound is empirically very close to the true p erformance, sugg esting that the bo und is reaso nably tigh t. 4.2 P erfor mance for Correlated V ariables The exp ected-case running time of our algor ithm was obtained under the assumption that the v ariables w er e uncorrelated, as was the case for the previous exper imen t. W e suggested that we will 6 Figure 3: Performance of our algorithm and b ounds. F o r K = 2, the exact exp ectation is shown, which appea rs to precisely match the av era ge p erformance (ov er 1 00 trials). The do tted lines show the upper- bound, which app ears to b e extremely close to the av era ge p erformance, indica ting tha t the b ound is reas onably tigh t. obtain worse p erformance in the case of negatively cor r elated v ariables, and better p erformance in the case of p ositiv ely correla ted v ariables; we will assess these claims in this exp erimen t. W e rep ort the p erformance for tw o lists (i.e., for Alg orithm 2), whos e v alues a re sa mpled from a 2-dimensional Ga us sian, with cov ariance matrix Σ =  1 c c 1  , (12) meaning that the tw o lists are cor related with cor relation co eﬃcien t c . Performance is sho wn in Figure 4 for diﬀerent v alues of c ( c = 0, is no t sho wn, as this is the ca se observed in the pre v ious exp erimen t). In re al graphs, c shall be the c orrelation coeﬃcient b et ween p ( u, x, i − 1) and p ( x, v , i − 1) (which is free over x ). Unless c is equal to precisely − 1 for all u , v , and i , we obtain a sub-cubic solutio n. Whether w e observe positive, negative, or z ero c o rrelation will depend on the statistics of the gr a phs in question. 4.3 P erfor mance of Algorithm 3 Finally , we co mpa re our algorithm to the divide-a nd- conquer solutio n of Algorithm 1, and to the po pular Floyd-W ar shall Algorithm [Floyd, 1 962 ] on dense g r aphs in R + . W e ge nerate dense g r aphs of size V with edg e weigh ts sa mpled uniformly in [0 , 1). The p erfor- mance of our algo rithm, compared to Algo rithm 1 and the Floyd-W ars hall Alg o rithm is shown in Figure 5. W e note that our algor ith m is faster than Algorithm 1 after only V = 4, meaning that its computational overhead is neglig ible . It is faster than Floyd-W ar shall after V ≃ 90. 7 Figure 4: Performance of our algorithm for diﬀerent correlatio n co eﬃcien ts. The top three plots show p ositiv e correlation, the bottom thr e e show negative correlation. Cor relation co eﬃcien ts of c = 1 . 0 and c = − 1 . 0 capture pr ecisely the bes t and w or st-case p erformance (r e spectively) of our algorithm. 8 Figure 5: The running time of o ur algor ith m co mp ared to the divide-and- c onquer solution of Algorithm 1, and the Floyd-W arsha ll Algo rithm. The average of 10 trials is shown. All alg orithms were implemented in Python. 4.4 Conclusion W e hav e pres e n ted an exp ected-case sub cubic solutio n to the pr oblem of F unny Matrix Multiplica- tion, r esulting in an exp ected-case O ( V 2 . 5 log( V ) ) solutio n to the all- pa irs shortest path problem. The running time of our metho d depe nd s on the distribution of edge weight s fo r the gra ph in question, though w e achieve p erformance at least as go o d as the exp ectation under reasonable con- ditions. O ur alg orithm is signiﬁcantly faster than Floyd-W ars hall in pra c tice , ma king it a viable solution to r e al-w or ld a ll-pairs shor test path problems. Ac knowledgem en ts W e w o uld lik e to thank Pedro F elzenszwalb for a lerting us to the link b et ween inference in graphical mo dels and the all- pairs shortest path problem. NICT A is funded by the Austra lian Gov ernment’s Backing Austr alia’s Ability initiative, and the Australia n Rese a rc h Council’s ICT Centr e of Exc el- lenc e prog ram. References [Alon et al., 1 9 97] Alon, N., Galil, Z., a nd Ma r galit, O . (1997). On the e x ponent of the all pairs shortest path problem. Journal of Computer and System Scienc es , 5 4(2):255–262. [Dijkstra, 19 59] Dijkstra, E. W. (1959). A note on t wo problems in connexion with gr aphs. Nu - merische Mathematik , 1(1):26 9–271. 9 [Floyd, 196 2] Floyd, R. W. (1962). Algo rithm 9 7 : Shor test path. Commun. AC M , 5(6):345 . [Johnson, 19 7 7] Johnson, D. B. (19 77). E ﬃcien t alg orithms for sho rtest paths in sparse netw o rks. J. ACM , 24(1 ) :1–13. [Kerr, 1970 ] Kerr, L. R. (19 70). PhD Thesis . 10

An expected-case sub-cubic solution to the all-pairs shortest path problem in R

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment