Algorithmic Techniques for Several Optimization Problems Regarding Distributed Systems with Tree Topologies

As the development of distributed systems progresses, more and more challenges arise and the need for developing optimized systems and for optimizing existing systems from multiple perspectives becomes more stringent. In this paper I present novel al…

Authors: Mugurel Ionut Andreica

Algorithmic T ec hniques for Sev eral Optimization Problems Regarding Distributed Systems with T ree T op ologies Mugurel Ionut ¸ Andreica Politehnic a University of Buchar est, Com puter Scienc e Dep artment, R omania mugur el.andr eic a@cs.pub.r o As the developmen t of distributed systems prog resses, more and more c hallenges arise and the need for developing optimized systems and for optimizing existing systems from multiple persp ectives becomes more stringent. In this pap er I present nov el algo rithmic techniques for solving several optimization problems regar ding distributed sy stems with tree to polo gies. I address to pics like: r elia- bilit y improv ement, partitio ning , colo ring, conten t delivery , optimal matc hings, as w ell as some tree counting asp ects. Some of the presented tec hniques are only of theoretical interest, while others can be used in practical settings. 1 In tro duction Distributed systems are b eing increa s ingly developed and deploy ed all around the world, b ecause they pres en t efficient solutions to many practical pr o blems. How ever, as their developmen t progresses , many problems related to scalability , fault toler ance, stability , efficien t r esource usag e and ma n y other topics need to be so lv ed. Developing e fficie nt distributed s ystems is not an easy task, because many system parameters need to be fine tuned a nd optimized. Because of this, optimization techniques are required for designing efficient distributed sy stems or improving the p erfor ma nce of existing, already deploy ed ones. In this pap er I present se v eral novel a lgorithmic techniques for some o ptimiza tion problems regar ding distributed systems with a tree topo logy . T r ees are some of the simplest non-trivia l to p olo gies which a ppea r in real- life s ituations. Many o f the existing netw o rks have a hiera rchical structure (a tree or tre e-like gr aph), with user devices at the edge of the net work and router backbones a t its co re. Some p eer-to-p eer systems used for c on tent retr iev al and indexing hav e a tree structure. Multicast con tent is usually delivered using m ulticast tr ees. F urthermore , many graph top ologies can b e reduced to tree top ologies, by choosing a spanning tr ee o r by cov ering the gr aph’s edge s with edge disjo in t spanning trees [1 ]. In a tree, there exists a unique path betw een any t wo nodes. Thus, the netw ork is quite frag ile. The frag ilit y is comp ensated by the simplicity of the top ology , which makes many decisio ns become easie r. This pap er is structure d a s follows. Section 2 defines the main notations which ar e used in the rest of the pap er. In Se c tion 3 I consider the minimum weigh t cycle completion problem in trees. In Section 4 I discus s tw o tree parti- tioning problems and in Section 5 I c o nsider t wo cont ent delivery optimization problems. In Section 6 I so lve se veral o ptimal matching problems in trees and powers o f trees and in Section 7 I analyze the firs t fit online c oloring heuristic, applied to tre e s . In Section 8 I co nsider three other o ptimiza tion and tree co un t- ing pr oblems. In Section 9 I dis cuss re la ted work and in Section 1 0 I conclude and present future work. 1 2 Notations A tree is a n undirected, connected, acyclic gr aph. A tree may be ro oted, in which case a special vertex r will b e called its roo t. Even if the tree is unr o o ted, we may cho ose to ro o t it at some vertex. In a ro oted tr e e, we define par ent ( i ) as the pa rent of vertex i and ns ( i ) as the num b er of sons of vertex i . F o r a leaf vertex i , ns ( i ) = 0 and for the r o o t r , pare nt ( r ) is undefined. The sons of a vertex i are deno ted by s ( i, j ) (1 ≤ j ≤ ns ( i )). A vertex j is a desc endant of vertex i if ( par ent ( j ) = i ) o r par ent ( j ) is also a descendant of vertex i . W e denote b y T ( i ) the subtree ro oted at v ertex i , i.e. the par t of the tree co mpose d of v ertex i a nd all of its de s cendant s (together with the edge s connecting them). In the pap er, the terms no de and vert ex will be used with the same meaning. A matching M of a g raph G is a set of edges o f the graph, suc h that any t wo edg e s in the set hav e distinct e ndp oints (vertices). A maximum matching is a matching with max im um cardinality (maximum num b er of edges). 3 Minim um W eigh t Cycle Completion of a T ree W e consider a tree netw o rk with n vertices. F or m pairs of v ertices ( i, j ) which are not adjac e nt in the tree, we a re g iv en a weigh t w ( i, j ) (we can consider w ( i, j ) = + ∞ for the other pairs of v ertices). W e w ant to connect so me of these m pair s (i.e. add extra edg es to the tree), suc h that, in the end, every vertex of the tree belo ngs to exa ctly one cycle. The ob jective consists of minimizing the total w eight of the edges added to the tree. F o r the unw eighted case ( w ( i, j ) = 1) and when we can connect any pair o f vertices which is not connected by a tree edge, there ex ists the following simple greedy algorithm [3]. W e select an arbitrar y ro ot vertex r a nd then traverse the tree b ottom-up (from the leav e s tow a rds the ro ot). F or each vertex i we will compute a v alue l ( i ), r e pr esenting the lar gest num ber of vertices o n a path P ( i ) starting at i and co n tinu ing in T ( i ), such that every vertex j ∈ ( T ( i ) \ P ( i )) b elongs to exactly one cy cle and the v ertices in P ( i ) ar e the only ones who do not belong to a cycle. W e denote by e ( i ) the seco nd endp oint of the path (the first one being vertex i ). F o r a leaf vertex i , we hav e l ( i ) = 1 and e ( i ) = i . F or a non-lea f vertex i , we firs t remov e from its list of so ns the s ons s ( i, j ) with l ( s ( i, j )) = 0, up date ns ( i ) and renum b er the other sons starting fro m 1. If i remains with only one son, we set l ( i ) = l ( s ( i, 1)) + 1 and e ( i ) = e ( s ( i, 1)). If i r e mains with ns ( i ) > 1 sons, we will sort them acc o rding to the v alues l ( s ( i , j )), such that l ( s ( i, 1)) ≤ l ( s ( i, 2)) ≤ . . . ≤ l ( s ( i, ns ( i ))). W e will connec t by an edg e the vertices e ( s ( i, 1)) and e ( s ( i, 2)). This wa y , every vertex on the pa ths P ( s ( i, 1)) and P ( s ( i, 2)), plus the vertex i , b elong to exactly one cycle. F or the other sons s ( i, j ) (3 ≤ j ≤ ns ( i )), we will have to connect s ( i, j ) to e ( s ( i, j )). This will only be p o ssible if l ( s ( i, j )) ≥ 3; otherwise, the tree admits no solution. Afterwards, we set l ( i ) = 0. If the ro ot r has o nly o ne son, then we must hav e l ( r ) ≥ 3 , such that we can connect r to e ( r ). F o r the gene r al ca se, I will des c r ibe a dynamic progra mming algor ithm (as the gr eedy algor ithm cannot b e extended to this cas e ). W e will a gain ro ot the tree a t an arbitra ry vertex r , thu s defining parent-son relatio nships. F or each vertex i , we will compute t wo v alues: w A ( i )=the minimum total weigh t of a subset of edges added to the tree such that every vertex in T ( i ) b e longs to 2 exactly one cy cle, and wB ( i )=the minimum total weight of a subset of edge s added to the tr ee such that every vertex in ( T ( i ) \ { i } ) b elongs to exa ctly one cycle (and vertex i belo ngs to no cycle). W e will compute the v a lues fro m the leaves tow ards the ro ot. F or a lea f vertex i , we hav e w A ( i ) = + ∞ and wB ( i ) = 0. F or a non-leaf vertex i , we hav e: wB ( i ) = P ns ( i ) j =1 wA ( s ( i, j )). In order to compute w A ( i ) we will fir st tra verse T ( i ) and for eac h vertex j , we will compute wAsum ( i , j )=the sum of all the w A ( p ) v a lues, where p is a son of a vertex q which is lo cated o n the pa th from i to j ( P ( i . . . j )) and p do es not belo ng to P ( i . . . j ). W e hav e w Asum ( i, i ) = w B ( i ) and for the o ther vertices j we hav e w Asum ( i, j ) = w Asum ( i, par ent ( j )) − wA ( j ) + w B ( j ). Now we will try to a dd a n edge, such tha t it closes a c y cle in the tree which contains vertex i . W e will first try to add edge s of the form ( i , j ), where j is a descendant of i (but not a son o f i , of course) - these will b e called typ e 1 edges. Adding suc h an edge ( i, j ) provides a ca ndidate v alue w cand ( i, i, j ) for w A ( i ): wcand ( i, i, j ) = wAsum ( i, j ) + w ( i, j ). W e will then consider edges of the form ( p, q ) ( p 6 = i and q 6 = i ), whe r e the low est co mmon ance s tor of p and q ( LC A ( p, q )) is vertex i - these will be called typ e 2 edges (we consider every pair of distinct so ns s ( i, a ) and s ( i , b ), and fo r each such pair we co nsider every pair of vertices p ∈ T ( s ( i, a )) and q ∈ T ( s ( i, b )) and verify if the edge ( p, q ) ca n be added to the tree). Adding such an edge ( p, q ) provides a ca ndidate v alue w cand ( i, p, q ) for wA ( i ): w cand ( i, p, q )= w Asum ( i, p )+ w Asum ( i, q )- w B ( i )+ w ( p, q ). wA ( i ) will be equal to the minim um of the candidate v alues wc and ( i, ∗ , ∗ ) (or to + ∞ if no candidate v alue exists). W e ca n implement the algor ithm in O ( n 2 ) time, which is optimal in a sense, b e c ause m ≤ ( n · ( n − 1) / 2 − n + 1), which is O ( n 2 ). w A ( r ) is the answer to our pr oblem and we can find the a c tual edges to add to the tree b y tra cing back the wa y the w A ( ∗ ) and wB ( ∗ ) v alues were co mputed. How ever, when the num b er m of e dg es which can be added to the tree is s ig nificantly smaller, w e can impr ove the time co mplexit y to O (( n + m ) · l og ( n )). W e will compute for e a c h of the m edges ( i, j ) the low est common ancestor o f the vertices i and j ( LC A ( i, j )) in the ro oted tr e e . This can be a c hieved by prepr o cessing the tree in O ( n ) time and then answering each LCA query in O (1) time [2 ]. If LC A ( i, j ) = k , then we will add the edge ( i, j ) to a list Ledg e ( k ). Then, for each no n-leaf vertex i , w e will trav erse the edges in Ledg e ( k ). F o r ea c h edge ( p, q ) we can easily deter mine if it is of t ype 1 ( i = p o r i = q ) or of t ype 2 and use the co r resp onding equa- tion. How ever, we need the v alues w Asum ( i, p ) and wAsum ( i, q ). Instead of reco mputing these v alues from scratch, we will up date them incr emen tally . It is o b vious that w Asum ( par ent ( i ) , p )= wAsu m ( i, p )+ wB ( par ent ( i ))- wA ( i ). W e will pr epro cess the tree, by as signing to eac h vertex i its DFS n umber D F S num ( i ) ( D F S num ( i )= j if vertex i was the j th distinct vertex visited dur- ing a DFS traversal of the tree which star ted at the ro ot). Then, for each vertex i , we compute D F S max ( i )=the maximu m DFS num b er of a vertex in its subtree. F or a leaf no de i , we hav e D F S max ( i ) = D F S num ( i ). F or a non-le af vertex i , D F S max ( i )= max { D F S nu m ( i ) , D F S max ( s ( i, 1)) , . . . , D F S max ( s ( i, ns ( i ))) } . W e will main tain a segment tree, using the algorithmic framework fr om [15]. The op erations we will use are range a ddition update and po in t query . Initially , each leaf i (1 ≤ i ≤ n ) has a v alue v ( i ) = 0. Befor e com- puting w A ( i ) for a vertex i , w e set the v alue of leaf D F S num ( i ) in the seg men t tree to w B ( i ). Then, for each so n s ( i, j ), w e add the v alue ( w B ( i ) − wA ( s ( i, j )) 3 to the interv al [ D F S num ( s ( i, j )) , D F S max ( s ( i, j ))] (range up date). W e c a n obtain w Asum ( i, p ) for any vertex p ∈ T ( i ) by quer ying the v a lue of the cell D F S num ( p ) in the seg men t tree: we start from the (current) v alue of the leaf D F S num ( p ) and a dd the up date aggre g ates uagg stored a t every a nestor no de of the leaf in the segment tr ee. Queries and up dates take O ( l og ( n )) time ea c h. If the ob jective is to minimize the larg est weigh t W max of a n edg e added to the tree, we ca n binary sea rch W max and p erfor m the following feasibility test on the v alues W cand chosen by the binar y search: we c o nsider only the ”extra” edges ( i, j ) with w ( i, j ) ≤ W cand and run the alg orithm describ ed ab ov e for these edges; if w A ( r ) 6 = + ∞ , then W cand is feasible. 4 T ree P artitioning T ec hniques 4.1 T ree Partitioning with Lo w er and Upp er Size Bounds Given a tree with n vertices, we wan t to par tition the tree into several parts, such that the num b er of vertices in each part is at leas t Q and at mos t k · Q ( k ≥ 1). Each part P must hav e a r e pr esentativ e vertex u , which do es not necessarily b elong to P . How ever, ( P ∪ { u } ) m ust fo rm a connected subtree. I will present an alg orithm w hich w orks for k ≥ 3. W e r o ot the tree at a n y v ertex r , trav erse the tree bottom-up and compute the parts in a gree dy manner. F or each vertex i w e compute w ( i )=the size of a connected component C ( i ) in T ( i ), such that v ertex i ∈ C ( i ) , | C ( i ) | < Q , and a ll the vertices in ( T ( i ) \ C ( i )) were split into parts satis fying the sp ecified prop erties. F or a leaf vertex i , w ( i ) = 1 and C ( i ) = { i } . F or a non-leaf vertex i , w e trav erse its sons (in any o rder) and maintain a counter ws ( i )=the sum o f the w ( s ( i, j )) v alues of the sons trav ersed so far. If ws ( i ) exceeds Q − 1 a fter co nsidering the s on s ( i, j ), we form a new part from the connected comp onents C ( s ( i, l ast son + 1)) , . . . , C ( s ( i, j )) a nd assig n vertex i as its repr esentativ e. Then, w e r eset ws ( i ) to 0. ( l ast son < j ) is the previous son where w s ( i ) was r eset to 0 (o r 0, if ws ( i ) w as never res et to 0). After considering every s on of vertex i , we set w ( i ) = w s ( i ) + 1 a nd the comp onent C ( i ) is for med from the comp onents C ( s ( i, j )) whic h were not used for forming a new pa rt, plus vertex i . If ws ( i ) + 1 = Q , then we form a new part from the compo nen t C ( i ) a nd set w ( i ) = 0 and C ( i ) = {} . During the algorithm, the maxim um siz e of any part formed is 2 · Q − 2. At the end o f the algorithm, we may hav e that w ( r ) > 0. In this case, the vertices in C ( r ) were not a ssigned to a n y par t. How ever, at least one v ertex fro m C ( r ) is adjacent to a vertex assigned to some part P . The n, we can extend that part P in or der to contain the v ertices in C ( r ). This way , the maximum size of a part bec o mes 3 · Q − 3 . The pseudo co de of the first pa rt of the alg orithm is pres e n ted below. In order to compute the parts, w e maintain for each vertex i a v alue par t ( i ), which is 0 , initially (0 means that the vertex was not as s igned to any part). In order to a s sign distinct pa rt num ber s, w e will ma intain a global co un ter part number , whose initial v alue is 0. The first part of the alg orithm has linear time complexity ( O ( n )). The second par t (a dding C ( r ) to an alr eady ex is ting part) c an als o b e p erformed in linea r time, by searching for an edge ( p , q ), such that part ( p ) = 0 and par t ( q ) > 0 (there are only n − 1 = O ( n ) edges in a tree). Lo w erUpp erBoundT reeP artitioning(Q, i): if (n s(i)=0) then w(i)=1 el se 4 ws(i)=last son=0 for j=1 to ns(i) do // j=1,2,. . . ,ns(i) Lo w erUpp erBoundT reeP artitioning (Q , s(i,j)) ws(i)=ws(i)+w(s(i,j)) if ( ws ( i ) ≥ Q ) then p art numb er=p art nu m b er + 1; last son=j; ws(i)=0 for k=last son+1 to j do AssignPar tNumber (s(i,k), p art nu m b er) w(i)=ws(i)+1 if ( w ( i ) ≥ Q ) then p art numb er=p art nu m b er + 1; w(i)=0 AssignPa rtNumber (i, p art nu mb er) AssignPa rtNumber(i, part n um b er): if ( par t ( i ) 6 = 0) then return() p art(i)=p art num b er for j=1 to ns(i) do As signP artNum b er (s(i,j), p art numb er) 4.2 Connected T ree Partitioning I will now presen t a n efficien t a lgorithm for identifying k connected par ts of given sizes in a tree (if p oss ible ), sub ject to minimizing the total co st. Thus, g iven a tree with n v ertices, w e wan t to find k v ertex-dis jo in t components (called par ts), such tha t the i th part (1 ≤ i ≤ k ) has sz ( i ) vertices ( sz (1)+ sz (2)+ . . . + sz ( k ) ≤ n and sz ( i ) ≤ sz ( i + 1) for 1 ≤ i ≤ k − 1). Ea ch tree edg e ( i, j ) has a cost ce ( i, j ) and each tree v ertex i has a co s t cv ( i ). W e want to minimize the s um of the costs of the vertices and edges which do not b elong to any par t. An edge ( i, j ) belo ngs to a part p if bo th vertices i and j b elong to par t p . In order to o btain k connected co mponents of the g iv en sizes we need to keep Q − k edges of the tree and remove the other s, where Q = sz (1) + . . . + sz ( n ). W e could try all the ( ( n − 1) cho ose ( Q − k ) ) p ossibilities of choosing Q − k edges out of the n − 1 edg e s o f the tree. F or each p ossibility , we obtain k ′ = n − Q + k connected comp onents with sizes sz ′ (1) ≤ sz ′ (2) ≤ . . . ≤ sz ′ ( k ′ ); in cas e of several co mp onents with equal sizes , we sort them in increasing order of the total co st of the vertices in them. Then, we m ust hav e sz ( j ) = sz ′ ( k ′ − k + j ) and the total cos t of the po ssibility is the sum o f the costs of the removed edges plus the s um of the costs of the v ertices in the comp o nen ts 1 , 2 , . . . , k ′ − k (which should ha ve only o ne v ertex each, if the size c o nditions ho ld). How ever, this approach is quite inefficien t in mos t cas es. I will pres e n t an algo rithm with time complexity O ( n 3 · 3 k ). W e ro ot the tree at an a r bitrary vertex r . Then, we compute a table C min ( i, j, S )=the minimum cost o f obtaining fro m T ( i ) the parts with indices in the set S and, b esides them, we are left with a co nnected comp onent consisting o f j vertices which includes vertex i a nd, possibly , several vertices whic h are ignor ed (if j = 0, then e very vertex in T ( i ) is assigned to o ne of the parts in S or is ignor ed). W e compute this table b ottom-up: ConnectedT reeP artitioning(i ): for e ac h S ⊆ { 1 , 2 , . . . , k } do for j=0 to n do Cmin(i,j,S)= + ∞ Cmin(i, 1, {} )= 0; Cmin(i, 0, {} )=cv(i) for x=1 to ns(i) do ConnectedT reeP artitioning ( s (i,x)) for e ac h S ⊆ { 1 , 2 , . . . , k } do for j=0 to n do 5 Caux(i,j,S)=Cmin(i,j, S); Cmin(i,j,S)= + ∞ for e ac h S ⊆ { 1 , 2 , . . . , k } do for j=0 to n do for eac h W ⊆ S do for q=0 to qlimit(j) do Cmin(i,j,S)=min { Cmin(i,j,S), Caux(i,j-q, S \ W ) + extr a c ost(i,s(i,x),q) + Cmin(s(i,x),q,W) } for e ac h S ⊆ { 1 , 2 , . . . , k } do for j=0 to n do i f ( C min ( i , j, S ) < + ∞ ) then for q=1 to k do i f (( j=s z(q)) and ( q / ∈ S )) then Cmin(i,0, S ∪ { q } )=min { Cmin(i,j,S), Cmin(i,0, S ∪ { q } ) } W e define extr a c ost(i, son x i, q)=if ( q > 0) then r eturn(0) else r etu rn(c e(i, son x i)) and qlimit(j)=max { j-1,0 } . The algor ithm computes Cmin(i,*,* ) from the v alues of vertex i ’s sons , using the principles of tree knapsack. The tota l amount of computations fo r ea c h vertex is O ( ns ( i ) · 3 k · n 2 ). Summing over all the vertices, we obtain O ( n 3 · 3 k ). The minimum total cost is C min ( r, 0 , { 1 , 2 , . . . , k } ) (if this v alue is + ∞ , then we cannot obtain k parts with the given size s ). In order to find the actual parts, we need to trac e bac k the wa y the C mi n ( ∗ , ∗ , ∗ ) v alues were computed, which is a sta ndard pro cedure. When the sum of the sizes of the k parts is n , then every v ertex b elongs to one part. 5 Con ten t Deliv ery Optimization Problems 5.1 Minim um Num b er of Unicast Streams W e consider a directed acyclic graph G with n vertices and m edges. E very directed edge ( u, v ) has a lower b ound lb e G ( u, v ), an upp er bound ube G ( u, v ) and a co s t ce G ( u, v ). E v ery vertex u has a low er b ound l bv G ( u ), an upp er bo und ubv G ( u ) and a cost cv G ( u ). W e need to determine the minim um num ber of unicast comm unication streams p and a path for each of the p streams, such that the num ber of strea m pa ths npe ( u, v ) containing an edge ( u, v ) satisfie s l be G ( u, v ) ≤ npe ( u, v ) ≤ u be G ( u, v ) and the n um b er of paths npv ( u ) co n taining a vertex u satisfies l bv G ( u ) ≤ npv ( u ) ≤ u bv G ( u ). E ach vertex u can b e a sour c e no de, a destination no de, bo th or none. A stream’s path may start at any source no de and finish at any destination no de. Moreover, for the n umber of streams p , we want to compute the paths suc h that the sum S over all the v alues ( npe ( u, v ) − l be G ( u, v )) · ce G ( u, v ) and ( npv ( u ) − lbv G ( u )) · cv G ( u ) is minimum. Particular ca ses of this problem hav e b een studied previo us ly . When l bv G ( u ) =1 and ubv G ( u ) = 1 for every vertex u , lb e G ( u, v ) = 0 and ube G ( u, v ) = + ∞ for every direc ted edge ( u, v ), all the costs are 0, and every vertex is a source and destinatio n no de, we obta in the minimum p ath c over problem in directed acyclic gra phs, which is s olved as follows [18 ]. Construct a bipartite gra ph B with n vertices x 1 , . . . , x n on the left side and n vertices y 1 , . . . , y n on the righ t side. W e add an edge ( x i , y j ) in B if the directed edge ( i, j ) appea r s in G . Then, we compute a maxim um matching in B . If the cardinality of this matching is C , then we nee d p = n − C streams. The paths are computed as follows. Having an edge ( x i , y j ) in the maximum matc hing means that the edge ( i, j ) in G belong s to some str eam’s path. If t wo edg es ( x i , y j ) and ( x j , y k ) in B belo ng to the matching, then the edges ( i, j ) and ( j, k ) in G b elong to the path of the same stream. F or non-zero c osts, we co mpute a minimum (total) weigh t matching in B (where every edge ( x i , y j ) has a weigh t equal to ce ( i, j )). 6 In order to solve the problem I mentioned, we will use a sta ndard trans- formation and c o nstruct a new graph G ′ where every vertex u is repr esent ed by tw o vertices u in and u out . F or every directed e dge ( u, v ) in G , w e add an edge ( u out , v in ) in G ′ , with the same co st and low er and upp er b ounds. W e also add a directed edg e from u in to u out in G ′ (for every vertex u in G ), with cost cv G ( u ), low er b ound l bv G ( u ) and upper b ound ubv G ( u ). Then we add tw o sp ecial v ertices s (source) and t (sink) to G ′ . F or ev ery source no de u in G , we add a dir ected edge ( s, u in ) in G ′ , with low er bound and cost 0 a nd upper bound + ∞ . F or every destination no de v in G , we add a directed edge ( v out , t ), with low e r b ound and c o st 0 and upp e r b ound + ∞ . W e also add the edg es ( s, t ) and ( t, s ) with low er b ound and cost 0 and upp er b ound + ∞ . The resulting g raph G ′ has costs, low er and upp er b ounds only on its edges and not on its vertices. In order to compute the minimum num b e r of communication strea ms which satisfy the constr aint s imp osed by G , it is enoug h to compute a (minimum co st) minim um feasible flow in G ′ , from s to t . Decomp osing the flow into unit-flow paths (in order to obtain the path of each comm unication stream) can then b e done ea sily . W e rep eatedly p erform a gra ph trav ersal (DFS or BFS) fro m s to t in G ′ , considering only directed edg es with p ositive flow on them. F rom the trav ersal tree, by follo wing the ”par en t” po in ters, we can find a pa th P f rom s to t , co n taining only edges with p ositive flow. W e compute the minimum flow f P o n a ny edge o f P , tr ansform P into f P unit paths and then dec r ease the flow on the edg es in P by f P . If we r emo ve the first a nd last vertices on any unit path (i.e. s and t ), we obtain a path from a vertex u in to a vertex v out , where u is a so urce no de in G a nd v is a destination no de in G . W e will use the algorithm presented in [1 8] for determining a feasible flow (not necessarily minim um) in a flow netw ork with low er and upp er bo unds on its edges. W e will denote this algorithm b y A ( F, s, t ) ( F is the flo w ne tw ork given as arg umen t, s is the sour ce vertex and t is the sink vertex). I will describ e A ( F , s, t ) briefly . W e constr uc t a new graph F ′ from F , as follows. W e maintain all the vertices and edges in F . F o r every directed edge ( u, v ) in F , the directed edge ( u, v ) in F ′ has the same cost, low er bound 0 a nd upp er b ound ( ube F ( u, v ) − lbe F ( u, v )). W e add tw o extra vertices s ′ and t ′ and the following zero-co st directed edges: ( s ′ , u ) and ( u, t ′ ) for ev ery vertex u in F (including s and t ). The low er b ound of every edge will b e 0 . The upp er bound of a directed edge ( s ′ , u ) in F ′ is equal to the sum of the lo wer b o unds of the dir ected edges ( ∗ , u ) in F . The upper b ound of ev ery dir ected edg e ( u, t ′ ) in F ′ is equal to the sum of the lo wer bo unds o f the directed edges ( u, ∗ ) in F . The a lgorithm A ( F , s, t ) computes a minimum cos t maximum flow g in the graph F ′ (whic h, as stated, o nly has upper b ounds); if all the costs are 0, only a maximum flow is computed. If g is equa l to the sum of the upp e r b ounds o f the edges ( s ′ , ∗ ) (or, e q uiv alently , of the edges ( ∗ , t ′ )), then a feasible flow from s to t e x ists in F : the flow on every directed edge ( u, v ) in F will b e l be F ( u, v ) plus the flow o n the edge ( u, v ) in F ′ . W e will firs t r un the algorithm on G ′ (i.e. call A ( G ′ , s, t )) in order to v erify if a feas ible flow exists). If no fea s ible flow ex ists, then the constra ints canno t be satisfied by a n y n umber of streams. O therwise, we constr uct a gr aph G ′′ from G ′ , by a dding a new vertex snew and a zer o-cost directed edge ( snew , s ) with lower b ound 0 a nd uppe r b ound x . snew will b e the new sour ce vertex and x is a parameter which is used in o rder to limit the amount of flow en tering the old source vertex s . W e will now per form a binary s e arch o n x , b et ween 0 a nd g max , wher e g max is the v a lue of the feasible flow computed by calling 7 A ( G ′ , s, t ). The feasibility test consists of verifying if there exists a feasible flow in the g raph G ′′ (i.e. calling A ( G ′′ , snew , t )). The minimum v alue of x for which a feasible flow exists in G ′′ is the v alue of the minim um feasible flow in G ′ , from s to t . Obtaining the feasible flow in G ′ from the feasible flow in G ′′ is triv ial: for every directed edge ( u, v ) in G ′ , we set its amount of flo w to the flow of the same edge ( u, v ) in G ′′ . The time complexity of the pres en ted algo rithm is O ( M F ( n, m ) · log ( g max )), where g max is a go o d upper b ound on the v alue of a feasible flow a nd M F ( n, m ) is the b est time complexity of a (minimum cost) maximum flow a lgorithm in a directed graph with n vertices a nd m edges. 5.2 Degree-Constrained Minim um Spanning T ree In [13], the following problem was consider ed: given an undire cted g raph with n verices and m edges, where each edge ( i, j ) ha s a weight w ( i, j ) > 0 , compute a spanning tree M S T of minimum total weigh t, such that a sp ecial vertex r has degree exactly k in M S T . A solution w as pro pos ed, based on using a pa rameter d and setting the cos t o f each edge ( r, j ) adjace n t to r , c ( r , j ) = d + w ( r, j ); the co st of the other e dg es is equal to their weight. Parameter d can range from −∞ to + ∞ . W e deno te by M S T ( d )=the minimum spanning tree using the cost functions defined previously . When d = −∞ , M S T ( d ) contains the maximum num b er of edges adjacent to r . F o r d = + ∞ , M S T ( d ) co n tains the minimum n umber of edg es adjacent to r . W e define the function ne ( d )=the nu mber of edges adjacent to r in M S T ( d ). ne ( d ) is non-inc r easing on the interv al [ −∞ , + ∞ ]. W e will binary sea rch the smallest v alue dopt of the parameter d in the interv al [ −∞ , + ∞ ], such tha t ne ( dopt ) ≤ k . W e will finish the binary search when the length of the sear c h in terv al is smalle r than a small consta n t ε > 0. If n e ( dopt ) = k , then the edges in M S T ( dopt ) fo rm the required minimum spanning tree. If ne ( dopt ) < k , then ne ( dopt − ε ) > k . W e define S ( d )=the set of edges adjace n t to vertex r in M S T ( d ). It is easy to prove that S ( dopt ) is included in S ( dopt − ε ). The required minimum spa nning tree is constructed in the fo llowing manner . The edges adjacent to vertex r will b e the edges in S ( dopt ), to whic h we add ( k − ne ( dopt )) ar bitrary edg e s from the set S ( dopt − ε ) \ S ( dopt ). Once these edg es are fix e d, we construct the following gra ph G : we set the cost of the chosen edges to 0 and the cost of the o ther edges ( i, j ) to w ( i, j ). W e now compute a minim um spanning tree M S T G in G . The edges in M S T G are the edg es of the minim um spanning tree of the o riginal graph, in which vertex r has deg ree exactly k . The time complexity of this a pproach is O ( m · l og ( m ) · l og ( D M AX )), where D M AX denotes the r ange ov er which we search the parameter d . When m is not to o la rge (i.e. m is not of the order O ( n 2 )), this represents an improv ement o ver the O ( n 2 ) solution given in [13]. 6 Matc hing Problems 6.1 Maxim um W eigh t Matc hing in an Extended T ree Let’s c o nsider a ro oted tr e e (with v ertex r as the ro ot). Eac h vertex i has a weigh t w ( i ). W e wan t to find a matching in the fo llowing g raph G (extended tree), having the same vertices as T a nd an edge ( x, y ) b etw een tw o vertices x and y , if: (i) x and y ar e a djacent in the tree; (ii) x a nd y hav e the same parent 8 in the tree. The weigh t of an edge ( x, y ) in G is | w ( x ) − w ( y ) | . The weight of a matching is the sum o f the weigh ts o f its edges. W e are in terested in a maximum w eight matching in the graph G . F or each vertex i , we so r t its sons s ( i, 1) , . . . , s ( i , ns ( i )) in non-decrea sing order of their weigh ts, i.e. w ( s ( i, 1)) ≤ . . . ≤ w ( i , ns ( i )). W e will compute for each vertex i t wo v a lues: A ( i )=the maximum weight of a ma tc hing in T ( i ) if vertex i is the endp oint of a n edge in the matching and B ( i )=the maxim um weight o f a matchin g in T ( i ) if vertex i is not the endp oint of a n y edg e in the matching. In or der to compute these v alues, we will compute the following tables for every vertex i : C A ( i, j, k )=the maximum weigh t o f a matching in T ( i ) if vertex i is the endpo int of an edge in the ma tc hing a nd we only co nsider its sons s ( i , j ) , s ( i, j + 1) , . . . , s ( i, k ) (and their subtrees). Similar ly , we hav e C B ( i, j, k ), where v ertex i does not belong to an y edge in the matching. The maximum w eight of a matching is max { A ( r ) , B ( r ) } . The actual matching can b e computed ea sily , b y tracing back the wa y the A ( i ), B ( i ), C A ( i, ∗ , ∗ ) a nd C B ( i , ∗ , ∗ ) v alues were computed. A r ecursive algor ithm (called with r as its argument) is given b elow. The time complexity is O ( ns ( i ) 2 ) for a vertex i and, thus, O ( n 2 ) ov erall. Maxim umW eigh tMatc hi ng-ExtendedT ree(i): if (n s(i)=0) then A(i)=B(i)=0 else for j=1 to ns(i) do M axim umW ei gh tMatc hi ng-ExtendedT ree (s(i,j)) for j=1 to ns(i) do CA(i, j, j - 1)= −∞ ; C A ( i, j, j ) = | w ( i ) − w ( s ( i, j )) | + B ( s ( i, j )) CB(i, j, j - 1)= 0; CB(i, j, j)=max { A(s(i,j)), B(s(i,j)) } for c ount = 1 to (ns(i)-1) do for j=1 to (ns(i)-c ount) d o k = j + c ount CA(i,j,k)=max {| w ( s ( i, j )) − w ( s ( i, k )) | + B(s(i,j)) + B(s(i,k)) + CA(i, j + 1, k - 1), | w ( i ) − w ( s ( i, j )) | + B(s(i,j)) + CB(i, j+1, k), | w ( i ) − w ( s ( i, k )) | + B(s(i,k)) + CB(i, j, k-1), max { A(s( i,j)), B(s(i,j)) } + CA(i, j+1, k), max { A(s(i,k)), B(s(i,k)) } + CA(i, j, k-1) } CB(i,j, k)=max {| w ( s ( i, j )) − w ( s ( i, k )) | + B(s(i,j)) + B(s(i,k)) + CB(i, j + 1, k - 1), max { A(s(i,j)), B(s(i,j)) } + CB(i, j+1, k), max { A( s ( i,k)), B(s(i,k)) } + CB(i, j, k-1) } A(i)=CA(i,1,ns(i)); B(i)=CB(i,1,ns(i)) 6.2 Maxim um Matc hing in the P o w er of a Gr aph The k th power G k ( k ≥ 2) of a graph G is a graph with the sa me set o f vertices as G , where there exists an edge ( x, y ) betw een t wo vertices x and y if the distance betw een x and y in G is a t most k . The distance b e t ween tw o vertices ( x, y ) in a g raph is the minimum num b er of edges which need to b e tr aversed in or der to r each v ertex y , starting from vertex x . A maxim um ma tc hing in G k of a gr aph G can be found by restricting our attention to a spanning tree T of G . The fo llowing linea r a lg orithm (called with i = r ), us ing obser v ations fro m [12], solves the pro blem (w e consider that, initially , no v ertex is matched): Maxim umMatc hingGk(i): if (n s(i)=0) then return() el se last son=0 for j=1 to ns(i) do // j=1,2,. . . ,ns(i) Maxim umMatc hingGk (s(i,j)) 9 if ( not matche d(s(i,j)) the n if (last son = 0) then last son = s(i,j) else add ed ge (last son, s( i,j)) to the matc hing matche d(last son) = matche d(s(i,j)) = tru e; last son = 0 if ( l ast son > 0 ) then add edge (i, last son) to the matching matche d(i) = m atche d(last son) = true 7 First F it Online T ree Coloring A v ery in tuitive a lg orithm for co loring a g raph with n vertices is the fi rst-fit on- line c oloring heuristic . W e trav erse the vertices in some order v (1) , v (2) , . . . , v ( n ). W e assign colo r 1 to v (1) and for i = 2 , . . . , n , w e assig n to v ( i ) the minim um color c ( i ) ≥ 1 which w as not assigned to any o f its neighbours v ( j ) ( j < i ). A tree is 2-c olor able : we ro ot the tree at any vertex r and then compute fo r each v ertex i its lev el in the tre e (distance from the ro ot); we ass ign the co lor 1 to the vertices on even levels a nd the co lor 2 to those on o dd levels. How ever, in some situatio ns , w e might be forced to pr oc e ss the v ertices in a given order. In this ca se, it would be useful to compute the worst-case coloring that can be obtained by this heuristic, i.e. the largest n umber of colors that are used, under the worst-case ordering of the tree vertices ( Gru n dy numb er ). I will pr esent an O ( n · log ( log ( n ))) algorithm fo r this problem, s imilar in nature to the linear algorithm pres en ted in [4 ]. F o r each v ertex i , we will compute cma x ( i )=the largest color the can b e assig ned to vertex i in the worst-case, if vertex i is the last vertex to be color ed. T he v alue max { cmax ( i ) | 1 ≤ i ≤ n } is the largest nu mber of colo rs that can be assigned b y the firs t fit o nline coloring heuristic. W e will ro o t the tree at an arbitrar y vertex r . The algo r ithm consists of t wo s tages. In the first stag e, the tree is traversed bo ttom-up and fo r each vertex i we compute c (1 , i )=the larg est color that ca n b e assigned to vertex i , cons ide r ing o nly the tree T ( i ). F o r a leaf v ertex i , we hav e c (1 , i ) = 1. F o r a non-lea f vertex i , we will sor t its sons s ( i, 1) , . . . , s ( i, ns ( i )), such that c (1 , s ( i, 1)) ≤ c (1 , s ( i, 2)) ≤ . . . ≤ c (1 , s ( i, ns ( i ))). W e will initializ e c (1 , i ) to 1 and then co nsider the so ns in the sorted or der. When we reach so n s ( i, j ), we compare c (1 , s ( i, j )) with c (1 , i ). If c (1 , s ( i, j )) ≥ c (1 , i ), then we increment c (1 , i ) by 1 (otherwis e, c (1 , i ) stays the s ame). The justification of this alg orithm is the fo llowing: if a vertex i can b e a ssigned color c (1 , i ) in some ordering of the vertices in T ( i ), then there exists an or de r ing in which it can be assigned any o ther colo r c ′ , such that 1 ≤ c ′ ≤ c (1 , i ). Then, when trav ersing the sons and r eaching a son s ( i , j ) with c (1 , s ( i, j )) ≥ c (1 , i ), we consider an order ing of the vertices in T ( s ( i, j )), wher e the color o f vertex s ( i, j ) is c (1 , i ); thus, we can increase the maximum co lor that can be a ssigned to v ertex i . After the bo ttom-up tree trav ersa l, we hav e cmax ( r ) = c (1 , r ), but we still hav e to co mpute the v alues cmax ( i ) for the other vertices o f the tree. W e could do that by ro oting the tree at every vertex i and running the previo usly describ ed a lgorithm, but this w ould take O ( n 2 · l og ( l og ( n ))) time. How ever, w e can co mpute these v alues faster, by trav ersing the tree vertices in a top-down manner (consider ing the tree r o oted at r ). F or each vertex i , we will compute col max ( parent ( i ) , i )= the maximum color that can b e assigned to par ent ( i ) if we 10 remov e T ( i ) fro m the tre e and after wards w e c o nsider parent ( i ) to be the (new) ro ot of the tree. W e will use the v alues c (2 , i ) a s tempo rary sto rage v aria bles. c (2 , i ) is initialized to c (1 , i ), for every vertex i . When computing cmax ( i ), we consider that vertex i is the ro ot o f the tree. Let’s a ssume that we computed the v alue cmax ( i ) of a v ertex i and now w e wan t to compute the v alue c max ( j ) of a vertex j which is a son of vertex i . W e remov e j fr om the list o f so ns of vertex i and add par ent ( i ) to this list ( pa rent ( i )= vertex i ’s parent in the tree ro oted at the initial vertex r ). W e now need to lift vertex j above vertex i and make j the new r o o t of the tree. In or der to do this, w e will recompute the v alue c (2 , i ), which is computed simila rly to c (1 , i ), exc ept that we consider the new list of so ns for vertex i (and their c (2 , ∗ ) v alues ). Afterw ards, we add vertex i to the list o f sons of vertex j . W e will compute the v a lue cmax ( j ) similar ly to the v alue c (1 , j ), using the v alues c (2 , ∗ ) of vertex j ’s so ns (instea d of the c (1 , ∗ ) v alues of the sons). After computing cmax ( j ) we restore the lists of sons of vertices i and j to their orig inal states (as if the tree were ro oted at the initia l vertex r ). After computing the v alues cmax ( u ) of all the descendants u of a vertex j , we reset the v alue c (2 , j ) to c (1 , j ). Both tr av er sals ta k e O ( n · log ( n )) time, if w e sort the ns ( i ) sons of ev ery vertex i in O ( ns ( i ) · l og ( ns ( i ))) time. How ever, it ha s b een proved in [4] tha t the minimum num b er of vertices of a tree with the Grundy n umber q is 2 q − 1 , which is the binomial tree B ( q − 1). The binomial tree B (0) consists of only one vertex. The binomial tr e e B ( k ≥ 1 ) has a ro ot vertex with k neighbo rs; the i th of these neigh b ors (0 ≤ i ≤ k − 1) is the ro ot of a B ( i ) binomia l tree. Thus, every v a lue c (1 , ∗ ), c (2 , ∗ ) a nd cmax ( ∗ ) c a n be repre s en ted using O ( l og ( l og ( n ))) bits. W e can us e radix- sort and obtain a n O ( n · l og ( l og ( n ))) time complexity . The pseudo co de of the functions is given b elow. The main algorithm co nsists o f calling FirstFit-BottomUp(r) , initializing the c (2 , ∗ ) v alues to the c (1 , ∗ ) v alues, setting cmax ( r ) = c (1 , r ) and then calling FirstFit-T opDown(r) Compute(i, idx): sort the sons of vertex i, su ch that c(idx,s(i,1)) ≤ . . . ≤ c(idx,s(i,ns(i))) c(idx,i)=1 for j=1 to ns(i) do i f (c(idx,s(i,j)) ≥ c(idx,i)) then c(idx,i)=c(idx,i)+1 FirstFit-BottomUp(i): for j=1 to ns(i) do Fi rstFit-BottomUp (s(i,j)) Compute (i, 1) FirstFit-T opDown(i): if ( i 6 = r ) the n remo v e vertex i fr om t he list of sons of p ar ent( i) add p ar ent ( p ar ent (i)) t o the list of sons of p ar ent(i) (if p ar ent(i) 6 = r) Compute (p ar ent(i),2); c olmax(p ar ent (i),i)=c(2,p ar ent(i)) add p ar ent ( i) to t he list of sons of vertex i Compute (i,2); cmax(i)=c(2,i) restore the original lists of sons of the vertic es p ar ent(i) and i for j=1 to ns(i) do Fi rstFit-T opDown (s(i, j)) c(2,i)=c(1,i) 11 8 Other Op timizatio n and Coun ting Problems 8.1 Building a (Constrained) T ree with Minim um Heigh t In this subsection I c onsider the following o ptimization problem: W e ar e giv en a sequence of n leav es and each leaf i (1 ≤ i ≤ n ) ha s a heig h t h ( i ). W e wan t to construct a (strict) binar y tr ee with n − 1 internal no des, such that, in an inorder trav ersal o f the tree, w e encounter the n leav es in the given o rder. The height of an internal node i is h ( i ) = 1 + max { h ( l ef tson ( i ) , h ( r ig htson ( i )) } (the height of the leaves is given). W e are in terested in computing a tr ee whose r oo t has minim um height. A straig h t-forward dynamic progra mming solutio n is the following: compute H min ( i, j )=the minim um height of a tree containing the leav es i , i + 1, . . . , j . W e ha ve: H min ( i, j )=1 + mi n i ≤ k ≤ j − 1 max { H mi n ( i, k ) , H mi n ( k + 1 , j ) } . H min (1 , n ) is the answer to our pro blem. How ever, the time complexity of this algorithm is O ( n 3 ), whic h is unsatisfactor y . An o ptimal, linear-time alg orithm was giv en in [14]. The main idea o f this a lgorithm is the following. W e traverse the leav es from left to r ight and maintain information ab out the rightmost path o f the optimal tree for the first i leav es . Then, we can a dd the ( i + 1) st leaf by mo difying the rightmost path o f the o ptimal tr ee for the first i leav e s. Let’s ass ume that w e pro cess e d the first i le av es and the optimal tree for these leav es contains, on its rightmost path, the vertices v (1 ), v (2), . . . , v ( nv ( i )), in o r der, from the ro ot to the r ight most leaf ( v (1) is the ro ot). Let’s assume that the heights o f the subtrees r o o ted a t these vertices are hv (1), . . . , hv ( nv ( i )). It is ea s y to build this tree for i = 1 a nd i = 2 (it is unique). When adding the ( i + 1) st leaf, we trav erse the r ight most path from n v ( i ) down to 2 . Assume tha t we a re consider ing the vertex v ( j ). If hv ( j − 1) < (2 + max { hv ( j ) , h ( i + 1) } ), then we disconsider the vertex v ( j ) from the rightmost path and mov e to the next vertex ( v ( j − 1)). Le t’s assume that the path now co n tains the vertices v (1), . . . , v ( nv ′ ( i )). W e replace vertex v ( nv ′ ( i )) by a new v ertex v new , whose left son will b e v ( nv ′ ( i )) (together with its subtree) and whose right son will be the ( i + 1 ) st leaf. The heig h t of the new vertex w ill b e 1 + max { hv ( nv ′ ( i )) , h ( i + 1) } . The r ig h tmost path of the optimal tree behaves like a stac k and, th us, the overall time complexity is linea r. I will pres en t a sub-optimal O ( n · l og ( n )) time algor ithm which is in terest- ing on its own. T he alg orithm is similar to Huffman’s algorithm for computing optimal prefix-free codes, e xcept that it maintains the or der of the leav es . A sug - gestion that such an appro ach might w ork w as given to me by C. Gheorghe. At step i (1 ≤ i ≤ n − 1 ) of the algorithm, w e will have n − i + 1 subtrees of the opti- mal tree. Each subtree j contains an interv al of leaves [ lef tl eaf ( j ) , ri g htl eaf ( j )] and its heig h t is h ( j ). W e will combine the t wo adja c en t subtrees j and j + 1 whose combined height (1+max { height(subtr e e j),heig ht(subtr e e j+1) } ) is min- im um among a ll the O ( n ) pair s o f a dja c en t subtrees. At the first step, the n subtrees ar e repres en ted by the n le aves, whos e he ig h ts a re given. A stra ight- forward implemen tation o f this idea leads to an O ( n 2 ) algorithm. How ever, the pro cessing time can b e improved by using t wo segment trees [15], A and B , with n and n − 1 leav es, resp ectively . Each no de q o f a segment tree corres ponds to an int erv a l o f lea ves [ l ef t ( q ) , r ig ht ( q )] (leav es are num b ered starting from 1). Each leaf node of the segment tree A can b e in the active or inactive state. E ach no de q of A (whether lea f or in ternal no de) maintains a v alue na ctiv e ( q ), deno ting the num b er of active le aves in its subtree. Initially , each o f the n leav es of A 12 is a ctiv e and the nacti v e ( ∗ ) v alues are initialized appro priately , in a bottom-up manner (1, for a leaf no de , and nactiv e ( l ef tson ( q )) + nactiv e ( rig htson ( q )), for an internal node q ). Segment tree B has n − 1 leav es and each node o f B (lea f or internal no de) stores a v alue hc . If leaf i (1 ≤ i ≤ n − 1) is active in A , then hc(le af i)=1+max { h(i), h(j) } , where j > i is the next active leaf. If leaf i is not active in A or is the last active lea f, then hc(le af i)= + ∞ . The v alue hc of ea c h in- ternal node q o f B is the minimum among all the hc v alues of the lea ves in no de q ’s subtree, i.e. hc(no de q)=min { hc(leftson(q)), hc(rightson(q)) } . Mo reov er, each no de q of B ma intains the num ber l nu m o f the leaf in its subtree whic h gives the v alue hc(no de q) . W e hav e ln um(le af i)=i and lnum(internal no de q)=if (hc(leftson(q)) ≤ hc(rig htson(q))) then lnum(leftson(q)) else lnu m(rightson(q)) . A t each step i (1 ≤ i ≤ n − 1), eac h active leaf is the leftmos t le a f o f a subtree of the o ptimal tree. After every s tep, the num b er o f active leav es decreases by 1 . W e can find in O ( l og ( n )) time the pa ir of adjacent subtrees to combine. The he ig h t of the combination of these subtrees is hc(r o ot no de of B) , the leftmost leaf o f the first subtree is i=lnu m(r o ot no de of B) and that of the seco nd subtree is j = next activ e ( i ). W e define the function next activ e by using tw o other functions: r ank ( i ) and un rank ( r ). rank ( i ) returns the nu mber of active leav e s befor e leaf i (0 ≤ rank ( i ) ≤ nactive(r o ot no de of A)-1 ). unr ank ( r ) re tur ns the index of the leaf whose rank is r . The tw o functions ar e inv er ses o f e a ch other: unr ank ( r ank ( i )) = i a nd rank ( unr ank ( r )) = r . W e hav e r ank(i)=r ank’(i, r o ot no de of A) , unr ank(r)=unr ank’(r, r o ot n o de of A) and next active(i)=unr ank(r ank(i) + 1) . rank’(i, q): if (q is a le af no de) then if (left(q)=right(q)=i) then return (0) els e return (-1) else i f ( i > rig ht ( l ef tson ( q )) ) then return (nactive(leftson(q))+r ank’(i, rightson(q))) else return (r ank’(i, leftson(q))) unrank’(r, q): if (q is a le af no de) then if ( r > 0) then return (-1) e lse return (left(q)) else i f ( nacti v e ( l ef tson ( q )) ≤ r ) then return (unr ank’(r-nactive(leftson(q)), rightson(q))) else return (un r ank’(r, leftson(q))) The functions r ank , unr ank and next active take O ( log ( n )) time each. After obtaining the indices of the tw o a ctive leav es i and j whose co r resp onding sub- trees are united (b y adding a new in ternal no de whose left so n is the r o ot of i ’s subtree and who se right son is the ro ot of j ’s subtree), we mark lea f j as inac- tive . W e do this by trav ersing the seg ment tree A from le a f j tow ards the ro o t (from j to pa rent ( j ), parent ( par ent ( j )), . . . , r o ot no de of A ) and decrement b y 1 the nactiv e v alues of the v is ited no des. Then, we change the h v alues of leaves i and j . W e se t h(i)=hc(r o ot no de of B) and h ( j ) = + ∞ . After this, we will also change the hc v alues ass o cia ted to the leav es i and j in the segment tree B . The new hc v alue of leaf j will b e + ∞ . If i is now the last active leaf, then hc(le af i) be c o mes + ∞ , too. Otherwise, let j ′ = next acti v e ( i ), the nex t active leaf a fter i (at this p oint, leaf j is not active a n ymore). W e will change hc(le af no de i) to (1 + m ax { h ( i ) , h ( j ′ ) } ). After changing the hc v a lue o f a leaf k , we 13 trav erse the tr e e fro m lea f k tow ar ds the r o o t (visiting a ll o f k ’s a ncestors, in order, starting from parent ( k ) a nd ending a t the ro ot of B ). F or eac h ancestor no de q , w e reco mpute hc(no de q) as min { hc ( l ef tson ( q )) , hc ( rig htson ( q )) } . 8.2 The Number of T rees with a Fixed N um b er of Leav es In order to co mpute the num ber o f lab eled trees with n vertices and exactly p leav es, we will compute a table N T ( i, j )=the nu mber of trees with i vertices and exactly j leav es (1 ≤ j ≤ i ≤ n ). Ob viously , we ha ve N T (1 , 1 ) = N T (2 , 2 ) = 1 and N T ( i, j ) = 0 for i = 1 , 2 and j 6 = i . F or i > 2, we hav e N T ( i, i ) = 0 and for 1 ≤ j ≤ i − 1, we will pro ceed as follows. The j leaves ca n be chosen in C ( i, j ) wa ys ( i cho ose j ). After choosing the identifiers o f the j leav es, we will conceptually remov e the leav es fro m the tr e e, th us remaining with a tree having i − j vertices and an y num ber of lea ves k (1 ≤ k ≤ j ). Ea c h of the j leaves that we conce ptually r emov e d is adja c en t to o ne o f these k vertices. F urthermore, each of these k vertices is adjacent to at lea st one of the j leaves from the larger tree. Th us, we need to co mpute the num b e r of surjective functions f from a doma in of size j to a domain of size k . W e will denote this v alue by N F ( j, k ). This is a ”c lassical” pro blem, but I will prese nt a simple solution, nevertheless. W e hav e N F (0 , 0) = 1 and N F ( j, k ) = 0 , if j < k . In order to compute the v alues for k ≥ 1 and j ≥ k , we will cons ider every num ber g of v alues x from the set { 1 , . . . , j } for which f ( x ) = k . Once g is fixe d, we hav e C ( j, g ) wa ys of choosing the g v a lues from the set { 1 , . . . , j } . F or each such p ossibility we hav e N F ( j − g , k − 1) ways of extending it to a surjective function. Thus, N F ( j, k ) = P j g =1 C ( j, g ) · N F ( j − g , k − 1). W e can tabulate all the N F ( ∗ , ∗ ) v alues in O ( n 3 ) time (after tabulating the combinations C ( ∗ , ∗ ) in O ( n 2 ) time, first). With the N F ( ∗ , ∗ ) v alues computed, we hav e N T ( i, j ) = C ( i, j ) · P j k =1 ( N T ( i − j, k ) · N F ( j, k )). W e ca n easily compute ea c h entry N T ( i, j ) in O ( n ) time, obta ining an O ( n 3 ) ov erall time complexity . The technique of per forming dynamic programming on succes sive la yers of leav es o f a tr ee is also useful in several o ther coun ting problems. 8.3 The Num b er of T rees with Degree Constrain ts W e w ant to compute the num b er o f unlab eled, ro oted trees with n ≥ 2 vertices, such that the (degree / num ber of sons) of each v ertex be lo ngs to a set S , whic h is a subset of { 0 , 1 , 2 , . . . , n − 1 } . By (a/b) we mean that a refers to the degree- constrained pro blem and b refers to the num b er-o f- s ons-constra ined problem (everything else b eing the same). Because every tree with n ≥ 2 vertices must contain at least a leaf (a v ertex o f degr ee 1) and at lea st o ne vertex with at least 1 so n, the set S will always contain the subset ( { 1 } / { 0 , 1 } ). W e will co mpute a ta ble N T ( i, j, p )=the num b er of trees with i vertices, such that the r o ot has degree j ( j sons) and the maximum num ber of vertices in the subtree of an y son of the ro o t is p ; mor eov er, except p e rhaps the tree ro ot, the (degr ees/num b ers of s ons) o f all the other vertices b elong to the set S . Because the trees ar e unlab eled, we ca n sort the sons of each v ertex in non-decr easing order of the nu mbers o f vertices in their subtrees. Thus, we will compute the table N T in increasing o rder of p . N T (1 , 0 , p ) = 1 and N T (1 , j > 0 , p ) = N T ( i ≥ 2 , j, 0) = 0. F o r p ≥ 1 and i ≥ 2, we ha ve: 14 N T ( i, j, p )= N T ( i, j, p − 1 )+ P ⌊ i − 1 p ⌋ k =1 N T ( i − k · p, j − k , p − 1) · C R ( T T ( p ) , k ) T T ( p ) is the to ta l n umber of trees with p vertices, fo r which the (degree / num ber o f sons) of the ro ot is equal to some (( x − 1)/( x )), x ∈ S , and the (degrees / num ber s o f so ns) of the other vertices belong to the set S . By C R ( i, j ) we denote combinations with rep etitions of i elements, out o f which we c ho ose j . Because the arg umen t i can b e very lar ge, we cannot tabulate C R ( i, j ). Instead, we will compute it on the fly . W e know that C R ( i, j ) = C ( i + j − 1 , j ) and that C ( i, j ) = i − j +1 j · C ( i, j − 1). Thus, C R ( i, j ) can be computed in O ( j ) time. Befor e computing any v alue N T ( ∗ , ∗ , p ), we need to co mpute and stor e the v alues T T ( p ), T T ( p ) = P x ∈ S N T ( p, (( x − 1) / ( x )) , p − 1), and C R ( T T ( p ) , k ), for all the v alues of k (1 ≤ k ≤ j n − 1 p k ). W e ca n compute all of these v alues in O ( n 3 · l og ( n )) time. The desired num ber of tree s is P x ∈ S N T ( n, x, n − 1). The memor y storage can be r educed from O ( n 3 ) to O ( n 2 ), by noticing that the v alues N T ( ∗ , ∗ , p ) are computed bas ed o nly on the v alues N T ( ∗ , ∗ , p − 1). Thus, we can maintain these v alues only for the most recent t w o v alues of p . A less efficient metho d is to compute the num ber s T ok ( i )=the num b er of trees with i v ertices, suc h that each vertex sa tisfies the (degr ee/num b er of s ons) constraints. T ok (1) = T ok (2) = 1 . W e will make us e of the T T ( i ) v alues defined previously , except that they will be computed differently . F or every i ≥ 2, we co nsider ev ery po s sible n umber x of sons o f the tree r o ot a nd compute N T 2( i, x )=the num b er of trees with i v ertices, such that the tree ro o t has x s ons and all the other vertices satisfy the (degree/ n umber o f s o ns) constraints. W e will g e ne r ate all the po ssibilities ( y (1) , y (2) , ..., y ( i − 1)), with 0 ≤ y ( j ) ≤ j i − 1 j k (1 ≤ j ≤ i − 1) a nd y (1) + . . . + y ( i − 1)= x . y ( j ) is the num be r of so ns of the tree ro ot which hav e j vertices in their subtree s. The num b er of trees ”matching” such a pa r tition is equal to Q i − 1 j =1 C R ( T T ( j ) , y ( j )). N T 2 ( i , x ) is computed by summing the num b ers of tree s ”ma tching” every par tition. Afterwards, if x ∈ S , we add N T 2( i, x ) to T ok ( i ). If x = (( y − 1) / ( y )) and y ∈ S , then we a dd N T 2( i, x ) to T T ( i ). N T 2( i, x ) may b e added to b oth T ok ( i ) and T T ( i ). 9 Related W ork Reliability analy sis a nd improvemen t techniques for distributed sy stems were considered in [6,7]. Reliabilit y analy sis and optimization for tree netw or ks in particular were considere d in [3,5,8]. Different kinds of tree partitioning algo- rithms, base d on optimizing several ob jectiv es, were prop osed in [9,10,16 ]. P rob- lems related to tree colo ring w ere studied in [4]. Co n tent delivery in distributed systems is a sub ject of high practica l and theor e tical interest and is studied from m ultiple pe r sp e c tiv es. Communication scheduling in tree netw orks was considered in ma n y pa per s (e.g. [17]) and the o ptimization of conten t delivery trees (m ulticast tree s ) was studied in [11]. 10 Conclusions and F uture W ork In this pap er I consider ed several optimizatio n pr o blems reg arding distributed systems with tree top olo g ies (e.g . p eer-to-p eer netw o rks, wireless netw or ks, 15 Grids), which hav e ma n y practical applications: minim um weigh t c ycle com- pletion (relia bilit y improv ement), constra ined pa rtitioning (distr ibuted co or di- nation and control), minim um n um b er of streams and degree-constr a ined mini- m um spanning trees (efficien t cont ent delivery), o ptimal matchings (data repli- cation and resource allo ca tion), color ing (resource manag emen t and frequency allo cation) and tree counting asp ects. All these problems ar e v aria tions or exten- sions of pro blems w hich hav e b een previously po sed in other r esearch pap ers. The presented techniques are either b etter (faster or mo re general) than the previous solutions or easier to implement. References 1. J. R oskind, R. E. T arjan, A Note on Finding Mi nimum-Cost Edge-Disjoint Sp an- ning T r e es , Mathematics and Op erations Researc h 10 (4) (1985), 701-708. 2. M. A. Bender, M. F arac h-Colton, The LCA Pr oblem r evisite d , Lecture Notes in Computer Science 1776 (2000), 88-94. 3. M. Scortaru, National Olympiad in Informatics , Gazeta de i nformatica (I nformatics Gazzette) 12 (7) (2002), 8-13. 4. S. M. Hedetniemi, S. T. Hedetniemi, T. Beyer, A Line ar Algorithm for the Grundy (Coloring) Numb er of a T r e e , Congressus N umeranti um 36 (1982), 351-362. 5. M. I. Andreica, N. T apus, R eliability Analysis of T r e e Networks Appli e d to Balanc e d Content R epli c ation , Pro c. of the IEEE Intl. Conf. on Automation, R obotics, Quality and T esting (2008 ), 79-84. 6. D. J. Chen, T. H. Huang, R eliabili ty Ana lysis of Distribute d Systems Base d on a F ast R eliabi lity Algorithm , IEEE T rans. on P ar. and Dist. Syst. 3 (1992), 139-154. 7. A. Kumar, A. S. Elmag hraby , S. P . Ahuja, Performanc e and r eli ability optimi zation for distribute d c omputing systems , Pro c. of the IEEE Symp. on Comp. and Comm. (1998), 611-615. 8. H. A bac hi, A.-J. W alker, R eliabil i ty analysis of tr e e, torus and hyp er cub e m essage p assing ar chite ctur es , Proc. of th e IEEE S.-E. Symp. on System Theory (1997), 44-48 . 9. G. N . F rederickson, Optimal algorithms f or tr e e p artitioning , Proc. of the ACM- SIAM Sy mposium on Discrete A lgorithms (SODA) (1991), 168-177. 10. R. Cordone, A sub exp onential algorithm for the c olour e d tr e e p artition pr oblem , Discrete App lied Mathematics 155 (10) (2007), 1326-1335. 11. Y . Cui, Y. Xue, K. Nahrstedt, Maxmin overlay multic ast: r ate al lo c ation and tr e e c onstruction , Proc. of the I EEE W orkshop on QoS (IWQOS) (2004), 221-231. 12. Y . Qinglin, F actors and F actor Extensions , M.Sc. Thesis, Shandong Un iv., 1985. 13. T. L. Magnanti, L. A. W olsey , Optimal T r e es , Hand b ooks in Op erations Research and Management Science, v ol. 7, chap. 9 (1995), 513-616. 14. S.-C. Mu, R. S. Bird, On B ui lding T r e es with Minimum Height, R elational ly , Pro c. of the Asian W orkshop on Progra mming Languages and Systems (2000). 15. M. I. Andreica, N. T apus, Optim al Offl ine TCP Sender Buffer Management Str at- e gy , Proc. of the Intl. Conf. on Comm. T heory , R eliab., and Q oS (2008), 41-46. 16. B. Y. W u, H.-L. W an g, S. T. Kuan, K.-M. Chao, On the Uniform Edge-Par tition of a T r e e , Discrete A pplied Mathematics 155 (10) (2007), 1213-1223. 17. M. R. Henzinger, S. Leonardi, S che duli ng m ul tic asts on unit-c ap acity tr e es and meshes , J. of Comp. and Syst. Sci. 66 (3) (2003), 567-61 1. 18. T. H. Cormen, C. E. Leiserson, R. L . Rivest, C. Stein, Intr o duction to Algorithms , MIT Press and McGra w-Hill (2001). 16

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment