Algorithmic Techniques for Several Optimization Problems Regarding Distributed Systems with Tree Topologies

Algorithmic T ec hniques for Sev eral Optimization Problems Regarding Distributed Systems with T ree T op ologies Mugurel Ionut ¸ Andreica Politehnic a University of Buchar est, Com puter Scienc e Dep artment, R omania mugur el.andr eic a@cs.pub.r o As the developmen t of distributed systems prog resses, more and more c hallenges arise and the need for developing optimized systems and for optimizing existing systems from multiple persp ectives becomes more stringent. In this pap er I present nov el algo rithmic techniques for solving several optimization problems regar ding distributed sy stems with tree to polo gies. I address to pics like: r elia- bilit y improv ement, partitio ning , colo ring, conten t delivery , optimal matc hings, as w ell as some tree counting asp ects. Some of the presented tec hniques are only of theoretical interest, while others can be used in practical settings. 1 In tro duction Distributed systems are b eing increa s ingly developed and deploy ed all around the world, b ecause they pres en t eﬃcient solutions to many practical pr o blems. How ever, as their developmen t progresses , many problems related to scalability , fault toler ance, stability , eﬃcien t r esource usag e and ma n y other topics need to be so lv ed. Developing e ﬃcie nt distributed s ystems is not an easy task, because many system parameters need to be ﬁne tuned a nd optimized. Because of this, optimization techniques are required for designing eﬃcient distributed sy stems or improving the p erfor ma nce of existing, already deploy ed ones. In this pap er I present se v eral novel a lgorithmic techniques for some o ptimiza tion problems regar ding distributed systems with a tree topo logy . T r ees are some of the simplest non-trivia l to p olo gies which a ppea r in real- life s ituations. Many o f the existing netw o rks have a hiera rchical structure (a tree or tre e-like gr aph), with user devices at the edge of the net work and router backbones a t its co re. Some p eer-to-p eer systems used for c on tent retr iev al and indexing hav e a tree structure. Multicast con tent is usually delivered using m ulticast tr ees. F urthermore , many graph top ologies can b e reduced to tree top ologies, by choosing a spanning tr ee o r by cov ering the gr aph’s edge s with edge disjo in t spanning trees [1 ]. In a tree, there exists a unique path betw een any t wo nodes. Thus, the netw ork is quite frag ile. The frag ilit y is comp ensated by the simplicity of the top ology , which makes many decisio ns become easie r. This pap er is structure d a s follows. Section 2 deﬁnes the main notations which ar e used in the rest of the pap er. In Se c tion 3 I consider the minimum weigh t cycle completion problem in trees. In Section 4 I discus s tw o tree parti- tioning problems and in Section 5 I c o nsider t wo cont ent delivery optimization problems. In Section 6 I so lve se veral o ptimal matching problems in trees and powers o f trees and in Section 7 I analyze the ﬁrs t ﬁt online c oloring heuristic, applied to tre e s . In Section 8 I co nsider three other o ptimiza tion and tree co un t- ing pr oblems. In Section 9 I dis cuss re la ted work and in Section 1 0 I conclude and present future work. 1 2 Notations A tree is a n undirected, connected, acyclic gr aph. A tree may be ro oted, in which case a special vertex r will b e called its roo t. Even if the tree is unr o o ted, we may cho ose to ro o t it at some vertex. In a ro oted tr e e, we deﬁne par ent ( i ) as the pa rent of vertex i and ns ( i ) as the num b er of sons of vertex i . F o r a leaf vertex i , ns ( i ) = 0 and for the r o o t r , pare nt ( r ) is undeﬁned. The sons of a vertex i are deno ted by s ( i, j ) (1 ≤ j ≤ ns ( i )). A vertex j is a desc endant of vertex i if ( par ent ( j ) = i ) o r par ent ( j ) is also a descendant of vertex i . W e denote b y T ( i ) the subtree ro oted at v ertex i , i.e. the par t of the tree co mpose d of v ertex i a nd all of its de s cendant s (together with the edge s connecting them). In the pap er, the terms no de and vert ex will be used with the same meaning. A matching M of a g raph G is a set of edges o f the graph, suc h that any t wo edg e s in the set hav e distinct e ndp oints (vertices). A maximum matching is a matching with max im um cardinality (maximum num b er of edges). 3 Minim um W eigh t Cycle Completion of a T ree W e consider a tree netw o rk with n vertices. F or m pairs of v ertices ( i, j ) which are not adjac e nt in the tree, we a re g iv en a weigh t w ( i, j ) (we can consider w ( i, j ) = + ∞ for the other pairs of v ertices). W e w ant to connect so me of these m pair s (i.e. add extra edg es to the tree), suc h that, in the end, every vertex of the tree belo ngs to exa ctly one cycle. The ob jective consists of minimizing the total w eight of the edges added to the tree. F o r the unw eighted case ( w ( i, j ) = 1) and when we can connect any pair o f vertices which is not connected by a tree edge, there ex ists the following simple greedy algorithm [3]. W e select an arbitrar y ro ot vertex r a nd then traverse the tree b ottom-up (from the leav e s tow a rds the ro ot). F or each vertex i we will compute a v alue l ( i ), r e pr esenting the lar gest num ber of vertices o n a path P ( i ) starting at i and co n tinu ing in T ( i ), such that every vertex j ∈ ( T ( i ) \ P ( i )) b elongs to exactly one cy cle and the v ertices in P ( i ) ar e the only ones who do not belong to a cycle. W e denote by e ( i ) the seco nd endp oint of the path (the ﬁrst one being vertex i ). F o r a leaf vertex i , we hav e l ( i ) = 1 and e ( i ) = i . F or a non-lea f vertex i , we ﬁrs t remov e from its list of so ns the s ons s ( i, j ) with l ( s ( i, j )) = 0, up date ns ( i ) and renum b er the other sons starting fro m 1. If i remains with only one son, we set l ( i ) = l ( s ( i, 1)) + 1 and e ( i ) = e ( s ( i, 1)). If i r e mains with ns ( i ) > 1 sons, we will sort them acc o rding to the v alues l ( s ( i , j )), such that l ( s ( i, 1)) ≤ l ( s ( i, 2)) ≤ . . . ≤ l ( s ( i, ns ( i ))). W e will connec t by an edg e the vertices e ( s ( i, 1)) and e ( s ( i, 2)). This wa y , every vertex on the pa ths P ( s ( i, 1)) and P ( s ( i, 2)), plus the vertex i , b elong to exactly one cycle. F or the other sons s ( i, j ) (3 ≤ j ≤ ns ( i )), we will have to connect s ( i, j ) to e ( s ( i, j )). This will only be p o ssible if l ( s ( i, j )) ≥ 3; otherwise, the tree admits no solution. Afterwards, we set l ( i ) = 0. If the ro ot r has o nly o ne son, then we must hav e l ( r ) ≥ 3 , such that we can connect r to e ( r ). F o r the gene r al ca se, I will des c r ibe a dynamic progra mming algor ithm (as the gr eedy algor ithm cannot b e extended to this cas e ). W e will a gain ro ot the tree a t an arbitra ry vertex r , thu s deﬁning parent-son relatio nships. F or each vertex i , we will compute t wo v alues: w A ( i )=the minimum total weigh t of a subset of edges added to the tree such that every vertex in T ( i ) b e longs to 2 exactly one cy cle, and wB ( i )=the minimum total weight of a subset of edge s added to the tr ee such that every vertex in ( T ( i ) \ { i } ) b elongs to exa ctly one cycle (and vertex i belo ngs to no cycle). W e will compute the v a lues fro m the leaves tow ards the ro ot. F or a lea f vertex i , we hav e w A ( i ) = + ∞ and wB ( i ) = 0. F or a non-leaf vertex i , we hav e: wB ( i ) = P ns ( i ) j =1 wA ( s ( i, j )). In order to compute w A ( i ) we will ﬁr st tra verse T ( i ) and for eac h vertex j , we will compute wAsum ( i , j )=the sum of all the w A ( p ) v a lues, where p is a son of a vertex q which is lo cated o n the pa th from i to j ( P ( i . . . j )) and p do es not belo ng to P ( i . . . j ). W e hav e w Asum ( i, i ) = w B ( i ) and for the o ther vertices j we hav e w Asum ( i, j ) = w Asum ( i, par ent ( j )) − wA ( j ) + w B ( j ). Now we will try to a dd a n edge, such tha t it closes a c y cle in the tree which contains vertex i . W e will ﬁrst try to add edge s of the form ( i , j ), where j is a descendant of i (but not a son o f i , of course) - these will b e called typ e 1 edges. Adding suc h an edge ( i, j ) provides a ca ndidate v alue w cand ( i, i, j ) for w A ( i ): wcand ( i, i, j ) = wAsum ( i, j ) + w ( i, j ). W e will then consider edges of the form ( p, q ) ( p 6 = i and q 6 = i ), whe r e the low est co mmon ance s tor of p and q ( LC A ( p, q )) is vertex i - these will be called typ e 2 edges (we consider every pair of distinct so ns s ( i, a ) and s ( i , b ), and fo r each such pair we co nsider every pair of vertices p ∈ T ( s ( i, a )) and q ∈ T ( s ( i, b )) and verify if the edge ( p, q ) ca n be added to the tree). Adding such an edge ( p, q ) provides a ca ndidate v alue w cand ( i, p, q ) for wA ( i ): w cand ( i, p, q )= w Asum ( i, p )+ w Asum ( i, q )- w B ( i )+ w ( p, q ). wA ( i ) will be equal to the minim um of the candidate v alues wc and ( i, ∗ , ∗ ) (or to + ∞ if no candidate v alue exists). W e ca n implement the algor ithm in O ( n 2 ) time, which is optimal in a sense, b e c ause m ≤ ( n · ( n − 1) / 2 − n + 1), which is O ( n 2 ). w A ( r ) is the answer to our pr oblem and we can ﬁnd the a c tual edges to add to the tree b y tra cing back the wa y the w A ( ∗ ) and wB ( ∗ ) v alues were co mputed. How ever, when the num b er m of e dg es which can be added to the tree is s ig niﬁcantly smaller, w e can impr ove the time co mplexit y to O (( n + m ) · l og ( n )). W e will compute for e a c h of the m edges ( i, j ) the low est common ancestor o f the vertices i and j ( LC A ( i, j )) in the ro oted tr e e . This can be a c hieved by prepr o cessing the tree in O ( n ) time and then answering each LCA query in O (1) time [2 ]. If LC A ( i, j ) = k , then we will add the edge ( i, j ) to a list Ledg e ( k ). Then, for each no n-leaf vertex i , w e will trav erse the edges in Ledg e ( k ). F o r ea c h edge ( p, q ) we can easily deter mine if it is of t ype 1 ( i = p o r i = q ) or of t ype 2 and use the co r resp onding equa- tion. How ever, we need the v alues w Asum ( i, p ) and wAsum ( i, q ). Instead of reco mputing these v alues from scratch, we will up date them incr emen tally . It is o b vious that w Asum ( par ent ( i ) , p )= wAsu m ( i, p )+ wB ( par ent ( i ))- wA ( i ). W e will pr epro cess the tree, by as signing to eac h vertex i its DFS n umber D F S num ( i ) ( D F S num ( i )= j if vertex i was the j th distinct vertex visited dur- ing a DFS traversal of the tree which star ted at the ro ot). Then, for each vertex i , we compute D F S max ( i )=the maximu m DFS num b er of a vertex in its subtree. F or a leaf no de i , we hav e D F S max ( i ) = D F S num ( i ). F or a non-le af vertex i , D F S max ( i )= max { D F S nu m ( i ) , D F S max ( s ( i, 1)) , . . . , D F S max ( s ( i, ns ( i ))) } . W e will main tain a segment tree, using the algorithmic framework fr om [15]. The op erations we will use are range a ddition update and po in t query . Initially , each leaf i (1 ≤ i ≤ n ) has a v alue v ( i ) = 0. Befor e com- puting w A ( i ) for a vertex i , w e set the v alue of leaf D F S num ( i ) in the seg men t tree to w B ( i ). Then, for each so n s ( i, j ), w e add the v alue ( w B ( i ) − wA ( s ( i, j )) 3 to the interv al [ D F S num ( s ( i, j )) , D F S max ( s ( i, j ))] (range up date). W e c a n obtain w Asum ( i, p ) for any vertex p ∈ T ( i ) by quer ying the v a lue of the cell D F S num ( p ) in the seg men t tree: we start from the (current) v alue of the leaf D F S num ( p ) and a dd the up date aggre g ates uagg stored a t every a nestor no de of the leaf in the segment tr ee. Queries and up dates take O ( l og ( n )) time ea c h. If the ob jective is to minimize the larg est weigh t W max of a n edg e added to the tree, we ca n binary sea rch W max and p erfor m the following feasibility test on the v alues W cand chosen by the binar y search: we c o nsider only the ”extra” edges ( i, j ) with w ( i, j ) ≤ W cand and run the alg orithm describ ed ab ov e for these edges; if w A ( r ) 6 = + ∞ , then W cand is feasible. 4 T ree P artitioning T ec hniques 4.1 T ree Partitioning with Lo w er and Upp er Size Bounds Given a tree with n vertices, we wan t to par tition the tree into several parts, such that the num b er of vertices in each part is at leas t Q and at mos t k · Q ( k ≥ 1). Each part P must hav e a r e pr esentativ e vertex u , which do es not necessarily b elong to P . How ever, ( P ∪ { u } ) m ust fo rm a connected subtree. I will present an alg orithm w hich w orks for k ≥ 3. W e r o ot the tree at a n y v ertex r , trav erse the tree bottom-up and compute the parts in a gree dy manner. F or each vertex i w e compute w ( i )=the size of a connected component C ( i ) in T ( i ), such that v ertex i ∈ C ( i ) , | C ( i ) | < Q , and a ll the vertices in ( T ( i ) \ C ( i )) were split into parts satis fying the sp eciﬁed prop erties. F or a leaf vertex i , w ( i ) = 1 and C ( i ) = { i } . F or a non-leaf vertex i , w e trav erse its sons (in any o rder) and maintain a counter ws ( i )=the sum o f the w ( s ( i, j )) v alues of the sons trav ersed so far. If ws ( i ) exceeds Q − 1 a fter co nsidering the s on s ( i, j ), we form a new part from the connected comp onents C ( s ( i, l ast son + 1)) , . . . , C ( s ( i, j )) a nd assig n vertex i as its repr esentativ e. Then, w e r eset ws ( i ) to 0. ( l ast son < j ) is the previous son where w s ( i ) was r eset to 0 (o r 0, if ws ( i ) w as never res et to 0). After considering every s on of vertex i , we set w ( i ) = w s ( i ) + 1 a nd the comp onent C ( i ) is for med from the comp onents C ( s ( i, j )) whic h were not used for forming a new pa rt, plus vertex i . If ws ( i ) + 1 = Q , then we form a new part from the compo nen t C ( i ) a nd set w ( i ) = 0 and C ( i ) = {} . During the algorithm, the maxim um siz e of any part formed is 2 · Q − 2. At the end o f the algorithm, we may hav e that w ( r ) > 0. In this case, the vertices in C ( r ) were not a ssigned to a n y par t. How ever, at least one v ertex fro m C ( r ) is adjacent to a vertex assigned to some part P . The n, we can extend that part P in or der to contain the v ertices in C ( r ). This way , the maximum size of a part bec o mes 3 · Q − 3 . The pseudo co de of the ﬁrst pa rt of the alg orithm is pres e n ted below. In order to compute the parts, w e maintain for each vertex i a v alue par t ( i ), which is 0 , initially (0 means that the vertex was not as s igned to any part). In order to a s sign distinct pa rt num ber s, w e will ma intain a global co un ter part number , whose initial v alue is 0. The ﬁrst part of the alg orithm has linear time complexity ( O ( n )). The second par t (a dding C ( r ) to an alr eady ex is ting part) c an als o b e p erformed in linea r time, by searching for an edge ( p , q ), such that part ( p ) = 0 and par t ( q ) > 0 (there are only n − 1 = O ( n ) edges in a tree). Lo w erUpp erBoundT reeP artitioning(Q, i): if (n s(i)=0) then w(i)=1 el se 4 ws(i)=last son=0 for j=1 to ns(i) do // j=1,2,. . . ,ns(i) Lo w erUpp erBoundT reeP artitioning (Q , s(i,j)) ws(i)=ws(i)+w(s(i,j)) if ( ws ( i ) ≥ Q ) then p art numb er=p art nu m b er + 1; last son=j; ws(i)=0 for k=last son+1 to j do AssignPar tNumber (s(i,k), p art nu m b er) w(i)=ws(i)+1 if ( w ( i ) ≥ Q ) then p art numb er=p art nu m b er + 1; w(i)=0 AssignPa rtNumber (i, p art nu mb er) AssignPa rtNumber(i, part n um b er): if ( par t ( i ) 6 = 0) then return() p art(i)=p art num b er for j=1 to ns(i) do As signP artNum b er (s(i,j), p art numb er) 4.2 Connected T ree Partitioning I will now presen t a n eﬃcien t a lgorithm for identifying k connected par ts of given sizes in a tree (if p oss ible ), sub ject to minimizing the total co st. Thus, g iven a tree with n v ertices, w e wan t to ﬁnd k v ertex-dis jo in t components (called par ts), such tha t the i th part (1 ≤ i ≤ k ) has sz ( i ) vertices ( sz (1)+ sz (2)+ . . . + sz ( k ) ≤ n and sz ( i ) ≤ sz ( i + 1) for 1 ≤ i ≤ k − 1). Ea ch tree edg e ( i, j ) has a cost ce ( i, j ) and each tree v ertex i has a co s t cv ( i ). W e want to minimize the s um of the costs of the vertices and edges which do not b elong to any par t. An edge ( i, j ) belo ngs to a part p if bo th vertices i and j b elong to par t p . In order to o btain k connected co mponents of the g iv en sizes we need to keep Q − k edges of the tree and remove the other s, where Q = sz (1) + . . . + sz ( n ). W e could try all the ( ( n − 1) cho ose ( Q − k ) ) p ossibilities of choosing Q − k edges out of the n − 1 edg e s o f the tree. F or each p ossibility , we obtain k ′ = n − Q + k connected comp onents with sizes sz ′ (1) ≤ sz ′ (2) ≤ . . . ≤ sz ′ ( k ′ ); in cas e of several co mp onents with equal sizes , we sort them in increasing order of the total co st of the vertices in them. Then, we m ust hav e sz ( j ) = sz ′ ( k ′ − k + j ) and the total cos t of the po ssibility is the sum o f the costs of the removed edges plus the s um of the costs of the v ertices in the comp o nen ts 1 , 2 , . . . , k ′ − k (which should ha ve only o ne v ertex each, if the size c o nditions ho ld). How ever, this approach is quite ineﬃcien t in mos t cas es. I will pres e n t an algo rithm with time complexity O ( n 3 · 3 k ). W e ro ot the tree at an a r bitrary vertex r . Then, we compute a table C min ( i, j, S )=the minimum cost o f obtaining fro m T ( i ) the parts with indices in the set S and, b esides them, we are left with a co nnected comp onent consisting o f j vertices which includes vertex i a nd, possibly , several vertices whic h are ignor ed (if j = 0, then e very vertex in T ( i ) is assigned to o ne of the parts in S or is ignor ed). W e compute this table b ottom-up: ConnectedT reeP artitioning(i ): for e ac h S ⊆ { 1 , 2 , . . . , k } do for j=0 to n do Cmin(i,j,S)= + ∞ Cmin(i, 1, {} )= 0; Cmin(i, 0, {} )=cv(i) for x=1 to ns(i) do ConnectedT reeP artitioning ( s (i,x)) for e ac h S ⊆ { 1 , 2 , . . . , k } do for j=0 to n do 5 Caux(i,j,S)=Cmin(i,j, S); Cmin(i,j,S)= + ∞ for e ac h S ⊆ { 1 , 2 , . . . , k } do for j=0 to n do for eac h W ⊆ S do for q=0 to qlimit(j) do Cmin(i,j,S)=min { Cmin(i,j,S), Caux(i,j-q, S \ W ) + extr a c ost(i,s(i,x),q) + Cmin(s(i,x),q,W) } for e ac h S ⊆ { 1 , 2 , . . . , k } do for j=0 to n do i f ( C min ( i , j, S ) < + ∞ ) then for q=1 to k do i f (( j=s z(q)) and ( q / ∈ S )) then Cmin(i,0, S ∪ { q } )=min { Cmin(i,j,S), Cmin(i,0, S ∪ { q } ) } W e deﬁne extr a c ost(i, son x i, q)=if ( q > 0) then r eturn(0) else r etu rn(c e(i, son x i)) and qlimit(j)=max { j-1,0 } . The algor ithm computes Cmin(i,*,* ) from the v alues of vertex i ’s sons , using the principles of tree knapsack. The tota l amount of computations fo r ea c h vertex is O ( ns ( i ) · 3 k · n 2 ). Summing over all the vertices, we obtain O ( n 3 · 3 k ). The minimum total cost is C min ( r, 0 , { 1 , 2 , . . . , k } ) (if this v alue is + ∞ , then we cannot obtain k parts with the given size s ). In order to ﬁnd the actual parts, we need to trac e bac k the wa y the C mi n ( ∗ , ∗ , ∗ ) v alues were computed, which is a sta ndard pro cedure. When the sum of the sizes of the k parts is n , then every v ertex b elongs to one part. 5 Con ten t Deliv ery Optimization Problems 5.1 Minim um Num b er of Unicast Streams W e consider a directed acyclic graph G with n vertices and m edges. E very directed edge ( u, v ) has a lower b ound lb e G ( u, v ), an upp er bound ube G ( u, v ) and a co s t ce G ( u, v ). E v ery vertex u has a low er b ound l bv G ( u ), an upp er bo und ubv G ( u ) and a cost cv G ( u ). W e need to determine the minim um num ber of unicast comm unication streams p and a path for each of the p streams, such that the num ber of strea m pa ths npe ( u, v ) containing an edge ( u, v ) satisﬁe s l be G ( u, v ) ≤ npe ( u, v ) ≤ u be G ( u, v ) and the n um b er of paths npv ( u ) co n taining a vertex u satisﬁes l bv G ( u ) ≤ npv ( u ) ≤ u bv G ( u ). E ach vertex u can b e a sour c e no de, a destination no de, bo th or none. A stream’s path may start at any source no de and ﬁnish at any destination no de. Moreover, for the n umber of streams p , we want to compute the paths suc h that the sum S over all the v alues ( npe ( u, v ) − l be G ( u, v )) · ce G ( u, v ) and ( npv ( u ) − lbv G ( u )) · cv G ( u ) is minimum. Particular ca ses of this problem hav e b een studied previo us ly . When l bv G ( u ) =1 and ubv G ( u ) = 1 for every vertex u , lb e G ( u, v ) = 0 and ube G ( u, v ) = + ∞ for every direc ted edge ( u, v ), all the costs are 0, and every vertex is a source and destinatio n no de, we obta in the minimum p ath c over problem in directed acyclic gra phs, which is s olved as follows [18 ]. Construct a bipartite gra ph B with n vertices x 1 , . . . , x n on the left side and n vertices y 1 , . . . , y n on the righ t side. W e add an edge ( x i , y j ) in B if the directed edge ( i, j ) appea r s in G . Then, we compute a maxim um matching in B . If the cardinality of this matching is C , then we nee d p = n − C streams. The paths are computed as follows. Having an edge ( x i , y j ) in the maximum matc hing means that the edge ( i, j ) in G belong s to some str eam’s path. If t wo edg es ( x i , y j ) and ( x j , y k ) in B belo ng to the matching, then the edges ( i, j ) and ( j, k ) in G b elong to the path of the same stream. F or non-zero c osts, we co mpute a minimum (total) weigh t matching in B (where every edge ( x i , y j ) has a weigh t equal to ce ( i, j )). 6 In order to solve the problem I mentioned, we will use a sta ndard trans- formation and c o nstruct a new graph G ′ where every vertex u is repr esent ed by tw o vertices u in and u out . F or every directed e dge ( u, v ) in G , w e add an edge ( u out , v in ) in G ′ , with the same co st and low er and upp er b ounds. W e also add a directed edg e from u in to u out in G ′ (for every vertex u in G ), with cost cv G ( u ), low er b ound l bv G ( u ) and upper b ound ubv G ( u ). Then we add tw o sp ecial v ertices s (source) and t (sink) to G ′ . F or ev ery source no de u in G , we add a dir ected edge ( s, u in ) in G ′ , with low er bound and cost 0 a nd upper bound + ∞ . F or every destination no de v in G , we add a directed edge ( v out , t ), with low e r b ound and c o st 0 and upp e r b ound + ∞ . W e also add the edg es ( s, t ) and ( t, s ) with low er b ound and cost 0 and upp er b ound + ∞ . The resulting g raph G ′ has costs, low er and upp er b ounds only on its edges and not on its vertices. In order to compute the minimum num b e r of communication strea ms which satisfy the constr aint s imp osed by G , it is enoug h to compute a (minimum co st) minim um feasible ﬂow in G ′ , from s to t . Decomp osing the ﬂow into unit-ﬂow paths (in order to obtain the path of each comm unication stream) can then b e done ea sily . W e rep eatedly p erform a gra ph trav ersal (DFS or BFS) fro m s to t in G ′ , considering only directed edg es with p ositive ﬂow on them. F rom the trav ersal tree, by follo wing the ”par en t” po in ters, we can ﬁnd a pa th P f rom s to t , co n taining only edges with p ositive ﬂow. W e compute the minimum ﬂow f P o n a ny edge o f P , tr ansform P into f P unit paths and then dec r ease the ﬂow on the edg es in P by f P . If we r emo ve the ﬁrst a nd last vertices on any unit path (i.e. s and t ), we obtain a path from a vertex u in to a vertex v out , where u is a so urce no de in G a nd v is a destination no de in G . W e will use the algorithm presented in [1 8] for determining a feasible ﬂow (not necessarily minim um) in a ﬂow netw ork with low er and upp er bo unds on its edges. W e will denote this algorithm b y A ( F, s, t ) ( F is the ﬂo w ne tw ork given as arg umen t, s is the sour ce vertex and t is the sink vertex). I will describ e A ( F , s, t ) brieﬂy . W e constr uc t a new graph F ′ from F , as follows. W e maintain all the vertices and edges in F . F o r every directed edge ( u, v ) in F , the directed edge ( u, v ) in F ′ has the same cost, low er bound 0 a nd upp er b ound ( ube F ( u, v ) − lbe F ( u, v )). W e add tw o extra vertices s ′ and t ′ and the following zero-co st directed edges: ( s ′ , u ) and ( u, t ′ ) for ev ery vertex u in F (including s and t ). The low er b ound of every edge will b e 0 . The upp er bound of a directed edge ( s ′ , u ) in F ′ is equal to the sum of the lo wer b o unds of the dir ected edges ( ∗ , u ) in F . The upper b ound of ev ery dir ected edg e ( u, t ′ ) in F ′ is equal to the sum of the lo wer bo unds o f the directed edges ( u, ∗ ) in F . The a lgorithm A ( F , s, t ) computes a minimum cos t maximum ﬂow g in the graph F ′ (whic h, as stated, o nly has upper b ounds); if all the costs are 0, only a maximum ﬂow is computed. If g is equa l to the sum of the upp e r b ounds o f the edges ( s ′ , ∗ ) (or, e q uiv alently , of the edges ( ∗ , t ′ )), then a feasible ﬂow from s to t e x ists in F : the ﬂow on every directed edge ( u, v ) in F will b e l be F ( u, v ) plus the ﬂow o n the edge ( u, v ) in F ′ . W e will ﬁrs t r un the algorithm on G ′ (i.e. call A ( G ′ , s, t )) in order to v erify if a feas ible ﬂow exists). If no fea s ible ﬂow ex ists, then the constra ints canno t be satisﬁed by a n y n umber of streams. O therwise, we constr uct a gr aph G ′′ from G ′ , by a dding a new vertex snew and a zer o-cost directed edge ( snew , s ) with lower b ound 0 a nd uppe r b ound x . snew will b e the new sour ce vertex and x is a parameter which is used in o rder to limit the amount of ﬂow en tering the old source vertex s . W e will now per form a binary s e arch o n x , b et ween 0 a nd g max , wher e g max is the v a lue of the feasible ﬂow computed by calling 7 A ( G ′ , s, t ). The feasibility test consists of verifying if there exists a feasible ﬂow in the g raph G ′′ (i.e. calling A ( G ′′ , snew , t )). The minimum v alue of x for which a feasible ﬂow exists in G ′′ is the v alue of the minim um feasible ﬂow in G ′ , from s to t . Obtaining the feasible ﬂow in G ′ from the feasible ﬂow in G ′′ is triv ial: for every directed edge ( u, v ) in G ′ , we set its amount of ﬂo w to the ﬂow of the same edge ( u, v ) in G ′′ . The time complexity of the pres en ted algo rithm is O ( M F ( n, m ) · log ( g max )), where g max is a go o d upper b ound on the v alue of a feasible ﬂow a nd M F ( n, m ) is the b est time complexity of a (minimum cost) maximum ﬂow a lgorithm in a directed graph with n vertices a nd m edges. 5.2 Degree-Constrained Minim um Spanning T ree In [13], the following problem was consider ed: given an undire cted g raph with n verices and m edges, where each edge ( i, j ) ha s a weight w ( i, j ) > 0 , compute a spanning tree M S T of minimum total weigh t, such that a sp ecial vertex r has degree exactly k in M S T . A solution w as pro pos ed, based on using a pa rameter d and setting the cos t o f each edge ( r, j ) adjace n t to r , c ( r , j ) = d + w ( r, j ); the co st of the other e dg es is equal to their weight. Parameter d can range from −∞ to + ∞ . W e deno te by M S T ( d )=the minimum spanning tree using the cost functions deﬁned previously . When d = −∞ , M S T ( d ) contains the maximum num b er of edges adjacent to r . F o r d = + ∞ , M S T ( d ) co n tains the minimum n umber of edg es adjacent to r . W e deﬁne the function ne ( d )=the nu mber of edges adjacent to r in M S T ( d ). ne ( d ) is non-inc r easing on the interv al [ −∞ , + ∞ ]. W e will binary sea rch the smallest v alue dopt of the parameter d in the interv al [ −∞ , + ∞ ], such tha t ne ( dopt ) ≤ k . W e will ﬁnish the binary search when the length of the sear c h in terv al is smalle r than a small consta n t ε > 0. If n e ( dopt ) = k , then the edges in M S T ( dopt ) fo rm the required minimum spanning tree. If ne ( dopt ) < k , then ne ( dopt − ε ) > k . W e deﬁne S ( d )=the set of edges adjace n t to vertex r in M S T ( d ). It is easy to prove that S ( dopt ) is included in S ( dopt − ε ). The required minimum spa nning tree is constructed in the fo llowing manner . The edges adjacent to vertex r will b e the edges in S ( dopt ), to whic h we add ( k − ne ( dopt )) ar bitrary edg e s from the set S ( dopt − ε ) \ S ( dopt ). Once these edg es are ﬁx e d, we construct the following gra ph G : we set the cost of the chosen edges to 0 and the cost of the o ther edges ( i, j ) to w ( i, j ). W e now compute a minim um spanning tree M S T G in G . The edges in M S T G are the edg es of the minim um spanning tree of the o riginal graph, in which vertex r has deg ree exactly k . The time complexity of this a pproach is O ( m · l og ( m ) · l og ( D M AX )), where D M AX denotes the r ange ov er which we search the parameter d . When m is not to o la rge (i.e. m is not of the order O ( n 2 )), this represents an improv ement o ver the O ( n 2 ) solution given in [13]. 6 Matc hing Problems 6.1 Maxim um W eigh t Matc hing in an Extended T ree Let’s c o nsider a ro oted tr e e (with v ertex r as the ro ot). Eac h vertex i has a weigh t w ( i ). W e wan t to ﬁnd a matching in the fo llowing g raph G (extended tree), having the same vertices as T a nd an edge ( x, y ) b etw een tw o vertices x and y , if: (i) x and y ar e a djacent in the tree; (ii) x a nd y hav e the same parent 8 in the tree. The weigh t of an edge ( x, y ) in G is | w ( x ) − w ( y ) | . The weight of a matching is the sum o f the weigh ts o f its edges. W e are in terested in a maximum w eight matching in the graph G . F or each vertex i , we so r t its sons s ( i, 1) , . . . , s ( i , ns ( i )) in non-decrea sing order of their weigh ts, i.e. w ( s ( i, 1)) ≤ . . . ≤ w ( i , ns ( i )). W e will compute for each vertex i t wo v a lues: A ( i )=the maximum weight of a ma tc hing in T ( i ) if vertex i is the endp oint of a n edge in the matching and B ( i )=the maxim um weight o f a matchin g in T ( i ) if vertex i is not the endp oint of a n y edg e in the matching. In or der to compute these v alues, we will compute the following tables for every vertex i : C A ( i, j, k )=the maximum weigh t o f a matching in T ( i ) if vertex i is the endpo int of an edge in the ma tc hing a nd we only co nsider its sons s ( i , j ) , s ( i, j + 1) , . . . , s ( i, k ) (and their subtrees). Similar ly , we hav e C B ( i, j, k ), where v ertex i does not belong to an y edge in the matching. The maximum w eight of a matching is max { A ( r ) , B ( r ) } . The actual matching can b e computed ea sily , b y tracing back the wa y the A ( i ), B ( i ), C A ( i, ∗ , ∗ ) a nd C B ( i , ∗ , ∗ ) v alues were computed. A r ecursive algor ithm (called with r as its argument) is given b elow. The time complexity is O ( ns ( i ) 2 ) for a vertex i and, thus, O ( n 2 ) ov erall. Maxim umW eigh tMatc hi ng-ExtendedT ree(i): if (n s(i)=0) then A(i)=B(i)=0 else for j=1 to ns(i) do M axim umW ei gh tMatc hi ng-ExtendedT ree (s(i,j)) for j=1 to ns(i) do CA(i, j, j - 1)= −∞ ; C A ( i, j, j ) = | w ( i ) − w ( s ( i, j )) | + B ( s ( i, j )) CB(i, j, j - 1)= 0; CB(i, j, j)=max { A(s(i,j)), B(s(i,j)) } for c ount = 1 to (ns(i)-1) do for j=1 to (ns(i)-c ount) d o k = j + c ount CA(i,j,k)=max {| w ( s ( i, j )) − w ( s ( i, k )) | + B(s(i,j)) + B(s(i,k)) + CA(i, j + 1, k - 1), | w ( i ) − w ( s ( i, j )) | + B(s(i,j)) + CB(i, j+1, k), | w ( i ) − w ( s ( i, k )) | + B(s(i,k)) + CB(i, j, k-1), max { A(s( i,j)), B(s(i,j)) } + CA(i, j+1, k), max { A(s(i,k)), B(s(i,k)) } + CA(i, j, k-1) } CB(i,j, k)=max {| w ( s ( i, j )) − w ( s ( i, k )) | + B(s(i,j)) + B(s(i,k)) + CB(i, j + 1, k - 1), max { A(s(i,j)), B(s(i,j)) } + CB(i, j+1, k), max { A( s ( i,k)), B(s(i,k)) } + CB(i, j, k-1) } A(i)=CA(i,1,ns(i)); B(i)=CB(i,1,ns(i)) 6.2 Maxim um Matc hing in the P o w er of a Gr aph The k th power G k ( k ≥ 2) of a graph G is a graph with the sa me set o f vertices as G , where there exists an edge ( x, y ) betw een t wo vertices x and y if the distance betw een x and y in G is a t most k . The distance b e t ween tw o vertices ( x, y ) in a g raph is the minimum num b er of edges which need to b e tr aversed in or der to r each v ertex y , starting from vertex x . A maxim um ma tc hing in G k of a gr aph G can be found by restricting our attention to a spanning tree T of G . The fo llowing linea r a lg orithm (called with i = r ), us ing obser v ations fro m [12], solves the pro blem (w e consider that, initially , no v ertex is matched): Maxim umMatc hingGk(i): if (n s(i)=0) then return() el se last son=0 for j=1 to ns(i) do // j=1,2,. . . ,ns(i) Maxim umMatc hingGk (s(i,j)) 9 if ( not matche d(s(i,j)) the n if (last son = 0) then last son = s(i,j) else add ed ge (last son, s( i,j)) to the matc hing matche d(last son) = matche d(s(i,j)) = tru e; last son = 0 if ( l ast son > 0 ) then add edge (i, last son) to the matching matche d(i) = m atche d(last son) = true 7 First F it Online T ree Coloring A v ery in tuitive a lg orithm for co loring a g raph with n vertices is the ﬁ rst-ﬁt on- line c oloring heuristic . W e trav erse the vertices in some order v (1) , v (2) , . . . , v ( n ). W e assign colo r 1 to v (1) and for i = 2 , . . . , n , w e assig n to v ( i ) the minim um color c ( i ) ≥ 1 which w as not assigned to any o f its neighbours v ( j ) ( j < i ). A tree is 2-c olor able : we ro ot the tree at any vertex r and then compute fo r each v ertex i its lev el in the tre e (distance from the ro ot); we ass ign the co lor 1 to the vertices on even levels a nd the co lor 2 to those on o dd levels. How ever, in some situatio ns , w e might be forced to pr oc e ss the v ertices in a given order. In this ca se, it would be useful to compute the worst-case coloring that can be obtained by this heuristic, i.e. the largest n umber of colors that are used, under the worst-case ordering of the tree vertices ( Gru n dy numb er ). I will pr esent an O ( n · log ( log ( n ))) algorithm fo r this problem, s imilar in nature to the linear algorithm pres en ted in [4 ]. F o r each v ertex i , we will compute cma x ( i )=the largest color the can b e assig ned to vertex i in the worst-case, if vertex i is the last vertex to be color ed. T he v alue max { cmax ( i ) | 1 ≤ i ≤ n } is the largest nu mber of colo rs that can be assigned b y the ﬁrs t ﬁt o nline coloring heuristic. W e will ro o t the tree at an arbitrar y vertex r . The algo r ithm consists of t wo s tages. In the ﬁrst stag e, the tree is traversed bo ttom-up and fo r each vertex i we compute c (1 , i )=the larg est color that ca n b e assigned to vertex i , cons ide r ing o nly the tree T ( i ). F o r a leaf v ertex i , we hav e c (1 , i ) = 1. F o r a non-lea f vertex i , we will sor t its sons s ( i, 1) , . . . , s ( i, ns ( i )), such that c (1 , s ( i, 1)) ≤ c (1 , s ( i, 2)) ≤ . . . ≤ c (1 , s ( i, ns ( i ))). W e will initializ e c (1 , i ) to 1 and then co nsider the so ns in the sorted or der. When we reach so n s ( i, j ), we compare c (1 , s ( i, j )) with c (1 , i ). If c (1 , s ( i, j )) ≥ c (1 , i ), then we increment c (1 , i ) by 1 (otherwis e, c (1 , i ) stays the s ame). The justiﬁcation of this alg orithm is the fo llowing: if a vertex i can b e a ssigned color c (1 , i ) in some ordering of the vertices in T ( i ), then there exists an or de r ing in which it can be assigned any o ther colo r c ′ , such that 1 ≤ c ′ ≤ c (1 , i ). Then, when trav ersing the sons and r eaching a son s ( i , j ) with c (1 , s ( i, j )) ≥ c (1 , i ), we consider an order ing of the vertices in T ( s ( i, j )), wher e the color o f vertex s ( i, j ) is c (1 , i ); thus, we can increase the maximum co lor that can be a ssigned to v ertex i . After the bo ttom-up tree trav ersa l, we hav e cmax ( r ) = c (1 , r ), but we still hav e to co mpute the v alues cmax ( i ) for the other vertices o f the tree. W e could do that by ro oting the tree at every vertex i and running the previo usly describ ed a lgorithm, but this w ould take O ( n 2 · l og ( l og ( n ))) time. How ever, w e can co mpute these v alues faster, by trav ersing the tree vertices in a top-down manner (consider ing the tree r o oted at r ). F or each vertex i , we will compute col max ( parent ( i ) , i )= the maximum color that can b e assigned to par ent ( i ) if we 10 remov e T ( i ) fro m the tre e and after wards w e c o nsider parent ( i ) to be the (new) ro ot of the tree. W e will use the v alues c (2 , i ) a s tempo rary sto rage v aria bles. c (2 , i ) is initialized to c (1 , i ), for every vertex i . When computing cmax ( i ), we consider that vertex i is the ro ot o f the tree. Let’s a ssume that we computed the v alue cmax ( i ) of a v ertex i and now w e wan t to compute the v alue c max ( j ) of a vertex j which is a son of vertex i . W e remov e j fr om the list o f so ns of vertex i and add par ent ( i ) to this list ( pa rent ( i )= vertex i ’s parent in the tree ro oted at the initial vertex r ). W e now need to lift vertex j above vertex i and make j the new r o o t of the tree. In or der to do this, w e will recompute the v alue c (2 , i ), which is computed simila rly to c (1 , i ), exc ept that we consider the new list of so ns for vertex i (and their c (2 , ∗ ) v alues ). Afterw ards, we add vertex i to the list o f sons of vertex j . W e will compute the v a lue cmax ( j ) similar ly to the v alue c (1 , j ), using the v alues c (2 , ∗ ) of vertex j ’s so ns (instea d of the c (1 , ∗ ) v alues of the sons). After computing cmax ( j ) we restore the lists of sons of vertices i and j to their orig inal states (as if the tree were ro oted at the initia l vertex r ). After computing the v alues cmax ( u ) of all the descendants u of a vertex j , we reset the v alue c (2 , j ) to c (1 , j ). Both tr av er sals ta k e O ( n · log ( n )) time, if w e sort the ns ( i ) sons of ev ery vertex i in O ( ns ( i ) · l og ( ns ( i ))) time. How ever, it ha s b een proved in [4] tha t the minimum num b er of vertices of a tree with the Grundy n umber q is 2 q − 1 , which is the binomial tree B ( q − 1). The binomial tree B (0) consists of only one vertex. The binomial tr e e B ( k ≥ 1 ) has a ro ot vertex with k neighbo rs; the i th of these neigh b ors (0 ≤ i ≤ k − 1) is the ro ot of a B ( i ) binomia l tree. Thus, every v a lue c (1 , ∗ ), c (2 , ∗ ) a nd cmax ( ∗ ) c a n be repre s en ted using O ( l og ( l og ( n ))) bits. W e can us e radix- sort and obtain a n O ( n · l og ( l og ( n ))) time complexity . The pseudo co de of the functions is given b elow. The main algorithm co nsists o f calling FirstFit-BottomUp(r) , initializing the c (2 , ∗ ) v alues to the c (1 , ∗ ) v alues, setting cmax ( r ) = c (1 , r ) and then calling FirstFit-T opDown(r) Compute(i, idx): sort the sons of vertex i, su ch that c(idx,s(i,1)) ≤ . . . ≤ c(idx,s(i,ns(i))) c(idx,i)=1 for j=1 to ns(i) do i f (c(idx,s(i,j)) ≥ c(idx,i)) then c(idx,i)=c(idx,i)+1 FirstFit-BottomUp(i): for j=1 to ns(i) do Fi rstFit-BottomUp (s(i,j)) Compute (i, 1) FirstFit-T opDown(i): if ( i 6 = r ) the n remo v e vertex i fr om t he list of sons of p ar ent( i) add p ar ent ( p ar ent (i)) t o the list of sons of p ar ent(i) (if p ar ent(i) 6 = r) Compute (p ar ent(i),2); c olmax(p ar ent (i),i)=c(2,p ar ent(i)) add p ar ent ( i) to t he list of sons of vertex i Compute (i,2); cmax(i)=c(2,i) restore the original lists of sons of the vertic es p ar ent(i) and i for j=1 to ns(i) do Fi rstFit-T opDown (s(i, j)) c(2,i)=c(1,i) 11 8 Other Op timizatio n and Coun ting Problems 8.1 Building a (Constrained) T ree with Minim um Heigh t In this subsection I c onsider the following o ptimization problem: W e ar e giv en a sequence of n leav es and each leaf i (1 ≤ i ≤ n ) ha s a heig h t h ( i ). W e wan t to construct a (strict) binar y tr ee with n − 1 internal no des, such that, in an inorder trav ersal o f the tree, w e encounter the n leav es in the given o rder. The height of an internal node i is h ( i ) = 1 + max { h ( l ef tson ( i ) , h ( r ig htson ( i )) } (the height of the leaves is given). W e are in terested in computing a tr ee whose r oo t has minim um height. A straig h t-forward dynamic progra mming solutio n is the following: compute H min ( i, j )=the minim um height of a tree containing the leav es i , i + 1, . . . , j . W e ha ve: H min ( i, j )=1 + mi n i ≤ k ≤ j − 1 max { H mi n ( i, k ) , H mi n ( k + 1 , j ) } . H min (1 , n ) is the answer to our pro blem. How ever, the time complexity of this algorithm is O ( n 3 ), whic h is unsatisfactor y . An o ptimal, linear-time alg orithm was giv en in [14]. The main idea o f this a lgorithm is the following. W e traverse the leav es from left to r ight and maintain information ab out the rightmost path o f the optimal tree for the ﬁrst i leav es . Then, we can a dd the ( i + 1) st leaf by mo difying the rightmost path o f the o ptimal tr ee for the ﬁrst i leav e s. Let’s ass ume that w e pro cess e d the ﬁrst i le av es and the optimal tree for these leav es contains, on its rightmost path, the vertices v (1 ), v (2), . . . , v ( nv ( i )), in o r der, from the ro ot to the r ight most leaf ( v (1) is the ro ot). Let’s assume that the heights o f the subtrees r o o ted a t these vertices are hv (1), . . . , hv ( nv ( i )). It is ea s y to build this tree for i = 1 a nd i = 2 (it is unique). When adding the ( i + 1) st leaf, we trav erse the r ight most path from n v ( i ) down to 2 . Assume tha t we a re consider ing the vertex v ( j ). If hv ( j − 1) < (2 + max { hv ( j ) , h ( i + 1) } ), then we disconsider the vertex v ( j ) from the rightmost path and mov e to the next vertex ( v ( j − 1)). Le t’s assume that the path now co n tains the vertices v (1), . . . , v ( nv ′ ( i )). W e replace vertex v ( nv ′ ( i )) by a new v ertex v new , whose left son will b e v ( nv ′ ( i )) (together with its subtree) and whose right son will be the ( i + 1 ) st leaf. The heig h t of the new vertex w ill b e 1 + max { hv ( nv ′ ( i )) , h ( i + 1) } . The r ig h tmost path of the optimal tree behaves like a stac k and, th us, the overall time complexity is linea r. I will pres en t a sub-optimal O ( n · l og ( n )) time algor ithm which is in terest- ing on its own. T he alg orithm is similar to Huﬀman’s algorithm for computing optimal preﬁx-free codes, e xcept that it maintains the or der of the leav es . A sug - gestion that such an appro ach might w ork w as given to me by C. Gheorghe. At step i (1 ≤ i ≤ n − 1 ) of the algorithm, w e will have n − i + 1 subtrees of the opti- mal tree. Each subtree j contains an interv al of leaves [ lef tl eaf ( j ) , ri g htl eaf ( j )] and its heig h t is h ( j ). W e will combine the t wo adja c en t subtrees j and j + 1 whose combined height (1+max { height(subtr e e j),heig ht(subtr e e j+1) } ) is min- im um among a ll the O ( n ) pair s o f a dja c en t subtrees. At the ﬁrst step, the n subtrees ar e repres en ted by the n le aves, whos e he ig h ts a re given. A stra ight- forward implemen tation o f this idea leads to an O ( n 2 ) algorithm. How ever, the pro cessing time can b e improved by using t wo segment trees [15], A and B , with n and n − 1 leav es, resp ectively . Each no de q o f a segment tree corres ponds to an int erv a l o f lea ves [ l ef t ( q ) , r ig ht ( q )] (leav es are num b ered starting from 1). Each leaf node of the segment tree A can b e in the active or inactive state. E ach no de q of A (whether lea f or in ternal no de) maintains a v alue na ctiv e ( q ), deno ting the num b er of active le aves in its subtree. Initially , each o f the n leav es of A 12 is a ctiv e and the nacti v e ( ∗ ) v alues are initialized appro priately , in a bottom-up manner (1, for a leaf no de , and nactiv e ( l ef tson ( q )) + nactiv e ( rig htson ( q )), for an internal node q ). Segment tree B has n − 1 leav es and each node o f B (lea f or internal no de) stores a v alue hc . If leaf i (1 ≤ i ≤ n − 1) is active in A , then hc(le af i)=1+max { h(i), h(j) } , where j > i is the next active leaf. If leaf i is not active in A or is the last active lea f, then hc(le af i)= + ∞ . The v alue hc of ea c h in- ternal node q o f B is the minimum among all the hc v alues of the lea ves in no de q ’s subtree, i.e. hc(no de q)=min { hc(leftson(q)), hc(rightson(q)) } . Mo reov er, each no de q of B ma intains the num ber l nu m o f the leaf in its subtree whic h gives the v alue hc(no de q) . W e hav e ln um(le af i)=i and lnum(internal no de q)=if (hc(leftson(q)) ≤ hc(rig htson(q))) then lnum(leftson(q)) else lnu m(rightson(q)) . A t each step i (1 ≤ i ≤ n − 1), eac h active leaf is the leftmos t le a f o f a subtree of the o ptimal tree. After every s tep, the num b er o f active leav es decreases by 1 . W e can ﬁnd in O ( l og ( n )) time the pa ir of adjacent subtrees to combine. The he ig h t of the combination of these subtrees is hc(r o ot no de of B) , the leftmost leaf o f the ﬁrst subtree is i=lnu m(r o ot no de of B) and that of the seco nd subtree is j = next activ e ( i ). W e deﬁne the function next activ e by using tw o other functions: r ank ( i ) and un rank ( r ). rank ( i ) returns the nu mber of active leav e s befor e leaf i (0 ≤ rank ( i ) ≤ nactive(r o ot no de of A)-1 ). unr ank ( r ) re tur ns the index of the leaf whose rank is r . The tw o functions ar e inv er ses o f e a ch other: unr ank ( r ank ( i )) = i a nd rank ( unr ank ( r )) = r . W e hav e r ank(i)=r ank’(i, r o ot no de of A) , unr ank(r)=unr ank’(r, r o ot n o de of A) and next active(i)=unr ank(r ank(i) + 1) . rank’(i, q): if (q is a le af no de) then if (left(q)=right(q)=i) then return (0) els e return (-1) else i f ( i > rig ht ( l ef tson ( q )) ) then return (nactive(leftson(q))+r ank’(i, rightson(q))) else return (r ank’(i, leftson(q))) unrank’(r, q): if (q is a le af no de) then if ( r > 0) then return (-1) e lse return (left(q)) else i f ( nacti v e ( l ef tson ( q )) ≤ r ) then return (unr ank’(r-nactive(leftson(q)), rightson(q))) else return (un r ank’(r, leftson(q))) The functions r ank , unr ank and next active take O ( log ( n )) time each. After obtaining the indices of the tw o a ctive leav es i and j whose co r resp onding sub- trees are united (b y adding a new in ternal no de whose left so n is the r o ot of i ’s subtree and who se right son is the ro ot of j ’s subtree), we mark lea f j as inac- tive . W e do this by trav ersing the seg ment tree A from le a f j tow ards the ro o t (from j to pa rent ( j ), parent ( par ent ( j )), . . . , r o ot no de of A ) and decrement b y 1 the nactiv e v alues of the v is ited no des. Then, we change the h v alues of leaves i and j . W e se t h(i)=hc(r o ot no de of B) and h ( j ) = + ∞ . After this, we will also change the hc v alues ass o cia ted to the leav es i and j in the segment tree B . The new hc v alue of leaf j will b e + ∞ . If i is now the last active leaf, then hc(le af i) be c o mes + ∞ , too. Otherwise, let j ′ = next acti v e ( i ), the nex t active leaf a fter i (at this p oint, leaf j is not active a n ymore). W e will change hc(le af no de i) to (1 + m ax { h ( i ) , h ( j ′ ) } ). After changing the hc v a lue o f a leaf k , we 13 trav erse the tr e e fro m lea f k tow ar ds the r o o t (visiting a ll o f k ’s a ncestors, in order, starting from parent ( k ) a nd ending a t the ro ot of B ). F or eac h ancestor no de q , w e reco mpute hc(no de q) as min { hc ( l ef tson ( q )) , hc ( rig htson ( q )) } . 8.2 The Number of T rees with a Fixed N um b er of Leav es In order to co mpute the num ber o f lab eled trees with n vertices and exactly p leav es, we will compute a table N T ( i, j )=the nu mber of trees with i vertices and exactly j leav es (1 ≤ j ≤ i ≤ n ). Ob viously , we ha ve N T (1 , 1 ) = N T (2 , 2 ) = 1 and N T ( i, j ) = 0 for i = 1 , 2 and j 6 = i . F or i > 2, we hav e N T ( i, i ) = 0 and for 1 ≤ j ≤ i − 1, we will pro ceed as follows. The j leaves ca n be chosen in C ( i, j ) wa ys ( i cho ose j ). After choosing the identiﬁers o f the j leav es, we will conceptually remov e the leav es fro m the tr e e, th us remaining with a tree having i − j vertices and an y num ber of lea ves k (1 ≤ k ≤ j ). Ea c h of the j leaves that we conce ptually r emov e d is adja c en t to o ne o f these k vertices. F urthermore, each of these k vertices is adjacent to at lea st one of the j leaves from the larger tree. Th us, we need to co mpute the num b e r of surjective functions f from a doma in of size j to a domain of size k . W e will denote this v alue by N F ( j, k ). This is a ”c lassical” pro blem, but I will prese nt a simple solution, nevertheless. W e hav e N F (0 , 0) = 1 and N F ( j, k ) = 0 , if j < k . In order to compute the v alues for k ≥ 1 and j ≥ k , we will cons ider every num ber g of v alues x from the set { 1 , . . . , j } for which f ( x ) = k . Once g is ﬁxe d, we hav e C ( j, g ) wa ys of choosing the g v a lues from the set { 1 , . . . , j } . F or each such p ossibility we hav e N F ( j − g , k − 1) ways of extending it to a surjective function. Thus, N F ( j, k ) = P j g =1 C ( j, g ) · N F ( j − g , k − 1). W e can tabulate all the N F ( ∗ , ∗ ) v alues in O ( n 3 ) time (after tabulating the combinations C ( ∗ , ∗ ) in O ( n 2 ) time, ﬁrst). With the N F ( ∗ , ∗ ) v alues computed, we hav e N T ( i, j ) = C ( i, j ) · P j k =1 ( N T ( i − j, k ) · N F ( j, k )). W e ca n easily compute ea c h entry N T ( i, j ) in O ( n ) time, obta ining an O ( n 3 ) ov erall time complexity . The technique of per forming dynamic programming on succes sive la yers of leav es o f a tr ee is also useful in several o ther coun ting problems. 8.3 The Num b er of T rees with Degree Constrain ts W e w ant to compute the num b er o f unlab eled, ro oted trees with n ≥ 2 vertices, such that the (degree / num ber of sons) of each v ertex be lo ngs to a set S , whic h is a subset of { 0 , 1 , 2 , . . . , n − 1 } . By (a/b) we mean that a refers to the degree- constrained pro blem and b refers to the num b er-o f- s ons-constra ined problem (everything else b eing the same). Because every tree with n ≥ 2 vertices must contain at least a leaf (a v ertex o f degr ee 1) and at lea st o ne vertex with at least 1 so n, the set S will always contain the subset ( { 1 } / { 0 , 1 } ). W e will co mpute a ta ble N T ( i, j, p )=the num b er of trees with i vertices, such that the r o ot has degree j ( j sons) and the maximum num ber of vertices in the subtree of an y son of the ro o t is p ; mor eov er, except p e rhaps the tree ro ot, the (degr ees/num b ers of s ons) o f all the other vertices b elong to the set S . Because the trees ar e unlab eled, we ca n sort the sons of each v ertex in non-decr easing order of the nu mbers o f vertices in their subtrees. Thus, we will compute the table N T in increasing o rder of p . N T (1 , 0 , p ) = 1 and N T (1 , j > 0 , p ) = N T ( i ≥ 2 , j, 0) = 0. F o r p ≥ 1 and i ≥ 2, we ha ve: 14 N T ( i, j, p )= N T ( i, j, p − 1 )+ P ⌊ i − 1 p ⌋ k =1 N T ( i − k · p, j − k , p − 1) · C R ( T T ( p ) , k ) T T ( p ) is the to ta l n umber of trees with p vertices, fo r which the (degree / num ber o f sons) of the ro ot is equal to some (( x − 1)/( x )), x ∈ S , and the (degrees / num ber s o f so ns) of the other vertices belong to the set S . By C R ( i, j ) we denote combinations with rep etitions of i elements, out o f which we c ho ose j . Because the arg umen t i can b e very lar ge, we cannot tabulate C R ( i, j ). Instead, we will compute it on the ﬂy . W e know that C R ( i, j ) = C ( i + j − 1 , j ) and that C ( i, j ) = i − j +1 j · C ( i, j − 1). Thus, C R ( i, j ) can be computed in O ( j ) time. Befor e computing any v alue N T ( ∗ , ∗ , p ), we need to co mpute and stor e the v alues T T ( p ), T T ( p ) = P x ∈ S N T ( p, (( x − 1) / ( x )) , p − 1), and C R ( T T ( p ) , k ), for all the v alues of k (1 ≤ k ≤ j n − 1 p k ). W e ca n compute all of these v alues in O ( n 3 · l og ( n )) time. The desired num ber of tree s is P x ∈ S N T ( n, x, n − 1). The memor y storage can be r educed from O ( n 3 ) to O ( n 2 ), by noticing that the v alues N T ( ∗ , ∗ , p ) are computed bas ed o nly on the v alues N T ( ∗ , ∗ , p − 1). Thus, we can maintain these v alues only for the most recent t w o v alues of p . A less eﬃcient metho d is to compute the num ber s T ok ( i )=the num b er of trees with i v ertices, suc h that each vertex sa tisﬁes the (degr ee/num b er of s ons) constraints. T ok (1) = T ok (2) = 1 . W e will make us e of the T T ( i ) v alues deﬁned previously , except that they will be computed diﬀerently . F or every i ≥ 2, we co nsider ev ery po s sible n umber x of sons o f the tree r o ot a nd compute N T 2( i, x )=the num b er of trees with i v ertices, such that the tree ro o t has x s ons and all the other vertices satisfy the (degree/ n umber o f s o ns) constraints. W e will g e ne r ate all the po ssibilities ( y (1) , y (2) , ..., y ( i − 1)), with 0 ≤ y ( j ) ≤ j i − 1 j k (1 ≤ j ≤ i − 1) a nd y (1) + . . . + y ( i − 1)= x . y ( j ) is the num be r of so ns of the tree ro ot which hav e j vertices in their subtree s. The num b er of trees ”matching” such a pa r tition is equal to Q i − 1 j =1 C R ( T T ( j ) , y ( j )). N T 2 ( i , x ) is computed by summing the num b ers of tree s ”ma tching” every par tition. Afterwards, if x ∈ S , we add N T 2( i, x ) to T ok ( i ). If x = (( y − 1) / ( y )) and y ∈ S , then we a dd N T 2( i, x ) to T T ( i ). N T 2( i, x ) may b e added to b oth T ok ( i ) and T T ( i ). 9 Related W ork Reliability analy sis a nd improvemen t techniques for distributed sy stems were considered in [6,7]. Reliabilit y analy sis and optimization for tree netw or ks in particular were considere d in [3,5,8]. Diﬀerent kinds of tree partitioning algo- rithms, base d on optimizing several ob jectiv es, were prop osed in [9,10,16 ]. P rob- lems related to tree colo ring w ere studied in [4]. Co n tent delivery in distributed systems is a sub ject of high practica l and theor e tical interest and is studied from m ultiple pe r sp e c tiv es. Communication scheduling in tree netw orks was considered in ma n y pa per s (e.g. [17]) and the o ptimization of conten t delivery trees (m ulticast tree s ) was studied in [11]. 10 Conclusions and F uture W ork In this pap er I consider ed several optimizatio n pr o blems reg arding distributed systems with tree top olo g ies (e.g . p eer-to-p eer netw o rks, wireless netw or ks, 15 Grids), which hav e ma n y practical applications: minim um weigh t c ycle com- pletion (relia bilit y improv ement), constra ined pa rtitioning (distr ibuted co or di- nation and control), minim um n um b er of streams and degree-constr a ined mini- m um spanning trees (eﬃcien t cont ent delivery), o ptimal matchings (data repli- cation and resource allo ca tion), color ing (resource manag emen t and frequency allo cation) and tree counting asp ects. All these problems ar e v aria tions or exten- sions of pro blems w hich hav e b een previously po sed in other r esearch pap ers. The presented techniques are either b etter (faster or mo re general) than the previous solutions or easier to implement. References 1. J. R oskind, R. E. T arjan, A Note on Finding Mi nimum-Cost Edge-Disjoint Sp an- ning T r e es , Mathematics and Op erations Researc h 10 (4) (1985), 701-708. 2. M. A. Bender, M. F arac h-Colton, The LCA Pr oblem r evisite d , Lecture Notes in Computer Science 1776 (2000), 88-94. 3. M. Scortaru, National Olympiad in Informatics , Gazeta de i nformatica (I nformatics Gazzette) 12 (7) (2002), 8-13. 4. S. M. Hedetniemi, S. T. Hedetniemi, T. Beyer, A Line ar Algorithm for the Grundy (Coloring) Numb er of a T r e e , Congressus N umeranti um 36 (1982), 351-362. 5. M. I. Andreica, N. T apus, R eliability Analysis of T r e e Networks Appli e d to Balanc e d Content R epli c ation , Pro c. of the IEEE Intl. Conf. on Automation, R obotics, Quality and T esting (2008 ), 79-84. 6. D. J. Chen, T. H. Huang, R eliabili ty Ana lysis of Distribute d Systems Base d on a F ast R eliabi lity Algorithm , IEEE T rans. on P ar. and Dist. Syst. 3 (1992), 139-154. 7. A. Kumar, A. S. Elmag hraby , S. P . Ahuja, Performanc e and r eli ability optimi zation for distribute d c omputing systems , Pro c. of the IEEE Symp. on Comp. and Comm. (1998), 611-615. 8. H. A bac hi, A.-J. W alker, R eliabil i ty analysis of tr e e, torus and hyp er cub e m essage p assing ar chite ctur es , Proc. of th e IEEE S.-E. Symp. on System Theory (1997), 44-48 . 9. G. N . F rederickson, Optimal algorithms f or tr e e p artitioning , Proc. of the ACM- SIAM Sy mposium on Discrete A lgorithms (SODA) (1991), 168-177. 10. R. Cordone, A sub exp onential algorithm for the c olour e d tr e e p artition pr oblem , Discrete App lied Mathematics 155 (10) (2007), 1326-1335. 11. Y . Cui, Y. Xue, K. Nahrstedt, Maxmin overlay multic ast: r ate al lo c ation and tr e e c onstruction , Proc. of the I EEE W orkshop on QoS (IWQOS) (2004), 221-231. 12. Y . Qinglin, F actors and F actor Extensions , M.Sc. Thesis, Shandong Un iv., 1985. 13. T. L. Magnanti, L. A. W olsey , Optimal T r e es , Hand b ooks in Op erations Research and Management Science, v ol. 7, chap. 9 (1995), 513-616. 14. S.-C. Mu, R. S. Bird, On B ui lding T r e es with Minimum Height, R elational ly , Pro c. of the Asian W orkshop on Progra mming Languages and Systems (2000). 15. M. I. Andreica, N. T apus, Optim al Oﬄ ine TCP Sender Buﬀer Management Str at- e gy , Proc. of the Intl. Conf. on Comm. T heory , R eliab., and Q oS (2008), 41-46. 16. B. Y. W u, H.-L. W an g, S. T. Kuan, K.-M. Chao, On the Uniform Edge-Par tition of a T r e e , Discrete A pplied Mathematics 155 (10) (2007), 1213-1223. 17. M. R. Henzinger, S. Leonardi, S che duli ng m ul tic asts on unit-c ap acity tr e es and meshes , J. of Comp. and Syst. Sci. 66 (3) (2003), 567-61 1. 18. T. H. Cormen, C. E. Leiserson, R. L . Rivest, C. Stein, Intr o duction to Algorithms , MIT Press and McGra w-Hill (2001). 16

Algorithmic Techniques for Several Optimization Problems Regarding Distributed Systems with Tree Topologies

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment