Efficient Sum-Based Hierarchical Smoothing Under ell_1-Norm

Eﬃcien t Sum-Based Hierarc hical Smo othing Under ℓ 1 -Norm Sia vosh Benabbas ∗ siavosh@cs. toronto.edu Hyun Ch ul Lee † chul.lee@th oora.com Jo el Oren ∗ ‡ oren@cs.tor onto.edu Y uli Y e ∗ ‡ y3ye@cs.tor onto.edu June 7, 2018 Abstract W e int ro duce a new r egression problem whic h we call the Sum-Base d Hier ar chic al Smo othing problem. Given a directed acyclic gra ph and a non-negative v a lue, called tar get value , for ea ch vertex in the graph, w e wis h to ﬁnd non-nega tiv e v alues for the v er tices sa tisfying a certain constraint while minimizing the distance of these assigned v alues a nd the target v alues in the ℓ p -norm. The constra in t is that the v alue a s signed to e ac h vertex should b e no less than the sum of the v a lues assig ned to its children. W e motiv ate this problem with applica tions in information re tr iev al and w eb min ing. While our pro blem ca n b e s olv ed in polynomia l time using linear prog ramming, given the input size in thes e applications suc h a solution is to o slo w . W e mainly study the ℓ 1 -norm case restricting the underlying gr aphs to ro oted trees. F or this case we provide an eﬃcient algor ithm, r unning in O ( n 2 ) time. While the alg o rithm is pure ly combinatorial, its pro of of correctness is an elegant use of linear progra mming dualit y . W e also present a num b er o f other po sitiv e and neg ativ es results for diﬀerent norms and certain o ther sp ecial cases. W e be liev e that our appr oach may be applicable to similar pr oblems, where comparable hierarchical constraints are inv olved, e.g. co nsidering the av erage of the v a lues assigned to the children o f eac h v ertex. While similar in ﬂa vour to other smo othing problems like Isotonic Regressio n (see for example [Angelov et al. SODA ’06]), our problem is arguably richer and theoretically more challenging. ∗ Department of Co mputer S cience, Un iv ersit y of T oron to † Thoora Inc., T oronto, ON, Ca nada ‡ This researc h was supported by the MIT A CS A ccelerate program, Tho ora Inc., and The Univ ersit y of T oron to, Department of Computer Science. 1 In t ro duction The p rev alence of p opu lar w eb s ervices like Amazon, Go ogle, Netﬂix, and Stum bleUp on has giv en rise to man y interesting large-scal e p roblems relate d to classiﬁcation, recommendation, rankin g, and collaborativ e ﬁltering. I n several recen t stu dies (e .g. [KFB09 , PG08, CKP07]), researc h ers ha ve incorp orated the underlying class hierarc hies of the data-sets into the s etting of recommenda- tion sys tems. Moreo v er, Koren et al. [DKK11] recen tly demons trated an app licat ion of hierarc hical classiﬁcations of topics, i.e. taxonomies , in Collab orativ e Filtering settings, in particular, m usic rec- ommendation. In these app licat ion scenarios, the taxonomies are abstr acte d as trees. Asso ciated with the vertice s are scalar target v alues, typical ly inferred through the use of v arious mac h ine learning or information retriev al metho ds. F or instance, giv en a hierarc h y of topics and a searc h query , the target v alues could b e the r elev ance measures of the topics to the search qu ery . When a taxonomy is used, on e wo uld usu ally lik e to enforce particular constrain ts on the v alue assigned to the ve rtices to pr op erly represen t th e hierarc h ical relationship among them. Typically , the relev an t mac h ine learning app roac hes are ill-equipp ed to handle these requir emen ts. Often, these constraints s tate that the v alue of eac h vertex sh ould b e at least some fun ctio n of the v alue of its direct children in the taxonomy (e.g. [PG08, CKP07]). Going bac k to the previous example of topics and searc h query , imagine that the taxonom y con tains the topics sp orts , b aseb al l , fo otb al l , and b asketb al l with the ﬁrst topic b eing the parent of the other three an d that th e searc h query is “ESPN”. O ne would lik e to ﬁnd the r elev ance of this quer y to ev ery topic in the taxonomy . A reasonable requir emen t of these relev ance v alues w ould b e th at the r elev ance of “ESPN” to sp orts w ould b e no less than the su m of its relev ance to b aseb al l , fo otb al l , and b asketb al l . One wa y to solv e this p roblem would b e to directly imp ose s uc h a constraint on the learning algorithm that infers the r elev ance v alues using regularization; i.e. adding an add itional term in the ob jectiv e function of that algorithm p enalizing any violation of the constraint. Ho w ever, th is appr oac h has tw o problems. First, it “softens” our requirement s; i.e. it allo ws for p ossible v iolations, to some limited extend. Moreo ve r, it can dramatically deteriorate the ru n ning time of th e p ro cess of learning or restrict our c hoice of the learning algorithm. Instead, w e take the f ollo wing, widely used, t wo-ste p app roac h. Giv en a searc h query s , we ﬁrst infer eac h of th e r elev ance scores of eac h of the topics, disregarding the h ierarc hy constraints. Then, w e smo oth en the in ferred r elev ance scores by m od ifying them so as to uphold the ab o ve s u m constrain t. W e w ould w an t the change of the relev ance sco res in the sec ond step to b e as small as p ossible. As the r elev ance scores are scalar v alues, w e can represent b oth the original and ﬁnal relev ance scores as tw o v ectors with non-negativ e v alues, and measure th eir diﬀerence in a suitable norm (e.g. the ℓ 1 , ℓ 2 or the ℓ ∞ norms). The sub ject of t his p ap er is ho w to p erform the seco nd step. W e form ulate this p roblem whic h we call the Sum-Base d Hier ar chic al Smo othing problem (SBHSP) as follo ws. Giv en a ro oted tree (or in general a dir ecte d acyclic graph) G = ( V , E ) and a v ector of original v ertex v alues (called tar g e t values ) a = ( a v 1 , a v 2 , . . . , a v n ) the ob jectiv e is to ﬁnd a vect or of new ve rtex v alues (calle d assigne d values ) x = ( x v 1 , x v 2 , . . . , x v n ) with the follo wing prop erties. (i) for an y n od e w with incoming edges ( u 1 , w ) , . . . , ( u k , w ) w e hav e x u 1 + · · · + x u k ≤ x w . (ii) || a − x || p is minimized. Diﬀeren t v alues of p result in diﬀeren t v arian ts of the problem. W e mainly study th e p roblem for p = 1 and p = ∞ bu t the case of p = 2 is also in teresting. It is n ot hard to see that for p = 1 the problem can b e solv ed in p olynomial time using linear programming (see inequalties (3a)-(3d)) and for p > 1 it can b e solved by using a su itable separation oracle and the Ellipsoid metho d. Ho wev er giv en the typical size of taxonomies these solutions are to o slo w. 1 W e n ote that this pr ob lem seems to b e more complex than other previously considered similar problems as the assigned v alue of eac h vertex aﬀects the p ossible v alues f or an y v ertex it shares a paren t with. In particular, to the b est of our kno wledge tec h niques used for similar problems are ineﬀectiv e for it. Con t ributions: Our main contribution is a pur ely com binatorial algorithm wh en th e input is a ro oted tree and p = 1 (i.e. the ℓ 1 norm) that runs in time O ( n 2 ). W e n ote that the ℓ 1 norm was previously used as a go o d measure of d iﬀerence in similar regression problems (e.g. see [AHKW0 6]). As man y hierarchical stru ctur es in practice are trees, our algorithm can b e u sed in many practical applications. O ur second contribution is a linear time algorithm for the case p = ∞ whic h wo rks for any directed acyclic graph. W e a lso sh o w a n eﬃcien t FPT AS for optimizing the ℓ 1 norm for another class of DA Gs (directed bila y er graphs.) Finally , we sho w that if one add s the extra condition that the assigned v alues should b e inte gr al the pr oblem is hard to approximate (to within a p olylogarithmic factor) for any ℓ p norm for 1 ≤ p < ∞ . Inte restingly , giv en that our algorithm for the ℓ 1 norm on trees alw a ys outp u ts an in tegral solution this last result su ggests that n ew ideas are needed to extend it to general D AG s. Our algorithm for th e ℓ 1 case has a r ather simple structur e. W e assign v alues to the v ertices of the tree in a b ottom-up mann er. F or eac h ve rtex we ﬁrst assign a v alid (bu t p ossibly sub optimal) v alue and then u se paths g oing do wn fr om t hat v ertex to “push the excess” do wn the tree and impro v e the ob jectiv e v alue. While th e algorithm is purely com binatorial, its pro of of correctness is an elegen t us e of linear programming du alit y . I n particular, w e use the complementary slac kness condition to show that if the algorithm can no longer pus h the excess of a no de do w n th e tree the v alues assigned to its subtree most b e optimal. Organization: W e p resen t the relev an t previous w ork in S ectio n 2. I n Section 3, w e present a precise deﬁnition of the p roblem and some preliminaries. W e present our ﬁrst algorithm wh ic h is for the case of trees and ℓ 1 norm in Section 4 and p ro ve its correctness. In Section 5 we sho w ho w this al gorithm can b e o ptimized to run in the promissed O ( n 2 ) time. W e conclude a nd prop ose sev eral o p en problems in Section 6. W e extend t he algorithm to the case of weighte d ℓ 1 norm in App endix A. W e present our algorithm for the case of ℓ ∞ in App endix B. W e lea v e our hardness of appro ximation r esu lt to App end ix C and our resu lts for th e case of bila y er graphs to App endix D. 2 Previous W ork The main motiv ation of the cur ren t p ap er is the application of taxonomies in regression. A r ecen t example, studied b y Koren et al. [DKK 11], is t he applica tion of topic hierarc hies in the cont ext of collab orativ e ﬁltering. They pro vid e a metho d of linkin g the d ata- set to a four lev el taxonomy , whic h helps them circumv ent diﬃculties related to the size of the d ata-se t. Regression and smo othing prob lems hav e b een studied extensive ly in r ecen t years. P erh aps the most relev an t prob lem to our setting is the Isotonic r e gr ession problem and its v arian ts. There one wishes to ﬁ nd a closest ﬁt to a giv en v ector sub j ect to a set of monotonicit y constrain ts. More precisely , let a = h a 1 , . . . , a n i b e n target v alues, and let E b e a set of m pairw ise ord er constrain ts on these v ariables. T he Isotonic r e gr ession problem is to ﬁnd v alues x = h x 1 , . . . , x n i s uc h that x i ≥ x j whenev er ( i, j ) ∈ E for wh ic h the distance b et ween x and a is minimized. T o put things in a language similar to ou r s, in isotonic regression the assigned v alue of eac h vertex should b e bigger than the maximum of the assigned v alue of its c h ildren as opp osed to the sum of those v alues in our problem. Common c hoices of distance f unctions in clude th e w eigh ted ℓ 1 , ℓ 2 and ℓ ∞ norms. T he Is otonic 2 regression pr oblem for suc h weigh ted norms h a v e b een stud ied extensiv ely . F or s ome of the results for the ℓ 1 and ℓ 2 norms see [Sto08, AHKW06, BC90]. Stout also main tains a web site conta ining some of the f astest kno w n Isotonic r egression algorithms for diﬀerent settings at [Sto]. The Is oto nic regression p r oblem b elongs t o a more general class of problems kno wn as or der r estricte d statistic al infer enc e . Order r estricted statistical inference was ﬁrst studied by Barlo w et al [BBBB72 ]. The Isotonic regression pr oblem b ecame p opular since it has many applications in testing [LB01, M A C 01 ], m o delling [MJDP + 00, Ulm86], data smo othing [FT84, PG0 8] and other areas [R WD 88] related to statistical and computational d ata analysis. It h as b een sh o wn to b e an imp ortan t p ost-pro cessing smo othing to ol to imp ose desir ed hard constrain ts on the v alues th at a learning algorithm has pro duced. V ariations of Isotonic regression hav e b een used for other appli- cations lik e template learning [CKP07], rankin g [DCZ + 10, MSCZ10], and classiﬁcation [KFB09]. 3 Preliminaries W e no w formally deﬁ ne the problem as follo ws. Giv en a tree (or DA G) T = ( V , E ) ro oted at n od e r ∈ V , and a vec tor a ∈ R n ≥ 0 of the target v alues of the ve rtices. W e wish to ﬁnd the closest v ector x ∈ R n ≥ 0 , in the ℓ p -norm, so that for eac h n od e v , with c hildren u 1 , . . . , u k , x v ≥ x u 1 + x u 2 + · · · + x u k . While most of the pap er addresses the ca se of p = 1, we also discuss the case of p = ∞ in App endix B. Note that our hardn ess results apply to al l 1 ≤ p < ∞ . F or a v er tex u ∈ T , w e denote the set of no des w ith edges to u the childr en of u or C ( u ), similarly the parent of u is A ( u ) (in the case w here the underlying graph is a general D A G, A ( u ) will b e a set of no des). T hroughout the pap er, w e will mak e extensiv e use of v arious paths in the giv en tree. F or this pu rp ose, we let P u → v denote the (uniqu e) path from vertex u to vertex v in T . W e d enote the sub-tree ro oted in vertex v by T v . F or a give n sub -tree T v , w e d eﬁne a | T v as the v ector of target v alues corresp onding to the no des in T v ; w e similarly deﬁ ne x | T v . 4 The Algorithmic Approach for ℓ 1 As an initial attempt, consider the follo wing trivial feasible solution. F or eac h leaf ℓ ∈ T , s et x ℓ = a ℓ . Then , for eac h in ternal n od e v set x v = max( a v , P u ∈ C ( v ) x u ), b y trav ersing the tr ee in p ost-order. Ho w ever, it is not hard to see th at this appr oac h would b e arbitrarily s ub-optimal (see Figure 1.) Indeed, in some cases it is preferable to lo w er t he existi ng x v alues of a give n nod e’s c hild r en, instead of raising the no de’s x v alue, as this might help the ob jectiv e v alue on th e no des ancestors as well. In ord er to optimize the ob jectiv e fun ction, our algo rithm will proceed as follo ws. By tra ve rsing the tree T in p ost-order, it p erforms the follo wing sequence of steps for eve ry vertex v . x v is initiall y set to the maximum of a v and the sum of the x v alues of its children, wh ic h is clearly a feasible assignmen t for T v . It then impro v es the assignmen ts for T v b y sequent ially decreasing the v alues of some vertices that are lo cate d on s ome p ath P fr om v to some other no de in T v . T he adjustments are mad e so that the ov erall impr o v emen t in the ob jectiv e function equ als the improv emen t in | a v − x v | . W e will r efer to s uc h p aths as push-p aths , and the impro v ements made on them as push op er ations . The algo rithm is presente d b elo w as Algorithm 1 . The pro cedure P ush − P ath ( x , P , ǫ ) c hecks what is the imp ro ve men t on the ob jectiv e fun ction v alue if we reduce the x v alue of all v ertices in the path P by ǫ . Th is path w ill alw a ys start at the curr en t ve rtex v . F or no w w e do not discu s s ho w to ﬁnd the p u sh path or the exact v alue that w e push d own that path. Th is ab s tractio n was made delib erately , so as to to separate the correctness of the algorithm from its p er f ormance. In fact, w e later sho w that the in d ividual p aths need not b e e n umerated separately . 3 Algorithm 1: Push -Impro ve Input : Und irected tree T = ( V , E ), with a v ector of v ertex weigh ts a ∈ R n + Output : A f easible v ector of wei gh ts x ∈ R n + for V 1 Let v 1 , v 2 , . . . , v n − 1 , v n b e the vertic es in T sorted in p ost-order. 2 for v ← 1 to n do 3 x v = max { P u ∈ C ( v i ) x u , a v } 4 ImproveS ubtree( v ) 5 end 6 Impr oveSubtree( V ertex u ) 7 while ∃ p ath P fr om u down to a vertex v , and ǫ > 0 such that v is either a le af or x v > P w ∈ C ( v ) x w and Push-P ath ( x , P , ǫ )= ǫ do 8 Push-Pat h( x , p, ǫ ) 9 end 10 Pus h-Path( Assignment x , Path P , Non-ne gative r e al-value ǫ ) 11 b egin 12 Let v 1 , . . . , v k b e the sequence of no des on the P from top to b ottom. 13 ol d = P 1 ≤ i ≤ k | x v i − a v i | 14 x v 1 = x v 1 − ǫ 15 for i = 2 to k do 16 t = P u ∈ C ( v i − 1 ) x u − x v i − 1 17 if t > 0 then 18 x v i = x v i − t 19 end 20 end 21 new = P 1 ≤ i ≤ k | x v i − a v i | 22 return ol d − new 23 end The follo win g theorem states th at the output of Algorithm 1 is optimal. Theorem 4.1. When Algorithm 1 terminates, the obtaine d v e ctor x is a fe asible and optima l assignment for the give n tr e e T . Our proof of Theorem 4.1 w ill pro ceed as follo ws . W e begin b y c haracterizing the necessary push-p ath impro v ement at eac h step of the while-lo op. W e then in d uctiv ely argue that b efore and after eac h push op eration, the v alue of the ob jectiv e function f or eac h sub-tree ro oted in a c hild of the curr en t no de remains optimal. W e conclude by using an LP d ualit y argument in order to sho w that once no more push op erations exist for the current v ertex in the for-lo op, T v is assigned optimal x v alues. The follo win g lemma refers to the series of imp ro ve men ts p erformed on no de v , and can b e view ed as the set of inv ariants of the outer for-lo op. Lemma 1. L et v b e the curr ent no de, P = ( v = u 0 , . . . , u k ) b e a push-p ath such that for 1 ≤ i ≤ k , u i ∈ C ( u i − 1 ) . Then the fol lowing invariants hold thr oughout the e xe cu tion of the inner while-lo op: 4 1. If, for ǫ > 0 , P ush − P ath ( x , P , ǫ ) > 0 , then P ush − P ath ( x , P , ǫ ) ≤ ǫ . F urthermor e, if for p ath P and ǫ > 0 , P ush − P ath ( x , P, ǫ ) = δ > 0 , then ther e exists ǫ ′ > 0 such that P ush − P ath ( x , P , ǫ ′ ) = ǫ ′ . 2. If for p ath P and ǫ > 0 P ush − P ath ( x , P , ǫ ) = ǫ , then for e ach u ∈ C ( v ) , T u is optimal ly set b efor e and after running P ush − P ath ( x , P, ǫ ) . Pr o of. First, n otice that the ab o ve inv ariants clearly hold if the current n od e v is a leaf, as their initial x v alues are set to th eir a v alues, and will only b e mod iﬁed as a result of p erforming P ush − P ath on their ancestors. Assume that the inv ariants hold f or all n o des p receding v in the p ost-order, and supp ose for con trad iction that there exists some path P = ( v = u 0 , . . . , u m ) and ǫ > 0 such th at P ush − P ath ( x , P , ǫ ) > ǫ . The ﬁrst part of th e ﬁrst in v arian t clearly h olds since the sub-trees ro oted in the c hildr en of v are assum ed optima l. H ence, an y ǫ -impro vemen t on v cannot en tail an add itional impro v ement on the rest of th e push -path. W e no w consider the second inv ariant, while brieﬂy deferring the pro of of the second part of the ﬁrst inv ariant. First, n otice that for eac h ℓ ∈ C ( v ) − { u 1 } , the assignmen ts to T ℓ do not change. Let P b e a mod iﬁ catio n-path, and ǫ > 0 suc h that P ush − P ath ( x , P , ǫ ) = ǫ . On the other h and, n otice that x v is redu ced b y exactly ǫ . This implies that k x | T u 1 − a | T u 1 k remains un c hanged, thereby remaining optimal. W e no w turn to the remaining part of th e ﬁrst inv arian t. Consider a mo diﬁcation-path P and δ > 0. By the ﬁrst part of the inv arian t, P ush − P ath ( x , P , δ ) ≤ δ . If P ush − P ath ( x , P , δ ) = δ , then the claim h olds trivially . Hence, assume P ush − P ath ( x , P , δ ) < δ . W e restrict ourselves to d ealing with δ v alues in th e range (0 , x v − a v ]. The follo wing observ ation stems from the f act that during the pu sh op eration, x v alues along P only decrease. Observ ation 1. F or p ath P and ǫ > 0 , i f P ush − P ath ( x , P, ǫ ) > 0 |{ j ∈ P : x j > a j }| > |{ j ∈ P : x j ≤ a j }| (1) In fact, usin g the induction hyp othesis, we can mak e Observ ation 1 eve n stronger: Claim 1. F or p ath P and ǫ > 0 , if P ush − P ath ( x , P , ǫ ) > 0 |{ j ∈ P : x j > a j }| − |{ j ∈ P : x j ≤ a j }| = 1 (2) Claim 1 can b e justiﬁed b y noticing that otherw ise, the sub-tree ro oted in one of v ’s children w ould b e amenable to p ath-impro vemen ts, cont radicting optimalit y . The inv ariant follo ws, as w e could simply set ǫ to b e the minim um (p ositiv e) amoun t that main tains the n umber of no des alo ng P with x v alues that are larger than their a v alues. Lemma 1 implies that eac h push op eration improv es the v alue of the ob jectiv e fun ction for the curren t su b-tree, while main taining the optimalit y of the sub-trees r o oted in the children of v . Ho w eve r, in order to show that the local optim u m obtained by the algorithm is the globally optimal feasible solution, we n eed to argue th at as long as the curr en t assignment is not optimal, there exists a feasible path-impr o v emen t w ith a corresp ond ing ǫ > 0 v alue. The follo wing theorem, whic h constitutes th e m ain tec hn ical part of this pap er, formalizes this notion. Theorem 4.2. Up on termination of the inner while-lo op, the sub-tr e e r o ote d in vertex v is assigne d optimal x values. 5 Pr o of. First, notice that the algorithm clearly main tains th e f easibilit y of the solution throughou t its execution. The follo w ing observ ation follo ws f rom the d eﬁnition of the algorithm. Observ ation 2. During the exe cution of the algorithm, x v ≥ a v . F urthermo r e, if x v = a v , the solution is trivial ly optimal. The p r oof of Theorem 4.2 will pro ceed as follo ws. W e giv e the LP for the optimizat ion problem, and its corresp onding d ual LP . W e then constru ct a f easible solution for the d u al LP that satisﬁes the complemen tary slac kness conditions with resp ect to the solution of the algorithm. In order to constru ct a v alid du al solution, we inductive ly b o otstrap the du al solutions constructed for the no des ro oted sub -trees. F rom LP dualit y , we then conclude that th e t wo solutions are optimal for the pr im al and dual problems. Recall that w e indu ctiv ely assu me that the sub-trees ro oted in the c hild r en of v are optimally adjusted. It is not hard to w r ite a linear program wh ic h f ormulates our problem. This p rogram and its dual can b e seen b elo w. The v ariables d i are introd uced to a void using absolute v alues in the ob j ecti v e fu n ction. min X i ∈ T v d i sub ject to d i + x i ≥ a i (3a) d i − x i ≥ − a i (3b) x i − X j ∈ C ( i ) x j ≥ 0 (3c) x i ≥ 0 ∀ i ∈ T v (3d) max X i ∈ T v a i ( λ i − λ ′ i ) sub ject to λ i + λ ′ i = 1 ∀ i ∈ T v (4a) ( λ i − λ ′ i ) + α i − α p ( i ) ≤ 0 ∀ i ∈ T v \{ v } (4b) ( λ v − λ ′ v ) + α v ≤ 0 (4c) λ i , λ ′ i , α i ≥ 0 ∀ i ∈ T v (4d) Note the sp ecial case for vertex v (inequalit y 4c). By denoting β i = λ i − λ ′ i , one can simplify the dual LP: max X i ∈ T v a i β i sub ject to − 1 ≤ β i ≤ 1 ∀ i ∈ T v (5a) β i + α i − α A ( i ) ≤ 0 ∀ i ∈ T v − v (5b) β v + α v ≤ 0 (5c) α i ≥ 0 ∀ i ∈ T v (5d) W e n ow sum marize the n ecessary complemen tary slac kness conditions required by the d u al: x i > a i ⇒ λ i = 0 , λ ′ i = 1 ( β i = − 1) (C1) x i < a i ⇒ λ i = 1 , λ ′ i = 0 ( β i = 1) (C2) x i > X j ∈ C ( i ) x j ⇒ α i = 0 (C3) x i > 0 ⇒ λ i − λ ′ i + α i − α p ( i ) = 0 (C4) Since thr oughout the execution of the while lo op x v ≥ a v and th e case where x v = a v is trivial, we will assume fr om now on that x v > a v . T his implies the last n ecessa ry cond ition: x v > a v ⇒ α v = 1 (C5) 6 W e b egin by suggesting an in itial assignmen t whic h might not b e feasible, and in addition, migh t violate one of the complemen tary slac kness p rop erties. The follo wing lemma is a direct consequen ce of th e constru ction of the d u al LP an d the com- plemen tary s lackness constraint s. It refers to a family of assignments to the d ual LP that satisfy a subset of the complementa ry slac kness conditions. Lemma 2. L et x , d b e a fe asible solution for the primal suc h that the sub - tr e e s r o ote d i n v ar e optimal ly assigne d and v admits no Push-Pa th impr ovements. L et α, β b e an ass ignment for the dual variables such that the fol lowing holds:  α i = α p ( i ) − β i , if x i > 0 α i ≤ α p ( i ) − β i , other wise β i =    − 1 , if x i > a i 1 , if x i < a i a value in [ − 1 , 1] , if x i = a i (6) Then α, β satisfy al l the pr op erties of a fe asible dual solution, and ( α, β ) along with ( x , d ) satisfy c omplementa ry slackness exc ept that α i might b e ne gative for some no des, and c ondition C3 c ould b e falsiﬁe d. Next, w e obser ve th at if our mo diﬁ ed dual LP admits an optimal fe asible solution, then ou r range of p ossible v alues for α, β can b e narro wed due the tot al unimo du larity of the simp liﬁed constrain t matrix of the dual LP: Observ ation 3. If the dual LP has an optimal and fe asible solution, then it has an inte gr al, fe asible and optimal solution, as wel l. In p articular, f or every i ∈ T v , β i ∈ {− 1 , 0 , 1 } ) . Observ ation 3 can b e v eriﬁed b y ind u ction on the constrain t m atrix of the dual LP , in order to sho w that ev ery square su b-matrix of it has a determinan t of ± 1. The follo win g lemma complemen ts Lemma 2 by suggesting a co ncrete assignment for eac h β i in the case w h enev er x i = a i . Lemma 3. Consider an assignment as describ e d i n L emma 2. If we set β i = 1 whenever x i = a i , then: ∀ j ∈ T v , x j > X k : child of j x k ⇒ α j ≤ 0 Pr o of. W e p r o v e the claim b y w a y of con tradiction. S u pp ose th at the claim is false, and let j b e the h ighest no de for whic h the claim d o es not hold. That is, x j > P k ∈ C ( j ) x k and α j > 0. C onsider P j → v , the path from j to v . As we are tryin g to prov e an up p er b ound for α j , w e will assume that for ev ery no de k on the path f rom v to j α k = α p ( k ) − β k , as lo wer v alues will only strengthen our claim. T his implies α j = − X k ∈ P j → v β k . (7) Since β i = 1 for all no des i su c h that x i ≤ a i , and β i = − 1 otherwise, α j > 0 implies: |{ k ∈ P j → v : x i > a i }| > |{ k ∈ P j → v : x i ≤ a i }| (8) This implies that we can reduce all x v alues of nod es P j → v b y an amount of at most x j − P k ∈ C ( j ) x k so as to get a feasible solution with a b etter ob jectiv e function v alue. Ho we v er , this is exactly a push op eration, thereby cont radicting the assumption of no further path-paths. 7 The follo win g corollary is th e con tra-p ositiv e statemen t of Lemma 3 Corollary 1. If ther e exists a no de j ∈ T v such that α j > 0 and x j − P k ∈ C ( j ) x k > 0 , then ther e exists an anc estor i of j such that β i ∈ { 0 , − 1 } and x i = a i . W e no w p ro ve the m ain theorem b y wa y of indu ction. W e indu ctiv ely assu me that the sub-trees ro oted in v ha v e b oth an optimal setting for the primal LP , and there exists an integral and feasible solution for the dual LP th at s atisfy the complementa ry slac kness conditions. Without loss of generalit y , w e assu me that no c hild i of v has x i = 0, since otherwise, we could use its assignments without an y mo diﬁcations, as x i do es not harden the feasibilit y constrain ts of v . Consider the assumed set of assignmen ts for the s ub-trees ro oted in v . By the assumption, they hav e corresp ondin g assignments to the d ual LPs. O bserv e that since the cond itions listed in Lemma 2 are a subset of the complemen tary slac kness conditions, L emm a 2 applies to them automatica lly . W e will start f rom a ten tative solution to the dual by initially set the ( α, β ) according to the assumed assignments, and set α v = 1 , β v = − 1. W e let s 1 denote the ab ov e assignmen t. Notice that for eac h child i of v , the dual LP that corresp ond s to the current assignmen t had α i + β i = 0 , as i was the ro ot (this is a strict equalit y as by our assump tion x i > 0). Ho wev er, in the current LP , the corresp ond ing d ual inequalit y b ecomes β i + α i − α v = 0 , As α v = 1, th is equalit y is therefore violated. I n order to rectify this, w e ﬁrst raise all the α v alue (except v ’s) b y 1, and denote the resulting solution b y s 2 . Note that b y the feasibilit y of the original assignmen ts to the sub -trees and b y t he deﬁnition o f s 2 all the n od es i n T v ha ve non-negativ e α v alues. Also o bserve that s 2 no w has all the prop erties l isted in Lemma 2. Thus, b y Lemma 2, w e can conclude that s 2 is a feasible solution to the du al LP , and s 2 along with ( x , d ) satisfy complemen tary slac kness except th at complementa ry slac kness condition C3 migh t b e violated. Our next step would b e to adju st s 2 so as to ﬁx any violation cond ition of condition C3. Let W b e th e set of all inf easible no des: W = { j ∈ T v : x j > X k ∈ C ( j ) x k and α i > 0 } (9) By Corollary 1, for eac h j ∈ W th ere exists an ancesto r i suc h that (1) x i = a i and (2) β i ∈ {− 1 , 0 } . W e let X = { i ∈ T v : x i = a i , β i ∈ {− 1 , 0 }} (10) Moreo v er, we let Y = { i ∈ X : there is no ancestor of i in X } (11) Th us, for eac h no de j ∈ W there exists an ancestor i ∈ Y . W e n o w deﬁne the ﬁ nal solution to the du al LP . Deﬁne assignm ent s 3 to the d ual LP for T v b y taking solution s 2 with the follo win g mo diﬁcations: 1. ∀ k ∈ T i , such that i ∈ Y , su b tract α k b y 1. 8 2. ∀ i ∈ Y add 1 to β j . Increasing the β j v alues by 1 mak es sure that complementary s lac kness co ndition C4 is sa tisﬁed after applying the ﬁrst step. Applying the ﬁrst m o diﬁ cati on s tep guarantee s th at complemen tary slac kness condition C 3 is agai n satisﬁed, as all n od es in W u ndergo the ﬁrst mo diﬁcation. Observe that b y deﬁnition, all th e sub -trees ro oted in no des in Y are pair-wise d isjoin t. Hence, eac h α v alue can b e decrement ed at most once. Also observ e th at in addition to no des in W , other no des ma y ha ve their α v alues d ecremen ted. Ho wev er, as b y th e deﬁn ition of W , these n od es do not n eed to main tain condition C3, and th us this step w ill not violate their constraints. In add ition, their α v alues are guaran teed to remain non-negativ e as they w ere previously incr emented b y 1. In conclusion, all of the complemen tary slac kness conditions for the dual LP no w hold for ( x , d ) and s 3 . Therefore, ( x , d ) is an op timal solution for T v . 5 The Algorithm Giv en the general tec hnique p resen ted in Algorithm 1, we conclude our results b y giving an O ( n 2 ) algorithm that follo ws the spirit of impro vin g b y pushin g the surplu s from a given vertex do wnw ard , along a p ath. Recall that the algorithm Pu sh-Impr ove p erforms the pus h op erations one path at a time. Ins tead, we can lev erage the fact that some paths can share the same preﬁx. Sp eciﬁcally , instead of the inner while-l o op, exe cuted for eac h nod e v in the tr ee, w e in tro duce a depth-ﬁrst- searc h algorithm in whic h f or eac h nod e j ∈ T v , t he algorithm remem b ers th e maximal amoun t, pushable through P v → j . W e mak e use of t w o measures, deﬁn ed for e ac h no de u ∈ T v . Le t δ u = |{ j ∈ P v → u : x j > a j }| − |{ j ∈ P v → u : x j ≤ a j }| . In other w ord s, for an y push operation along P v → u , δ u is the diﬀerence b et ween the n umber o f nodes th at will impro ve the o b jectiv e function v alue, and th e n um b er of no des that will w orsen the ob jectiv e function v alue, if we p u sh a small enough v alue through P v → u . Additionally , w e deﬁn e the p ositiv e b ottlenec k along P v → u as ǫ u = min j ∈ P v → u { x j − a j : x j > a j } . This is the maximum ǫ we can push on the path P v → u while gaining exactly δ u ǫ in the ob jectiv e function. In ord er to main tain feasibilit y , w e r estrict ǫ u to b e no mor e than x k , for any n od e k on P v → u . This v alue will ha ve a similar fu nction a s the ǫ v alue giv en in Algo rithm 1. That is, for the current no de v , an d a successor u , ǫ u will serve as the amoun t of exc ess w e pus h through P v → u . Our algorithm will main tain f easabilit y b y restricting the decrease in x u b y the sum of the decreases made on its direct c hildren of u (un less x u w as strictly bigger than the su m of the x ’s of its c hildren b efore the decrease.) The ﬁnal algorithm for optimizing the assignment to T v , can b e seen as Al gorithm 4 in Ap- p endix E. It diﬀers from Algorithm 1 in the wa y th e sub -tree T v is mo diﬁed for eac h no de v . The follo win g theorem states th at Algorithm 4 is optimal. Theorem 5.1. W hen Algorithm 4 le aves no de v ∈ V , ther e is no push-p ath going fr om the r o ot r , ends at a le af, and p asses thr ough v . Pr o of. First, we note the follo wing observ ation, which suggests that the p oten tial f or improv emen t on an y path fr om the ro ot to a no de v cannot in cr ease. Observ ation 4. L et u b e a no de in T . L et δ ∗ u = |{ i ∈ P r → u : x v > a v }| − |{ i ∈ P r → u : x v ≤ a v }| . δ ∗ u do es not incr e ase thr oughout the exe cution of the algorithm. 9 The obser v ation follo ws immediately from the fact that the only mo diﬁcations to the x v alues of the no des are decreases. W e pro ceed to p ro ve the lemma by induction on the heigh t h of the n od e v . F or h = 0 (lea ves) , the claim is trivial. Assu me the claim h olds for h = k , and let v b e a no de of heigh t k + 1. The claim follo w s immed iately from Obs erv atio n 4: n o sub -tree r o oted in a c hild of v can b e imp r o v ed as a resu lt of a push-path through it. Add itionally , the p ath from r to v neve r b ecomes amenable to impro v ements through push op erations, once the algorithm lea ves v . T his concludes the pro of. Running Time T h e algorithm essen tially p erforms a d epth-ﬁrst-searc h for ev ery no de v on the tree. Th erefore, th e ru n ning time of the algorithm is O ( n 2 ). 6 Conclusions and F uture W ork W e ha v e demonstrated the tec hnical diﬃculties that our p roblem en tails, as w ell as an eﬃcien t metho d for h andling a b road class o f instances of the problem. Due to their high eﬃciency , our metho ds can b e ru n on relativ ely large in stances in practice. W e also b eliev e that our algorithm migh t b e applicable to settings b ey ond recommendation systems. An immediate op en question is to extend ou r algorithm to the case of general DA Gs. It seems that one needs some new ideas to give a com binatorial algorithm for this general case. In fact ev en a (fast) ap p ro ximation algorithm for this case seems to b e b eyond the reac h of our tec hniqu es. Another in teresting direction w ould b e to consider other measures suc h as the ℓ 2 -norm. Due to the fundament al d iﬀeren ce b et ween the ℓ 1 and ℓ 2 norms, w e susp ect that this diﬀeren t distance measure will requ ir e a completely diﬀerent appr oac h. In addition to considering alte rnativ e ob jectiv e functions, w e can also consider other constrain ts. F or instance, we can consider comparing the v alue assigned to eac h no de to the aver age v alue of its c hildren. Another type of constraint would b e to require equ alit y b et wee n the v alue of a no de, and the sum of th e v alues of its children. References [AHKW06] Stanisla v An gelo v, Boulos Harb, Sampath Kannan, and Li-San W ang. W eigh ted iso- tonic regression u n der the ℓ 1 norm. In SODA , pages 783–791, 2006. [BBBB7 2] R. E. Barlo w, D. J. Bartholomew, J. M. Bremmer, and H. D. Brunk. Statistic al Infer enc e Under Or der R estrictions . Wiley , 1972. [BC90] M. J. Best and N. Chakra v arti. Activ e set algorithms for isotonic r egression: a unifying framew ork. M ath. Pr o gr am. , 47:425– 439, August 1990. [CKP07] Deepa y an Chakrabarti, Ra vi K umar, and Kunal Pu n era. Pa ge-lev el template detectio n via isotonic smo othing. In WWW , pages 61–70, 2007. [DCZ + 10] Anlei Dong, Yi Chang, Zh aohui Zheng, Gilad Mishne, Jing Bai, Ruiqiang Z hang, Karolina Buc hner, Ciya Liao, and F ernando D iaz. T o w ard s recency ranking in web searc h. In WSDM , p age s 11–20, 2010 . [DKK11] Gideon Dror, Noam Ko enigstein, and Y ehuda Koren. Y aho o! m usic recommendations: Mo deling music ratings with temp oral dynamics and item taxonom y . In R e c ommender Systems , Ch icag o, IL, US A, 2011. A CM. 10 [F ei98] Uriel F eig e. A threshold of l n n for appro ximating set co v er. J. ACM , 45:634–6 52, July 1998. [Fle04] Lisa Fleisc her. A fast appr o ximation sc h eme for fractional co vering p roblems with v ariable u pp er b ounds . In Pr o c e e dings of the ﬁfte enth annual ACM-SIAM symp osium on Di sc r ete algorithms , SOD A ’04, pages 1001–101 0, Philadelphia, P A, USA, 2004. So ciet y for Indu strial and App lied Mathematics. [FT84] J. F riedman and R. T ibshirani. Th e m on otone smo othing of scatterplots. T e chnomet- rics , 1984. [KFB09] R ´ emon Kamp, Ad F ee lders, and Nicola Ba rile. Isotonic cl assiﬁcation tree s. In Pr o- c e e dings of the 8t h International Symp osium on Intel lig e nt Data A nalysis: A dvanc es in Intel ligent Dat a Analysis V III , IDA ’09, pages 405–416, Berlin, Heidelb erg, 2 009. Springer-V erlag. [LB01] Klervi Leur aud and Jacques Benic hou. A c omparison of several metho ds to test for the existence of a monotonic dose-resp onse relat ionship in clinical and epid emiolo gical studies. Statistics in Me dicine , 20(22):333 5–335 1, 2001. [MA C01] Jessica Y. Mancuso, Hongshik Ahn, and James J. Chen. Or der-restricted dose-relat ed trend tests. Statistics in Me dicine , 20(15 ):2305– 2318, 2001. [MJDP + 00] T ony Morton-Jones, Peter Diggle, Louise Park er, Heather O. Dic kinson, and Keith Binks. Additive isoto nic regression mo dels in epidemiology . Statistics in Me dicine , 19(6): 849–85 9, 2000. [MSCZ10] T aesup Mo on, Alex J. S mola, Yi Chang, a nd Z haoh ui Zheng. Interv alrank: isotonic regression with listwise and pairwise constraints. In WSD M , p age s 151–160, 2010. [PG08] Kunal Punera and J o ydeep Ghosh. Enhanced hierarc hical cla ssiﬁcation via isotonic smo othing. In WWW , pages 151–160, 2008. [R W D88] T. ROBER TSON, F. T . W righ t, and R. L. Dykstra. Or der R estricte d Stat istic al In- fer enc e . Wiley , 1988. [Sto] Quen tin F. Stout. Isotonic Regression Algorithms. h ttp://www.eecs.umic h.edu/˜qstout/IsoRegAlg.h tml. Retriev ed: Au g. 6th, 2011. [Sto08] Q uen tin F. S tout. Unimo dal r egression via pr eﬁx isotonic regression. Computat ional Statistics & Data Analysis , 53(2):289–2 97, 2008. [Ulm86] K. Ulm. Nonparametric analysis o f dose-resp onse relations in e pidemiology . Mathe- matic al Mo del ling , 7(5-8):777 – 783, 1986. A The W eigh ted ℓ 1 -Norm Case W e no w d iscuss the case wh ere the no des on the giv en tree can h a v e v arying lev els of imp ortance with resp ect to the ob jectiv e fun ction. Sp eciﬁcally , as done in related studies, w e consider th e case in whic h f or eac h no de i ∈ V , there is an asso ciate d wei gh t w i . F or ease of presen tation, we assume that all w eights are in tegral, i.e. w ∈ N n ≥ . The ob jectiv e fun ction will b e g ( x ) = P i ∈ V w i · | a i − x i | . 11 Hence, w e can simply reinterpret the v ariable δ i deﬁned for Algorithm 4 as the w eigh ted b al- ance b et ween the no des wh ic h will b eneﬁt, and the no des that will “suﬀer” as a result of a Push-Pa th op eratio n. More precisely , wh en considering the path P v → i , w e will compute δ i = P j ∈ P v → i : x j >a j w j − P j ∈ P v → i : x j ≤ a j w j . T herefore, a feasible p ush-impro v ement across a path P v → u w ould lead to an imp ro ve men t if and only if δ u > 0. The ab o v e discu s sion leads to the follo wing simple mo diﬁcation to p r ocedu re SetParams , giv en in Algorithm 2. Note that the w eighte d case reduces to the un w eighte d case by setting th e we igh ts Algorithm 2: The mo diﬁed Se t-Params pro cedure for th e weig h ted case Input : V ertex v 1 Set- Params( V ertex i , Non-ne gative inte ger δ , Non-ne gative r e al-value ǫ ) 2 b egin 3 if x i > a i then 4 δ i ← δ + w i , ǫ i ← min { ǫ, x i − a i } 5 end 6 else 7 δ i ← δ − w i , ǫ i ← min { x i , ǫ } 8 end 9 end to 1. Clearly , the algorithm has the same O ( n 2 ) runn ing time of th e original algorithm. The follo wing theorem argues ab out the optimalit y of the m o diﬁ ed algorithm: Theorem A.1. The algorithm r esulting fr om the mo diﬁc ation g iven in Algorithm 2 obtains the optimal weighte d- ℓ 1 obje ctive function value. Pr o of. In order to argue ab out the correctness of the modiﬁ ed algorithm, we compare the ob jectiv e function obtained b y the algorithm to the one obtained b y the original algorithm, on an equiv alen t unw eigh ted tree. The construction W e constru ct the tree ˜ T = ( ˜ V , ˜ E ) b y replacing eac h n od e i with a chain i 1 , . . . , i w i , suc h th at for an y 1 ≤ j < w i , ( i j , i j +1 ) ∈ ˜ E . Additionally , we s et for c h ildren k ∈ C ( i ) ( k w k , i 1 ) ∈ ˜ E , and for i ’s paren t ℓ ( i w i , ℓ 1 ) ∈ ˜ E . It i s easy to see that ˜ T is a tree. Notice th at ˜ T might b e a rbitrarily large (according to the w eigh ts). How ev er, it is us ed only for the sak e of pro of of correctness, and nev er actually co nstructed b y the algorithm. Th e follo wing pro of sket c h highligh ts th e equiv alence of the uniform weigh t case to the weig h ted case. Claim 2. L et x and ˜ x b e optimal assignments for T and ˜ T , r esp e ctively. Then g ( x ) = f ( ˜ x ) . The follo wing immediate observ ation, wh ic h follo ws fr om the construction of ˜ T , implies the ab o ve claim. Observ ation 5. L et ˜ x b e an optimal assignment for ˜ T . Then for any chain ( i 1 , . . . , i w i ) that c orr esp onds to ve rtex i in T : x i 1 = x i 2 = . . . = x i w i The follo win g claim complement s claim 2: 12 Claim 3. L et x and ˜ x b e the fe asible assignments r eturne d by Algorithm 4 and the mo diﬁe d algo- rithm for weighte d tr e es, r esp e ctively. Then g ( x ) = f ( ˜ x ) B The ℓ ∞ -norm W e no w turn our atten tion to the case of the ℓ ∞ -norm; i.e. minimizing the maximal diﬀerence max u ∈ V | a i − x i | . In con trast to the case of the ℓ 1 -norm, this optimization problem can b e s olv ed in a straightfo rwa rd manner b y us ing dynamic programming, ev en when the und er lyin g graph is a directed acyclic graph . F or a giv en v alue t ≥ 0, the algorithm will go o v er all no des and tries to pro du ce an assignment of ob j ecti v e v alue at most t . W e can sh o w that if the algorithm fails then there is no v alid assignmen t of ob j ective v alue at most t . T o ﬁnd the optimal ob jectiv e v alue then one only needs to ru n a binary searc h on th e v ariable t . Algorithm 3: The dynamic programming algorithm for the ℓ ∞ -norm case Input : DA G G=(V,E), v ertices 1 , . . . , n sorted in top ological order, v ertex weig h t vect or a . 1 for i ← 1 to n do 2 x min i ← max { 0 , P j ∈ C ( i ) x min j , a i − t } 3 end 4 return x min As mentio ned a b o ve, w e p erform a binary s earch on t in the range [0 , P i a i ]. C learly , for an instance of the problem with optimal solution v alue τ , th e run ning time of Algorithm 3 would b e O ( n · l og τ ). W e n ow br ieﬂy outline the pro of of correctness of th e algorithm. Theorem B.1. F or any given t ≥ 0 , x = x min is a valid solution. F urthermo r e, if || x − a || ∞ > t , then ther e do es not exist a valid solution x ′ such that || a − x ′ || ∞ ≤ t . Pr o of. T he v alidit y o f x follo ws from deﬁn ition. T o pr o v e the second part w e sho w the follo win g simple lemma. Lemma 4. If x ′ is a valid solution and || a − x ′ || ∞ ≤ t , then for al l i , x ′ i ≥ x min i Pr o of. T he p ro of follo ws with a simple ind uction on i . Not e that b ecause || a − x ′ || ∞ ≤ t , x ′ i ≥ a i − t . F urthermore, x ′ is a v alid solution so x ′ i ≥ 0 an d x ′ i ≥ X j ∈ C ( i ) x ′ j ≥ X j ∈ C ( i ) x min j = x min i , where we hav e used th e ind u ction hyp othesis for the second inequ ality . It th en follo ws that, x ′ i ≥ max { 0 , X j ∈ C ( i ) x min j , a i − t } = x min i . 13 No w assume that there is a v alid solution x ′ with ob jectiv e v alue at most t . It follo ws that for all i , a i − t ≤ x min i ≤ x ′ i ≤ a i + t, that is, || x min − a || ∞ ≤ t . C Hardness of A pproxima tion in Gen eral Graphs As men tioned b efore when the ob jectiv e v alue is th e ℓ 1 norm of the diﬀerence b et w een the x and a v ectors the (most general case of the) problem can b e solv ed exactly in p olynomial time b y solving a linear program. In fact, it is not hard to see that using a similar app r oac h one can solv e this general case of the pr oblem for any ℓ p norm w ith 1 ≤ p ≤ ∞ . The only diﬀerence is that one has a linear program with inﬁnitely m any facets whic h has an e ﬃcien t separation orac le and can be solv ed with the E llipsoid metho d. F or an instance where all the in put v alues are integ ral, one migh t ask whether the task of ﬁ nding an optimal inte gr al solution is tractable or not. Th is is esp ecially in teresting for the ℓ 1 case, s ince in th e case of trees, an in tegral solution can b e foun d eﬃcien tly b y our algorithm of S ectio n 4, if the initial a v alues are integ rals. Unfortunately , as so on as one considers th e DA G case (ev en the sp ecial case of la y ered dags) this problem b ecomes in tractable for essenti ally all ℓ p norms. The follo wing theorem su mmarizes our hardness results. Theorem C.1. Unless N P ⊆ T I M E ( n O (log log n ) ) it is NP- har d to appr oximate the Inte gr al Iso- tonic R e gr ession pr oblem for the c ase of dir e cte d acyclic gr aphs b etter than Θ((log n ) 1 /p ) f or the ℓ p norm. Pr o of. W e pro ve the theorem by a reduction from the Set Cover problem. In the Set Co v er problem one is g iv en sets S 1 , S 2 , . . . , S m suc h that S 1 ∪ S 2 ∪ ... ∪ S m = { 1 , 2 , . . . , n } and t he ob jectiv e is to select a minim um num b er of S i ’s such that their un ion is still { 1 , 2 , . . . , n } . It is a well kno wn result of F eige [F ei98 ] that un less N P ⊆ T I M E ( n O (log log n ) ) it is NP-hard to appro ximate Set Co ver b etter than a factor of (1 − o (1)) log n . O ur redu ction uses v ertex we igh ts so as to simplify the construction. Ho w ev er, one can ea sily a dapt the constru ctio n to the un if orm ca se by addin g m u ltiple copies of no des so as to simulat e large weigh ts. Giv en an instance of the Set Co ver pr oblem W e constru ct the f ollo wing instance of the SBHSP problem: • The ve rtex set of the outpu t digraph will b e V = { v 1 , . . . , v m , u 1 , . . . u n } . • The edge set of the output d igraph w ill b e E = { ( v i , u j ) : j ∈ S i } . • The a v alues on the v ertices will b e as follo ws. F or all v i w e hav e a ( v i ) = 1, w hile for all u j w e ha v e a ( u j ) = |{ i : j ∈ S i }| − 1. • The w v alues (w eight s) of the vertic es will b e as follo w s. F or all v i w e ha v e w ( v i ) = 1, while for all u j w e h a v e w ( u j ) = m . On the one hand it is easy to see th at f or any set co v er of the original instance (of size α ) one can construct a solution t o the S BHSP instance ( of cost p √ α ) by assigning x ( v i ) = 0 if S i is selected and 1 otherwise, and x ( u j ) = a ( u j ) for all u j . On the other hand it is not hard to see that the optimal solution to the SBHSP will ha v e x ( v i ) ∈ { 0 , 1 } for all i and x ( u j ) = a ( u j ) for all j . F u rthermore, for an y suc h solution (of cost 14 α ) the set of S i for w hic h x ( v i ) = 0 can b e easily seen to b e a v alid set co ver of size α p . Hence, a hardness of app ro ximation of (1 − o (1)) log n for the Set C o v er problem implies a h ardness of appro ximation of Θ( p √ log n ) for S BHSP when the ob jectiv e v alue is deﬁ n ed u sing the ℓ p norm for an y 1 ≤ p < ∞ . R emark 1 . The hard-instances of Set-Co ver generated by F eige [F ei98] ha v e less Sets than elements. As a result it is n ot hard to see th at the hardn ess achiev ed by the p ro of of the ab ov e theorem is in fact ((1 − o (1)) ln n ) 1 /p for the w eigh ted case and  (1 − o (1)) ln n 2  1 /p for the non -weigh ted case. D FPT A S for optimi zing under the ℓ 1 norm for Bilay ered graphs Consider a D AG G = ( V , E ) whic h is bila yered, i.e. the ve rtex set can b e partitioned as V = U ∪ W and eac h edge is from the U s ide to the W s ide ( E ⊆ U × W .) In this section we sho w a fast F u lly P olynomial App ro ximation Sc h eme for SBHSP with the ℓ 1 norm for su c h D A Gs. The r un time will b e close to linear in the size of the D A G. Th e algorithm is a s im p le r eduction to a w ell known cla ss of problems which admit suc h FPT AS es. These prob lems are restricted class of the Mixed Positiv e P acking and Cov ering Problem, see [Fle04]. W e start b y the follo wing simple observ ation. Lemma 5. When optimizing the ℓ 1 norm and when the input DA G is bilayer e d ther e is always an optimal solution with the fol lowing two pr op erties, (i) ∀ w ∈ W : x w = a w , (ii) ∀ u ∈ U : x u ≤ a u . Pr o of. C onsider an y optimal solution x , and a v ertex w ∈ W for whic h x w 6 = a w . If x w < a w c hanging the assigned v alue of this vertex to a w pro duces another v alid solution with a b etter ob j ecti v e f u nction v alue. No w consider the case in wh ich x w > a w . If x w > P u ∈ C ( w ) x u then again w e can impro v e the ob jectiv e function by decreasing x w sligh tly , and if x w = P u ∈ C ( w ) x u w e can sim u ltaneously decrease x w and the assigned v alue of some of its c h ildren. This last step would help the ob jectiv e v alue d ue to the imp ro ve men t on w , and p ossibly hurt it b y the exact same amount due to the decrease on its c hildr en, w hile mainta ining a v alid the solution. Doing this step on ev ery no de on the W side results in a solution that satisﬁes th e ﬁrst condition. F or the s eco nd condition observe that if x u > a u w e can simp ly decrease it to a u without c hanging the v alidity of th e solution wh ile decreasing the ob jectiv e function. In other w ords any optimal solution must satisfy the second condition. Giv en the ab o ve lemma one can write the follo wing linear program whose solution is the exact v alue of the optimal solution. The left hand side is the original program based on the LP (3a)-(3d) while the right h and side is the result of a simpliﬁcation. min X v ∈ V d i sub ject to d u ≥ 0 ∀ u ∈ U x u ≥ 0 ∀ u ∈ U d u + x u = a u ∀ u ∈ U X u ∈ C ( w ) x u ≤ a w ∀ w ∈ W min X v ∈ V d i sub ject to d u ≥ 0 ∀ u ∈ U (13a) d u ≤ a u ∀ u ∈ U (13b) X u ∈ C ( w ) d u ≥ ( X u ∈ C ( w ) a u ) − a w ∀ w ∈ W (13c) Once wr itten in this form the ab o v e form ulation is a, so calle d, Mixed P ositiv e P ac king and C o v ering 15 Program. In fact it is among a certain class of su c h programs for which Fleisc her [Fle04] pro vides a fast FPT AS . In particular, we ha v e the f ollo wing theorem. Theorem D.1. When the input is a bilayer e d gr aph and t he obje ctive value is in terms of the ℓ 1 norm, ther e is an algorithm that g i ven ǫ > 0 runs in time O ( | V || E | log ( | V | ) /ǫ 2 ) and r eturns a valid solution with obje ctive value no mor e than (1 + ǫ ) times that of the optimum. Pr o of. T he pr oof is a sim p le application of Theorem 2.1 from [Fle04] to the abov e Linear Program. A simple co rollary of that theorem is that the al gorithm ﬁnishes in O ( | V || E | log( | V | ) /ǫ 2 ) steps 1 and in eac h step one has to ﬁn d the most unsatisﬁed constraint among (13a)-(13c) given a cur ren t solution d . E ach suc h step can b e d one by ev aluating all the constraint s in total time | E | . E Omitted ﬁgures and algorithms /.-, ()*+ 8 /.-, ()*+ 8 O O /.-, ()*+ 5 @ @         /.-, ()*+ 5 ^ ^ > > > > > > > > Figure 1: A counter-e xample for the naiv e algorithm. No de annotations den ote the a v alues. Th e naiv e algorithm will obtain an ob jectiv e fun ction v alue of 4, whereas the optimum v alue is 2. 1 the constan t C in Theorem 2.1 of [Fle04] is 1 in our case. 16 Algorithm 4: The improv ed Imp ro ve Subtree pr ocedu r e Input : V ertex v 1 b egin /* If x v = a v T ( v ) is o ptimal */ 2 if x v > a v then 3 Push-Sear ch( v , ∞ , 0 ) 4 end 5 end 6 Push -Search( V ertex u , Non-ne g ative r e al-value ǫ , Non-ne gative Inte ger δ ) 7 b egin 8 Set-Para ms( u, ǫ, δ ) 9 if ǫ u = 0 t hen 10 return 0 11 end 12 sum ← 0 13 ℓ ← min { ǫ u , x u − P k :c hi ld of u x k } 14 if ℓ > 0 and δ u > 0 t hen 15 x u ← x u − ℓ , sum ← ℓ 16 Set-Para ms( u, δ , ǫ − ℓ ) 17 end 18 foreac h j ∈ c ( u ) do 19 if sum = ǫ u then 20 return 0 /* Sp eedup 21 end 22 t ← Pus h-Search( j, ǫ u , δ u ) 23 sum ← s um + t, x u ← x u − t 24 Set-Para ms( u, δ , ǫ − s um ) 25 end 26 return sum 27 end 28 Set -Params( V ertex i , Non-ne gative inte ger δ , Non-ne gative r e al-value ǫ ) 29 if x i > a i then 30 δ i ← δ + 1 , ǫ i ← min { ǫ, x i − a i } 31 end 32 else 33 δ i ← δ − 1 , ǫ i ← min { x i , ǫ } 34 end 17

Efficient Sum-Based Hierarchical Smoothing Under ell_1-Norm

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment