Set Covering Problems with General Objective Functions

Set Cov ering Pr oblems with General Objectiv e Functions Jean Cardinal, Christophe Dumeunier Univ ersit ´ e Libre de Bruxelles (ULB) Computer Science Department, CP 212 B-1050 Brussels, Belgium { jcardin,cdumeuni } @ulb.ac.be Abstract. W e introduce a parameterized v ersion of set co ver that generalizes sev eral pre viously studied problems. Giv en a ground set V and a collection of subsets S i of V , a feasible solution is a partition of V such that each subset of the partition is included in one of the S i . The problem in volves maximizing the mean subset size of the partition, where the mean is the generalized mean of parameter p , taken o ver the elements. For p = − 1 , the problem is equiv alent to the classical minimum set cov er problem. For p = 0 , it is equivalent to the minimum entrop y set cover problem, introduced by Halperin and Karp. For p = 1 , the problem includes the maximum-edge clique partition problem as a special case. W e prove that the greedy algorithm simultaneously approximates the problem within a factor of ( p + 1) 1 p for any p ∈ R + , and that this is the best possible unless P = NP. These results both generalize and simplify previous results for special cases. W e also consider the corresponding graph coloring problem, and prov e sev eral tractability and inapproximability results. Finally , we consider a further generalization of the set cov er problem in which we aim at minimizing the sum of some concav e function of the part sizes. As an application, we deriv e an approximation ratio for a Rent-or -Buy set cover problem. 1 Introduction The greedy strategy is one of the simplest and most well-known heuristic, which can be applied to many combinatorial optimization problems. In the case of the minimum set cov er problem, it in v olves iterativ ely choosing a subset that cov ers a maximum number of uncovered elements. W e study this algorithm on a natural family of set co vering problems in which the v alue of a subset depends on the number of elements it cov ers, and a parameter p encodes the way in which these values are combined. This parameter interpolates between different versions of the set co vering problem, in particular between the classical minimum set cov er problem, the minimum entropy set cover problem, and the simpler problem of ﬁnding a subset of maximum size. Intuiti vely , the greedy algorithm should perform better for objectiv e functions in which more importance is giv en to subsets cov ering many elements. W e giv e a formal support to this intuition by sho wing that the greedy algorithm provides a constant factor approximation for all positiv e values of the parameter p . W e further sho w that this is the best we can achie ve unless P = NP. W e ﬁrst deﬁne some notations. Let V be an n -element ground set and S = { S 1 , . . . , S k } a collection of k subsets of V , whose union is V . In the minimum set cov er problem, we seek a minimum size subset T ⊆ S such that S S i ∈T S i = V . W e deﬁne a cover as an assignment ϕ : V 7→ S of each element of V to a set of S such that v ∈ ϕ ( v ) for all v ∈ V . This deﬁnition allows us to deﬁne alternativ e objectiv e functions for the set cover problem. Giv en a cover ϕ , let us deﬁne a part as a set ϕ − 1 ( S i ) for some S i ∈ S . W e use the following two notations: c i := | ϕ − 1 ( S i ) | is the part size of the i th subset S i with respect to ϕ , and a v := | ϕ − 1 ( ϕ ( v )) | is the size of the part containing the element v , with v ∈ V . W e deﬁne a new family of set cov er problems in which we aim at maximizing the mean M ( { a v : v ∈ V } ) of the values a v . There exist many deﬁnitions of the mean M ( { a 1 , a 2 , . . . , a n } ) of a set of numbers. The most widely used deﬁnition is the arithmetic mean : M 1 ( { a 1 , a 2 , . . . , a n } ) := 1 n P n i =1 a i . Another well-kno wn deﬁnition is the geometric mean : M 0 ( { a 1 , a 2 , . . . , a n } ) := ( a 1 · a 2 · . . . · a n ) 1 n . Finally , we also consider the harmonic mean : M − 1 ( { a 1 , a 2 , . . . , a n } ) := n/  P n i =1 a − 1 i  . The arithmetic, geometric, and harmonic means are special cases of the gener alized mean: M p ( { a 1 , a 2 , . . . , a n } ) = 1 n X v ∈ V a p v ! 1 p =   1 n X i : c i 6 =0 c p +1 i   1 p . (1) This value is the arithmetic mean for p = 1 , and the harmonic mean for p = − 1 . It is well-known that the limit of the generalized mean for p → 0 is equal to the geometric mean. The generalized mean with parameter p is also called the normalized L p -norm 1 . Deﬁnition 1 (Maximum p -mean set cover). Given an n -element gr ound set V and a collection S = { S 1 , . . . , S k } of subsets of V whose union is V , ﬁnd a cover ϕ : V 7→ S that maximizes M p ( { a v : v ∈ V } ) , wher e a v := | ϕ − 1 ( ϕ ( v )) | , and M p is the gener alized mean of parameter p . Special Cases Interestingly , letting p = − 1 (harmonic mean) or p = 0 (geometric mean) yields set co ver problems that are already known: the harmonic mean version is the minimum set cov er problem, while the geometric mean version is the minimum entr opy set cover problem [6]. A special case of the maximum p -mean set cov er problem for p = 1 has recently been introduced in the form of a graph coloring problem [9]. Minimum Set Cover . The maximum harmonic mean set cov er problem can be cast as min ϕ P v ∈ V 1 a v . W e can re write this objective function as P v ∈ V 1 a v = P S i ∈S P v ∈ ϕ − 1 ( S i ) 1 c i = |{ S i : c i 6 = 0 }| . Hence the maximum harmonic mean set cov er problem is the standard minimum set cov er problem. This problem is among the most studied NP-hard problems. It has long been kno wn to be approximable within a factor H max i | S i | with the greedy algorithm. The ﬁrst proof is from Johnson [20]. Lo v ´ asz [23] obtained the same f actor with a different method. Later , Chv ´ atal extended the result to the weighted set co v er problem [8], in which the subsets S i hav e nonuniform costs. A number of papers show that the logarithmic approximation guarantee is likely to be optimal. Lund and Y annakakis [24] ﬁrst prov ed that the problem is not approximable within log n/ 4 unless NP ⊆ DTIME ( n polylog ( n ) ) . This result has been improv ed to (1 − o (1)) ln n by Feige [10], under the hypothesis NP 6⊆ DTIME ( n O (log log n ) ) . Raz and Safra [27], and Alon, Moshkovitz, and Safra [1] prov ed inapproximability results for factors of the form c ln n for some constant c under the hypothesis P 6 = NP. These results are consequences of ne w PCP characterizations of NP. Minimum Entr opy Set Cover . Let us now consider the geometric mean version: max ϕ  Q v ∈ V a v  1 n . W e relate this mean to the entr opy of the discrete probability distrib ution found by di viding each part size by n : − k X i =1 c i n log c i n = − X v ∈ V 1 n log a v n = log n − 1 n X v ∈ V log a v = log n − log M 0 ( { a v : v ∈ V } ) . 1 W e use the word p -mean here, in order to a void confusion with the “minimum L p -norm set cov er” problem [15]. 2 Thus the maximum geometric mean set cov er problem is equiv alent to the problem of minimizing the en- tropy of the partition. This problem is known as the minimum entr opy set cover problem. It has been intro- duced by Halperin and Karp [19], and has applications in the ﬁeld of computational biology . They proved that the problem was approximable within a constant additive term with the greedy algorithm. Improving on this work, Cardinal, Fiorini, and Joret [6] provided a simple analysis showing that the constant was at most log 2 e ' 1 . 4427 bits, and that this was the smallest additiv e error achiev able in polynomial time, unless P = NP. The minimum entropy vertex co ver [7] and minimum entropy graph coloring [5] problems, which are special cases of minimum entropy set co ver , hav e been studied by the same authors. Maximum-Edge Clique P artition. In a recent publication [9], Dessmark, Jansson, Lingas, Lundell, and Persson studied the maximum-edge clique partition (Max-ECP) problem. In this problem, we aim to partition a graph G into cliques in order to maximize the number of edges whose endpoints are in the same clique of the partition. This is an implicit set cov er problem, in which the subsets S i are the cliques of G , and the function to maximize is: k X i =1  c i 2  = 1 2 − n + k X i =1 c 2 i ! = n 2 ( M 1 ( { a v : v ∈ V } ) − 1) . Thus the problem can be seen as an implicit maximum p -mean set cover problem for p = 1 . The y sho w that the problem is 2 -approximable on perfect graphs using the greedy algorithm, and that it is not approximable within a factor n 1 − O (1 / (log n ) γ ) for some constant γ in polynomial time unless NP ⊆ ZPTIME (2 (log n ) O (1) ) . Max-Max and Max-Min Set Co ver . When p → ∞ , the maximum p -mean set cover problem in volv es ﬁnding a cover in which the largest part has maximum size. This is a trivial problem, unless the subsets in S are not giv en explicitly , like in the graph coloring problem. For p → −∞ , the problem is that of maximizing the size of the smallest part, thus solving max ϕ min v ∈ V a v = max ϕ min i : c i 6 =0 c i . This problem seems much more challenging. W e will refer to it as the max-min set co ver problem. Our results W e sho w in section 2 that for any p ∈ R + , the maximum p -mean set cover problem is approximable within a factor of ( p + 1) 1 /p . This factor is less than e for all positiv e v alues of p , hence this can be seen as a r obust e -approximation for all p -means with positive p . This result generalizes the approximability results of Cardinal et al. [6] for the case p → 0 , and of Dessmark et al. [9] for p = 1 . W e also prove that this is the best we can achie ve in polynomial time unless P = NP, using a powerful reduction due to Feige et al. [10,11]. When p is negati ve, we show that the performance of the greedy algorithm degrades. W e gi ve an inapproximability result for max-min set cov er . Graph coloring problems can be seen as implicit set cover problems in which the subsets S i are the maximal independent sets of the graph. The subsets are not giv en explicitly , which would cause an expo- nential blowup in the problem size, but rather implicitly , from the graph structure. W e deﬁne the maximum p -mean graph coloring problem in this natural way . Special cases of the maximum p -mean graph coloring problem include the standard minimum coloring problem ( p = − 1 ), the minimum entropy coloring prob- lem [5] ( p → 0 ), the maximum-edge clique partition problem [9] ( p = 1 ), and the maximum independent set problem ( p → + ∞ ). In Section 3 we give approximability and inapproximability results for this problem. The maximum p -mean set cov er problem in v olves maximizing the sum of the ( p + 1) th powers of the part sizes, as can be seen in equation (1). In section 4, we consider weighted instances, and a further 3 generalization of the set cov er problem, in which we aim at minimizing the sum of some concav e function of the part sizes. W e gi ve a closed form of the approximation ratio achie v ed by the greedy algorithm for this general class of problems, and apply this result to the case of the Rent-or-Buy set co v er problem [12]. Related works Minimum sum set cover . In the minimum sum set cov er problem we aim to ﬁnd an ordering of the subsets that minimizes the average cover time of an element of the ground set, where the co ver time of an element is the index of the ﬁrst subset covering it. This problem was ﬁrst considered in its graph coloring version [4]. Feige, Lov ´ asz, and T etali [11] gave an elegant proof of the fact that greedy is a 4-approximation algorithm, and that this was the best one could hope for unless P = NP. They also studied the related minimum sum verte x cov er problem, for which they pro vided a 2-approximation algorithm. Generalizations of minimum sum set cover . Munagala, Babu, Motwani, and Widom [25] introduced the pipelined set cover problem. In this problem, we aim to ﬁnd an ordering of the subsets in S that minimizes the L p -norm of the vector ( R i ) , where R i is the number of elements that are not contained in any of the ﬁrst ( i − 1) subsets. For p = 1 , the problem is equiv alent to the minimum sum set cover problem. They generalize the technique of Feige et al. [11] to prov e a 4 1 p -approximation. More recently , Golovin, Gupta, Kumar , and T angwongsan [15] considered another minimum L p -norm set cover problem. This variant in volv es ﬁnding an ordering of the subsets minimizing the L p -norm of the cov er time vector . This problem is a simultaneous generalization of the minimum set cover problem and the minimum sum set cover problem. They prove that the greedy algorithm provides a O ( p ) -approximate solution, and that this is the best possible, up to a constant factor , unless NP ⊆ DTIME ( n O (log log n ) ) . Graph Coloring. The greedy algorithm for set co v er translates to the MaxIS algorithm for graph coloring, in which a maximum independent set is iterativ ely chosen as new color class. This algorithm has in particular been analyzed for the minimum sum [4] and minimum entropy [5,6] graph coloring problems. Recently , Fukunaga, Halld ´ orsson, and Nagamochi [13] initiated the study of a very general family of minimum cost graph coloring problems, similar to what we propose in section 4. They prov ed that any minimum cost graph coloring problem in this family is 4-approximable on weighted interval graphs, pro- vided that the cost function is both monotone and concav e. The proposed algorithm iteratively remov es a maximum i -colorable subgraph, where i is doubled at each iteration. In another recent contrib ution, Fukunaga, Halld ´ orsson, and Nagamochi [12] introduced the Rent-or-Buy coloring problem in verte x-weighted graphs, in which the cost of a color class is the minimum between 1 and the total weight of the class. This models situations in which each color class has to be paid for either by “buying” it for a ﬁxed cost, or “renting” it for a price proportional to its size. They ga ve, among other results, a 2-approximation for this problem in perfect graphs. W e consider the set cover version of this problem in section 4. Clique P artitioning with V alue-polymatr oidal Costs. Gijswijt, Jost, and Queyranne [14] recently studied clique partitioning problems with value-polymatr oidal cost functions. A function f ov er the subsets of V is said to be v alue-polymatroidal whenev er f ( ∅ ) = 0 , f is non-decreasing and for e very subsets S and T with f ( S ) ≥ f ( T ) , and ev ery u in V \ ( T ∪ S ) , the inequality f ( S + u ) − f ( S ) ≤ f ( T + u ) − f ( T ) holds. They deﬁne the cost of a clique partition as the sum of the cost of each clique. They prove, among other results, that this problem is solv able in polynomial time on interv al graphs. 4 Minimum L p -norm pr oblems. Azar , Epstein, Richter , and W oeginger [2] studied approximation algorithms for a scheduling problem in which we aim to minimize the L p -norm of the part sizes. A similar problem has been studied by Azar and T aub [3], who proposed all-norm approximation algorithms. Although similar in spirit, the goal is different than ours, since we instead seek the most “nonuniform” distribution, with maximum L p -norm. A number of other problems with general cost functions have been studied, such as facility location [17]. Due to space constraints, we do not gi ve more details here. 2 A pproximability Lemma 1. The maximum p -mean set cover pr oblem for p ∈ R is appr oximable in polynomial time within a factor of n p +1 P n j =1 j p ! 1 p . (2) Pr oof. W e consider an optimal cov er ϕ OPT , and a part C i = ϕ − 1 OPT ( S i ) in this cover , of size | C i | = c i . W e deﬁne a 0 v := | ϕ − 1 ( ϕ ( v )) | for the cov er ϕ returned by the greedy algorithm. W e ﬁrst suppose that p ≥ 0 , and giv e a lower bound on the value of the cover ϕ restricted to C i . W e do so by e xamining the elements of C i in the order in which they are cov ered by the greedy algorithm, breaking ties arbitrarily . The ﬁrst covered element v 1 ∈ C i must belong to a part of size at least c i in ϕ , since C i can be chosen as a new part, and the greedy algorithm chooses the largest part. Hence a 0 v 1 ≥ c i . Similarly , the second element v 2 of C i that is covered by greedy must belong to a class of size at least c i − 1 . Hence a 0 v 2 ≥ c i − 1 . In general, for the k th element v k cov ered by the greedy algorithm, a 0 v k ≥ c i − k + 1 . Thus we hav e X v ∈ C i ( a 0 v ) p ≥ c i X j =1 j p . (3) Letting a v := | ϕ − 1 OPT ( ϕ OPT ( v )) | , the corresponding value for ϕ OPT is P v ∈ C i a p v = c p +1 i , hence we get the follo wing upper bound P v ∈ C i a p v P v ∈ C i ( a 0 v ) p ≤ c i p +1 P c i j =1 j p . (4) This ratio is increasing with c i , and holds for all the parts C i of ϕ OPT . Letting c i = n and taking the p th root gi ves the result. A similar reasoning holds for p < 0 , with the direction of inequalities (3) and (4) reversed. u t The approximation ratios for various v alues of p and n are giv en in Fig. 1. W e next giv e a constant upper bound on the approximation ratio in the case p ≥ 0 . W e need the following lemma. Lemma 2. F or p ∈ R + and n ∈ N , n X j =1 j p ≥ n p +1 p + 1 . 5 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 0 5 10 15 n=50 n=20 n=5 p Fig. 1. Approximation ratios for the greedy algorithm. Pr oof. The inequality holds for p = 0 . For p > 0 , it can be checked graphically that approximating the sum by an integral yields a lo wer bound: n X j =1 j p = n X j =0 j p > Z n 0 x p dx = n p +1 p + 1 . u t Combining lemmas 1 and 2 proves the follo wing theorem. T ightness can be proved using known tight examples for special cases (see for instance [6]). Theorem 1. The maximum p -mean set cover pr oblem is appr oximable in polynomial time within a factor of ( p + 1) 1 p for p ∈ R + . This bound is asymptotically tight. Note that lim p → + ∞ ( p + 1) 1 p = 1 , hence in the case of p → + ∞ , the approximation ratio is equal to 1 . This formalizes the trivial observation that if our goal is to maximize the size of the largest part, then the greedy algorithm returns an optimal solution. Also, lim p → 0 ( p + 1) 1 p = e , which prov es that the greedy algorithm approximates the minimum entropy set cov er within an additi v e term of log e bits. This was sho wn by Cardinal, Fiorini, and Joret [6]. Finally , for p = 1 , the greedy algorithm returns a 2-approximation. A proof of this result was giv en by Dessmark, Jansson, Lingas, Lundell, and Persson [9]. W e now turn to the case p < 0 . W e know that the greedy algorithm approximates the problem for p = − 1 within a logarithmic factor . The following result shows that the performance of greedy degrades dramatically as p becomes smaller . Theorem 2. The maximum p -mean set cover pr oblem is appr oximable in polynomial time within a factor of n 1 − 1 q ζ ( q ) 1 q for any r eal p = − q < − 1 , where ζ ( q ) = P ∞ j =1 j − q is the Riemann zeta function. Pr oof. W e consider e xpression (2) in lemma 1 and replace p by − q : n 1 − q P n j =1 j − q ! − 1 q = P n j =1 j − q n 1 − q ! 1 q ≤  ζ ( q ) n 1 − q  1 q = n 1 − 1 q ζ ( q ) 1 q . (5) 6 u t The bound is asymptotically tight if we replace n by max i | S i | . Note that we need q > 1 , otherwise the Dirichlet series deﬁning the zeta function does not conv er ge. In particular, when q = 1 (and thus p = − 1 ), we hav e the harmonic series, which is the approximation ratio for the minimum set co ver problem. An interesting special case is when p = − 2 . This means that the cost of a part of size c i in the cover is 1 /c i . In that case, the approximation ratio of the greedy algorithm becomes n 1 − 1 2 ζ (2) 1 2 = π r n 6 . (6) W e now sho w that the approximability result in theorem 1 for positiv e v alues of p is the best we can hope for , unless P = NP. W e need the follo wing lemma, which is a simple consequence of the con ve xity of the function f ( x ) = x p +1 . Consider two sorted sequences c 1 ≥ c 2 ≥ . . . ≥ c k and c 0 1 ≥ c 0 2 ≥ . . . ≥ c 0 k . W e say that ( c i ) dominates ( c 0 i ) if j X i =1 c i ≥ j X i =1 c 0 i ∀ j ∈ { 1 , 2 , . . . , k } . (7) Lemma 3. If ( c i ) dominates ( c 0 i ) , then for any p ∈ R + , k X i =1 c p +1 i ≥ k X i =1  c 0 i  p +1 . (8) Theorem 3. It is NP-har d to appr oximate the maximum p -mean set cover pr oblem within a factor less than ( p + 1) 1 p for p ∈ R + . Pr oof. Feige, Lov asz, and T etali [11] g av e a procedure for transforming a 3SA T -6 formula into a set system ( V , S ) with the following properties: – each subset S i ∈ S has size n/t for a certain parameter t , – if the formula is satisﬁable, then there exists an e xact cov er of V with t subsets, – if the formula is δ -satisﬁable, that is, if at most a fraction δ of the clauses can be satisﬁed, then every i subsets of S cov er at most a fraction (1 − (1 − 1 /t ) i ) − ε of the elements of V , for i ∈ { 1 , 2 , . . . , at } and any choice of the constants ε > 0 and a > 0 . Gi ven a formula known to be either satisﬁable or δ -satisﬁable, the problem of distinguishing between the two is NP-hard [11]. Using the transformation abov e, we sho w that a polynomial algorithm with an approx- imation ratio less than ( p + 1) 1 p for maximum p -mean set cov er would solv e this problem. If the formula is satisﬁable, then V can be cov ered by exactly t disjoint sets of S . From Lemma 3, this is the optimal solution. The part sizes c i in this solution satisfy k X i =1  c i n  p +1 = t X i =1  1 t  p +1 = 1 t p . (9) 7 W e now suppose the formula is only δ -satisﬁable. W e consider the distribution in which the i th part cov ers a fraction  1 − (1 − 1 /t ) i  −  1 − (1 − 1 /t ) i − 1  = 1 t  1 − 1 t  i − 1 of the elements of V , for i ∈ { 1 , 2 , . . . , at } , and the remaining parts cov er exactly a fraction 1 t  1 − 1 t  at each. W e denote by r the number of remaining parts, so that the sum of the fractions equals 1 . From Lemma 3 and the properties of the reduction, this distribution dominates all other achiev able distributions. Therefore the follo wing upper bound holds. k X i =1  c i n  p +1 ≤ at X i =0 1 t  1 − 1 t  i ! p +1 + r 1 t  1 − 1 t  at ! p +1 (10) ' 1 t p +1 at X i =0 e − ( p +1) i t + r t p +1 e − a ( p +1) . (11) W e can approximate the sum by an inte gral : at X i =0 e − ( p +1) i t ' Z at 0 e − ( p +1) x t · dx = t p + 1  1 − e − a ( p +1)  . (12) The value r is the number of parts of size 1 t  1 − 1 t  at ' 1 t e − a needed to cov er a fraction 1 − P at i =0 1 t  1 − 1 t  i ' e − a of the elements. Thus r ∼ t , and r t p +1 e − a ( p +1) ' 1 t p e − a ( p +1) . Note that since the constant t can be assumed to be arbitrary large [11], the approximations above are arbitrarily accurate. Hence expression (11) can be made arbitrarily close to: 1 t p  1 p + 1 ·  1 − e − a ( p +1)  + e − a ( p +1)  . (13) No w by choosing a sufﬁciently large, the ratio between (13) and (9) can be made arbitrary close to p + 1 . The gap between the p -means is obtained by taking the p th root. u t In the case p → 0 , the abov e inapproximability proof shows that the additiv e log e error term is best possible (unless P = NP) for the minimum entropy set cov er problem. This was also shown previously by Cardinal, Fiorini, and Joret [6]. Although we do not hav e a precise inapproximability threshold for negati ve values of p , we can prove the follo wing result for p → −∞ . That is the max-min set co ver problem, in which we aim to maximize the size of the smallest part. Theorem 4. It is NP-hard to appr oximate the max-min set cover pr oblem within any constant factor . Pr oof. The proof uses the same reduction as the proof of theorem 3. W e consider set systems ( V , S ) con- structed from a 3SA T -6 formula, such that there exists an exact cov er with t parts of size n t if the formula is satisﬁable, and ev ery i subsets of S cov er at most a fraction (1 − (1 − 1 /t ) i ) − ε of the elements, for i ∈ { 1 , 2 , . . . , at } , if the formula is δ -satisﬁable. But this means that in the latter case, at least at subsets are needed to cover V . This implies that there is a part of size at most n at . Since a can be chosen arbitrarily greater than any constant, the gap can be made arbitrarily lar ge. u t 8 3 Graph Coloring W e no w deﬁne the graph coloring v ariant of the maximum p -mean set co ver problem. Deﬁnition 2 (Maximum p -mean graph coloring). Given a simple, undir ected graph G = ( V , E ) , ﬁnd an assignment ϕ : V 7→ N of colors to vertices such that adjacent vertices r eceive differ ent colors, and M p ( { a v : v ∈ V } ) is maximized, wher e a v := | ϕ − 1 ( ϕ ( v )) | and M p is the gener alized mean with parameter p . The greedy algorithms extends naturally to what is referred to as the MaxIS algorithm, in which a maximum independent set is iterati v ely removed from the graph. This procedure can run in polynomial time only if at each step we can ﬁnd a maximum independent set in polynomial time. This is true for large families of graphs, such as perfect graphs [16], and claw-free graphs [26]. W e thus hav e the following corollary of theorem 1 (the proof of tightness is omitted). Corollary 1. The maximum p -mean graph coloring pr oblem restricted to perfect or claw-fr ee graphs is appr oximable in polynomial time within a factor of ( p + 1) 1 p for p ∈ R + . This bound is asymptotically tight. It may happen that we only have an approximate algorithm for the maximum independent set problem. Then the follo wing result applies. Proofs are gi ven in appendix A. Theorem 5. If the maximum independent set pr oblem can be appr oximated within a factor ρ in polynomial time, then the maximum p -mean graph coloring pr oblem is appr oximable within a factor of ρ ( p + 1) 1 p in polynomial time. Corollary 2. The minimum entr opy coloring pr oblem [5] is appr oximable in polynomial time within an additive err or of log 2 ( ∆ + 2) − 0 . 14226 on graphs with maximum de gr ee ∆ . In the max-min graph coloring problem, that is when p → −∞ , we aim to maximize the size of the smallest color class. Using a recent polynomial algorithm from Kierstead and Kostochka to construct equi- table ∆ + 1 -colorings [22], we have the follo wing approximability result. Corollary 3. The max-min graph coloring pr oblem can be appr oximated in polynomial time within a factor  1 + O  1 n  ∆ +1 χ on graphs of or der n , maximum de gr ee ∆ , and c hr omatic number χ . The maximum independent set problem is the special case of minimum p -mean coloring in which p → + ∞ . It is therefore not surprising that the general coloring problem is not well approximable for any positiv e v alue of p , as the follo wing lemma shows. Lemma 4. If the maximum independent pr oblem set cannot be appr oximated in polynomial time within n 1 − ε for some ε = ε ( n ) , then the maximum p -mean graph coloring pr oblem with p ∈ R + cannot be appr oximated in polynomial time within n 1 − “ 2+ 1 p ” ε . Pr oof. If the maximum independent set cannot be approximated within n 1 − ε , then we can safely assume that this holds for graphs having an independent set of size α ≥ n 1 − ε . In such a graph, we consider the coloring obtained with a n 1 − tε -approximation algorithm for maximum p -mean coloring, for some constant t to be ﬁxed later . 9 The optimal solution in this graph has v alue at least  α p +1  1 p . Thus the v alue A of the coloring satisﬁes A ≥  α p +1  1 p n 1 − tε . (14) W e no w consider the largest color class in this coloring, and denote its size by h . W e then get the following upper bound on A : A ≤  n h h p +1  1 p = hn 1 p . (15) Putting this together , we obtain hn 1 p ≥  α p +1  1 p n 1 − tε ≥  n (1 − ε )( p +1)  1 p n 1 − tε (16) h ≥ n “ t − 1 − 1 p ” ε . (17) Letting t = 2 + 1 p , we obtain an independent set of size at least n ε , which is a n 1 − ε -approximation for the maximum independent set problem, a contradiction. u t Applying this lemma and using a result from Khot [21], we obtain the follo wing. Theorem 6. The maximum p -mean gr aph coloring pr oblem, for p ∈ R + , is not appr oximable in polynomial time within a factor n 1 − O (1 / (log n ) γ ) for some constant γ unless NP ⊆ ZPTIME (2 (log n ) O (1) ) . A similar result for p → 0 was prov ed by Cardinal et al. [5]. The special case p = 1 was proved by Dessmark et al. [9]. W e end our discussion of the graph coloring problems with the equiv alent problem in the complement of the graph G , which we call the maximum p -mean clique partition problem. The Max-ECP problem corresponds to the special case p = 1 . Gijswijt, Jost, and Que yranne [14] provided a O ( n 3 ) dynamic programming algorithm for ﬁnding a partition of interval graphs in cliques that minimizes the sum of a v alue-polymatroidal cost. Unfortunately , our objectiv e function do not fall in that class, since the equi v alent minimization problem in v olves minimizing a conca ve decreasing cost function, and value-polymatroidal functions must be non-decreasing. Howe v er , the correctness of their dynamic programming solely relies on the fact that an optimal partition always contain a maximal clique. This is true in our case as well, at least for p > 0 , and is a consequence of lemma 3. Thus the algorithm can be applied and we get the following results. Theorem 7. The maximum p -mean clique partition pr oblem with p ∈ R + can be solved in O ( n 3 ) time on interval graphs. Corollary 4. The Max-ECP problem [9] can be solved in O ( n 3 ) time on interval graphs. 4 Further Generalizations W eighted variant. W e ﬁrst observe that theorems 1 and 3 also hold for a weighted version of the minimum p -mean set cov er problem. In this problem, the elements of v hav e a weight w ( v ) . The objecti ve function is the same, except that a v is now deﬁned as w ( ϕ − 1 ( ϕ ( v ))) . W e can observe that the approximability proofs 10 abov e still hold using a simple reduction for integer weights. Gi ven a weighted instance, we can transform it into an unweighted instance by replacing each element v ∈ V by w ( v ) copies of it, each belonging to the same subsets as v . Then each c opy of the duplicated elements must belong to the same part of the (greedy or optimal) solution. Otherwise, from lemma 3, some elements can be reassigned so that the p -mean increases. The argument e xtends to rational and, by continuity , real weights. General costs. F ollo wing the deﬁnition of Fukunaga, Halld ´ orsson, and Nagamochi [13] for minimum cost colorings, we no w consider a much more general family of set cover problems. In these problems, we aim to minimize a sum of some concave function f ( c i ) of the part sizes. The functions f are concav e in the sense that they are discrete restrictions of concav e functions f : R + 7→ R . W e also assume f (0) = 0 . Setting f ( c i ) = − c p +1 i , for instance, yields a problem similar to the maximum p -mean set cover problem, without the 1 /p exponent. The deﬁnition of this ne w family is as follo ws. Deﬁnition 3 (Set cover with general costs). Given an n -element gr ound set V and a collection S = { S 1 , . . . , S k } of subsets of V whose union is V , ﬁnd a cover ϕ : V 7→ S that minimizes P k i =1 f ( c i ) , wher e c i := | ϕ − 1 ( S i ) | and f is a concave function. Concavity implies that we seek a distribution of the part sizes that is as unbalanced as possible. In particular , the follo wing generalization of lemma 3 holds. Lemma 5. Given two nonincr easing sequences ( c i ) and ( c 0 i ) , such that ( c i ) dominates ( c 0 i ) , and a concave function f , we have P k i =1 f ( c i ) ≤ P k i =1 f ( c 0 i ) . Although the approximation ratio obtained with the greedy algorithm depends on the function f , we can gi ve a simple e xpression of it. Theorem 8. The set cover problem with general costs can be appr oximated in polynomial time within a factor of max    1 f ( c ) c X j =1 f ( j ) j : 1 ≤ c ≤ max i | S i |    . Pr oof. (sketch) Gi ven a solution ϕ , we associate to each element v ∈ V the cost f ( a v ) a v , where a v = | ϕ − 1 ( ϕ ( v )) | as before. The cost of this solution is the sum P v ∈ V f ( a v ) a v . Using concavity , we can bound this sum in a greedy solution as in the proof of lemma 1: we show that the sum ov er the elements in a part of size c in the optimal solution is at most P c j =1 f ( j ) /j . The ratio follo ws. u t Note that we retriev e the approximation ratio H n of minimum set cover by setting f ( c ) = 1 if c > 0 and f (0) = 0 . This result also encompasses our analyses of the approximability of minimum entropy and maximum p -mean set cov er . W e now giv e an application of this result to a new problem. In this problem, we suppose that the cost of assigning an element of V to a subset S i is 1 if S i cov ers a lot of elements, but is proportional to its size if the fraction of elements covered by S i is small. More precisely , if the fraction c i /n of elements covered by S i is greater than some constant β < 1 , then the incurred cost is c i n /β . Otherwise, the cost is 1. Thus β deﬁnes a breakpoint, above which it is less costly to “buy” the subset than “rent” it. Hence we deﬁne the Rent-or-Buy set co v er problem as the set cov er problem with the follo wing cost function: f ( c ) = ( c/ ( β n ) if c ≤ β n, 1 otherwise. 11 This models situations in which for instance jobs are assigned to machines, and machines can be either bought or rented. The model w as introduced recently by Fukunaga, Halld ´ orsson, and Nagamochi as a graph coloring problem [12]. The original description of the Rent-or-Buy model was on a weighted graph, and the coloring problem was to ﬁnd a coloring minimizing the sum of the values min { 1 , w ( C i ) } over all color classes C i , where w ( C i ) is the sum of the weights of the vertices in C i . From our reduction of weighted instances described abov e, this is equi v alent to our problem with β = 1 w ( V ) . Corollary 5. The Rent-or-Buy set cover pr oblem is appr oximable in polynomial time within a factor of 1 − ln β . Pr oof. W e let t = β n . Let us ﬁrst suppose that c ≤ t . Then we hav e 1 f ( c ) c X j =1 f ( j ) j = t c c X j =1 1 t = 1 . (18) Otherwise, if c > t , we have 1 f ( c ) c X j =1 f ( j ) j = 1 + c X j = t +1 1 j = 1 + H c − H t ≤ 1 + H n − H t ≤ 1 − ln β . (19) Hence from Theorem 8, this is the worst-case approximation ratio achie v ed by the greedy algorithm. u t Since the greedy algorithm can be implemented to run in polynomial time on perfect or claw-free graphs, we obtain the follo wing result on the Rent-or-Buy graph coloring problem. Corollary 6. The Rent-or-Buy coloring pr oblem is appr oximable in polynomial time within a factor of 1 + ln w ( V ) on perfect or claw-free gr aphs. This improv es on the 2-approximation algorithm [12] when the ov erall weight w ( V ) does not exceed e . References 1. N. Alon, D. Moshkovitz, and M. Safra. Algorithmic construction of sets for k -restrictions. A CM T r ansactions on Algorithms , 2(2):153–177, 2006. 2. Y . Azar , L. Epstein, Y . Richter , and G. J. W oeginger . All-norm approximation algorithms. In Scandinavian W orkshop on Algorithms Theory (SW AT) , v olume 2368 of Lectur e Notes in Computer Science , pages 288–297. Springer -V erlag, 2002. 3. Y . Azar and S. T aub . All-norm approximation for scheduling on identical machines. In Scandinavian W orkshop on Algorithms Theory (SW AT) , v olume 3111 of Lectur e Notes in Computer Science , pages 298–310. Springer -V erlag, 2004. 4. A. Bar-Noy , M. Bellare, M. M. Halld ´ orsson, H. Shachnai, and T . T amir . On chromatic sums and distrib uted resource allocation. Information and Computation , 140(2):183–202, 1998. 5. J. Cardinal, S. Fiorini, and G. Joret. Minimum entropy coloring. In 16th International Symposium on Algorithms and Compu- tation (ISAA C) , volume 3827 of Lectur e Notes in Computer Science , pages 819–828. Springer-V erlag, 2005. 6. J. Cardinal, S. Fiorini, and G. Joret. Tight results on minimum entropy set cover . In 9th. International W orkshop on Appr oxi- mation Algorithms for Combinatorial Optimization Problems (APPRO X) , volume 4110 of Lectur e Notes in Computer Science , pages 61–69. Springer-V erlag, 2006. T o appear in Algorithmica . 7. J. Cardinal, S. Fiorini, and G. Joret. Minimum entropy orientations. preprint arXiv:0802.1237v1 [cs.DS] , 2008. 8. V . Chv ´ atal. A greedy heuristic for the set-covering problem. Mathematics of Operations Resear ch , 4(3):233–235, 1979. 9. A. Dessmark, J. Jansson, A. Lingas, E.-M. Lundell, and M. Persson. On the approximability of maximum and minimum edge clique partition problems. International Journal of F oundations of Computer Science , 18(2):217–226, 2007. 10. U. Feige. A threshold of ln n for approximating set cover . Journal of the A CM , 45(4):634–652, 1998. 11. U. Feige, L. Lov ´ asz, and P . T etali. Approximating min sum set cover . Algorithmica , 40(4):219–234, 2004. 12 12. T . Fukunaga, M. M. Halld ´ orsson, and H. Nag amochi. “rent-or-buy” scheduling and cost coloring problems. In F oundations of Softwar e T ec hnology and Theor etical Computer Science (FSTTCS) , v olume 4855 of Lectur e Notes in Computer Science , pages 84–95. Springer-V erlag, 2007. 13. T . Fukunaga, M. M. Halld ´ orsson, and H. Nagamochi. Robust cost colorings. In ACM-SIAM Symposium on Discr ete Algorithms (SOD A) , 2008. 14. D. Gijswijt, V . Jost, and M. Queyranne. Clique partitioning of interv al graphs with submodular costs on the cliques. RAIR O Operations Resear c h , 41:275–287, 2007. 15. D. Golovin, A. Gupta, A. Kumar , and K. T angwongsan. All-norms and all- L p -norms approximation algorithms. T echnical Report CMU-CS-07-153, Carnegie-Mellon Uni versity , 2007. 16. M. Gr ¨ otschel, L. Lov ´ asz, and A. Schrijver . Geometric algorithms and combinatorial optimization , volume 2 of Algorithms and Combinatorics . Springer-V erlag, Berlin, second edition, 1993. 17. M. T . Hajiaghayi, M. Mahdian, and V . S. Mirrokni. The facility location problem with general cost functions. Networks , 42(1):42–47, 2003. 18. M. M. Halld ´ orsson and J. Radhakrishnan. Greed is good: approximating independent sets in sparse and bounded-degree graphs. Algorithmica , 18:145–163, 1997. 19. E. Halperin and R. M. Karp. The minimum-entropy set cover problem. Theoretical Computer Science , 348(2-3):240–250, 2005. 20. D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences , 9:256–278, 1974. 21. S. Khot. Improved inapproximability results for maxclique, chromatic number and approximate graph coloring. In Pr oc. Annual Symposium on F oundations of Computer Science (FOCS) , pages 600–609, 2001. 22. H. A. Kierstead and A. V . K ostochka. A short proof of the Hajnal-Szemer ´ edi theorem on equitable colouring. Combinatorics, Pr obability and Computing , to appear . 23. L. Lov ´ asz. On the ratio of optimal integral and fractional cov ers. Discrete Mathematics , 13:383–390, 1975. 24. C. Lund and M. Y annakakis. On the hardness of approximating minimization problems. Journal of the A CM , 41(5):960–981, 1994. 25. K. Munagala, S. Babu, R. Motwani, and J. Widom. The pipelined set cover problem. In 10th International Conference on Database Theory (ICDT) , volume 3363 of Lectur e Notes in Computer Science , pages 83–98. Springer-V erlag, 2004. 26. D. Nakamura and A. T amura. A revision of Minty’ s algorithm for ﬁnding a maximum weight stable set of a claw-free graph. J. Oper . Res. Soc. J apan , 44(2):194–204, 2001. 27. R. Raz and M. Safra. A sub-constant error-probability lo w-degree test, and a sub-constant error -probability PCP characteriza- tion of NP. In Pr oc. Annual A CM Symposium on Theory of computing (STOC) , pages 475–484, 1997. A Proof of theor em 5 and cor ollary 2 The proof is similar to that of lemma 1. W e consider the approximate MaxIS algorithm in which a ρ - approximate maximum independent set is chosen at each step. W e consider a class C i in an optimal coloring, of size c i . The ﬁrst vertex v 1 of C i that is colored by the approximate MaxIS algorithm will be assigned a v alue a 0 v 1 at least c i /ρ , since there exists an independent set of size c i in the current graph. By iterating this argument, we obtain that P v ∈ C i ( a 0 v ) p ≥ 1 ρ p P c i j =1 j p . In the optimal coloring, the value of this color class is c p +1 i . Hence the ratio is at most n p +1 1 ρ p P n j =1 j p ! 1 p = ρ n p +1 P n j =1 j p ! 1 p . (20) For positi v e v alues of p , combining with lemma 2 yields an approximation factor of ρ ( p + 1) 1 p . W e now prove the corollary for the minimum entropy set cover problem. Using a greedy algorithm for the maximum independent set, we have ρ = ( ∆ + 2) / 3 [18]. This ratio is valid for each step of the 13 algorithm, as the maximum degree of the graph cannot increase. From (2), the error term for the minimum entropy problem is at most lim p → 0 log 2  ∆ + 2 3 ( p + 1) 1 p  = log 2 ( ∆ + 2) + log 2 ( e ) − log 2 (3) < log 2 ( ∆ + 2) − 0 . 14226 . 14

Set Covering Problems with General Objective Functions

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment