Nonlinear Optimization over a Weighted Independence System
We consider the problem of optimizing a nonlinear objective function over a weighted independence system presented by a linear-optimization oracle. We provide a polynomial-time algorithm that determines an r-best solution for nonlinear functions of t…
Authors: Jon Lee, Shmuel Onn, Robert Weismantel
Nonlinear Optimization o v er a W eigh ted Indep endenc e Sys t em Jon Lee Shm uel Onn Rob ert W eisman tel Abstract W e consider the problem of optimizing a nonlinear ob jective function o ver a weigh ted indepen- dence system presented by a linear-optimiza tio n ora cle. W e provide a p olynomia l- time a lgorithm that determines an r-b est solution for nonlinear functions of the total weigh t of an independent set, where r is a constant that dep ends on cer tain F robenius num b ers of the individual weight s and is independent o f the size of the gr ound set. In contrast, we show that finding an o ptima l (0-b est) solution re q uires exp onential time even in a very spe cial case of the problem. 1 In tro duction An indep endenc e system is a nonempty set of v ectors S ⊆ { 0 , 1 } n with th e pr op ert y that x ∈ { 0 , 1 } n , x ≤ y ∈ S implies x ∈ S . The general nonlinear optimiz ation pr oblem o ver a m u ltiply-w eigh ted indep end ence system is as follo ws. Nonlinear optimization ov er a multiply-w eigh ted indep endence system. Giv en ind ep endence system S ⊆ { 0 , 1 } n , wei ght v ectors w 1 , . . . , w d ∈ Z n , and f u nction f : Z d → R , find x ∈ S minimizing the ob jectiv e f ( w 1 x, . . . , w d x ) = f n X j =1 w 1 j x j , . . . , n X j =1 w d j x j . The represent ation of the ob jectiv e in the ab o v e comp osite form has sev eral adv an tages. First, f or d > 1 , it ca n nat ur ally b e in terpreted as multi-c riteria op timization : the d giv en weig ht vec tors w 1 , . . . , w d represent d d ifferen t criteria, wh er e the v alue of x ∈ S und er criterion i is its i -th tota l w eigh t w i x = P n j =1 w i j x j ; and the ob jectiv e is to minimize the “balancing” f ( w 1 x, . . . , w d x ) of the d giv en criteria by the giv en fu nction f . Second, it allo ws u s to classify nonlinear optimization 1 problems into a hierarc hy of increasing generalit y and complexit y: at the b ottom lies standard linear optimization, reco v ered with d = 1 and f the identit y on Z ; and at the top lies the problem of minimizing an arbitrary f unction, w hic h is typica lly intrac table, arising with d = n and w i = 1 i the i -th standard unit vec tor in Z n for all i . The computational complexit y of the pr oblem dep end s on the num b er d of weig ht v ectors, on the w eigh ts w i j , on the type of fu n ction f and its pr esen tation, and on the t yp e of indep endence system S and its presenta tion. F or example, when S is a matr oid , the problem can b e solv ed in p olynomial time for an y fixed d , any { 0 , 1 , . . . , p } -v alued w eigh ts w i j with p fixed, and any function f p resen ted b y a c omp arison or acle , eve n w hen S is p resen ted b y a m ere memb ership or acle , see [2]. Also, when S consists of the matchings in a giv en bip artite graph G , the pr oblem can b e solv ed in p olynomial time for an y fixed d , an y weigh ts w i j present ed in u nary , and an y c onc ave function f , see [3]; but on the other hand, for c onvex f , already with fixed d = 2 and { 0 , 1 } -v alued w eigh ts w i j , it includes as a sp ecial case the notorious exact matching pr oblem , the complexit y of whic h is long op en [5, 6]. In view of the difficult y of the problem already for d = 2 , in this article we tak e a first step and concen trate on nonline ar optimization over a (singly) weighte d indep endenc e system , th at is, with d = 1 , single we ight v ector w = ( w 1 , . . . , w n ) ∈ Z n , and univ ariate function f : Z → R . The function f can b e arbitrary and is presente d by a c omp arison or acle that, qu eried on x, y ∈ Z , asserts whether or not f ( x ) ≤ f ( y ) . The we ights w j tak e on v alues in a p -tuple a = ( a 1 , . . . , a p ) of p ositiv e intege rs. Without loss of generalit y we assu me that a = ( a 1 , . . . , a p ) is primitive , by whic h w e mean that the a i are d istin ct p ositive int egers havi ng greatest common divisor gcd( a ) := gcd( a 1 , . . . , a p ) that is equal to 1 . T h e ind ep endence sys tem S is presen ted b y a line ar-optimizat ion or acle that, queried on v ector v ∈ Z n , returns an element x ∈ S that maximizes the linear fu n ction v x = P n j =1 v j x j . It turns out that solving this pr oblem to optimalit y may require exp onential time (see Theorem 7.1), and so w e settle for an appro ximate solution in the follo w in g sense, that is interesting in its o wn right. F or a nonnegativ e int eger r , w e sa y that x ∗ ∈ S is an r -b est solution to the optimizatio n problem o v er S if ther e are at most r b etter ob jectiv e v alues attained by f easible solutions. In particular, a 0-b est solution is optimal. Recall that the F r ob enius numb er of a pr im itive a is the largest in teger F( a ) that is not expressible as a nonnegativ e in teger com bination of the a i . W e pr o v e the follo wing theorem. Theorem 1.1. F or every primitive p -tuple a = ( a 1 , . . . , a p ) , ther e is a c onstant r ( a ) and an algorithm that, given any indep endenc e system S ⊆ { 0 , 1 } n pr esente d by a line ar-optim ization or acle, weight ve ctor w ∈ { a 1 , . . . , a p } n , and fu nc tion f : Z → R pr esente d by a c omp arison or acle, pr ovides an r ( a ) - b est solution to the nonline ar pr oblem min { f ( wx ) : x ∈ S } , in time p olynomial in n . Mor e over: 1. If a i divides a i +1 for i = 1 , . . . , p − 1 , then the algorithm pr ovides an optimal solution. 2 2. F or p = 2 , that i s, for a = ( a 1 , a 2 ) , the algorithm pr ovide an F( a ) -b est solution. In fact, we giv e an explicit upp er b ound on r ( a ) in terms of the F rob enius n umbers of certain subtup les d er ived from a . Because F (2 , 3) = 1 , Theorem 1.1 (P art 2) assur es us that we can effic ientl y compute a 1-best solution in that case. It is natur al to w onder then w hether, in this case, an optimal (i.e., 0-b est) solution can b e calculated in p olynomial time. Th e next result indicates that this cann ot b e done. Theorem 1.2. Ther e is no p olynomial time algorithm for c omputing an optimal (i.e., 0 -b est) solution of the nonline ar optimization pr oblem m in { f ( wx ) : x ∈ S } over an indep endenc e system pr esente d by a line ar optimiza tion or acle with f pr esente d by a c omp arison or acle and weight ve ctor w ∈ { 2 , 3 } n . The next sections gradu ally dev elop the v arious necessary in gred ien ts used to establish our main results. § 2 sets some notation. § 3 discusses a n a ¨ ıv e solution strategy that d o es not directly lead to a go o d appr o ximation, but is a basic bu ilding blo ck that is refined and rep eatedly used later on. § 4 describ es a w ay of partitioning an ind ep endence sy s tem into suitable pieces, on eac h of whic h a suitable refinement of the na ¨ ıv e strategy w ill b e app lied separately . § 5 pro vides some p rop erties of monoids and F rob enius num b ers th at will allo ws us to sh o w that the refined na ¨ ıv e s tr ategy applied to eac h piece giv es a go o d appr o ximation within th at piece. § 6 com bines all ingredien ts dev elop ed in § 3–5, p ro vides a b ound on the appr o ximation qu alit y r ( a ) , and pro vides the algorithm establishing Th eorem 1 .1. § 7 demonstrates that finding an optimal solution is p ro v ab ly intrac table, proving a refined v ersion of Theorem 1.2. § 8 concludes with some final remarks and questions. 2 Some Notation In this section we provide some notation that will b e used throughout the article. Some more sp ecific notation will b e introd uced in later sections. W e denote b y R , R + , Z and Z + , the reals, nonnegativ e reals, in tegers and nonnegativ e in tegers, resp ectiv ely . F or a p ositiv e in teger n , we let N := { 1 , . . . , n } . The j -th standard unit v ector in R n is d enoted by 1 j . T h e supp ort of x ∈ R n is th e ind ex set supp( x ) := { j : x j 6 = 0 } ⊆ N of nonzero en tries of x . The indic ator of a s u bset J ⊆ N is the v ector 1 J := P j ∈ J 1 j ∈ { 0 , 1 } n , so that sup p( 1 J ) = J . The p ositive and ne gative parts of a vect or x ∈ R n are d en oted, resp ectiv ely , by x + , x − ∈ R n + , and defin ed by x + i := max { x i , 0 } and x − i := − min { x i , 0 } for i = 1 , . . . , n . S o, x = x + − x − , and x + i x − i = 0 f or i = 1 , . . . , n . Unless otherwise s p ecified, x d enotes an elemen t of { 0 , 1 } n and λ, µ, τ , ν denote elements of Z p + . Throughout, a = ( a 1 , . . . , a p ) is a primitive p -tuple, by w h ic h we mean that the a i are distinct p ositiv e 3 in tegers ha ving greatest common divisor gcd ( a ) := gcd( a 1 , . . . , a p ) equal to 1 . W e will b e w orking with w eigh ts taking v alues in a , that is, v ectors w ∈ { a 1 , . . . , a p } n . With suc h a wei ght v ector w b eing clear from the con text, we let N i := { j ∈ N : w j = a i } for i = 1 , . . . , p , so that N = U p i =1 N i . F or x ∈ { 0 , 1 } n w e let λ i ( x ) := | supp( x ) ∩ N i | for i = 1 , . . . , p , and λ ( x ) := ( λ 1 ( x ) , . . . , λ p ( x )) , so that wx = λ ( x ) a . F or intege rs z , s ∈ Z and a set of in tegers Z ⊆ Z , w e define z + sZ := { z + sx : x ∈ Z } . 3 A Na ¨ ıv e Strategy Consider a set S ⊆ { 0 , 1 } n , we ight ve ctor w ∈ { a 1 , . . . , a p } n , and function f : Z → R presen ted b y a comparison oracle. Defin e the image of S u nder w to b e the set of v alues w x tak en b y elements of S , w · S := n wx = P n j =1 w j x j : x ∈ S o ⊆ Z + . As explained in the introd uction, for a nonnegativ e in teger r , we say that x ∗ ∈ S is an r -b est solution if there are at most r b etter ob jectiv e v alues attai ned by feasible solutions. F ormally , x ∗ ∈ S is an r -b est solution if |{ f ( w x ) : f ( wx ) < f ( wx ∗ ) , x ∈ S }| ≤ r . W e p oin t out the follo win g simp le observ ation. Prop osition 3.1. If f is given by a c omp arison or acle, then a ne c essary c ondition for any algorithm to find an r - b est solution to the pr oblem min { f ( w x ) : x ∈ S } is th at it c omputes al l but at most r values of the image w · S of S under w . Note that this n ecessary condition is also sufficient for computing the weig ht w x ∗ of an r -b est solution, bu t not for computing an actual r -b est solution x ∗ ∈ S , whic h ma y b e harder. An y p oin t ¯ x attaining max { w x : x ∈ S } pr o vides an approxi mation of the image giv en b y (1) { wx : x ≤ ¯ x } ⊆ w · S ⊆ { 0 , 1 , . . . , w ¯ x } . This suggests the follo wing natural n a ¨ ıv e strategy for fi n ding an appro ximate s olution to the optimiza- tion problem o v er an indep endence system S that is presented by a linear-optimization oracle. Na ¨ ıv e Stra tegy input ind ep endence system S ⊆ { 0 , 1 } n present ed b y a linear-optimization oracle, f : Z → R present ed b y a comparison oracle, and w ∈ { a 1 , . . . , a p } n ; obtain ¯ x atta ining max { w x : x ∈ S } using the linear-optimization oracle for S ; output x ∗ as one attaining min { f ( wx ) : x ≤ ¯ x } using the algorithm of Lemma 3.3 b elo w . 4 Unfortunately , as th e next example sh o ws, the num b er of v alues of the image th at are missing from the appro ximating s et on the left-hand side of equation (1 ) cannot generally b e b ounded b y any constan t. So by Prop osition 3.1 , this strategy cannot b e used as is to obtain a prov ably go o d appro ximation. Example 3.2. Let a := (1 , 2) , n := 4 m , y := P 2 m i =1 1 i , z := P 4 m i =2 m +1 1 i , and w := y + 2 z , th at is, y = (1 , . . . , 1 , 0 , . . . , 0) , z = (0 , . . . , 0 , 1 , . . . , 1) , w = (1 , . . . , 1 , 2 , . . . , 2) , define f on Z b y f ( k ) := ( k , k o dd; 2 m , k ev en, and let S b e the indep end ence system S := { x ∈ { 0 , 1 } n : x ≤ y } ∪ { x ∈ { 0 , 1 } n : x ≤ z } . Then th e unique optimal s olution of the linear-ob jectiv e problem max { wx : x ∈ S } is ¯ x := z , with w ¯ x = 4 m , and therefore { wx : x ≤ ¯ x } = { 2 i : i = 0 , 1 , . . . , 2 m } , and w · S = { i : i = 0 , 1 , . . . , 2 m } ∪ { 2 i : i = 0 , 1 , . . . , 2 m } . So all m o dd v alues (i.e., 1 , 3 , . . . , 2 m − 1) in the image w · S are missing from the approxi mating set { wx : x ≤ ¯ x } on th e left-hand side of (1), and x ∗ attaining m in { f ( w x ) : x ≤ ¯ x } outpu t b y the ab o v e s tr ategy has ob jectiv e v alue f ( w x ∗ ) = 2 m , wh ile ther e are m = n 4 b etter ob jectiv e v alues (i.e., 1 , 3 , . . . , 2 m − 1) atta inable by feasible p oin ts (e.g., P k i =1 1 i , for k = 1 , 3 , . . . , 2 m − 1). Nonetheless, a m ore sophisticated refinement of the na ¨ ıv e strategy , applied rep eate dly to sev eral suitably c hosen subsets of S r ather than S itself, will lead to a goo d approximat ion. In th e next t w o sections, we dev elop the necessary ingredients that enable us to imp lemen t such a refinement of the n a ¨ ı ve strategy and to pro v e a guaran tee on the qu alit y of the app ro ximation it pro vides. Before pro ceeding to the next section, w e n ote that the na ¨ ıv e strategy can b e efficientl y implemen ted as follo ws. Lemma 3.3. F or every fixe d p -tuple a , ther e is a p olynomial -time algorithm that, given univariate function f : Z → R pr esente d by a c omp arison or acle, weight v e ctor w ∈ { a 1 , . . . , a p } n , and ¯ x ∈ { 0 , 1 } n , solves min { f ( wx ) : x ≤ ¯ x } . 5 Pr o of. C on s ider the follo win g algorithm: input f unction f : Z → R presen ted by a comparison oracle, w ∈ { a 1 , . . . , a p } n and ¯ x ∈ { 0 , 1 } n ; let N i := { j : w j = a i } and τ i := λ i ( ¯ x ) = | supp( ¯ x ) ∩ N i | , i = 1 , . . . , p ; for ev ery c hoice of ν = ( ν 1 , . . . , ν p ) ≤ ( τ 1 , . . . , τ p ) = τ do determine some x ν ≤ ¯ x w ith λ i ( x ν ) = | supp( x ν ) ∩ N i | = ν i , i = 1 , . . . , p ; end output x ∗ as one minimizing f ( w x ) among the x ν b y u sing the comparison oracle of f . Since the v alue w x dep ends only on th e cardin alities | sup p( x ) ∩ N i | , i = 1 , . . . , p , it is clear that { wx : x ≤ ¯ x } = { w x ν : ν ≤ τ } . Clearly , for eac h c hoice ν ≤ τ it is easy to determine some x ν ≤ ¯ x by zeroing out su itable entries of ¯ x . Th e num b er of c hoices ν ≤ τ and h en ce of lo op iterations and comparison-oracle qu eries of f to determine x ∗ is p Y i =1 ( τ i + 1) ≤ ( n + 1) p . 4 P artitions of Indep endence Systems Define the fac e of S ⊆ { 0 , 1 } n determine d by two disjoint sub se ts L, U ⊆ N = { 1 , . . . , n } to b e S U L := { x ∈ S : x j = 0 for j ∈ L , x j = 1 for j ∈ U } . Our fir st simple lemma reduces linear optimization o v er f aces of S to linear optimization o v er S . Lemma 4.1. Consider any nonemp ty set S ⊆ { 0 , 1 } n , weight ve ctor w ∈ Z n , and disjoint subsets L, U ⊆ N . L et α := 1 + 2 n m ax | w j | , let 1 L , 1 U ∈ { 0 , 1 } n b e the indic ator s of L, U r esp e ctively, and let v := max { ( w + α ( 1 U − 1 L )) x : x ∈ S } − | U | α = max n wx − α P j ∈ U (1 − x j ) + P j ∈ L x j : x ∈ S o . (2) Then either v > − 1 2 α , in which c ase max { w x : x ∈ S U L } = v and the set of maximizers of w x over S U L is e qual to the set of maximizers of the pr o gr am (2), or v < − 1 2 α , i n which c ase S U L is empty. 6 Pr o of. F or all x ∈ { 0 , 1 } n , we ha ve − 1 2 α < w x < 1 2 α , and so for all y ∈ S \ S U L and z ∈ S U L w e ha v e wy − α X j ∈ U (1 − y j ) + X j ∈ L y j ≤ wy − α < 1 2 α − α = − 1 2 α < wz = w z − α X j ∈ U (1 − z j ) + X j ∈ L z j . Let S ⊆ { 0 , 1 } n and w ∈ { a 1 , . . . , a p } n b e arb itrary , and let N i := { j ∈ N : w j = a i } as usual. As usual, for x ∈ S , let λ i ( x ) := | sup p( x ) ∩ N i | for eac h i . F or p -tuples µ = ( µ 1 , . . . , µ p ) and λ = ( λ 1 , . . . , λ p ) in Z p + with µ ≤ λ , defi ne (3) S λ µ := ( x ∈ S : λ i ( x ) = µ i , if µ i < λ i , λ i ( x ) ≥ µ i , if µ i = λ i . ) . Prop osition 4.2. L et S ⊆ { 0 , 1 } n b e arbitr ary. Then every λ ∈ Z p + induc es a p artition of S given by S = ] µ ≤ λ S λ µ . Pr o of. C on s ider an y x ∈ S , and define µ ≤ λ by µ i := min { λ i ( x ) , λ i } . Then x ∈ S λ µ , but x / ∈ S λ ν for ν ≤ λ , ν 6 = µ . Lemma 4.3. F or al l fixe d p - tuples a a nd λ ∈ Z p + , ther e is a p olynomial -time algorithm that, given any indep e ndenc e system S pr esente d by a line ar-optimization or acle, w ∈ { a 1 , . . . , a p } n , and µ ∈ Z p + with µ ≤ λ , solves max n wx : x ∈ S λ µ o . Pr o of. C on s ider the follo win g algorithm: input ind ep endence system S ⊆ { 0 , 1 } n present ed b y a linear-optimization oracle , w ∈ { a 1 , . . . , a p } n , and µ ≤ λ ; let I := { i : µ i < λ i } and N i := { j ∈ N : w j = a i } , i = 1 , . . . , p ; for ev ery S i ⊆ N i with | S i | = µ i , i = 1 , . . . , p, if any , do let L := S i ∈ I ( N i \ S i ) and U := S p i =1 S i ; find by the algorithm of Lemma 4.1 an x ( S 1 , . . . , S p ) attaining max { wx : x ∈ S U L } if an y; end output x ∗ as one maximizing w x among all of the x ( S 1 , . . . , S p ) (if an y) found in the lo op ab o v e . 7 It is clear that S λ µ is the union of the S U L o v er all choic es S 1 , . . . , S p as ab ov e, and therefore x ∗ is indeed a maximizer of w x o v er S λ µ . T h e n umb er of su c h choice s and hence of loop iterations is p Y i =1 | N i | µ i ≤ p Y i =1 n µ i ≤ p Y i =1 n λ i , whic h is p olynomial b ecause λ is fixed. In eac h iteration, we fin d x ( S 1 , . . . , S p ) maximizing w x o v er S U L or detect S U L = ∅ by applying the algorithm of Lemma 4.1 using a sin gle qu er y of the linear- optimization oracle for S . W e will later sh o w that, for a su itable c hoice of λ , we can guarantee that, for ev ery blo ck S λ µ of the partition of S induced by λ , the na ¨ ıv e strategy app lied to S λ µ do es giv e a go o d s olution, with only a constan t num b er of b etter ob jectiv e v alues obtainable by solutions within S λ µ . F or this, w e pro ceed next to tak e a closer lo ok at th e monoid generated by a p -tuple a and at suitable restrictions of this monoid. 5 Monoids and F rob enius Num b ers Recall that a p -tuple a = ( a 1 , . . . , a p ) is primitive if the a i are d istinct p ositive in tegers ha ving greatest common divisor gcd( a ) = gcd( a 1 , . . . , a p ) is 1 . F or p = 1 , the only primitive a = ( a 1 ) is the one with a 1 = 1 . T h e monoid of a = ( a 1 , . . . , a p ) is the set of nonnegativ e in teger com binations of its en tries, M ( a ) = µa = P p i =1 µ i a i : µ ∈ Z p + . The gap set of a is the set G ( a ) := Z + \ M ( a ) and is well kno wn to b e fin ite [4]. If all a i ≥ 2 , then G ( a ) is n onempt y , and its maxim um elemen t is kno wn as the F r ob enius numb er of a , and will b e denoted by F( a ) := max G ( a ) . If some a i = 1 , then G ( a ) = ∅ , in which case we define F( a ) := 0 b y con v ent ion. Also, w e let F( a ) := 0 by con v entio n for the empt y p -tup le a = () with p = 0 . Example 5.1. If a = (3 , 5) then the gap set i s G ( a ) = { 1 , 2 , 4 , 7 } , and the F r ob enius numb er i s F( a ) = 7 . Classical r esults of Sch ur and Sylveste r, r esp ectiv ely , assert that f or all p ≥ 2 and all a = ( a 1 , . . . , a p ) with eac h a i ≥ 2 , the F rob enius n umber ob eys the u pp er b ound (4) F( a ) + 1 ≤ min { ( a i − 1)( a j − 1) : 1 ≤ i < j ≤ p } , with equalit y F( a ) + 1 = ( a 1 − 1)( a 2 − 1) holding for p = 2 . See [4 ] and references therein for p ro ofs. 8 Define the r estriction of M ( a ) b y λ ∈ Z p + to b e the follo wing subset of M ( a ) : M ( a, λ ) := { µa : µ ∈ Z p + , µ ≤ λ } . W e start with a few s imple facts. Prop osition 5.2. F or every λ ∈ Z p + , M ( a, λ ) is symmetric on { 0 , 1 , . . . , λa } , that is, we have that g ∈ M ( a, λ ) if and only if λa − g ∈ M ( a, λ ) . Pr o of. In deed, g = µa with 0 ≤ µ ≤ λ if and only if λa − g = ( λ − µ ) a with 0 ≤ λ − µ ≤ λ . Recall that for z , s ∈ Z and Z ⊆ Z , we let z + sZ := { z + sx : x ∈ Z } . Prop osition 5.3. F or every λ ∈ Z p + , we have (5) M ( a, λ ) ⊆ { 0 , 1 , . . . , λa } \ ( G ( a ) ∪ ( λa − G ( a )) ) . Pr o of. C learly , M ( a, λ ) ⊆ { 0 , 1 , . . . , λa } \ G ( a ) . The claim no w follo ws f rom Prop osition 5.2. Call λ ∈ Z p + satur ate d for a if (5) holds for λ with equalit y . In particular, if some a i = 1 , then λ saturated for a implies M ( a, λ ) = { 0 , 1 , . . . , λa } . Example 5.1, con tin ued. F or a = (3 , 5) and sa y λ = (3 , 4) , we ha ve λa = 29, and it can b e easily c hec k ed that there are t wo v alues, namely 12 = 4 · 3 + 0 · 5 an d 17 = 4 · 3 + 1 · 5 , that are not in M ( a, λ ) but are in { 0 , 1 , . . . , λa } \ ( G ( a ) ∪ ( λa − G ( a )) ) . Hence, in this case λ is not saturated for a . Let max( a ) := max { a 1 , . . . , a p } . Call a = ( a 1 , . . . , a p ) divisible if a i divides a i +1 for i = 1 , . . . p − 1 . The follo wing th eorem asserts that, for an y fixed prim itive a , ev ery (comp onen t-wise) sufficien tly large p -tuple λ is saturated f or a . Theorem 5.4. L et a = ( a 1 , . . . , a p ) b e any primitive p - tu ple. Then the fol lowing statements hold: 1. Every λ = ( λ 1 , . . . , λ p ) satisfying λ i ≥ max ( a ) for i = 1 , . . . , p is satur ate d for a . 2. F or divisible a , every λ = ( λ 1 , . . . , λ p ) satisfying λ i ≥ a i +1 a i − 1 for i = 1 , . . . , p − 1 is satur ate d for a . 9 Pr o of. W e b egin with P art 1. As we go, we mak e some claims for whic h we emplo y somewhat tedious and length y elemen tary arguments to carefully v erify . W e relegate pro ofs of these claims, sp ecifically Claim 1 and Sub Claims 2.1–2.4 , to the Ap p endix. Supp ose that λ i ≥ max( a ) , for i = 1 , . . . , p . Supp ose that the result is false. T hen there is a p - tuple µ ∈ Z p + so that µa ≤ λa but µ a / ∈ M ( a, λ ) . By Prop osition 5.2, we can assume that µa ≤ 1 2 λa . Among all such µ , c ho ose one that h as m inim um violation P p i =1 ( µ i − λ i ) + . Let j b e an ind ex such that µ j > λ j . Claim 1: Th er e are at least t w o indices k for wh ic h µ k < λ k / 2 . Next, for ev ery integ er 0 ≤ γ ≤ a j − 1 , consider th e tw o-v ariable in teger linear program: min x l ( γ ) s.t. a j x j ( γ ) − a l x l ( γ ) = γ a k ; P γ x j ( γ ) , x l ( γ ) ∈ Z + . Claim 2: F or some γ ≤ ⌈ a j / 2 ⌉ , there is a nonzero optimal s olution to P γ , suc h that x l ( γ ) ≤ ⌊ a j / 2 ⌋ . Pr o of of Claim 2: F or th e pur p ose of establishing Claim 2, w e assume, with ou t loss of generalit y , that gcd( a j , a k , a l ) = 1 ; if this did not hold, w e co uld jus t divide the intege rs a j , a k , a l b y their greatest common divisor, th us pro ving a stronger result. SubClaim 2.1: The in teger program P γ is feasible f or all integ ers 0 ≤ γ ( ≤ a j − 1) that are in teger m ultiples of gcd( a l , a j ) . SubClaim 2.2: In f act, for γ = z k gcd( a l , a j ) with z k ∈ Z + , we h a v e that x ∗ l ( γ ) = z l gcd( a k , a j ) for some z l ∈ Z + . SubClaim 2.3: F or 0 ≤ γ , γ ′ < a j / gcd( a k , a j ) , w e h a v e that x ∗ l ( γ ) 6 = x ∗ l ( γ ′ ) for γ 6 = γ ′ . SubClaim 2.4: F or integ er γ ≥ a j / gcd( a k , a j ) , w e w rite γ u niquely as γ = γ ′ + µa j / gcd( a k , a j ) , with µ ∈ Z + , γ ′ ∈ Z + , γ ′ < a j / gcd( a k , a j ) . Then we ha ve that x ∗ l ( γ ′ ) = x ∗ l ( γ ) , x ∗ j ( γ ′ ) = x ∗ j ( γ ) + µa k / gcd( a k , a j ) . No w we are in p osition to complete the pr o of of C laim 2. First, if gcd( a l , a j ) ≥ 2 , then C laim 2 10 follo ws b ecause x l (0) := a j / gcd( a l , a j ) , x j (0) := a l / gcd( a l , a j ) is a feasible solution of P 0 with x l (0) ≤ ⌊ a j / 2 ⌋ . S o, we can assume from n o w on that gcd( a l , a j ) = 1 . W e denote b y Ω the set of all integ ers 0 ≤ γ ≤ a j − 1 for wh ic h P γ is f easible. Next, assu me that gcd( a k , a j ) ≥ 2 . Then by what we hav e sho wn already , { x ∗ l ( γ ) : γ ∈ Ω } = { x ∗ l ( γ ) : γ ∈ Ω , γ < a j / gcd( a k , a j ) } . Because a j / gcd( a k , a j ) ≤ a j / 2 , there is a γ ≤ a j / gcd( a k , a j ) ≤ a j / 2 suc h that P γ has a feasible solution with x l ( γ ) = 1 . So we no w can fur ther assume that gcd( a k , a j ) = 1 . Then x ∗ l ( γ ) 6 = x ∗ l ( γ ′ ) for all γ ∈ Ω , γ 6 = γ ′ implies that the cardinalit y of the set { x ∗ l ( γ ) : 1 ≤ γ ≤ ⌈ a j / 2 ⌉} is equal to ⌈ a j / 2 ⌉ . Because x ∗ l ( γ ) is an in teger b et we en 0 and a j − 1 , it follo ws that there m ust exist a γ ∗ with 1 ≤ γ ∗ ≤ ⌈ a j / 2 ⌉ suc h that x ∗ l ( γ ∗ ) ≤ ⌊ a j / 2 ⌋ . Hence w e ha ve established Claim 2. Notice that this then also implies that x ∗ j ( γ ∗ ) a j = γ ∗ a k + x ∗ l ( γ ∗ ) a l ≤ max( a ) ( γ ∗ + x ∗ l ( γ ∗ )) ≤ max( a ) a j , whic h implies x ∗ j ( γ ∗ ) ≤ max( a ) . No w, d efine a n ew p -tup le ν by ν j := µ j − x ∗ j ( γ ∗ ) , ν l := µ l + x ∗ l ( γ ∗ ) , ν k := µ k + γ ∗ , and ν i := µ i for all i 6 = j, k , l . Because x ∗ j ( γ ∗ ) ≤ max( a ) , it follo ws th at ν j > 0 . Moreo v er, for i ∈ { k , l } , 0 ≤ ν i ≤ λ i . T herefore ν is nonnegativ e, s atisfies ν a = µa = v , and h as lesser violation than µ , which is a con tradiction to the c hoice of µ . So indeed v ∈ M ( a, λ ) , and w e ha v e established Part 1 of the theorem. Before cont inuing, we n ote that a muc h simpler elemen tary argumen t can b e u sed to establish Part 1 of the theorem u nder the stronger hyp othesis: λ i ≥ 2 max ( a ) for i = 1 , . . . , p . W e next pro ceed w ith establishing Part 2 of the theorem. W e b egin b y u sing induction on p . F or p = 1 , we h a v e a 1 = 1 , and ev ery λ = ( λ 1 ) is sat ur ated b ecause ev ery 0 ≤ v ≤ λa = λ 1 satisfies v = µa = µ 1 for µ ≤ λ giv en by µ = ( µ 1 ) with µ 1 = v . Next consid er p > 1 . W e use in duction on λ p . Supp ose fir s t that λ p = 0 . Let a ′ := ( a 1 , . . . , a p − 1 ) and λ ′ := ( λ 1 , . . . , λ p − 1 ) . Consider any v alue 0 ≤ v ≤ λa = λ ′ a ′ . S ince λ ′ is saturated b y indu ction on 11 p , there exists µ ′ ≤ λ ′ with v = µ ′ a ′ . T hen, µ := ( µ ′ , 0) ≤ λ and v = µa . S o λ is also saturated. Next, consider λ p > 0 . Let τ := ( λ 1 , . . . , λ p − 1 , λ p − 1) . Consider an y v alue 0 ≤ v ≤ τ a = λa − a p . S ince τ is s atur ated by induction on λ p , there is a µ ≤ τ < λ with v = µa , and s o v ∈ M ( a, τ ) ⊆ M ( a, λ ) . Moreo v er, v + a p = ˆ µa w ith ˆ µ := ( µ 1 , . . . , µ p − 1 , µ p + 1) ≤ λ , so v + a p ∈ M ( a, λ ) as well . Therefore (6) { 0 , 1 , . . . , τ a } ∪ { a p , a p + 1 , . . . , λa } ⊆ M ( a, λ ) . No w, τ a = p X i =1 τ i a i ≥ p − 1 X i =1 λ i a i ≥ p − 1 X i =1 a i +1 a i − 1 a i = p − 1 X i =1 ( a i +1 − a i ) = a p − 1 , implying that the left-hand side of (6) is in fact equ al to { 0 , 1 , . . . , λa } . Th erefore λ is indeed saturated. This completes the double induction, the pro of of P art 2, and the pro of of the theorem. 6 Obtaining an r -Best Solution W e can now combine all the ingredients develo p ed in the previous sections and p ro vide our algorithm. Let a = ( a 1 , . . . , a p ) b e a fixed primitiv e p -tup le. Defin e λ = ( λ 1 , . . . , λ p ) by λ i := max( a ) for every i . F or µ ≤ λ defin e I λ µ := { i : µ i = λ i } and a λ µ := a i gcd( a i : i ∈ I λ µ ) : i ∈ I λ µ . Finally , define (7) r ( a ) := X µ ≤ λ F( a λ µ ) . The next corollary giv es s ome estimates on r ( a ) , including a general b ound implied by Theorem 5.4. Corollary 6.1. L et a = ( a 1 , . . . , a p ) b e any primitive p - tu ple. Then the fol lowing hold: 1. An upp e r b ound on r ( a ) i s gi ven by r ( a ) ≤ ( 2 m ax( a )) p . 2. F or divisible a , we have r ( a ) = 0 . 3. F or p = 2 , that i s, for a = ( a 1 , a 2 ) , we have r ( a ) = F( a ) . 12 Pr o of. Defin e λ = ( λ 1 , . . . , λ p ) by λ i := max ( a ) f or eve ry i . First note that if I λ µ is empty or a singleton then a λ µ is empty or a λ µ = 1 , and hence F( a λ µ ) = 0 . P art 1: As noted, F( a λ µ ) = 0 for eac h µ ≤ λ with | I λ µ | ≤ 1 . Th ere are at most 2 p (max( a )) p − 2 p -tuples µ ≤ λ with | I λ µ | ≥ 2 and for eac h, the b ound of equation (4) implies F( a λ µ ) ≤ (max( a )) 2 . Hence r ( a ) ≤ 2 p (max( a )) p − 2 (max( a )) 2 ≤ (2 max ( a )) p . P art 2: If a is divisible, then the least en try of ev ery n onempt y a λ µ is 1 , and hence F( a λ µ ) = 0 for ev ery µ ≤ λ . Therefore r ( a ) = 0 . P art 3: As noted, F( a λ µ ) = 0 for eac h µ ≤ λ with | I λ µ | ≤ 1 . F or p = 2 , the only µ ≤ λ with | I λ µ | = 2 is µ = λ . Because a λ λ = a , we find th at r ( a ) = F( a ) . W e are now in p osition to pr o v e the follo wing r efined v ersion of our main theorem (Theorem 1.1). Theorem 6.2. F or ev ery primitive p -tu ple a = ( a 1 , . . . , a p ) , with r ( a ) as in (7) ab ove, ther e is an algorithm that, gi ven any indep endenc e system S ⊆ { 0 , 1 } n pr esente d by a line ar-optimization or acle, weight ve ctor w ∈ { a 1 , . . . , a p } n , and function f : Z → R pr esente d by a c omp arison or acle, pr ovides an r ( a ) -b est solution to the nonline ar pr oblem min { f ( wx ) : x ∈ S } , in time p olynomial in n . Mor e over: 1. If a i divides a i +1 for i = 1 , . . . , p − 1 , then the algorithm pr ovides an optimal solution. 2. F or p = 2 , that i s, for a = ( a 1 , a 2 ) , the algorithm pr ovide an F( a ) -b est solution. Pr o of. C on s ider the follo win g algorithm: input ind ep endence system S ⊆ { 0 , 1 } n present ed b y a linear-optimization oracle, f : Z → R present ed b y a comparison oracle, and w ∈ { a 1 , . . . , a p } n ; define λ = ( λ 1 , . . . , λ p ) by λ i := max ( a ) for ev ery i ; for ev ery c hoice of p -tuple µ ∈ Z p + , µ ≤ λ do find by the algorithm of Lemma 4.3 an x µ attaining max { w x : x ∈ S λ µ } if an y; if S λ µ 6 = ∅ then fin d b y the algorithm of Lemma 3.3 an x ∗ µ attaining min { f ( wx ) : x ∈ { 0 , 1 } n , x ≤ x µ } ; end output x ∗ as one minimizing f ( w x ) among the x ∗ µ . 13 First note that the n um b er of p -tuples µ ≤ λ and hence of loop iterations and applications of the p olynomial-time algorithms of Lemma 3.3 and Lemma 4.3 is Q p i =1 ( λ i + 1) = (1 + max( a )) p whic h is constan t sin ce a is fixed. Therefore the en tire running time of the algorithm is p olynomial. Consider an y p -tuple µ ≤ λ with S λ µ 6 = ∅ , and let x µ b e an optimal solution of max { wx : x ∈ S λ µ } determined b y the algorithm. Let I := I λ µ = { i : µ i = λ i } , let g := gcd( a i : i ∈ I ) , let ¯ a := a λ µ = 1 g ( a i : i ∈ I ) , and let h := P { µ i a i : i / ∈ I } . F or eac h p oin t x ∈ { 0 , 1 } n and for eac h i = 1 , . . . , p , let as u s ual λ i ( x ) := | sup p( x ) ∩ N i | , where N i = { j : w j = a i } , and let ¯ λ ( x ) := ( λ i ( x ) : i ∈ I ) . By the definition of S λ µ in equation (3) and of I ab o ve, for eac h x ∈ S λ µ w e ha v e wx = X i / ∈ I λ i ( x ) a i + X i ∈ I λ i ( x ) a i = X i / ∈ I µ i a i + g X i ∈ I λ i ( x ) 1 g a i = h + g ¯ λ ( x )¯ a . In particular, for ev ery x ∈ S λ µ w e h a v e w x ∈ h + g M (¯ a ) and w x ≤ wx µ = h + g ¯ λ ( x µ )¯ a , and therefore w · S λ µ ⊆ h + g M ( ¯ a ) ∩ { 0 , 1 . . . , ¯ λ ( x µ )¯ a } . Let T := { x : x ≤ x µ } . Clearly , for an y ¯ ν ≤ ¯ λ ( x µ ) there is an x ∈ T obtained by zeroing out suitable en tries of x µ suc h th at ¯ λ ( x ) = ¯ ν and λ i ( x ) = λ i ( x µ ) = µ i for i / ∈ I , and hence wx = h + g ¯ ν ¯ a . Therefore h + g M ¯ a, ¯ λ ( x µ ) ⊆ w · T . Since x µ ∈ S λ µ , by the definition of S λ µ and I , for eac h i ∈ I w e hav e λ i ( x µ ) = | su pp( x ) ∩ N i | ≥ µ i = λ i = max( a ) ≥ max(¯ a ) . Therefore, by Theorem 5.4, w e conclude that ¯ λ ( x µ ) = ( λ i ( x µ ) : i ∈ I ) is saturated for ¯ a and hence M ¯ a, ¯ λ ( x µ ) = M ( ¯ a ) ∩ { 0 , 1 . . . , ¯ λ ( x µ )¯ a } \ ¯ λ ( x µ )¯ a − G (¯ a ) . This implies that w · S λ µ \ w · T ⊆ h + g ¯ λ ( x µ )¯ a − G (¯ a ) , and hence | w · S λ µ \ w · T | ≤ | G ( ¯ a ) | = F(¯ a ) . Therefore, as compared to the ob jectiv e v alue of the optimal solution x ∗ µ of min { f ( wx ) : x ∈ T } = min { f ( wx ) : x ≤ x µ } determined by the algorithm, at most F(¯ a ) b etter ob jectiv e v alues are atta ined by p oin ts in S λ µ . 14 Since S = U µ ≤ λ S λ µ b y Prop osition 4.2, the indep endence system S has altoge ther at most X µ ≤ λ F( a λ µ ) = r ( a ) b etter ob jective v alues f ( w x ) attainable than that of the solution x ∗ output b y the algorithm. There- fore x ∗ is ind eed an r ( a )-b est solution to the nonlinear optimization problem o v er the (singly) weigh ted indep end ence system. In fact, as the ab o ve pro of of Theorem 6.2 sho ws, our algorithm pr o vides a b etter, g ( a )-b est, solution, wh ere g ( a ) is defined as follo ws in terms of the cardinalities of the gap s ets of the sub tuples a λ µ with λ defined again b y λ i := 2 max ( a ) for all i (in particular, g ( a ) = | G ( a ) | for p = 2), (8) g ( a ) := X µ ≤ λ | G ( a λ µ ) | . 7 Finding an Optimal Solution Requires Exp onen tial Time W e no w d emonstrate th at our resu lts are b est p ossible in th e f ollo wing sense. Consider a := (2 , 3 ). Because F (2 , 3) = 1, Th eorem 1.1 (P art 2) assures that our algorithm pr o duces a 1-b est solution in p olynomial time. W e next establish a refined version of Theorem 1.2, showing that a 0-b est (i.e., optimal) solution c annot b e foun d in p olynomial time. Theorem 7.1. Ther e is no p olynomial time algorithm for c omputing a 0 -b est (i.e. , optimal) solution of the nonline ar optimization pr oblem m in { f ( wx ) : x ∈ S } over an indep endenc e system pr esente d by a line ar optimizat ion or acle with f pr esente d by a c omp arison or acle and weight ve ctor w ∈ { 2 , 3 } n . In fact, to solve the nonline ar optimization pr oblem over every indep endenc e system S with a gr ound set of n = 4 m elements with m ≥ 2 , at le ast 2 m m +1 ≥ 2 m queries of the or acle pr esenting S ar e ne e de d. Pr o of. Let n := 4 m w ith m ≥ 2, I := { 1 , . . . , 2 m } , J := { 2 m + 1 , . . . , 4 m } , and let w := 2 · 1 I + 3 · 1 J . F or E ⊆ { 1 , . . . , n } and an y nonnegativ e in teger k , let E k b e th e set of all k -element sub sets of E . F or i = 0 , 1 , 2 , let T i := x = 1 A + 1 B : A ∈ I m + i , B ∈ J m − i ⊂ { 0 , 1 } n . Let S b e the ind ep enden ce system generated by T 0 ∪ T 2 , that is, S := { z ∈ { 0 , 1 } n : z ≤ x , for some x ∈ T 0 ∪ T 2 } . 15 Note that the w -image of S is w · S = { 0 , . . . , 5 m } \ { 1 , 5 m − 1 } . F or ev ery y ∈ T 1 , let S y := S ∪ { y } . Note that eac h S y is an ind ep endence system as well, bu t with w -image w · S y = { 0 , . . . , 5 m } \ { 1 } ; that is, the w -image of eac h S y is precisely the w -image of S au gmented by the v alue 5 m − 1 . Finally , for eac h v ector c ∈ Z n , let Y ( c ) := { y ∈ T 1 : cy > max { cx : x ∈ S }} . Claim: | Y ( c ) | ≤ 2 m m − 1 for eve ry c ∈ Z n . Pr o of of Claim: Consider tw o elements (if any) y , z ∈ Y ( c ) . Then y = 1 A + 1 B and z = 1 U + 1 V for some A, U ∈ I m +1 and B , V ∈ J m − 1 . Sup p ose, indirectly , that A 6 = U and B 6 = V . Pic k a ∈ A \ U and v ∈ V \ B . Consid er the f ollo wing vec tors, x 0 := y − 1 a + 1 v ∈ T 0 , x 2 := z + 1 a − 1 v ∈ T 2 . No w y , z ∈ Y ( c ) and x 0 , x 2 ∈ S imply the contradicti on c a − c v = cy − cx 0 > 0 , c v − c a = cz − cx 2 > 0 . This implies that all v ectors in Y ( c ) are of th e form 1 A + 1 B with either A ∈ I m +1 fixed, in whic h case | Y ( c ) | ≤ 2 m m − 1 , or B ∈ J m − 1 fixed, in whic h case | Y ( c ) | ≤ 2 m m +1 = 2 m m − 1 , as claimed. Con tinuing with the p ro of of our theorem, consider an y algorithm, and let c 1 , . . . , c p ∈ Z n b e the sequence of oracle queries made by the algorithm. Supp ose that p < 2 m m +1 . Then p [ i =1 Y ( c i ) ≤ p X i =1 | Y ( c i ) | ≤ p 2 m m − 1 < 2 m m + 1 2 m m − 1 = | T 1 | . This implies th at there exists some y ∈ T 1 that is an elemen t of non e of the Y ( c i ) , that is, s atisfies c i y ≤ max { c i x : x ∈ S } for eac h i = 1 , . . . , p . Therefore, whether the linear optimization oracle present s S or S y , on eac h quer y c i it can reply with some x i ∈ S at taining c i x i = max { c i x : x ∈ S } = max { c i x : x ∈ S y } . Therefore, the algorithm cannot tell wh ether the oracle p resen ts S or S y and hence can neither compute the w -image of the indep end ence s y s tem nor s olv e the nonlinear optimizatio n p r oblem correctly . 16 8 Discussion W e view th is article as a fi rst step in un derstanding the complexit y of the general nonlinear optimiza- tion p r oblem o ve r an ind ep endence system presente d by an oracle. Our w ork raises man y in triguing questions including the f ollo wing. Can the saturated λ for a b e b etter und ersto o d or even characte r- ized? Can a saturated λ smaller than that w ith λ i = max( a ) b e determined for ev ery a and b e u s ed to obtain b etter runn ing-time guarante e for the algorithm of Theorem 1.1 and b etter appro ximation qualit y r ( a ) ? Can tigh ter b oun d s on r ( a ) in equation (7) and g ( a ) in equation (8) and p ossibly f orm u- las for r ( a ) and g ( a ) for small v alues of p , in particular p = 3, b e deriv ed? F or which primitive p -tuples a can an exact solution to the nonlinear optimizati on pr oblem o ve r a (sin gly) wei ghte d indep endence system b e obtained in p olynomial time, at least for sm all p , in particular p = 2 ? F or p = 2 w e kno w that we can when a 1 divides a 2 , and w e cannot when a := (2 , 3) , but we do not hav e a complete c haracterizatio n. Ho w ab out d = 2 ? While this includes the n otorious exact matc h ing problem as a sp ecial case, it ma y still b e that a p olynomial-time solution is p ossible. An d ho w a b out larger, but fixed, d ? In another d irection, it can b e int eresting to consider th e problem for fun ctions f with some structure th at helps to lo calize minima. F or instance, if f : R → R is conca ve or even more generally quasiconca v e (that is, its “upp er lev el sets” { z ∈ R : f ( z ) ≥ ˜ f } are con ve x subsets of R , for all ˜ f ∈ R ; see [1 ], for example), then the optimal v alue min { f ( w x ) : x ∈ S } is alw a ys attained on the b oundary of con v ( w · S ) , i.e., if x ∗ is a minimizer, then either wx ∗ = 0 or wx ∗ attains max { w x : x ∈ S } , so the p roblem is easily solv able b y a single qu ery to the linear-optimization oracle presenting S and a single qu er y to the comparison oracle of f . Also, if f is conv ex or ev en more generally quasicon vex (that is, its “lo wer leve l sets” { z ∈ R : f ( z ) ≤ ˜ f } are conv ex subsets of R , for all ˜ f ∈ R ), then a m uch simplified v ersion of the algo rithm (from the pro of of Theorem 6.2) gives an r -b est solution as w ell, as follo ws. Prop osition 8.1. F or every primitive p -tuple a = ( a 1 , . . . , a p ) , ther e is an algorithm that, g i ven inde- p endenc e system S ⊆ { 0 , 1 } n pr esente d by a line ar-optimizatio n or acle, weight ve ctor w ∈ { a 1 , . . . , a p } n , and quasic onvex function f : R → R p r esente d by a c omp arison or acle, pr ovides a (max( a ) − 1) -b est solution to the nonline ar pr oblem min { f ( wx ) : x ∈ S } , i n time p olynomial in n . Pr o of. W e could describ e th e construction as a sp ecialization of the algorithm from the p r o of of Theorem 6.2, but it is more clear to just p resen t it d ir ectly . W e first use ou r lin ear-optimizati on oracle to find x ∗ attaining max { wx : x ∈ S } . Th en, by rep eate dly , and in an arb itrary order, decreasing a 17 single comp onen t of the p oint b y unit y , w e obtain a sequ ence of p oints x k := x ∗ ≥ x k − 1 ≥ . . . ≥ x 0 := 0 , with k = P n j =1 x ∗ j ≤ n . Let ˘ f := min { f ( w x t ) : 0 ≤ t ≤ k } . Next, usin g the comparison oracle (a linear num b er of times), we fi nd the least and greatest indices t , s ay t min and t max resp ectiv ely , for whic h x t minimizes f ( w x t ) . Quasicon v exit y of f implies that f ( wx t ) = ˘ f , for t min ≤ t ≤ t max . Moreo v er, quasicon ve xity implies that there is an ind ex s , satisfying t min − 1 ≤ s ≤ t max , su c h that al l p oints z ∈ [0 , w x ∗ ] ∩ Z ha ving f ( z ) < ˘ f are in [ wx s + 1 , w x s +1 − 1] ∩ Z (that is, in one of the t max − t min + 2 in terv als [ wx t , w x t +1 ] b eginning with the one immediately to the left of t min and ending with the one immediately to the righ t of t max — and not the endp oint s of that in terv al). The resu lt no w follo w s b y noticing that wx t +1 − w x t ≤ max( a ) , for t = 0 , . . . , k − 1 , in particular for t = s . In y et another dir ection, it w ould b e interesting to consider other (we ak er or stronger) oracle pr e- sen tations of the ind ep endence system S . While a membersh ip oracle su ffices for nonlinear optimization when S is a matroid [2], in general it is muc h to o w eak, as the follo wing prop ositio n sho ws. Prop osition 8.2. Ther e is no p olynomial time algorithm for solving the nonline ar optimization pr ob- lem min { f ( w x ) : x ∈ S } over an indep endenc e system pr esente d by a memb ership or acle with f pr e- sente d by a c omp arison or acle, even with al l weights e qual to 1 , that is, for p = 1 , a = 1 , w = (1 , . . . , 1) . Pr o of. Let n := 2 m , let w := P n i =1 1 i = (1 , . . . , 1) , and let S := { x ∈ { 0 , 1 } n : sup p( x ) ≤ m − 1 } . F or eac h y ∈ { 0 , 1 } n with su p p( y ) = m , let S y := S ∪ { y } . Note that w · S = { 0 , 1 , . . . , m − 1 } , w · S y = { 0 , 1 , . . . , m − 1 , m } . No w, supp ose an algorithm queries th e mem b ership oracle less th an n m times. Then some y ∈ { 0 , 1 } n with supp( y ) = m is n ot queried, and so the algorithm cannot tell whether the oracle presen ts S or S y and hence can neither compute the image nor solv e the n onlinear optimization problem correctly . 18 Ac kno wledgmen t This researc h was supp orted b y the Mathematisc hes F orsc hungsinstitut Ob erw olfac h dur ing a sta y within the Researc h in P airs Programme. References [1] Avriel, M., Diew ert, W.E., Schaible, S., Z ang, I.: Generalize d conca vit y . Mathematical Concepts and Metho ds in Science and En gineering, 36. Plen um Press, New Y ork (1988) [2] Berstein, Y., Lee, J., Maruri-Aguilar, H., On n, S., Riccomagno, E., W eismantel, R., Wynn, H.: Nonlinear matroid optimization and exp erimenta l design. SIAM Journal on Discr ete Mathematics (to app ear) [3] Berstein, Y., Onn, S.: Nonlinear bipartite matc h ing. Discr ete Optimization 5:53–65 (2008 ) [4] Brauer, A.: O n a problem of p artitions. A meric an Journal of Mathematics 64:299 –312 (1942) [5] Mulmuley , K., V azirani, U.V., V azirani, V.V.: Matc hing is as easy as matrix in ve rsion. Combina- toric a 7:105–1 13 (1987 ) [6] Papadimitriou, C.H., Y anak akis, M.: Th e complexit y of restricted span n ing tree problems. Journal of the Asso ciation for Computing Machinery 29:28 5–309 (1982) App endix Claim 1: Th er e are at least t w o indices k for wh ic h µ k < λ k / 2 . Pr o of of Claim 1: W e note that µ j < λ j trivially implies (9) 0 ≤ a j ( µ j − λ j − 1) . Also, µa ≤ 1 2 λa can b e wr itten as (10) X k 6 = j a k ( µ k − λ k / 2) ≤ a j ( − µ j + λ j / 2) . No w, add ing (9) and (10), we obtain (11) X k 6 = j a k ( µ k − λ k / 2) ≤ − a j ( λ j / 2 + 1) . 19 The right -hand side of (11) is negativ e, therefore the left-hand side must also b e negativ e. Sup p ose that there is bu t a s ingle index k for whic h a s ummand on the left-hand side of (11) is negativ e. Then, w e hav e a k ( µ k − λ k / 2) ≤ − a j ( λ j / 2 + 1) , whic h implies (12) max( a ) ( µ k − λ k / 2) ≤ − a j (max( a ) / 2 + 1) . W e observe that w e must hav e µ k − λ k > − a j , otherwise w e could decrease the violatio n b y decreasing µ j b y a k and increasing µ k b y a j . But µ k − λ k > − a j implies th at (13) µ k − λ k / 2 ≥ − a j + 1 + λ k / 2 ≥ − a j + 1 + max( a ) / 2 . Next, we com bine (12) and (13) to arriv e at max( a ) ( − a j + 1 + max( a ) / 2) ≤ − a j (max( a ) / 2 + 1) , or, equiv alen tly , a j (max( a ) / 2 − 1 ) ≥ max( a ) ( max( a ) / 2 + 1) , whic h cannot hold. So Claim 1 is established. SubClaim 2.1: The in teger program P γ is feasible f or all integ ers 0 ≤ γ ( ≤ a j − 1) that are in teger m ultiples of gcd( a l , a j ) . Pr o of of SubClaim 2.1: Supp ose that γ := z k gcd( a l , a j ) , for some z k ∈ Z + . By B ´ ezout’s Lemma, there are intege rs β j , β l suc h that a j β j + a l β l = gcd( a l , a j ) . Moreo v er, there is an infi nite family in d icated by a j β j + ta l / gcd( a l , a j ) + a l β l − ta j / gcd( a l , a j ) = gcd( a l , a j ) , with t ranging o v er Z . Multiplying th rough by z k a k , and rearranging terms, w e obtain a j z k a k β j + ta l / gcd( a l , a j ) + a l z k a k β l − ta j / gcd( a l , a j ) = z k gcd( a l , a j ) a k = γ a k . 20 No w, f or a sufficien tly large p ositiv e in teger t , w e will ha ve β l − ta j / gcd( a l , a j ) ≤ 0 , and so x j ( γ ) := z k a k β j + ta l / gcd( a l , a j ) ; − x l ( γ ) := z k a k β l − ta j / gcd( a l , a j ) will b e a feasible solution to P γ . T hus w e ha v e established Su bClaim 2.1. SubClaim 2.2: In f act, for γ = z k gcd( a l , a j ) with z k ∈ Z + , we h a v e that x ∗ l ( γ ) = z l gcd( a k , a j ) for some z l ∈ Z + . Pr o of of SubClaim 2.2: a l x ∗ l ( γ ) = a j x ∗ j ( γ ) − γ a k = a j x ∗ j ( γ ) / gcd( a k , a j ) − γ a k / gcd( a k , a j ) gcd( a k , a j ) . As gcd( a k , a j ) d ivides b oth a j and a k , we ha ve a l x ∗ l ( γ ) = z gcd( a k , a j ) , for some z ∈ Z + , and hence x ∗ l ( γ ) = ( z /a l ) gcd( a k , a j ) . As gcd( a l , gcd( a k , a j )) = 1 , it is clear that a l m ust divide z (after all x ∗ l ( γ ) ∈ Z ), and hence Sub Claim 2.2 is established. SubClaim 2.3: F or 0 ≤ γ , γ ′ < a j / gcd( a k , a j ) , w e h a v e that x ∗ l ( γ ) 6 = x ∗ l ( γ ′ ) for γ 6 = γ ′ . Pr o of of SubClaim 2.3: Su pp ose the contrary . Without loss of generalit y , γ ′ > γ . Th en w e ha v e the follo wing tw o equations: a k γ ′ + a l x ∗ l ( γ ′ ) = x ∗ j ( γ ′ ) a j ; (14) a k γ + a l x ∗ l ( γ ) = x ∗ j ( γ ) a j . (15) Subtracting (15) from (14) gives a k ( γ ′ − γ ) = a j x ∗ j ( γ ′ ) − x ∗ j ( γ ) . (16) 21 Because γ ′ > γ , the left-hand side of (16) is p ositiv e, whic h implies that x ∗ j ( γ ′ ) − x ∗ j ( γ ) > 0 . But γ ′ − γ < a j / gcd( a k , a j ) . T his con tradicts that gcd( a j / gcd( a k , a j ) , a k / gcd( a k , a j )) = 1 , b ecause eve ry p ositive in teger solution of a k x k = a j x j is a p ositiv e multiple of x k := a j / gcd( a k , a j ) , x j := a k / gcd( a k , a j )) . Th us we hav e established SubClaim 2.3. SubClaim 2.4: F or integ er γ ≥ a j / gcd( a k , a j ) , w e w rite γ u niquely as γ = γ ′ + µa j / gcd( a k , a j ) , with µ ∈ Z + , γ ′ ∈ Z + , γ ′ < a j / gcd( a k , a j ) . Then we ha ve that x ∗ l ( γ ′ ) = x ∗ l ( γ ) , x ∗ j ( γ ′ ) = x ∗ j ( γ ) + µa k / gcd( a k , a j ) . Pr o of of SubClaim 2.4: W e can dir ectly c hec k feasibilit y: a j x ∗ j ( γ ) + µa k / gcd( a k , a j ) + a l x ∗ l ( γ ) = γ ′ + µa j / gcd( a k , a j ) a k . Moreo v er, there is n o feasible solution ¯ x for P γ ′ ha ving ¯ x l < x ∗ l ( γ ) , b ecause if there were, w e would simply add µa k / gcd( a k , a j ) to ¯ x j , and lea v e ¯ x l unc hanged, to pro du ce a feasible solution for P γ ha ving ob jectiv e v alue less than x ∗ l ( γ ) , a con tradiction. Thus we ha ve established SubClaim 2.4. Jon Lee IBM T.J. Watson R ese ar ch Center, Y orktown Heights, NY 10598, U SA email: jonle e @ u s.ibm.c om , http://w ww.r ese ar ch.ibm.c om/p e ople/j/jonle e Shm uel O nn T e chnion - Isr ael Institute of T e chnolo gy, 32000 Haifa, Isr ael email: onn @ ie.te chnion.ac.il , http:// ie.te chnion.ac.il/ ∼ onn Rob ert W eismantel Otto-von-Guericke Universit¨ at M agdebur g, D- 39106 M agdebur g, Germany email: weismantel @ imo.math.uni-magdebur g.de , http://www .math.uni-magdebur g . de/ ∼ weismant 22
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment