Algorithms for Temperature-Aware Task Scheduling in Microprocessor Systems
We study scheduling problems motivated by recently developed techniques for microprocessor thermal management at the operating systems level. The general scenario can be described as follows. The microprocessor's temperature is controlled by the hard…
Authors: Marek Chrobak, Christoph Durr, Mathilde Hur
Algorithms for T emp erature-Aw are T ask Sc heduling in Micropro cessor Systems Marek Chrobak ∗ Christoph D ¨ urr † Mathilde Hurand † Julien Rob ert ‡ Abstract W e study sc heduling problems motiv ated by recen tly developed techniques for micro- pro cessor thermal management at the op erating systems level. The general scenario can b e described as follows. The micropro cessor’s temp erature is controlled b y the hardw are thermal managemen t system that con tin uously monitors the chip temp erature and au- tomatically reduces the pro cessor’s sp eed as soon as the thermal threshold is exceeded. Some tasks are more CPU-intensiv e than other and thus generate more heat during ex- ecution. The co oling system op erates non-stop, reducing (at an exp onential rate) the deviation of the pro cessor’s temp erature from the am bient temperature. As a result, the pro cessor’s temperature, and thus the p erformance as well, dep ends on the order of the task execution. Giv en a v ariet y of p ossible underlying architectures, mo dels for co oling and for hardware thermal managemen t, as w ell as t yp es of tasks, this scenario gives rise to a plethora of interesting and never studied sc heduling problems. W e fo cus on sc heduling real-time jobs in a simplified mo del for co oling and thermal managemen t. A collection of unit-length jobs is given, eac h job sp ecified by its release time, deadline and heat con tribution. If, at some time step, the temperature of the system is τ and the processor executes a job with heat contribution h , then the temp erature at the next step is ( τ + h ) / 2. The temp erature cannot exceed the given thermal threshold T . The ob jectiv e is to maximize the throughput, that is, the n umber of tasks that meet their deadlines. W e pro ve that, in the offline case, computing the optimum schedule is NP -hard, ev en if all jobs are released at the same time. In the online case, w e sho w a 2-comp etitiv e deterministic algorithm and a matc hing low er b ound. 1 In tro duction Bac kground. The problem of managing the temp erature of pro cessor systems is not new; in fact, the system builders had to deal with this c hallenge since the inception of computers. Since early 1990s, the in tro duction of battery-op erated laptop computers and sensor systems highligh ted the related issue of controlling the energy consumption. Most of the initial w ork on these problems was hardware and systems oriented, and only during the last decade substantial progress has b een achiev ed on developing mo dels and algorithmic tec hniques for micropro cessor temp erature and energy management. This w ork pro ceeded in several directions. One direction is based on the fact that the energy consumption is a fast growing function of the pro cessor sp eed (or frequency). Thus w e can sav e ∗ Departmen t of Computer Science, Univ ersity of California, Riv erside, CA 92521, USA. Supp orted by NSF gran ts OISE-0340752 and CCF-0729071. † CNRS, LIX UMR 7161, Ecole Polytec hnique 91128 Palaiseau, F rance. Supp orted by ANR Alpage. ‡ Lab oratoire de l’Informatique du Parall ´ elisme, Ecole Normale Sup´ erieure de Lyon; ENS Lyon, F rance. 1 energy by simply slo wing do wn the pro cessor. Here, algorithmic research fo cussed on sp e e d sc aling – namely dynamically adjusting the pro cessor sp eed ov er time to optimize the energy consumption while ensuring that the system meets the desired p erformance requiremen ts. Another technique (applicable to the whole system, not just the micropro cessor) inv olves p ower-down str ate gies , where the system is p o w ered-down or ev en completely turned off when some of its comp onents are idle. Since changing the p o wer level of a system in tro duces some o verhead, scheduling the w ork to minimize the o verall energy usage in this mo del b ecomes a c hallenging optimization problem. Mo dels ha ve also b een dev elop ed for the pro cessor’s thermal b eha vior. Here, the main ob jective is to ensure that the system’s temp erature do es not exceed the so-called thermal thr eshold , ab ov e which the pro cessor cannot op erate correctly , or may even b e damaged. In this direction, techniques and algorithms hav e b een prop osed for using sp eed-scaling to optimize the system’s p erformance while maintaining the temp erature b elo w the threshold. W e refer the reader to the surv ey b y Irani and Pruhs [5], and references therein, for more in-depth information on the mo dels and algorithms for thermal and energy managemen t. T emperature-aw are scheduling. The ab o ve mo dels address energy and thermal man- agemen t at the micro-architecture level. In contrast, the problem we study in this pap er addresses the issue of thermal management at the op erating systems level. Most of the previ- ous work in this direction focussed on m ulti-core systems, where one can mov e tasks b etw een the pro cessors to minimize the maxim um temp erature [9, 1, 2, 6, 7, 8, 4, 10]. How ever, as it has been recently discov ered, ev en in single-core systems one can exploit v ariations in heat con tributions among different tasks to reduce the pro cessor’s temp erature through appro- priate task scheduling [1, 4, 6, 7, 11]. In this scenario, the micropro cessor’s temperature is con trolled by the hardware dynamic thermal managemen t (DTM) system that contin uously monitors the chip temp erature and automatically reduces the pro cessor’s sp eed as so on as the thermal threshold (maximum safe op erating temp erature) is exceeded. T ypically , the frequency is reduced b y half, although it can be further reduced to one fourth or even one eigh th, if needed. Running at a low er frequency , the CPU generates less heat. The co oling system op erates non-stop, reducing (at an exp onen tial rate) the deviation of the pro cessor’s temp erature from the am bien t temperature. Once the c hip cools do wn to b elo w the threshold, the frequency is increased again. Differen t tasks use different micropro cessor units in different wa ys; in particular, some tasks are more CPU-in tensive than other. As a result, the pro cessor’s thermal b eha vior – and thus the p erformance as well – dep ends on the order of the task execution. In particular, Y ang et al. [11] p oint out that, based on the standard mo del for the microprocessor thermal b eha vior, for any giv en tw o tasks, scheduling the “hotter” job b efore the “co oler” one, results in a lo w er final temp erature than after the reverse order. They take adv antage of this phe- nomenon to reduce the num b er of DTM in vocations, th us impro ving the p erformance of the OS sc heduler. With multitudes of p ossible underlying architectures (for e xample, single- vs. multi-core systems), mo dels for co oling and hardw are thermal management, as w ell as types of jobs (real-time, batch, etc.), the scenario outlined ab ov e gives rise to a plethora of interesting and nev er yet studied sc heduling problems. 2 Our mo del. W e fo cus on scheduling real-time jobs in a somewhat simplified mo del for co oling and thermal managemen t. The time is divided into unit time slots and eac h job has unit length. (These jobs represent unit slices of the processes presen t in the OS sc heduler’s queue.) W e assume that the heat contributions of these jobs are known. This is counterin tu- itiv e, but reasonably realistic, for, as discussed in [11], these v alues can b e well approximated using appropriate prediction metho ds. In our thermal model we assume, without loss of generalit y , that the ambien t temp erature is 0 and that the heat con tributions are expressed in the units of temp erature (that is, by ho w m uch they w ould increase the c hip temp erature in the absence of co oling). In realit y [11], during the execution of a job, its heat con tribution is spread o ver the whole time slot and so is the effect of cooling; thus, the final temp erature can b e expressed using an integral function. In this pap er, w e use a simplified mo del where w e first take into account the job’s heat con tribution, and then apply the co oling, where the co oling simply reduces the temp erature b y half. Finally , we assume that only one pro cessor frequency is a v ailable. Consequently , if there is no job whose execution does not cause a thermal violation, the pro cessor m ust stay idle through the next time slot. Our results. Summarizing, our scheduling problem can be now formalized as follo ws. A collection of unit-length jobs is giv en, each job j with a release time r j , deadline d j and heat con tribution h j . If, at some time step, the temp erature of the system is τ and the pro cessor executes a job j , then the temp erature at the next s tep is ( τ + h j ) / 2. The temp erature cannot exceed the given thermal threshold T . The ob jective is to compute a schedule whic h maximizes the num ber of tasks that meet their deadlines. W e prov e that in the offline case computing the optimum schedule is NP -hard, ev en if all jobs are released at the same time and ha ve equal deadlines. In the online case, we sho w a 2-comp etitiv e deterministic algorithm and a matching lo wer b ound. 2 T erminology and Notation The input consists of n unit-length jobs that w e num b er 1 , 2 , . . . , n . Each job j is sp ecified b y a triple ( r j , d j , h j ) ∈ N × N × Q , where r j is its release time, d j is the deadline and h j is its heat contribution. The time is divided into unit-length slots and each job can b e executed in any time slot in the interv al [ r j , d j ]. By τ u w e denote the pro cessor temp erature at time u . The initial temp erature is τ 0 = 0, and it changes according to the following rules: if the temp erature of the system at a time u is τ u and the pro cessor executes a job j then the temp erature at time u + 1 is τ u +1 = ( τ u + h j ) / 2. The tempe rature cannot exceed the giv en thermal threshold T . Without loss of generality , we assume that T = 1. Th us if ( τ u + h j ) / 2 > 1 then j cannot b e executed at time u . Idle slots are treated as executing a job with heat contribution 0, that is, after an idle slot the temp erature decreases by half. Giv en an instance, as ab o ve, the ob jective is to compute a schedule with maximum thr oughput , where throughput is defined as the num b er of completed jobs. Extending the standard notation for sc heduling problems, we denote the offline version of this problem b y 1 | r i , h i , p i = 1 | P U i . In the online version, denoted 1 | online- r i , h i , p i = 1 | P U i , jobs are av ailable to the algo- rithm at their release time. Sc heduling decisions of the algorithm cannot dep end on the jobs 3 that ha ve not y et b een released. Example. Supp ose w e hav e four jobs, specified in notation j → ( r j , d j , h j ): 1 → (0 , 2 , 0 . 4), 2 → (0 , 4 , 0 . 6), 3 → (2 , 3 , 1 . 9), 4 → (4 , 6 , 0 . 8). 1 2 4 --- --- 0.0 0.2 0.4 0.2 0.5 0.25 1 3 2 --- 0.0 0.2 0.1 1.0 0.8 0.8 4 1 0.4 2 0.6 3 1.9 4 0.8 Schedule 1 Schedule 2 Figure 1: Example of t wo schedules. Figure 1 sho ws these jobs and tw o sc hedules. Numbers ab o ve the schedules denote tem- p eratures. In the first sc hedule, when w e sc hedule job 2 at time 1, the pro cessor is too hot to execute job 3, so it will not complete job 3 at all. In the second schedule, we stay idle in step 2, allo wing us to complete all jobs. 3 The NP-Completeness Pro ofs In this section we pro ve that the scheduling problem 1 | r i , h i , p i = 1 | P U i is NP -hard. F or the sak e of exp osition, w e start with a proof for the general case, and later w e giv e a pro of for the sp ecial case when all release times and deadlines are equal. Theorem 1 The offline pr oblem 1 | r i , h i , p i = 1 | P U i is NP -har d. Pro of: W e use a reduction from the 3-P ar tition problem, defined as follows: w e are giv en a set S of 3 n p ositive in tegers a 1 , . . . , a 3 n suc h that β / 4 < a i < β / 2 for all i , where β = 1 n P a i . The goal is to partition S in to n subsets, each subset with the total sum equal exactly β . (By the assumption on the a i , each subset will hav e to hav e exactly 3 elemen ts.) This partition of S will b e called a 3 -p artition . 3-P ar tition is well-kno wn to b e NP -hard [3] in the strong sense, that is, even if max i a i ≤ p ( n ), for some p olynomial p ( n ). W e no w describ e the reduction. F or the given instance of 3-P ar tition we construct an instance of 1 | r i , h i , p i = 1 | P U i with 4 n jobs. These jobs will b e of tw o t yp es: (i) First, w e hav e 3 n jobs that corresp ond to the instance of 3-P ar tition . F or every 1 ≤ i ≤ 3 n we create a job of heat contribution 2 − 2 1 − a i , release time 1 and deadline n ( β + 1). (ii) Next, w e create n additional “gadget” jobs. These jobs are tight, meaning that their deadline is just one unit after the release time. The first of these jobs has heat contri- bution 2 and release time 0. Then, for eac h 1 ≤ j ≤ n − 1, we hav e a job with heat con tribution 1 and release time j ( β + 1). 4 W e claim that S has a 3-partition if and only if the instance of 1 | r i , h i , p i = 1 | P U i constructed ab o v e has a sc hedule with throughput 4 n (that is, with all jobs completed). The main idea is this: Imagine that at some moment the temp erature is 1, and w e wan t to schedule a job of heat contribution 2 − 2 1 − x , for some in teger x ≥ 1. Then we m ust first w ait x − 1 time units, so that the pro cessor co ols down to ( 1 2 ) x − 1 = 2 1 − x , b efore w e can schedule that job, and then right at the completion of this job the temp erature is 1 again. The analogous prop ert y holds, in fact, even if at the beginning the temp erature was some τ > 1 / 2, except that then, after completing this job, the new temp erature will b e τ 0 = ( τ 2 1 − x + 2 − 2 1 − x ) / 2 > (2 1 − x / 2 + 2 − 2 1 − x ) / 2 = 1 − 2 − x − 1 ≥ 1 / 2, that is, it may b e differen t than τ but still strictly greater than 1 / 2. With this observ ation in mind, the pro of of the ab o ve claim is quite easy . ( ⇐ ) First we show that if there is a solution to the scheduling problem, then S has a 3-partition. Note that the tight jobs divide the time into n in terv als of length β eac h. Also eac h of the 3 n other jobs is sc heduled in exactly one of these in terv als. This defines a partition of S in to n sets. No w w e claim that after every job execution the temperature is strictly more than 1 / 2. This is true for the first job to b e sc heduled, since it has heat con tribution 2. Each other job in the instance, including the tigh t jobs, has heat con tribution at least 1. Therefore righ t after its execution the temp erature is at least 1 / 2, already if w e tak e only this job in to accoun t. But there is also a declining but non-zero temp erature con tribution from the first tight job. So o verall the temp erature after every execution is strictly more than 1 / 2. T ogether with the earlier observ ation, this implies that every non-tight job of heat contri- bution 2 1 − a i m ust b e preceded by a i − 1 idle units, thus using a i time slots in total. Therefore ev ery set in the abov e mentioned partition has the total sum at most β . Since there are at most n sets in the partition, the total sum of each must b e exactly β . This prov es that S has a 3-partition. ( ⇒ ) Now w e show the other implication, namely that if S has a 3-partition then there is a solution to the scheduling instance. Simply schedule the tight jobs at their release time. This divides the time into n in terv als of length β each. Assign each of the n sets in the partition to a distinct in terv al and sc hedule its jobs in this in terv al: every in teger a i in the set corresponds to a job of heat con tribution 2 − 2 1 − a i , and w e sc hedule it preceded with a i − 1 idle time units. The jobs of the set can b e scheduled in an arbitrary order, the imp ortan t prop ert y b eing that, since their total sum is β , they all fit exactly in this in terv al. After all jobs in one set are executed the temp erature is exactly 1, and during the execution the temp erature do es not exceed 1 (b ecause we pad the schedule with enough idle slots). All release time and deadline constrain ts are satisfied, so the scheduling instance has a feasible solution as well. It remains to sho w that the ab ov e instance of 1 | r i , h i , p i = 1 | P U i can b e pro duced in p olynomial time from the instance of 3-P ar tition . Indeed, ev ery n um b er a i is mapp ed to some n um b er 2 − 2 1 − a i , whic h is describ ed with O ( a i ) bits. Since w e assumed that all n umbers a i for 1 ≤ i ≤ n are p olynomial in n , the reduction will tak e p olynomial time, and the pro of is complete. The ab o ve construction used the constraints of the release times and deadlines to fix tight jobs that force a partition of the time in to in terv als. W e can actually prov e a stronger result, namely that the problem remains NP -complete even if all release times are 0 and all deadlines are equal. Why is it interesting? One common approach in designing on-line algorithms for sc heduling is to compute the optimal schedule for the p ending jobs, and use this schedule to 5 mak e the decision as to what execute next. The set of p ending jobs forms an instance where all jobs hav e the same release time. Our NP -hardness result do es not necessarily imply that the ab o ve metho d cannot w ork (assuming that we do not care ab out the running time of the online algorithm), but it makes this approac h muc h less app ealing, since reasoning ab out the prop erties of the p ending jobs is lik ely to b e very difficult. Theorem 2 The offline pr oblem 1 | r i , h i , p i = 1 | P U i is str ongly NP -har d even for the sp e cial c ase when jobs ar e r ele ase d at time 0 and al l de ad lines ar e e qual. Pro of: The reduction is from Numerical-3D-Ma tching . In this problem, we are giv en 3 sets A, B , C of n non-negative integers eac h and a p ositiv e integer β . A 3-dimensional numeric al matching is a set of n triples ( a, b, c ) ∈ A × B × C suc h that each num ber is matc hed exactly once (app ears in exactly one triple) and all triples ( a, b, c ) in the set satisfy a + b + c = β . Numerical-3D-Ma tching is known to b e NP -complete even when the v alues of all num bers are bounded b y a p olynomial in n , it is referenced as problem [SP16] in [3]. (Clearly , this problem is quite similar to 3-P ar tition that we used in the previous pro of.) Without loss of generality , w e can assume (A1) that every x ∈ A ∪ B ∪ C satisfies x ≤ β and (A2) that P x ∈ A ∪ B ∪ C x = β n . W e now describ e the reduction. Let b e the constant α = 1 / 25 and the function f : x 7→ α (1 + x/ 8 β ). The instance of 1 | r i , h i , p i = 1 | P U i will hav e 4 n + 1 jobs, all with release time 0 and deadline 4 n + 1. These jobs will b e of tw o t yp es: (i) First we ha ve 3 n jobs that corresp ond to the instance of Numerical-3D-Ma tching : for ev ery a ∈ A , there is a job of heat con tribution 8 f ( a ), for every b ∈ B , there is a job of heat con tribution 4 f ( b ), and for every c ∈ C , there is a job of heat con tribution 2 f ( c ). W e call these jobs, resp ectiv ely , A -jobs , B -jobs and C -jobs . (ii) Next, we ha ve n + 1 “gadget” jobs. The first of these jobs has heat contribution 2, and the remaining ones 1 . 75. W e call these jobs, resp ectiv ely , 2 - and 1 . 75 -jobs . W e claim that the instance A, B , C , β has a n umerical 3-dimensional matching if and only if the instance of 1 | r i , h i , p i = 1 | P U i that we constructed has a sc hedule with all jobs completed not later than at time 4 n + 1. The idea of the pro of is that the gadget jobs are so hot that they need to b e scheduled only ev ery 4-th time unit, separating the time into n blo c ks of 3 consecutiv e time slots each. Ev ery remaining job has a heat con tribution that consists of tw o parts: a constan t part (8 α , 4 α or 2 α ) and a tin y v ariable part that dep ends on the instance of the matching problem. This constant part is so large that in ev ery blo c k there is a single A -job, a single B -job and a single C -job, and they must be scheduled in that order. This defines a partitioning of A, B , C in to triplets of the form ( a, b, c ) ∈ A × B × C . Since the gadget jobs are so hot, they force ev ery triple ( a, b, c ) to satisfy a + b + c ≤ β . W e no w make this argumen t formal. ( ⇒ ) Supp ose there is a solution to the instance of Numerical-3D-Ma tching . W e con- struct a schedule where all jobs complete at or b efore time 4 n + 1. Schedule the 2-job at time 0, and all other gadget jobs every 4-th time slot. No w the remaining slots are group ed into blo c ks consisting of 3 consecutive time slots eac h. F or i = 1 , 2 , . . . , n , asso ciate the i -th triple ( a, b, c ) from the matching with the i -th blo ck, and the corresp onding A -, B - and C -jobs are executed in this blo c k in the order A, B , C — see Figure 2. 6 A B C 2 A B C 1.75 A B C 1.75 A B C 1.75 1.75 Figure 2: The structure of the schedules in the pro of. By construction, every job meets the deadline, so it remains to show that the temperature nev er exceeds 1. The non-gadget jobs hav e all heat con tribution smaller than 1, by assumption (A1), so execution of a non-gadget job cannot increase the temperature to ab ov e 1, as long as the temp erature b efore was not greater than 1. No w w e sho w b y induction that right after the execution of a gadget job the temp erature is exactly 1. This is clearly the case after execution of the first job, since its heat contribution is 2. No w let u b e the time when a 1 . 75-job is sc heduled, and supp ose that at time u − 3 the temp erature w as 1. Let ( a, b, c ) b e the triple asso ciated with the blo c k consisting of time slots b et ween u − 3 and u . Then, by a + b + c = β , at time u the temp erature is 1 8 + 8 f ( a ) 8 + 4 f ( b ) 4 + 2 f ( c ) 2 = 1 8 + α 3 + a + b + c 8 β = 1 8 + α 24 8 + 1 8 = 1 4 . This shows that at time u + 1, after scheduling a 1 . 75-job, the temp erature is again (1 . 75 + 1 / 4) / 2 = 1. W e conclude that the sc hedule is feasible. ( ⇐ ) No w w e sho w the remaining implication. Supp ose the instance of 1 | r i , h i , p i = 1 | P U i constructed ab o v e has a schedule where all jobs meet the deadline 4 n + 1. W e sho w that there exists a matching of A, B , C . W e first show that this sc hedule m ust hav e the form from Figure 2. First, note that since all 4 n + 1 jobs hav e deadline 4 n + 1, all jobs must b e sc heduled without any idle time b et ween time 0 and 4 n + 1. This means that the gadget job of heat contribution 2 m ust b e scheduled at time 0, b ecause that is the only momen t of temp erature 0. Also note that ev ery job has heat con tribution at least 2 f (0) = 2 α . No w we claim that all 1 . 75-jobs ha ve to b e sc heduled ev ery 4-th time unit. This holds b ecause tw o units after scheduling a 1 . 75-job, the temp erature is at least 1 . 75 8 + 2 α 4 + 2 α 2 > 1 / 4 . Therefore t wo executions of 1 . 75-jobs m ust b e at least 4 time units apart, and this is only p ossible if they are sc heduled exactly at times 4 i for i = 1 , . . . , n . W e claim that after every execution of a gadget job, the temp erature is at least τ = 364 / 375. Clearly this is the case after the execution of the 2-job. Now assume that at time 4 i + 1, for some i = 0 , . . . , n − 1, the claim holds. Then at time 4 i + 5, after the execution of the next 1 . 75-job, the temp erature is at least τ 16 + 2 α 16 + 2 α 8 + 2 α 4 + 1 . 75 2 = τ . W e show now that every blo c k contains exactly one A -job, one B -job and one C -job, in that order. T ow ards con tradiction, supp ose that some A -job is scheduled at the 2nd p osition of a blo c k, sa y at time 4 i + 2 for some i ∈ { 0 , . . . , n − 1 } . Its heat contribution is at least 8 f (0). Therefore the temp erature at time 4 i + 4 would b e at least τ 8 + 2 α 8 + 8 α 4 + 2 α 2 > 1 / 4 , 7 con tradicting that a 1 . 75-job is sc heduled at that time: A similar argumen t sho ws that A -jobs cannot be sc heduled at p osition 3 in a blo c k, and therefore the 1st position of a block is alw a ys o ccupied by an A -job. By an analogous reasoning, w e show that a B -job cannot b e scheduled at the 3rd p osition of some block. It it w ere scheduled there, the temp erature at the end of block would b e at least τ 8 + 8 α 8 + 2 α 4 + 4 α 2 > 1 / 4 , again con tradicting that a 1 . 75-job is scheduled at the end of the blo c k. W e sho wed that every blo ck contains jobs that correspond to some triple ( a, b, c ) ∈ A × B × C . It remains to show that eac h such triple satisfies a + b + c = β . Let ( a i , b i , c i ) b e the triple corresp onding to the i -th blo c k, for 1 ≤ i ≤ n . Define t 0 = 1 and t i = 1 16 · 8 f ( a i ) + 1 8 · 4 f ( b i ) + 1 4 · 2 f ( c i ) + 1 2 · 1 . 75 = 1 400 [374 + ( a i + b i + c i ) /β ] . Th us t i represen ts the con tribution of the i th blo c k and a following gadget job to the temp er- ature righ t after this gadget job. This implies that, for 1 ≤ k ≤ n , the temp erature at time 4 k + 1 is exactly P k i =0 (1 / 16) k − i t i . By Assumption (A2), P n i =1 ( a i + b i + c i ) = nβ , and th us P n i =1 t i = 15 16 n . Define p i = t i − 15 / 16 for i = 1 , 2 , ..., n . F rom the previous paragraph, n X i =1 p i = 0 . (1) As men tioned earlier, P k i =0 (1 / 16) k − i t i is the temp erature at time 4 k +1, so w e hav e P k i =0 (1 / 16) k − i t i ≤ 1. Therefore, for all 1 ≤ k ≤ n w e get 16 − k k X i =1 16 i p i = 16 − k k X i =1 16 i ( t i − 15 / 16) = k X i =0 (1 / 16) k − i t i − (1 / 16) k − 15 k X i =1 (1 / 16) k − i +1 ≤ 1 − (1 / 16) k − 15(1 − (1 / 16) k ) / 15 = 0 . W e conclude that for k = 1 , 2 , ..., n we ha ve k X i =1 16 i p i ≤ 0 . (2) T o complete the pro of, it remains to sho w that p i = 0 for all i , for this will imply that a i + b i + c i = β , which in turn implies that there is a matching. W e prov e this claim b y con tradiction. Supp ose that not all p i ’s are zero. Let ` b e the smallest index such that p ` > 0 and p 1 + . . . + p ` ≥ 0 . (3) 8 Clearly , ` ≥ 2. By the minimality of ` , for ev ery 2 ≤ k ≤ ` − 1 we hav e p 1 + . . . + p k − 1 ≤ 0 and p k + . . . + p ` ≥ 0 . There are σ i > 0, i = 1 , ..., ` ,such that P j i =1 σ i = 16 j . Then ` X j =1 16 j p j = ` X j =1 j X i =1 σ i p j = ` − 1 X i =1 σ i ` X j = i p i + σ ` p ` > 0 , b ecause all terms are non-negativ e and p ` > 0. This con tradicts (2). It remains to sho w that the ab o ve instance of 1 | h i , p i = 1 | P U i can b e pro duced in p olynomial time from the instance of Numerical-3D-Ma tching . Indeed, every n umber x ∈ A ∪ B ∪ C is mapp ed to some fraction, where b oth the denominator and numerator are linear in x and β . Therefore if we represen t fractions b y writing the denominator and n umerator, and not as some rounded decimal expansion, the reduction will take p olynomial time, and the pro of is complete. Theorem 2 implies that other v ariants of temperature-aw are scheduling are NP -hard as w ell. Consider for example the problem 1 | h i , p i = 1 | C max , where the ob jectiv e is to minimize the makesp an , that is, the maximum completion time. In the decision version of this problem w e ask if all jobs can b e completed by some given deadline C – whic h is exactly what we pro ved abov e to b e NP -hard. It also gives us NP -hardness of 1 | h i , p i = 1 | P C j . T o pro ve this, we can use the decision version of this problem where we ask if there is a sc hedule for whic h the total completion time is at most n ( n − 1) / 2 (where n is the n umber of jobs). 4 An Online Comp etitiv e Algorithm In this section w e sho w that there is a 2-comp etitiv e algorithm for 1 | online- r i , h i , p i = 1 | P U i . W e will sho w, in fact, that a large class of deterministic algorithms is 2-comp etitiv e. Giv en a schedule, we will say that a job j is p ending at time u if j is released, not expired (that is, r j ≤ u < d j ) and j has not b een sc heduled b efore u . If the temp erature at time u is τ u and j is p ending, then we call j admissible if τ u + h j ≤ 2, that is, j is not too hot to b e executed. W e sa y that a job j dominates a job k if j is b oth not hotter and has the same or smaller deadline than k , that is h j ≤ h k and d j ≤ d k . If at least one of these inequalities is strict, then w e say that j strictly dominates k . An online algorithm is called r e asonable if at eac h step (i) it schedules a job whenev er one is admissible (the non-waiting pr op erty ), and, if there is one, (ii) it schedules an admissible job that is not strictly dominated b y another p ending job. The class of reasonable algorithms contains, for example, the follo wing tw o natural algorithms: Co olestFirst : Alwa ys schedule a co olest admissible job (if there is any), breaking ties in fav or of jobs with earlier deadlines. Ea rliestDeadlineFirst : Alwa ys schedule an admissible job (if there is one) with the earliest deadline, breaking ties in fav or of co oler jobs. If tw o jobs ha ve the same deadlines and heat contributions, b oth algorithms give preference to one of them arbitrarily . 9 Theorem 3 Any r e asonable algorithm for 1 | online- r i , h i , p i = 1 | P U i is 2 -c omp etitive. Pro of: Let A b e an y reasonable algorithm. W e fix some instance, and we compare the sc hedules pro duced by A and the adv ersary on this instance. The proof is based on a charging sc heme that maps jobs executed b y the adversary to jobs executed by A in such a wa y that no job in A ’s schedule gets more than t wo c harges. k j T ype-1 Charge ADV A j T ype-3 Charge --- j A ADV j' j' j* j * h j' > h j* h j > h j' p h j* ≤ h p ≤ < or j T ype-2 Charge --- q p h q > h p A ADV ≤ < and Figure 3: F our types of charges. The vertical inequalit y signs b etw een the schedules show the relation b et w een the temp eratures. W e no w describ e this charging scheme. (See Figure 3 for illustration.) There will b e three t yp es of charges, dep ending on whether A is busy or idle at a giv en time step and on the relativ e temp eratures of the schedules of A and the adv ersary . The temp erature at time u in the sc hedules of A and the adversary will b e denoted by τ u and τ 0 u , resp ectiv ely . Supp ose that at some time u , A sc hedules a job k while the adversary schedules a job j , or that the adv ersary is idle (we treat this case as if executing a job j with h j = 0.) Then step u will be called a r elative-he ating step if k is strictly hotter than j , that is h k > h j . Note that if τ v > τ 0 v for some time v , then a relative-heating step must hav e o ccurred b efore time v . Consider now a job j scheduled b y the adv ersary , say at time v . The charge of j is defined as follo ws: T yp e 1 Char ges: If A schedules a job k at time v , charge j to k . Otherwise, we are idle, and then we hav e t wo more cases. T yp e-2 Char ges: Supp ose that A is hotter than the adversary at time v but not at v + 1, that is τ u > τ 0 u and τ u +1 ≤ τ 0 u +1 . In this case w e charge j to the job q executed by A in the last relativ e-heating step b efore v . (As explained ab o ve, this step exists.) T yp e-3 Char ges: Supp ose now that either A is not hotter than the adv ersary at v or A is hotter than the adversary at v + 1. In other words, τ v ≤ τ 0 v or τ v +1 > τ 0 v +1 . (Note that in the latter case we must also hav e τ v > τ 0 v as w ell, since the algorithm is idle.) W e claim that τ v + h j ≤ 2, which means that neither j or any job ` with h ` ≤ h j can b e pending at v . T o justify this, we consider the tw o sub-cases of the condition for type-3 c harges. If τ v ≤ τ 0 v , the claim is trivial, since then τ v + h j ≤ τ 0 v + h j ≤ 2, b ecause the adv ersary executes j at time v . So assume now that τ v +1 > τ 0 v +1 . Since A is idle, we ha ve τ v +1 ≤ 1 / 2. Therefore h j = 2 τ 0 v +1 − τ 0 v ≤ 2 τ 0 v +1 ≤ 1, and the claim follows b ecause τ v ≤ 1 as well. F rom the ab o ve claim, j was scheduled by A at some time u < v . T o find a job that w e can c harge j to, we construct a chain of jobs j, j 0 , j 00 , . . . , j ∗ with strictly decreasing heat con tributions. F urther, all jobs in this chain except j ∗ will b e executed by A at relativ e- heating steps. This chain will b e determined uniquely by j , and w e will charge j to j ∗ . If, 10 at time u , the adversary is idle or schedules an equal or hotter job, then j ∗ = j , that is, w e c harge j to itself (its “copy” in A ’s schedule). Otherwise, if the adversary schedules j 0 at time u then j 0 is strictly co oler than j , that is h j 0 < h j . No w w e claim that the algorithm sc hedules j 0 at some time before v . Indeed, if j 0 is sc heduled before u , w e are done. Otherwise, j 0 is p ending at u , and, since the algorithm nev er sc hedules a dominated job, w e must hav e d j 0 ≥ d j ≥ v + 1. By our earlier observ ation and by h j 0 < h j , if A did not schedule j 0 b efore v , then j 0 w ould hav e b een admissible at v , contradicting the fact A is idle at v . So now the c hain is j, j 0 . Let u 0 b e the time when A sc hedules j 0 . If the adv ersary is idle at time u 0 or if j 0 is not hotter than the job executed b y the adv ersary at time u 0 , w e take j ∗ = j 0 . Otherwise, w e take j 00 to b e the job executed b y the adversary at time u 0 , and so on. This pro cess must end at some p oin t, since w e deal with strictly co oler and co oler jobs. So the job j ∗ is w ell-defined. This completes the description of the c harging sc heme. No w w e sho w that any job sched- uled by A will get at most t wo c harges. Obviously , each job in A ’s schedule gets at most one type-1 c harge. In-betw een any t wo time steps that satisfy the condition of the type-2 c harge there m ust b e a relative-heating step, so the type-2 charges are assigned to distinct relativ e-heating steps. As for type-3 charges, every chain defining a type-3 charge is uniquely defined b y the first considered job, and these c hains are disjoint. Therefore t yp e-3 c harges are assigned to distinct jobs. No w let k b e a job scheduled b y A at some time v . By the previous paragraph, k can get at most one c harge of eac h type. W e claim that k cannot get c harges of eac h type 1, 2 and 3. Indeed, if k receiv es a type-1 charge, then the adversary is not idle at time v , and schedules some job ` . If k also receiv es a t yp e-2 c harge, then v must be a relativ e-heating step, that is h k > h ` . But to receiv e a t yp e-3 charge, k would b e the last job j ∗ in a chain of some job j , and since the c hain was not extended further, we m ust hav e h k ≤ h ` . So type-1, t yp e-2 and t yp e-3 charges cannot coincide. Summarizing the argument, we ha ve that every job sc heduled by the adversary is c harged to some job scheduled by A , and ev ery job scheduled by A receives no more than 2 charges. Therefore the comp etitiv e ratio of A is not more than 2. 5 A Lo wer Bound on the Comp etitive Ratio Theorem 4 Every deterministic online algorithm for 1 | online- r i , h i , p i = 1 | P U i has c om- p etitive r atio at le ast 2 . Pro of: Fix some deterministic online algorithm A . W e (the adv ersary) release a job 1 → (0 , 3 , 1 . 2) (in notation j → ( r j , d j , h j )). If A schedules it at time 0, we release at time 1 a tigh t job 2 → (1 , 2 , 1 . 6) and schedule it follo w ed b y 1. A ’s sc hedule is to o hot at time 1 to execute job 2. If A does not schedule job 1 at time 0, then w e schedule it at 0 and release at time 2 (and sc hedule) a tight job 3 → (2 , 3 , 1 . 6) at time 2. In this case, A cannot complete b oth jobs 1 and 3 without violating the thermal threshold. In b oth cases we sc hedule t wo jobs, while A schedules only one, completing the pro of. (See Figure 4.) 11 2 1 --- 0.0 0.0 0.8 1.0 2 1.6 1 1.2 1 --- 0.0 0.6 0.3 --- 0.15 ADV A 3 1.6 1 1.2 3 1 --- 0.0 0.6 0.3 0.95 ADV --- 0.0 0.0 ? ? ? A ? Figure 4: The lo wer b ound for deterministic algorithms. 6 Final Commen ts Man y op en problems remain. Perhaps the most in triguing one is to determine the randomized comp etitiv e ratio for the problem we studied. The pro of of Theorem 4 can easily b e adapted to prov e the low er b ound of 1 . 5, but w e ha ve not been able to impro ve the upp er b ound of 2; this is, in fact, the main fo cus of our current w ork on this scheduling problem. One approach, based on Theorem 3, would b e to randomly choose, at the b eginning of computation, tw o differen t reasonable algorithms A 1 , A 2 , each with probability 1 / 2, and then deterministically execute the c hosen A i . So far, we hav e b een able to show that for many natural choices for A 1 and A 2 (sa y , Co olestFirst and Ea rliestDeadlineFirst ), this approac h do es not work. Extensions of the co oling mo del can be considered, where the temp erature after executing j is ( τ + h j ) /R , for some R > 1. Even this form ula, how ev er, is only a discrete appro ximation for the true mo del (see, for example, [11]), and it would b e interesting to see if the ideas b ehind our 2-comp etitiv e algorithm can b e adapted to these more realistic cases. In reality , thermal violations do not cause the system to idle, but only to reduce the frequency . With frequency reduced to half, a unit job will execute for t wo time slots. Several frequency lev els may b e av ailable. W e assumed that the heat contributions are kno wn. This is coun ter-intuitiv e, but not unrealistic, since the ”jobs” in our mo del are unit slices of longer jobs. Prediction methods are a v ailable that can quite accurately predict the heat contribution of each slice based on the heat con tributions of the previous slices. Nev ertheless, it may b e in teresting to study a mo del where exact heat con tributions are not kno wn. Other t ypes of jobs may b e studied. F or real-time jobs, one can consider the case when not all jobs are equally imp ortan t, whic h can b e mo deled by assigning w eights to jobs and maxi- mizing the w eighted throughput. F or batc h jobs, other ob jective functions can b e optimized, for example the flow time. One more realistic scenario would b e to represent the whole pro cesses as jobs, rather then their slices. This naturally leads to sc heduling problems with preemption and with jobs of arbitrary pro cessing times. When the thermal threshold is reached, the execution of a job is slow ed down by a factor of 2. Here, a sc heduling algorithm ma y decide to preempt a job when another one is released or, say , when the pro cessor gets to o hot. Finally , in m ulti-core systems one can explore the migrations (sa y , mo ving jobs from hotter to co oler cores) to k eep the temp erature under con trol. This leads to even more scheduling problems that may b e w orth to study . 12 References [1] F. Bellosa, A. W eissel, M. W aitz, and S. Kellner. Ev ent-driv en energy accounting for dynamic thermal management. In Workshop on Compilers and Op er ating Systems for L ow Power , 2003. [2] J. Choi, C-Y. Cher, H. F ranke, H. Hamann, A. W eger, and P . Bose. Thermal-aw are task scheduling at the system softw are level. In International Symp osium on L ow Power Ele ctr onics and Design, , pages 213–218, 2007. [3] M.R. Garey and D.S. Johnson. Computers and Intr actability: A Guide to the The ory of NP-Completeness . W.H.F reeman and Co., 1979. [4] M. Gomaa, M. D. P ow ell, and T. N. Vijaykumar. Heat-and-run: leveraging sm t and cmp to manage p o wer densit y through the operating system. SIGPLAN Not. , 39(11):260–270, 2004. [5] S. Irani and K. R. Pruhs. Algorithmic problems in p o wer management. SIGACT News , 36(2):63–76, 2005. [6] M. Martonosi J. Donald. T ec hniques for m ulticore thermal managemen t: Classifica- tion and new exploration. In Pr o c e e dings of the International Symp osium on Computer A r chite ctur e , pages 78–88, 2006. [7] A. Kumar, L. Shang, L-S. Peh, and N. K. Jha. HybDTM: a co ordinated hardware- soft ware approac h for dynamic thermal management. In DA C ’06: Pr o c e e dings of the 43r d Annual Confer enc e on Design Automation , pages 548–553, 2006. [8] E. Kursun, C.-Y. Cher, A. Buyuktosunoglu, and P . Bose. Inv estigating the effects of task sc heduling on thermal b eha vior. In the 3r d Workshop on T emp er atur e-Awar e Computer Systems , 2006. [9] A. Merk el and F. Bellosa. Balancing p o wer consumption in multiprocessor systems. SIGOPS Op er. Syst. R ev. , 40(4):403–414, 2006. [10] J. Moore, J. Chase, P . Ranganathan, and R. Sharma. Making sc heduling ”co ol”: temp erature-a w are workload placemen t in data cen ters. In A TEC’05: Pr o c e e dings of the USENIX Annual T e chnic al Confer enc e 2005 on USENIX Annual T e chnic al Confer- enc e , pages 5–5, 2005. [11] J. Y ang, X. Zhou, M. Chrobak, and Y. Zhang. Dynamic thermal management through task scheduling. In IEEE International Symp osium on Performanc e Analysis of Systems and Softwar e , 2008. T o app ear. 13
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment