Self-* overload control for distributed web systems
Unexpected increases in demand and most of all flash crowds are considered the bane of every web application as they may cause intolerable delays or even service unavailability. Proper quality of service policies must guarantee rapid reactivity and r…
Authors: - Novella Bartolini (Department of Computer Science, University of Rome “Sapienza”, Italy) - Giancarlo Bongiovanni (Department of Computer Science
Self-* o v erload control for distrib ute d web s ystems Novella Bartolini, Gianc arlo Bong iov anni, Simone S ilvestri Departmen t of Computer Science University of Rom e “Sap ienza”, Italy Email: { novella,bongiovanni,simone.silvestri } @di.uniroma1.it Abstract —Unexpected increases in demand and most of all flash cro wds are considered the bane of every web application as they may cause intolerable delays or e ven serv ice u na vailability . Proper qu ality of service policies must guarantee rapid reac- tivity and responsiv eness even in such critical sit uations. P re vious solutions fail to meet common perfo rmance requirements wh en the system has to face su dden and unp redictable surges of traffic. Indeed they often rely on a proper setting of key parameters which requires laborious manual tu ning, prev enting a fast adaptation of th e control policies. W e contribute an original Self-* Overload Control ( S O C ) pol- icy . This allows th e system to self-configure a dynamic constraint on the rate of admitted sessions in order to respect service lev el agree ments and maximize the resource ut ilization at the same time. Our policy does not require any prior info rmation on the incoming traffic or manual confi guration of key parameters. W e ran extensive simulations und er a wide range of op erating conditions, showing that S O C rapidly adapts to time v arying traffic a nd self-optimizes the resour ce utilization. It admits as many n ew sessions as possible i n observance of the agreements, ev en und er intense wo rkload variations. W e compared our algorithm to p re viously proposed approaches highli ghting a more stable beh a vior and a better p erf ormance. I . I N T RO D U C T I O N Quality o f Service (Qo S) mana gement for web-b ased ap- plications is typically considered a problem of system sizing: enoug h reso urces are to be provisioned to meet qu ality of ser - vice req uirements und er a wide range of o perating cond itions. While this approa ch is b eneficial in ma king the site perfor- mance satisfactory in the most common working situa tions, it still lea ves the site incapable to face sud den and un expected surges of traffic. In these situations, in fact, it is impossible to predict the intensity of the overload. Th e architecture in u se, although over-dimensioned , m ay not be sufficient to meet th e desired QoS. For this r eason, u nexpected in creases o f requests and most of all flash crowds ar e considered the bane of e very internet b ased applicatio n, and mu st be addressed in te rms of perfor mance contro l rather th an capacity sizing. Due to the in effecti veness of static r esource over - provisioning , sev eral altern ativ e approa ches have been pro- posed for overload ma nagement in web systems, such as dy- namic provisionin g, dynamic content adaptation, perfor mance degradation and admission contro l. Most of the p reviously propo sed works on this topic r ely on laborio us parameter tuning and manual co nfiguratio n that impede fast ad aptation of the contro l policies. This work is motiv ated by the need to formulate a fast rea cti ve and autonomo us approach to admission control. W e co ntribute an orig inal Self-* Overlo ad Control policy (SOC) which enab les some funda mental self-* p roperties suc h as self- configur ation, self-optimiza tion, self-pr otection. In par- ticular , th e prop osed system is capa ble o f self-configurin g its compon ent l evel parameters accordin g to performan ce require- ments. A t th e same time it op timizes its own responsiveness and self-protects fro m overload . The prop osed policy is to be adop ted by web cluster dispatching poin ts (DP) an d does no t r equire any m odification of th e client and/o r server software. DPs intercept requests and ma ke decision s to block or a ccept inco ming n ew sessions to meet the serv ice level requ irements detailed in a Serv ice Lev el Agreement (SLA). Decisions whether to accept or refuse new sessions are mad e o n th e b asis o f a dy namically adjusted upper limit on the adm ission rate. This lim it is upd ated and kept consistent with the system capacity and time varying traf- fic behavior , measured by an a pposite self-learning, mo nitor module. Such module performs an autono mous and continuou s measuremen t activity that is of primary importan ce if hum an supervision is to be a voided. Our pro posal is oriente d to the manageme nt of web based traffic, an d fo r this reason provid es admission con trol at session g ranularity . Nevertheless, it does not req uire any prior knowledge on the incoming tr affic, and can b e applied to no n- session based traffic as well. Unlike previous w orks, our approa ch is rapidly adaptive, and also cap able to deal with flash crowds which are detected as soon as they arise, with a simple chang e detection mech anism, that permits a fast adaptation of the rate of d ecision updates. The inter-decision time becomes in creasingly shorter as traffic changes become sudden and fast, as in presence of flash crowds. This in terval is set back to longer values wh en the workload co nditions re turn to nor mality . Although inspired by our previous work [ 5], th is p roposal is original as it includ es the anomaly detection and decision rate adap tation mechanisms necessary to perform flash crowd managem ent. It also provides a con siderably improved mea- surement validation system as detailed in section IV. W e designed a synthetic traffic generator, based on an indus- trial stand ard benc hmark SPECWEB2005, which we used to run simu lations u nder a wide range of ope rating conditio ns. W e co mpared S O C to oth er commo nly a dopted ap proache s showing that it outpe rforms the others in term s of perfor mance and stability even in p resence of flash crowds. Indeed S O C does not show th e typical o scillations of r esponse time due to the over-reactiv e behavior of other policies. A wide rang e o f experiments has been c onducted to test the sensiti vity of th e p roposed solutio n to the co nfiguratio n of the few startup pa rameters. Experime nts show that the behavior of ou r policy is not dep endent on the initial parameter setting, while other po licies a chieve an acceptable perf ormance only when perfectly tun ed a nd in very stable scenario s. The paper is organized as follows: in section II we formulate the problem of overload contr ol in d istributed web systems. In section III we sketch the basic actions of the proposed overload control po licy . I n section IV we introdu ce our algor ithm in deeper details. In section V we introduce some previous ap- proach es that we com pared to ours in section VI. Section VII outlines the state of the ar t of adm ission contr ol in distributed autonom ic web systems while section VIII concludes the paper with some final rema rks. I I . T H E P RO B L E M W e tackle the pro blem of admission co ntrol for web b ased services. In this con text, th e user in teraction with the appli- cation typ ically co nsists of a seque nce of requests forming a n avigation session . As justified b y [10], [9] we make the admission control work at session g ranularity . Since the system should promp tly react to tr affic anom alies, any type of solu tion that r equires human in tervention is to be excluded. For th is re ason we addre ss this problem by a pplying the autonom ic computing [1] design paradigm . W e consider a ty pical multi-tier ar chitecture [8] , [ 22]. E ach tier is compo sed by several r eplicated servers, while a fr ont- end dispatcher hosts the admission contro l and d ispatch mod- ule. Each re quest may in volve execution at d ifferent depths in th e tiered architectu re. This results in a d ifferentiation of requests into several categorie s whose average p rocessing times may differ significan tly . The quality of service of web applications is usually regu- lated by a SL A. Altho ugh our work may be applied to se veral formu lations of SLA , when clusters of heterog eneous tiers are considered , the m ost appropr iate for mulation is the following, as we argue in [ 4]: • RT i SLA : max imum acceptable value of the 95 %-ile of the response time fo r requests of type i ∈ { 1 , 2 , . . . , K } , where K is the number of cluster tiers. • λ SLA : minimu m guara nteed adm ission rate. I f λ in ( t ) is the rate of incomin g sessions, an d λ adm ( t ) is the rate of admitted sessions, th is ag reement im poses that λ adm ( t ) ≥ min { λ in ( t ) , λ SLA } • T SLA : observation interval between two subsequen t checks of the satisfaction of the SLA constraints. Meeting these quality requirem ents under sudden traffic variations requires n ovel techniques that guaran tee the n eces- sary responsiveness. In such cases the respect of the agree ment on r esponse time is a ch allenging p roblem. Some other p erfor- mance issues arise as well, such as the presence o f o scillatory behavior , that typ ically affects some over-reacting po licies, as we show in th e exper imental section VI. I I I . T H E I D E A W e designed S O C , a session ba sed admission control policy that self-co nfigures a limit on the incom ing rate of n ew sessions. Such limit co rrespon ds to the m aximum capac ity of the system to su stain the incomin g traffic without v iolating the agreemen ts on quality . I t can n ot be evaluated off-line bec ause it dep ends on the p articular traffic r ate and pr ofile th at the system h as to face. Since we do n ot want to rely on any prior assumption on the incoming traffic, we introduce a mon itor mo dule that makes the system capable to lear n its capacity to face ea ch particu lar traffic pr ofile as it is when it comes. For this reason we make the system measure and learn th e relationship between the rate of admitted sessions and the corr espondin g measure o f response time. By accu rately p rocessing raw m easures, th e system can “lear n” wh ich is the maximu m session admission rate that can b e adop ted in o bservance of th e SLA requ ire- ments. This learnin g activity introduce s some issues such as how to time perform ance control, how to aggregate measures and how to detect c hanges, that will b e dealt in detail in the following section s. W e just mention that a s soon as a change is d etected the pro posed sy stem varies the rate o f perfor mance con trols to guaran tee at the same time accu racy and responsiveness. According to o ur pro posal the admission controller op erates at the application le vel of the proto col stack becau se session informa tion is necessary to discriminate which requ ests are to be accepted (name ly requ ests belo nging to alread y o ngoing sessions), and which are to be ref used (r equests th at im ply the creation of a new session). The cluster dispatch er can discriminate between ne w requests and requests belongin g to ongoin g sessions becau se either a coo kie or an http parame ter are appended t o the request. This technique ensures two impor- tant b enefits: 1) the admission contro ller can b e imp lemented on DPs, and d oes not require any modification of client and server software, 2 ) the d ispatcher can immed iately respon d to non adm itted req uests, sending an “ I am busy ” page to inform the clien t of the overload situation. This a voids that the expiration of protocol time-outs affects th e user pe rceiv ed perfor mance and m itigates the retrial phenom enon. I V . S E L F - * O V E R L OA D C O N T RO L ( S O C ) P O L I C Y S O C works in two moda lities, namely n ormal mode a nd flash cr owd management mode , switching fro m one to the other according to the tr affic scenario being considered. During stable lo ad situation s the timing of p erform ance c ontrol is regularly paced a t time intervals of length T SOC AC . If a sudden change o f the traffic scenario is d etected, the system enters th e flash crowd man agement mod ality durin g which per forman ce controls an d po licy updates are made more often in order to av o id a system overload. S O C provides a pr obabilistic ad mission con trol mechan ism which filters incomin g sessions acco rding to an adaptive rate limit λ ∗ . In or der to p roperly calculate λ ∗ , the monitor module takes measures to analyze the relatio nship between the observed Response Ti me (R T) and the rate of admitted init; normal_mode: while (( t < T SOC AC ) AND !change_detec tion()){ n=n+1; for each session arrival { probabilistic _admission_con trol; collect_raw_m easures; } } / * end while if change_detection () goto flash_cr owd_mode; else { update_stats; update_curve; update_admiss ion_probabilit y; t=0; goto normal_mode; } Fig. 1. Pseudo-code of SOC (normal mode) sessions. The value of λ ∗ is then calcu lated a s the highest r ate that the site can support without violating the constraints on R T defined in the SLAs. The admission co ntrol policy varies the admission probability acco rding to a prediction of the futu re workload an d to the estimated value of λ ∗ . The beh avior of our policy und er normal mo de is describ ed in figure 1, wh ile figu re 2 descr ibes the flash crowd man age- ment mode. For sake of simp licity , we leave th e description of the parameter initialization (instru ction init ) at the end of the algorithm description, in section IV -G. Normal Mode At each iterative cycle n , the ad mission co ntroller acc epts new sessions with an auton omously tuned proba bility p ( n ) and collects r elated raw measures of respo nse time and session arriv al rate (m ore details on these phases are gi ven in sectio ns IV -A an d IV -B ). If no ab rupt ch ange is de tected in the deman d intensity , the while loo p of the nor mal moda lity is rep eated ev ery T SOC AC seconds. At the end o f each cycle execution, th e system pro- cesses the raw measures to calculate some statistical metrics ( update_sta ts instruction ), such as th e mean session ar- riv a l rate λ in ( n ) , the m ean session admission ra te λ adm ( n ) and the 95%-ile of respon se time R T i ( n ) , i ∈ { 1 , 2 , . . . , K } . Details on the statistics update instruction are give in section IV -C. The execution of the update_curve instruction is of primary importanc e to deter mine the au tonomic behavior of our po licy . The system co nstructs the f unction between th e ob- served traffic rate λ adm ( · ) and th e cor respondin g response time for the K typ es o f requests being served RT i ( · ) . In parag raph IV -D we give complete details regarding the co nstruction of flash_crowd_m ode: for each session arrival { update_stats; / * calculates λ in ( n ) , ..., and S n++; update_admiss ion_probabilit y; probabilistic _admission_con trol; collect_raw_m easures; measure λ ist ; if λ ist < λ ∗ goto normal_mode else goto flash_cro wd_mode; } Fig. 2. Pseudo-code of SOC (flash cro wd m anage ment mode) this function by means of the statistical metrics calcu lated in the p revious update_stats instruction. Before starting a new admission contro l c ycle, the algor ithm ev alu ates a new limit λ ∗ ( n ) on the adm ission r ate, and calculates the new session admission probab ility acco rdingly , as detailed in section IV -E. While in n ormal m ode, if a flash c rowd occurs and a sudden surge in demand is detected, th e system enters the flash crowd managem ent mod e. It persists in this modality as long as the traffic pattern keeps on varying significantly . Flash Cr o wd Management Mode The flash crowd manag ement mo de provides that statistical metrics ar e updated every time a new session ar riv es, thu s ensuring a pe rfect a daptivity ( updat e_stats instru ction). Although statistical metrics a re updated at each session arriv al, no learning mechanism is activ ated in flash cro wd managem ent mode, i.e. there is no upda te_curve instruction, du e to the high variability of the incomin g traffic. The policy returns to no rmal mode only when th e admission probab ility has been pr operly adapted to ensure that the in- stantly measure d session admission rate λ ist is actually b elow the lim it λ ∗ . I n th is c ase we can a ssume th e un expected surge is un der control and the p olicy c an retur n to no rmal mode, during whic h perfo rmance controls ar e paced a t a slower and regular rate. In the f ollowing p aragraph s we discuss the d etails of the instructions p rovided in figure 1 and 2. A. Instructio n probabil istic_admissio n_control Purpose o f this instruction is to lim it th e inco ming rate to λ ∗ ( n ) by means of a probab ilistic admission contro l. New sessions will be ad mitted with probab ility p ( n ) , initially set to 1 and au tonomo usly tu ned as described in section IV -E o n the basis o f a f orecast on the session arriv al rate for the n ext iteration. B. Instructio n collect_ raw_measures This instruction en ables the collectio n of raw measures o f the R T of all requests b elonging to the cu rrently admitted session. W e define T n i as the set of raw mea sures of response time for r equests of type i , i ∈ { 1 , 2 , . . . , K } during the tim e interval [ t n , t n +1 ) . C. In struction update _stats At the execution of this instruction r aw measur ements ar e processed to calcu late some statistical parameters: • RT i ( n ) , that is the 95% -ile of the set T n i , for i ∈ { 1 , 2 , . . . , K } ; • λ in ( n ) , that is the a verage in coming r ate of new session s observed durin g the time interval [ t n , t n +1 ) ; • λ adm ( n ) , that is the average rate of admitted sessions during the time interval [ t n , t n +1 ) . In ord er to ensure a p roper system reactivity , all statistical metrics are calculated over the set S composed by th e last min {⌊ λ in · t ⌋ ; ⌊ λ ∗ · T SOC AC ⌋} admitted sessions. In normal mode, this allows an ear ly adaptation of th e admission con trol probab ility to a possibly increased de mand even if it has not yet caused th e trigger of the chan ge detection mechan ism. In flash crowd mode this ensures that the rate limit is calculated on the basis of the smallest time window that still guarantees a suf ficiently nu merou s set o f raw measures. D. Instructio n upda te_curve This instru ction provides the self-lear ning activity o f our algorithm . It allows the system to discover th e function th at relates the rate of admitted sessions an d th e R T of each tier . The statistics collected with th e update _stats instruc- tion give the system the following information: during the time interval [ t n , t n +1 ) , a rate of λ in ( n ) n ew sessions r eached the DP; only a rate of λ adm ( n ) of those sessions was actu ally served, and the 95% -ile of the response time for type i requ ests was R T i ( n ) . A statistical metric calcu lated fro m samples of raw mea- sures as d escribed in paragrap h IV -C, taken durin g a single iteration, is n ot reliable enou gh fo r two reasons: first, th e workload is subject to variations that may cause transient effects; seco nd, the numbe r of samp les may not be sufficient to ensure an accep table c onfidence level. The use of lo nger inter-observation periods may allow the collection of m ore numero us samples, but it is impossible to d efine a sufficiently long in ter-observation period for any possible traffic situation, and th e incomin g workload may vary before a sufficiently representative set of samples is gathered. Moreover too lo ng inter-observation period may le ad to low respo nsiv eness of the admission policy . The id ea at the basis o f ou r pr oposal is to collect the se statistics under a r ange of workload le vels. At each algorithm iteration the DP acquires K pairs ( λ adm ( n ) , R T i ( n )) for i = 1 . . . K , wher e R T i ( n ) is the 95%-ile of reque st R T measu red at the i- th tier . Let us co nsider the set of pairs: R i , ( λ adm ( n ) , R T i ( n )) , n ∈ { 0 , 1 , . . . } , where i ∈ { 1 , 2 , . . . , K } , an d let us partition the Cartesian plan e into rectangu lar intervals of length l λ along the λ adm axis, as shown in figure 3. For every interval [( k − 1) l λ ; k l λ ) , with k = 1 , 2 , . . . we define P i k = { ( λ adm , R T i ) | λ adm ∈ [( k − 1 ) l λ ; k l λ ) } . Then we calculate the barycenter B i k = ( λ B k , R T B i k ) of the k -th 0 2 4 6 8 10 12 0 1 2 3 4 5 6 7 8 95%-ile RT (sec) Average Session Arrival Rate (session/sec) 95%-ile RT Barycenters RT Curve RT SLA Fig. 3. Curve s et construction, regular s lice barycenters interval as the poin t with a verage coo rdinates over the set P i k . An in terval has no b arycenter if P i k = ∅ . Figure 3 shows the collected statistics taken at ru n-time at the datab ase tier of an examp le scen ario. It also poin ts o ut the calculated b arycenters f or each interval. Every time a new poin t is added to a set P i k , the monitor module upd ates the values of the barycen ter coor dinates, standard deviation and car dinality of the set bein g modified. Notice that the update of such v alues is performed for only one set at a time ( set th at have n ot bee n modified do not re quire statistic updates) and is incr ementally ca lculated with r espect to a synthetic statistical r epresentation . Such representatio n permits to av oid compu tational and storag e costs that would be afford ed if all the pairs had to b e con sidered. Barycenters calculated with a standard error high er than 20% are discarded while the others are considered sufficiently reliable and ar e include d in correspo nding lists L i , where i ∈ { 1 , 2 , . . . , K } . The elements of such lists are ordered on the b asis o f the first coord inate λ adm . Since we kn ow that the r elation b etween λ adm and RT i is mono tonically not decr easing, we can assume th at if two subsequen t barycenters d o no t satisfy this b asic mo notonicity proper ty , the co rrespon ding slices can be aggregated to im- prove the measure reliability . For this rea son, if L i contains two adjacent points which do not correspond to growing values of R T i , th e sets of statistics related to the correspo nding intervals are ag gregated and L i is upd ated until it contain s a list of p airs in growing ord er in both the co ordinates, as shown in figure 4. Notice that this procedur e pe rmits a further validation of the measures, beyond the alr eady p erforme d test on the standar d er ror value. After few aggregations, the list L i contains an or dered set of pairs wh ich can be line arly interp olated to obtain an estimate of the f unction that r elates λ adm and RT i . Than ks to the frequen t updates, this list is a highly dynamic structure, tha t continuo usly adapts itself to chang ing workload situ ations. The linea r interpo lation of the points in L i permits to forecast the response time corresp onding to any po ssible workload ra te. Notice that the u se o f common regression techniqu es as 0 2 4 6 8 10 12 0 1 2 3 4 5 6 7 8 95%-ile RT (sec) Average Session Arrival Rate (session/sec) 95%-ile RT Barycenters RT Curve RT SLA Fig. 4. Curve set construction, aggregat ed slic e barycenters an alternative to linear interpolatio n is un advised, because it would r equire a pr ior assumption on the type of function s be- ing parametrized for the regression. Experim ents we condu cted on d ifferent traf fic pr ofiles (e.g . b y using SPECWEB2005 [20] and TPC-W [21] orien ted traffic generator s) show that, ap art from monoton icity , no other structural pro perty is generally valid for all the po ssible traffic scenarios. This would make it d ifficult to ch oose th e ty pe o f regression (poly nomial, exponential, power law) to use. E. Instructio n update_a dmission_proba bility The self-constru cted set L i described in paragrap h IV -D is linearly inter polated to obtain an estimate o f the functio n f i ( · ) that re lates λ adm and the 95%-ile o f resp onse time measured at the i -th tier . Such function is then used to evaluate the highest session admission rate λ ∗ that can be ado pted to remain u nder the response time con straints defin ed in the SLA. Thanks to th is estimation, the DP can configu re the session admission prob ability acco rding to a forecast of the incoming workload. The algor ithm is b ased on a predictio n o f the session arriv al rate ˆ λ in ( n ) for the next itera tion inter val [ t n , t n +1 ) . It assumes that an esteem of the current session arrival rate ˆ λ in ( n ) can be based on the incomin g session rate λ in ( n − 1) observed during the previous in terval [ t n − 1 , t n ) , that is, ˆ λ in ( n ) = λ in ( n − 1 ) . Th e algorithm is suffi ciently robust to po ssibly false predictio ns, as they will be c orrected at th e next iteration, making use o f up dated statistics. New sessions will be admitted with prob ability p ( n ) = min { 1 , λ ∗ ( n ) / ˆ λ in ( n ) } . This way , if the inco ming r ate of ne w sessions in the present tim e interval is the same o bserved in the previous, the u pper limit on the total in coming ra te of new sessions is m et. The on-line self-tuning of the admission probab ility has se veral b enefits. On the one hand the high est possible rate o f incoming sessions is adm itted, optimizing the system utiliza- tion. On the other hand it prev ents the system from overload, by quickly r educing the adm ission pro bability as the traffic grows. The execution of this instruction starts with a test to verify the validity of the rate limit λ ∗ ( n − 1) adop ted in the p revious time interval. T o th is extent we define two ty pes of err or in the evaluation of λ ∗ ( n − 1) : • err or − : The system admitted new sessions with proba - bility p ( n − 1) but the incoming rate was unexpected ly greater than λ ∗ ( n − 1) . In such a situation, if the rate limit was proper ly estima ted, some SLA limits sh ould have been violated . In this erroneou s situation , although the rate limit was exceeded, the SLA limits were not violated. The oc currence of th is error depend s on a possible underestimation of the r ate limit λ ∗ ( n − 1 ) . More for mally , if λ adm ( n − 1) ≥ λ ∗ ( n − 1) AND ∀ i ∈ 1 , 2 , . . . , K RT i < RT i SLA ) the n er r or − = tr ue . • err or + : The system admitted new sessions with pr oba- bility p ( n − 1) and, as expected, the incomin g rate was lower than λ ∗ ( n − 1 ) . I n such a situa tion, if the rate limit was p roperly e stimated, there should not be any violatio n of the agreeme nts. In this err oneous situation, although the rate limit was n ot exceed ed, a vio lation of at least o ne of the SLA limits was obser ved. The occurr ence of this error r ev eals a po ssible overestimation of the rate limit λ ∗ ( n − 1) . More for mally , if ( λ adm ( n − 1) ≤ λ ∗ ( n − 1) AND ∃ i ∈ { 1 , 2 , . . . , K } s.t. RT i > R T i SLA ) then err or + = tru e . If none of these er rors occurred, the upper limit o n the r ate of adm itted sessions was prop erly set and there is no n eed to change the value of the rate limit. Therefo re, in absence of errors, λ ∗ ( n ) = λ ∗ ( n − 1) . If other wise one of these two types of er ror ha s occur red the value of λ ∗ ( n − 1 ) needs to be u pdated. T o th is pur pose the set L i is linearly interpoled and the resulting functio n f i ( · ) is in verted in correspo ndence to th e v alue of the SLA limit on the 95%-ile of the response time RT i SLA . Th e functio n f i ( · ) crosses the lin e t = RT i SLA in a point P ∗ i = ( λ ∗ i ( n ) , R T i SLA ) , whose first c oordina te, λ ∗ i ( n ) , is the estimated optim al session admission rate for the i - th tier . T o guar antee the fulfillment of the SLA on each tier, the optimal admission rate for the next rou nd is set a s follows: λ ∗ ( n ) = min i =1 ,...,K λ ∗ i ( n ) . Notice th at at the startu p, L i may contain on ly one point (the benchm ark point described in parag raph IV -D) or sev eral points located b elow th e SLA con straint. In the first case, the admission pr obability p ( n ) is set to 1. In the second case the linear interp olation between the extreme two p oints in L i is prolon ged until it cro sses the SLA constraint. F . Fun ction change_detect ion() This mechanism co nsists o f two joint controls and tr iggers only if both of th em g i ve a po siti ve result: 1) the numb er of sessions adm itted during the curr ent execution cycle ( we call it N ) exceeds the expectations for a single cycle, that is ( N > λ ∗ · T SOC AC ) ; 2) the cu rrent ad mission rate exceeds the limit λ ∗ by k times the measured stand ard deviation of the admitted rate, tha t is (( N /t ) > ( λ ∗ + k · σ λ )) , where t is the time elapsed f rom the start o f th e current iteration. Notice that the value of σ λ is calcu lated at run -time b y measuring the standard deviation of the admitted rate λ adm in situation s Boolean change_dete ction() { if ( ( N > λ ∗ · T SOC AC ) AND (( N /t ) > ( λ ∗ + k · σ λ )) ) return TRUE; else return FALSE; } Fig. 5. Pseudo-code of the change detection m echani sm where λ in greater than λ ∗ . It measures the intensity of the inherent variability of the admitted rate λ adm , that cannot be filtered by a prob abilistic admission control. The p seudo-cod e o f the chang e detection m echanism is described in figu re 5. G. Instructio n Init The autono mic b ehavior of our algorithm makes the system capable o f adaptin g itself to changin g traffic conditions when prior knowledge of the traffic parameters is useless or ev en misleading. For this reason the initial setting of the system parameters is n ot of p rimary imp ortance. As initial setting of our algorithm we use n = 0 , λ ∗ (0) = λ SLA and p ( n ) = 1 . As initial setting of the cur ve co nstruction phase, we insert the p oint P i bench = (0 , RT i bench ) in L i , r epresenting the lo wer bound on the 95 % -ile of th e respon se times o f ty pe i requests. This p oint is the 95% -ile o f r esponse time measured at the i - th tier , when the system is in a completely id le state, that is when λ adm e =0 . In order to calcu late the av erage response time in such situation we use an offline benchm ark, obtaining the poin ts P i bench = (0 , RT i bench ) , i ∈ { 1 , 2 , . . . , K } . The prope r setting of the points P i bench with value P i bench = (0 , RT i bench ) as detailed in section IV -D, is no t a key po int in the algorithm, since it can be substituted with the o rigin O = (0 , 0) , with no imp act but a little difference in the time to con verge to a stable choice of λ ∗ ( n ) . The use o f this point in the interpo lation o f the curve obtained f rom the set L i is in fact limited to the first executions of th e instruction update_curve , wh en too f ew re liable points a re av ailable. V . O T H E R A D M I S S I O N C O N T RO L S T R A T E G I E S In this section we describe other pr eviously propo sed QoS policies to make p erform ance comparisons. These policies c an be f ormulated in m any variants depending on the co nsidered perfor mance o bjectiv e. W e limit our ana lysis to the optimiza- tion of respo nse tim e which is strictly related to th e user perceived q uality . A. Threshold Based Admission Contr ol Fixed thr eshold policies have been p roposed in many fields of compute r science, and in p articular f or web applicatio ns with se veral variants [10], [11], [6]. According to the T hreshold Based Admission Con trol (TB A C) policy , the DP makes periodic ev alu ations of t he 95% - ile of respon se time of e ach tier , ev ery T TBAC AC seconds. If there is at least o ne tier for which the 95% -ile of re sponse time exceeds a threshold RT TBAC , the DP reje cts new sessions and only accepts requ ests that belong to on going sessions. On the contrary , if the value of the 95% - ile of response time at each tier is lo wer than RT TBAC , all new sessions ar e accepted f or the n ext T TBAC AC seconds. This policy , like all threshold based policies, imp lies a typical on/off behavior of the admission controller . This causes unacceptab le o scillations of respo nse time. Furthermo re, its perfor mance depends o n a pr oper pa rameter setting (i.e. the choice of the threshold RT TBAC and of the period be tween two succeeding decisions T TBAC AC ), and for this r eason it ca nnot be used in tr affic scena rios characterized by high ly variable workloads. B. Pr obabilistic Admission Con tr ol Probabilistic Admission Con trol (P A C) is a well kn own technique in contro l th eory , com monly used when oscillation s are to be av o ided. This p olicy was p roposed for I nternet services in [24], while a similar version was also prop osed fo r web systems in [3]. According to this policy , a new session is admitted with a cer tain probab ility , whose value depend s on the m easured re sponse time. The DP ev aluates, every T PAC AC seconds, the re sponse time of each tier . It compares the measured resp onse times with two thresholds, RT PAC low and RT PAC high . The acceptanc e pro bability for the i - th tier is a piece- wise lin ear fun ction of the me a- sured 95% -ile of the respo nse time r i , and has the following formu lation: p ( r i ) , 1 if r i ≤ R T PAC low RT PAC high − r i RT PAC high − RT PAC low if RT PAC low < r i ≤ RT PAC high 0 if r i > R T PAC high (1) Then the session admission prob ability for the next rou nd is giv en by : p = min i =1 ,...,K p ( r i ) Notice that the two threshold values, RT PAC high and R T PAC low , that char acterize th is policy , ar e arbitrar ily set o ffline indep en- dently of the observed incoming session rate and of the inter- observation period T PAC AC . Therefo re, the perfor mance of this policy is dep endent on a proper tuning of these parameter s, a s we show in sectio n VI. V I . S I M U L A T I O N R E S U L T S In order to ma ke perfo rmance com parisons amo ng the dif- ferent policies and to in vestigate the flash crowd man agement capabilities of SOC, we de veloped a simulator on the basis of the OPNET mo deler software [18]. In ou r experimental setting, we assum e that the interarriv al time of n ew sessions follows a negative exponen tial distri- bution. The interarrival time o f requests b elonging to the same session is more comp lex. In or der to h av e a realistic traffic generator, we used the ph ase model o f an indu strial standard b enchmar k: SPECWEB2005 [2 0]. W e refer to [20] for a detailed description o f the state mo del and of the function alities of e ach p hase. Upon receptio n o f a respo nse, the next request is sent after a think tim e interval T think spent by the u ser analyzin g the received we b pag e. Our m odel of T think is based on TPC- W [21], [15] and on oth er works in the area o f web traffic analysis such as [23]. As in the TPC-W mo del, we assume an expone ntial d istribution of think times with a lower bo und of 1 sec. The refore T think = ma x {− lo g( r ) µ, 1 } where r is unifor mly distrib uted in the interval [0,1 ] and µ = 10 sec. T o model a re alistic user behavior , we also introduce a timeout to represent the max imum response time to lerable by the u sers. After that a r equest has be en sent, if th e timeout expires befor e the recep tion o f the re sponse, the client a bandon s the system. W e assume each phase of the session state m odel can b e mapped o nto a specific tier of a 3 -tier cluster . W e use an approx imate estimate of the average processing times of the different tiers on the basis of the experimen ts d etailed in [11]. W e assume each session phase r equires an exponentially distributed execution time set as follows: average execution time of pure http requests is 0.00 1 sec, while for servlet request is 0 .01 sec an d f or d atabase r equests is 1 sec. For sake of brevity , we con duct our an alysis on th e database tier which is the bottleneck of the architecture co nsidered in these simulations. Thus, for simplicity , we indicate the limit on the database response time, defined in the SLA, as R T SLA . All the experiments of this section ar e condu cted with 20 applicatio n servers, a client timeout of 8 sec. and RT SLA = 5 sec. The fixed threshold T TBAC AC of the TBA C po licy is a lw ays set in agreem ent with the SLA constrain ts o n the 95% -ile of da tabase response time, therefo re T TBAC AC = R T SLA . The thresholds of th e P A C policy are defined as follows: T PAC low = 3 sec an d T PAC high = RT SLA , in agreemen t with th e SLA constraints. A first set of exper iments (figur es 6 an d 7) shows how SOC outperf orms the TB A C an d P A C po licies, in ter ms of both perfor mance and stability . 3 4 5 6 7 8 9 10 11 0 2 4 6 8 10 12 14 95%-ile RT (sec) Average Session Arrival Rate (session/sec) SOC TBAC PAC RT SLA Fig. 6. 95%-ile of databa se R T Figure 6 highligth s the ad aptive beh avior of SOC. On the one hand , when th e traffic load is hig h, SOC finds the suitable session arr i val rate and admits as many sessions as possible to remain un der the SLA limits. On the oth er hand, when the traffic is low , it accep ts almost all incoming sessions. Unlike SOC, other non ad aptive policies, such as TBA C and P AC, typica lly under-utilize the system resources in low workload con ditions, and violate th e QoS agreemen ts w hen the workload is high. SOC oup erforms TB AC and P AC also in terms of stability . 3 4 5 6 7 8 9 10 11 45000 46000 47000 48000 49000 95%-ile RT time (sec) SOC TBAC PAC RT SLA 3 4 5 6 7 8 9 10 11 45000 46000 47000 48000 49000 95%-ile RT time (sec) SOC TBAC PAC RT SLA Fig. 7. Oscillati ons of 95%-ile of database R T As figure 7 poin ts ou t, T B AC shows an evident oscillatory behavior du e to its on/off n ature while P A C h as an over- reacting behavior in many situations. SOC, instead, shows a more stable re sponse time . The self-learn ing ac ti vity allows to build a reliab le k nowledge of the system cap acity with respect to the incoming traffic that is used to derive a go od and stab le estimation of the optim al admission rate. W ith the fo llowing experimen ts we want to show that although SOC is based on the o ff-line co nfiguratio n o f some parameters, (in particular T AC and l λ ), this does not harm its auton omy . In fact th e experiments detailed in figure s 8, 9 and 10 show that the policy beh avior is insensitive to the p articular setting o f those parameters. These exper iments were co nducted with slow varying traffic scenarios. In th is experimental setting , the p articular cho ice of T AC does no t influence the policy p erform ance. Fu rthermor e a lthough small values o f T AC may cause f requent trigger s of the ch ange detection mech anism ( due to false po siti ve results of the tests described in section IV -F), these trig gers only cause m ore mode switches, withou t significant impact on perf ormance (figure 8 ). Similarly the choice of the interval size l λ that defines the curve constru ction and de termines the o ccurren ce of ag- gregation of measureme nt sample sets, does not a ffect SOC perfor mance. Both r esponse time and admission probab ility are stable ( figures 9 and 10) even wh en l λ varies sig nificantly . 0 2 4 6 8 10 0 100 200 300 400 500 95%-ile RT (sec) T AC (sec) SOC TBAC PAC RT SLA Fig. 8. 95%-ile of databa se R T Giv en the slo w varying traffic scenario that char acterizes the experimental setting of the previous experiments, we did not show any perfo rmance comparison with the AA CA po licy that 3 4 5 6 7 8 0 0.2 0.4 0.6 0.8 1 95%-ile RT (sec) l λ 10 Server 20 Server 30 Server RT SLA Fig. 9. 95%-ile of databa se R T 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Session Admission Probability l λ 10 Server 20 Server 30 Server 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Session Admission Probability l λ 10 Server 20 Server 30 Server Fig. 10. Session admission probabil ity we intr oduced in [ 5]. In fact in this scenario the per forman ce of SOC is only marginally be tter than AA CA, an d the lines in the figures would hav e overlapp ed each o ther in many cases. In the following exper iments we studied th e perf ormance of SOC with and without acti vating the cha nge detection and flash crowd management capability describe d in section IV. In figures 13, 12, 14 and 1 5 the form er version is called Flash Cr o wd Management wh ile the latter is called Base . The Base version is the same policy we introduced in [ 5] with the addition of th e new m onitor module detailed in paragr aphs IV -C and IV -D. Figure 1 1 ch aracterizes the traffic scenario o f th e last set o f experiments. It sh ows a session arrival rate that is sub ject to se veral sud den surges of g rowing intensity . 0 20 40 60 80 100 120 40000 60000 80000 100000 120000 140000 Incoming session rate (sess/sec) time (sec) Fig. 11. Session arri val rate Figures 13 and 12 show how th e flash crowd man agement support is capab le o f extremely mitigating the spikes of response time c aused by the occurr ence of flash crowds. These spikes are instead presen t in figure 1 3 showing that without proper flash cr owd man agement, a violation of th e service lev el ag reements is inevitable. 0 2 4 6 8 10 12 40000 60000 80000 100000 120000 140000 95%-ile RT time (sec) Flashcrowd Management RT SLA Fig. 12. 95%-ile R T (FCM) Figures 14 and 15 f ocus on the man agement o f the flash crowd that occurs a t 1 0000 0 seconds o f simu lations. These figures high ligth the increased reactivity of SOC when using the flash crowd man agement support. The Base version takes almost 4 0 seco nds to discover the occu rrence of the flash crowd and consequently adapt the adm ission pr oba- bility , while the enhan ced version reacts almost immediately . Notice the time scale difference between the two fig ures 14 and 15, and the fact that a 40 seconds delay in discovering the flash crowd, implies the system being in overload for almost 500 secon ds. This is mo stly due to th e fact that the admission controller work s at session granu larity . Notice that prematur e session interru ption would not solve th is prob lem, because o n th e o ne h and, sessions are ter minating anyway due to clien t timeo ut, and on the o ther h and, the increased session interrup tion rate sh ould obviously be consid ered as ano ther aspect o f degraded perfo rmance. In particular, figure 14 shows how the Base version o f SOC is in capable to face such flash crowd, as can be seen by the high values to th e 95%-ile of resp onse time, which exceed the user time -out. This means that user s are ab andon ing the site due to poor perf ormance or system un av ailab ility . On the contrary , the flash crowd managemen t enhanc ed version of SOC is capab le of m aintaning the response time at acceptable 0 2 4 6 8 10 12 40000 60000 80000 100000 120000 140000 95%-ile RT time (sec) Base RT SLA Fig. 13. 95%-ile (Base) lev els by rapidly redu cing the session admission prob ability as soon as the surge in demand is detec ted. 0 2 4 6 8 10 12 99500 100000 100500 101000 95%-ile RT time (sec) Flashcrowd Management Base RT SLA 0 2 4 6 8 10 12 99500 100000 100500 101000 95%-ile RT time (sec) Flashcrowd Management Base RT SLA Fig. 14. 95%-ile R T V I I . R E L A T E D W O R K There is an impressively growing in terest in auton omic computin g and self-m anaging systems, starting fr om se veral industrial initiatives from IBM [1], Hewlett Packard [2] and Microsoft [16]. Altho ugh self-adapta tion capa bilities could dramatically improve we b system r eactivity and overload control durin g flash crowds, little effort has been sp ent on the proble m of autonom ous tuning o f QoS p olicies for web systems. The ap plication of the autono mic co mputing parad igm to the problem of overload con trol in web systems poses some key problems concerning the design of the monitor ing module. The authors of [1 9] pr opose a techn ique for learning dynamic patterns of web user beha vior . A finite s tate machin e represent- ing the ty pical user b ehavior is constructed on th e basis of past history and used for prediction and prefetchin g techniq ues. In paper [12] the prob lem of delay prediction is analyzed on the basis of a learnin g ac ti vity exploiting passive measur ements of query executions. Such predictive cap ability is exploited to enhance traditional qu ery optim izers. The cited p roposals [1 2], [19] can p artially con tribute to improve the QoS o f web systems, but differently from our work, n one of them directly f ormulate a complete autonomic solution that at the same time giv es d irections on how to take measures, and make co rrespon ding admission contr ol decisions for web cluster architectur es. 0 0.2 0.4 0.6 0.8 1 99920 99960 100000 100040 100080 Session amission probability time (sec) Flashcrowd Management Base Fig. 15. Session admission probabil ity The author s o f [14] also address a very importan t deci- sion problem in the d esign of the monitor ing m odule: the timing of per forman ce contr ol. They pr opose to adap t the time interval betwee n su ccessi ve decisions to the size of workload depe ndent sy stem par ameters, such as the processor queue length. The dynam ic adju stment o f this interval is o f primary importan ce for thresho ld based policies for which a constan t time in terval between decision s may lead to an oscillatory behavior in high loa d scen arios as we show in Section VI. Simulations re veal that our algorithm is not subject to oscillations and sho ws a very little depe ndence on the time interval between decisions. The pr oblem of design ing ad aptive compo nent-level thresh- olds is a nalyzed in [ 7] f or a gener al con text of au tonomic computin g. The mechanism pro posed in the paper consists in monito ring the thresh old values in use b y keeping track of false alarm s with respect to possible violations of service lev el agree ments. A r egression model is used to to fit the observed history . When a sufficiently confiden t fit is attained the thresho lds are calculated accord ingly . On th e contrary if the required confiden ce is not attained, the thresho lds are s et to random values as if there was n o h istory . A critical p roblem o f this proposal is th e fact that the most c ommon threshold p oli- cies cause on /off behaviors that o ften r esult in un acceptable perfor mance. Ou r pro posal is instead b ased on a probabilistic approa ch and on a learning techniqu e, that dynamically creates a kno wledge basis for the online ev aluation of the best decision to make even for tra ffic situations th at nev er occur red in the past history . The pr oblem of autonom ously configurin g a comp uting cluster to satisfy SLA require ments is addr essed in [13]. This p aper is similar to ours in the design of a strategy f or autonom ic com puting th at divides the problem into different phases, called m onitor , a nalyze , plan an d e xecute (MAPE, accordin g to the termin ology in use b y IBM [17]) in o rder to meet SLA require ments in terms of response time and server utilization. Unlike ou r work, the autho rs of this paper designed a po licy whose d ecisions co ncern th e r econfigur ation of resource allocatio n to services. The design o f S O C is inspire d by the policy AACA we introdu ced in a previous work [5] to which we added the anomaly detection and decision rate adaptation mechanism that is n ecessary to manage flash crowd situations. W ith respect to [5], we also largely improved the design o f the monitor module as we detail in section IV. V I I I . C O N C L U S I O N In this p aper we address the prob lem of overload contro l for web based sy stems. W e introdu ce an origin al policy , that we name SOC, that permits th e self-con figuration and rapid ad ap- ti vity . SOC exploits a cha nge detectio n mecha nim to switch between two modalities acco rding to the time variability of the in coming traf fic. When th e incomin g traffic is stable, the policy works in normal mode in which perf ormanc e controls are p aced at a regular rate. The policy switches to flash cr owd man agement mode as soo n as a rapid surge of demand is detected. It the n increases the rate of perfor mance contro ls until the incomin g traffic bec omes more stable. This per mits a fast reactio n to sudd en changes in traffic intensity , and a hig h system responsiveness. Our policy does not requ ire any prior kn owledge of the incoming traffic, n or any assumption o n the p robab ility d istri- bution of requ est in ter-arri val and service time. Unlike other propo sals in the a rea, our policy works u nder a wide ran ge of operating cond itions without the need of labo rious man ual parameter tuning. It is en tirely implemented on dispatchin g points, withou t the n eed of any modification of c lient and server software. W e compar ed our policy to previously proposed approach es. Extensive simulations show that it permits a n excellent u tiliza- tion of system resourc es while always respecting the limits on response time imp osed by service lev el a greements. W e show that our policy red uces the o scillations o f response time common to other po licies that work a t session granularity . Simulation results also high light the flash crowd man agement capabilities of SOC, sh owing h ow it rapidly adapts the admis- sion p robability to keep the overload u nder c ontrol. R E F E R E N C E S [1] Ibm: the v ision of autonomic computing. http:// www . re sear ch.ibm.com/ autonomic/manifesto . [2] Hewle tt packard : Adapti ve ent erprise design pr inciple s. http:// h71028.www7.hp.com/ enterprise/cache/80425-0-0-0-121.html . [3] J. A weya, M. Ouelette, D. Y . Montuno, B. Doray , and K. Felsk e. A n adapti ve load balancing scheme for web servers. Internati onal J ournal of Network Manageme nt , 12:3–39, 2002. [4] N. Bartol ini, G. Bongiov anni, and S. Silv estri. An adapt i ve admission control policy for geograph ically distribut ed web system. The ACM Pr oceedi ngs of the Internationa l Confer ence on Scalable Information Systems (INFOSCALE) , 2007. [5] N. Bartolin i, G. Bongio v anni, and S. Silvestri . An autonomic admission control polic y for distrb uted web systems. The IEEE P r oceed ings of the Internation al Symposium on Modelling , Analysis, and Simulat ion of Computer and T elecommunica ion Systems (MASCO TS) , 2007. [6] N. Bartoli ni, G. Bongiov anni, and S. Silv estri. Distrib uted server select ion and admission control in repli cated web systems. The IEEE Pr oceedi ngs of the 6th Interna tional Symposium on P arallel and Dis- tribu ted Computing (ISPDC) , 2007. [7] D. Breitgand, E . Henis, and O. Shehory . Automated and adapti ve thresh- old setti ng: enabling technol ogy for autonomy and self-management. Pr oceedi ngs of the Internatio nal Confere nce on Autonomic Comput ing (ICA C) , 2005. [8] V . Carde llini, E. Casalicc hio, and M. Colajann i. The state of the art in locally distribute d web server systems. ACM Computing Survey s , 34(2):263– 311, 2002. [9] J. Carlstrom and R. Rom. Applic ation awa re admission control and scheduli ng in web serve rs. Proce edings of the IEEE Confe ren ce on Computer Communica tions (INFOCOM) , 2002. [10] L. Cherkasov a and P . Phaal. Session based admission control: a mechanism for peak load management of commercial web sites. IEEE T ransactions on Computer s, 2002 , 51(6). [11] S. E lnik ety , E . Nahum, J. Tra cey , and W . Zwaenepo el. A m ethod for transpare nt admission control and request s chedul ing in e-commerce web sites. Proc eedings of the ACM W orld W ide W eb Confere nce(WWW) , May 2004. [12] J.-R. Gruser , L. Raschid, V . Zadorozhny , and T . Zhan. Learning response time for websources using query feedback and applicat ion in query optimiza tion. The International Journa l on V ery Large Data Bases , 9(1), March 2000. [13] Y . Li, K. Sun, J. Qiu, and Y . Chen. Self-reconfigura tion of servic e- based systems: a case study for service le vel agreements and resource optimiza tion. P r oceed ings of the IEEE International Confer ence on W eb Service s (ICWS) , 2005. [14] X. Liu, R. Zheng, J . Heo, Q. W ang, and L. Sha. Timing performance control in web serve r systems utilizing server internal state informat ion. Pr oceedi ngs of the IEE E Joi nt Inte rnational Confe ren ce on Autonomic and Autonomous Systems and International Confe re nce on Netwo rking and Servic es (ICAS/ICNS) , 2005. [15] D. Menasce. Tpc-w: A benchmark for e-commerce. IEEE Internet Computing , May/ June 2002. [16] Microsoft: The dr iv e to self-managin g dynamic systems. http:// www . micr osoft.com/windowsserver system/dsi/default.mspx . [17] B. Mille r . The a utonomic computing edg e: The rol e of kno wledge in auto nomic systems. http:// www- 128.ibm.com/de veloperworks/aut onomic/library/ac-edge6/ . [18] Opnet technologi es inc. http://ww w .opnet.com . [19] S-Pradeep, C. Ramachandra n, and S. Srini vasa. T owar ds autonomic web-sites based on learning automata. Pr oceedi ngs of the A CM W orld W ide W eb Confer ence (WWW) , 2005. [20] Specweb2005 design document. http:// www . spec.or g/web2 005/docs/designdocument.html . [21] The transact ion processin g council (tpc). tpc-w . http:// www . tpc.or g/tpc w . [22] B. Urgaonkar , G. Pac ifici, P . Shenoy , M. Spreit zer , and A. T antawi . An analyti cal model for m ulti-t ier internet services and its applica tions. IEEE Tr ansactio ns on the W eb , 1(1), 2007. [23] H. W einreic h, H. Obendorf, E . Herder , and M. Mayer . Off the beaten tracks: Explorin g three aspects of web na vigati on. Pr oceedi ngs of the ACM W orld W ide W eb Confer ence (WWW) , 2006. [24] Z. Xu and G. v . Bochman n. A probabili stic approac h for admission control to web s erve rs. Proc eedings of the International Symposium on P erformance Evaluati on of Computer and T elecommun ication Systems (SPECTS) , July 2004.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment