Solving the At-Most-Once Problem with Nearly Optimal Effectiveness
We present and analyze a wait-free deterministic algorithm for solving the at-most-once problem: how m shared-memory fail-prone processes perform asynchronously n jobs at most once. Our algorithmic strategy provides for the first time nearly optimal …
Authors: Sotirios Kentros, Aggelos Kiayias
Solving the At-Most-Once Problem with Nearly Optimal Ef fecti v eness Sotirios Kentros a,1 , Aggelos Kiayias a,2 a Computer Scien ce and Engineeri ng, U niver sity of Connect icut, Storrs, USA Abstract W e present and analyze a wait-free deterministic algorithm for solving the at-mo st-once problem : how m shared-me mory fail-pro ne p rocesses perfor m asyn chron ously n jobs at most once . Our algorithm ic strategy p rovides for the first time nearly o ptimal effectiv eness, which is a measu re that expresses the total number of jobs co mpleted in th e worst case. The effectiv eness of our al- gorithm equals n − 2 m + 2 . This is u p to an additive factor of m clo se to the k nown effecti veness upper bound n − m + 1 over all possible algorithm s and improves on the previously best kno wn deterministic solutions that h ave effecti veness on ly n − log m · o( n ) . W e also present an iterati ve version of our alg orithm that fo r any m = O( 3+ ǫ p n/ log n ) is both effectiv eness-optim al a nd work-optima l, for any constant ǫ > 0 . W e the n employ this algorithm to provide a new algorith- mic solution for the Write-All problem which is work optimal for any m = O( 3+ ǫ p n/ log n ) . K e ywor ds: at-most-on ce problem, task allocation, write-all, I/O automata, asynchrono us shar ed memory , deterministic algorithms, distributed computing 1. Introduction The at-m ost-once pr oblem fo r asynch rono us shared me mory systems was in troduce d by K en- tros et al. [26] as the pro blem of perf orming a set of n jo bs by m fail-pro ne pr ocesses while maintaining at-most-o nce semantics. The at-mo st-once semantic for ob ject inv ocation ensu res that an oper ation accessing and al- tering the state of an object is performed no more tha n o nce. This semantic is among the standard semantics for remote proced ure calls (RPC) and metho d inv ocations and it p rovides impo rtant means for reasoning about the safety of critical applications. Unipro cessor systems may tr ivially provide solutions for at-most-once semantics by implementing a cen tral schedule for operations. Email addr esses: ske ntros@engr.uco nn.edu (Sotirios Kentr os), aggelos@kiayi as.com (Aggelos Kiayias) 1 Researc h supported in part by the State Scholarships Foundati on of Greece. 2 Researc h supported in part by NSF awa rds 0831304, 0831306 and EU projects RECUP and CODAMOD A 3 NO TICE: This is the authors version of a work that was accepte d for publica tion in Theoretica l Computer Science. Changes resulti ng from the publishi ng process, such as peer re vie w , editing, correctio ns, structural formatting, and other qualit y control mechani sms may not be reflected in this document. Changes may hav e been made to this work since it was submitted for publicat ion. A definiti v e version was subsequent ly published in Theoretic al Computer Science, [V ol ume 496, 22 July 2013] DOI:10.1016/ j.tcs.2013.04.017 1 The problem beco mes very challeng ing f or au tonomo us p rocesses in a system with c oncurr ent in vocations on mu ltiple objec ts. At- most-onc e semantics h ave bee n tho rough ly studied in the context of at-most-on ce message d eliv ery [8, 3 0, 33] an d at-mo st-once process inv ocation for RPC [6 , 3 1, 37]. Ho we ver , findin g effecti ve solu tions for asynchron ous sh ared-me mory m ulti- processors, in terms of how many at-most-o nce inv ocation s ca n be perfor med by the co operat- ing processes, is largely an open pr oblem. Solutions for the at-m ost-once pro blem, u sing only atomic read /write memo ry , an d withou t specialized h ardware suppor t such as condition al writ- ing, provide a useful to ol in reasoning about the safety proper ties of application s developed for a v ariety of multiprocessor sy stems, including those not supporting b us-interlockin g instruc tions and multi-cor e systems. Specifically , in recent years, attention has shifted fr om increasing clock speed to wards chip multipr ocessing , in order to incr ease the performance of systems. Becau se of the d ifferences in ea ch multi-co re system, asynchro nous shared mem ory is becoming an impor- tant abstraction for arguin g ab out the safety prop erties of parallel application s in such systems. In the next years, one can expect chip multiprocessing to appear in a wide range of applications, many of which will have com ponen ts that need to satisfy at-m ost-once semantics in o rder to guaran tee safety . Such applica tions may include autono mous robotic devices, robotic devices for assisted living, automation in productio n lines or medical facilities . In such applications per- forming spe cific job s at-m ost-once may be o f p aramou nt imp ortance fo r safety of pa tients, the workers in a facility , or th e devices themselves. Such jobs cou ld be the triggerin g of a motor in a r obotic ar m, the ac ti vation of the X- ray gun in an X-ray machine, or sup plying a do sage of medicine to a patient. Perhaps the most impo rtant question in this ar ea is devising algorithms for th e at-m ost-once problem with good effectiveness . The com plexity measur e o f effectiv eness [ 26] describ es the number of jobs com pleted (at- most-onc e) by an implem entation, as a f unction o f the overall number o f jobs n , the number of processes m , and the n umber o f crashes f . Th e only d eter- ministic solutions known, exhibit very lo w effecti veness ( n 1 log m − 1) log m (see [26]) which for most choices of th e p arameters is very far fro m optim al (unless m = O(1) ). Contrary to this, the p resent work presents the first wait-free determ inistic algor ithm for the at-m ost-once pr oblem which is op timal u p to additive factor s of m . Specifically our effecti veness is n − (2 m − 2 ) whic h comes close to an additive factor of m to the known u pper bound over all possible algorithms for effectiveness n − m + 1 (f rom [26]). W e also dem onstrate how to co nstruct an algorithm which has ef fectiveness n − O( m 2 log n log m ) an d work complexity O( n + m 3+ ǫ log n ) , and is both effectiveness and work optimal when m = O( 3+ ǫ p n/ log n ) , for any con stant ǫ > 0 (work c omplexity counts th e to tal numb er of basic ope rations p erform ed by the processes). Fi- nally we show ho w to use this algorith m in or der to solve the Write-All problem [2 3] with work complexity O( n + m 3+ ǫ log n ) . Related W ork: A wide ran ge of works stud y at-mo st-once semantics in a variety of settings. At- most-once message delivery [8, 30, 33, 38] and at-most-onc e semantics for RPC [6, 3 1 – 33, 37], are two areas th at have attracted a lot o f attention. Both in at- most-onc e m essage delivery an d RPCs, we have two entities (sender /client and receiver/server) that co mmunica te b y m essage passing. Any en tity ma y fail a nd recover and messages m ay b e d elayed o r lost. I n th e first case one wants to gu arantee that d uplicate messages will no t be acce pted by the receiver , while in the case of RPCs, one wants to guarantee that the proced ure called in the remo te server will be in voked at-most-once [37]. In Kentros et a l. [26], the at-m ost-once problem f or asynch ronou s sh ared memory systems and the co rrectness properties to b e satisfied by any solution were defined. T he first alg orithms 2 that so lve the at-mo st-once pr oblem were provid ed and analyzed. Specifically they p resented two a lgorithm s that solve the at-m ost-once proble m for two processes with optimal effectiv e- ness and a m ulti-pro cess algor ithm, that em ploys a two-pro cess algo rithm as a building blo ck, and solves the at-most-on ce p roblem with effectiv eness n − log m · o( n ) and work com plexity O( n + m log m ) . Subsequen tly Censor-Hillel [22] p rovided a probabilistic algorith m in the same setting with optimal ef fectiveness and expected work complexity O( nm 2 log m ) by employing a probab ilistic multi-v alued consensus protocol as a b uilding block. Follo wing the c onferen ce versio n of this paper [ 25] a nd mo tiv ated by the difficulty of im- plementing wait-free d eterministic solutions f or the a t-most-on ce problem that ar e effecti veness optimal, Kentros et a l. [24] introduc ed the str ong at-most-on ce pr oblem an d studied its feasibil- ity . Th e stro ng a t-most-on ce problem r efers to th e settin g wh ere e ffecti veness is me asured only in term s of the jo bs that need to be executed and the p rocesses that to ok part in th e comp utation and crashed. Th e strong at-most-o nce problem demands solutions that are adaptive, in the sense that the effecti veness depen ds only on the behavior of pro cesses that p articipate in the ex ecution . In this man ner trivial solution s are e xcluded an d, as dem onstrated in [24], processes h av e to solve an agreemen t primitive in order to make pr ogress an d p rovide a solution for the problem . Ken- tros et al. [ 24] prove that the strong at-most-once problem has consensus n umber 2 as defined by Herlihy [2 1] and observe that it belon gs in th e Commo n 2 class as defin ed by Afek et al. [1]. As a resu lt, there exists no wait-free determ inistic solution for the stro ng at-m ost-once pr oblem in the asynchro nous shared memo ry mo del, using atomic read/write registers. Kentros et al. [24] present a r andom ized k -ada ptiv e effectiv eness optim al solu tion for th e stron g at-most-once prob- lem, with expected w ork complexity of O( n + k 2+ ǫ log n ) fo r any small constant ǫ , where k the number of proc esses that participate in the execution. Di Crescenzo and Kiayias in [ 11] (and later Fitzi et a l. [14]) demon strate the use of the at-most-on ce semantic in m essage passing systems f or the pu rpose o f secure com munication . Driv en by the fu ndamen tal secu rity requir ements of o ne-time pad en cryptio n, the author s par- tition a co mmon ra ndom pa d amo ng multiple commu nicating parties. Perfect secur ity can be achieved on ly if every piece of th e pad is used at mo st once. The auth ors show how the pa r- ties m aintain security while max imizing efficiency b y app lying at-most-o nce semantics on pad expenditure. Ducker et al. [12] consider a distrib uted task allocation problem , where players that commu - nicate u sing a shared blackb oard or an arbitrar y directed co mmun ication gr aph, want to assign the tasks so that each task is perfo rmed exactly once. They conside r synchr onou s execution without failures and examine the comm unication and r ound co mplexity required to solve the problem , p roviding relev ant lower and u pper b ound s. If crashes are intr oduced in their m odel, the im possibility r esults fro m Kentros et al. [2 6] will a pply to the at-most-o nce version of th eir problem . Another related problem is th e semi-matching problem [ 7, 10, 20]. Th e semi-matching prob- lem known also as th e load b alancing problem h as been exten si vely studied under v arious names in the n etwork scheduling literatu re. Recently it has received renewed attention after a p aper by Harvey et al. [2 0], where the name semi-match ing was intr oduced . Semi-match ing ca n be seen as an abstraction of th e prob lem of matching clients with servers, each of which can p rocess a subset o f clients. Th e g oal is to match each clien t with at-most-one server . Clients and ser vers are abstracted as the vertices of a bipartite graph , an d a synch rono us, failure-fr ee, message-p assing model of comp utation is assumed, where edges represent commun ication links. One can also relate the at-m ost-once pro blem to th e co nsensus pr oblem [1 3, 21, 29, 35]. In- deed, consensus can be vie wed as an at-most-once distributed decision. An other related problem 3 is pro cess r enaming, see Attiy a et al. [4] where each pr ocess identifier shou ld be assigned to at most one process. The at-mo st-once p roblem has also m any similar ities with the Write- All pro blem fo r the shared me mory model [3, 9, 1 8, 23, 28, 3 6]. First p resented by Kanellak is and Shvartsman [2 3], the Write-All prob lem is concerned with per formin g each job at-least-o nce . Mo st of the solutions for the Write-A ll problem , exhibit super-linear work even when m ≪ n . Malewicz [36] was the first to p resent a solu tion for the Write-A ll pro blem th at has linear work fo r a no n-trivial number o f pr ocessors. The algorithm presen ted by M alewicz [36] h as work O( n + m 4 log n ) and uses test-and-set operatio ns. Later Ko walski an d Shvartsman [28] presen ted a so lution for the Write-All proble m that f or any constan t ǫ has work O( n + m 2+ ǫ ) . The ir algorith m uses a collection o f q perm utations with contention O( q log q ) for a properly ch osen con stant q and does no t r ely on test-and-set oper ations. Although an efficient p olynom ial time con struction o f permutatio ns with conten tion O( q polylog q ) h as been developed by Ko walski e t al. [2 7], it is not known to date how to con struct permutations with contention O( q log q ) in p olynom ial time. Subsequen t to the con ference version of this pa per [25], Alistarh et a l. [2] show th at there exists a deterministic algor ithm for the Write-All problem with work O( n + m log 5 n log 2 max( n, m )) , by derandomizing the ir randomize d solution for t he problem. Their solution is a breakthr ough in terms o f b ridging the g ap betwee n the Ω ( n + m log m ) lower bound fo r th e Write-All problem and known determ inistic solutio ns, but is so far existential. For a detailed overvie w of researc h on the Write-All problem, we refer the rea der to the b ooks by Georgiou an d Shvartsman [15, 16]. W e note that the at- most-onc e problem becomes much simpler when shared -memor y is sup- plemented by some typ e o f read- modify -write op erations. For example, one c an associate a test-and-set bit with each job, ensuring that the jo b is assigned to th e o nly pro cess that success- fully sets the shared bit. An effecti veness o ptimal implementation can th en be easily obtained from any W rite-All solution. In this paper we deal only with the more challenging setting where algorithm s use atomic read/write registers. Contributions: W e pr esent and analyze th e algor ithm KK β that solves th e at- most-once pr ob- lem. The alg orithm is parametrize d by β ≥ m an d has effectiv eness n − β − m + 2 . If β < m the co rrectness of th e alg orithm is still gua ranteed, but the termination of the algorith m cannot be gua ranteed. For β = m the alg orithm has optimal effecti veness of n − 2 m + 2 up to an additive factor of m . Note that the upp er bou nd fo r the effecti veness of any algor ithm is n − f [26], where f ≤ m − 1 is the nu mber of failure s in the system. W e fu rther prove that for β ≥ 3 m 2 the alg orithm h as work complexity O( nm lo g n lo g m ) . W e use algor ithm KK β with β = 3 m 2 , in order to construct an iterated version of our a lgorithm which fo r any constant ǫ > 0 , has effectiveness of n − O( m 2 log n log m ) and work complexity O( n + m 3+ ǫ log n ) . This is both effecti veness-optimal a nd work-optimal for any m = O( 3+ ǫ p n/ log n ) . W e n ote that ou r solutions a re d eterministic a nd assum e worst-case behavior . In th e probab ilistic setting Censor- Hillel [22] and Kentros e t al. [24] show that optimal effecti veness can be achiev ed with expected work complexity O( nm 2 log m ) and O( n + m 2+ ǫ log n ) , f or any small constant ǫ , respecti vely . W e then demonstrate h ow to u se the itera ted version of our algo rithm in o rder to solve th e Write-All prob lem with work comp lexity O( n + m 3+ ǫ log n ) for any constan t ǫ > 0 . Our solutio n improves on th e algorith m of Male wicz [36], wh ich solves th e Write-All pr oblem f or a n on-trivial number of processes with optimal (linear) work co mplexity , in two ways. First our solution is work o ptimal f or a wider rang e of cho ices f or m , namely for any m = O ( 3+ ǫ p n/ log n ) , c f. the restriction m = O( 4 p n/ log n ) of Malewicz, [36]. Second ou r solution do es not assume th e test-and-set pr imitiv e used by Malewicz and relies on ly on atomic r ead/write mem ory . Th ere is 4 also a Write- All algo rithm d ue to Ko walski and Shv artsman [2 8], wh ich does no t use test-and-set operation s and is work optimal for a w ider ran ge of pr ocessors m than our algo rithm, specifically for m = O( 2+ ǫ √ n ) . Howev er, their algorithm uses a collectio n o f q per mutations w ith contention O( q lo g q ) and it is not k nown to date how to construct such permutations in p olyno mial time (see the d iscussion in the r elated work section). Finally , subsequ ent to the con ference version of this paper [25], Alistarh et al. [2] show that the re exists a deterministic algorithm for the Write-All problem with work O( n + m log 5 n log 2 max( n, m )) . Their solu tion is so far existential, while ours explicit. Outline: In Section 2 we formalize the model an d introduce definition s and n otations u sed in the paper . In Sectio n 3 we present the algo rithm K K β . In Sec tions 4 and 5 we an alyze correctness, effecti veness and work co mplexity of algo rithm KK β . In Section 6 we present and an alyze the iterativ e algor ithm IterativeKK ( ǫ ) . In Sectio n 7 we present and an alyze the iterative algorithm W A IterativeKK ( ǫ ) fo r the Write-All problem. Finally , we conclude with Section 8. 2. Model, Definitions, and Efficiency W e define our model, the at-mo st-once problem, and measures of ef ficiency . 2.1. Model and Adversary W e mo del a multi-p rocessor as m asyn chrono us, crash- prone proce sses with unique identi- fiers fro m some set P . Share d memo ry is mod eled as a collection of atomic re ad/write mem ory cells, where the number of b its in each cell is explicitly defin ed. W e use the Input/Outp ut Au- tomata form alism [3 4, 35] to specify and reason ab out algorithms; specifically , we use the a syn- chr onou s shar ed memory au tomaton fo rmalization [17, 35]. Each process p is defined in terms of its states state s p and its action s acts p , where e ach action is of the type input , output , o r internal . A subset s ta rt p ⊆ states p contains all th e start states of p . Each shared variable x takes values from a set V x , among which there is init x , the initial value of x . W e model an alg orithm A a s a co mposition o f th e a utomata for e ach p rocess p . Automato n A con sists o f a set of states state s ( A ) , where eac h state s co ntains a state s p ∈ state s p for each p , and a value v ∈ V x for each shared variable x . Start states sta rt ( A ) is a subset of states ( A ) , where each state contains a start p for each p and an init x for each x . Th e actions of A , ac ts ( A ) consists o f actio ns π ∈ acts p for each proc ess p . A tr ansition is th e modifica tion of the state as a result o f an action a nd is rep resented by a tr iple ( s, π , s ′ ), where s, s ′ ∈ states ( A ) and π ∈ acts ( A ) . State s is called the enab ling state of action π . The set of all tran sitions is denoted by tr ans ( A ) . Each action in ac ts ( A ) is per formed by a process, thus fo r any transition ( s, π , s ′ ), s and s ′ may differ only with respect to th e state s p of process p tha t in voked π an d potentially the value o f the share d variable that p interacts with du ring π . W e also use tr iples ( { v ar s s } , π , { v ar s s ′ } ) , where va rs s and v ars s ′ are subsets of v ariables in s and s ′ respectively , as a sh orthand to d escribe transitions with out having to specify s and s ′ completely; her e v ar s s and v ars s ′ contain only the v ariables whose value cha nges as the result of π , plus possibly some other variables of interest. An execution fragmen t o f A is either a finite seque nce, s 0 , π 1 , s 1 , . . . , π r , s r , or an infinite sequence, s 0 , π 1 , s 1 , . . . , π r , s r , . . . , of alternatin g states an d a ctions, where ( s k , π k +1 , s k +1 ) ∈ trans ( A ) f or any k ≥ 0 . I f s 0 ∈ star t ( A ) , th en the sequence is called an execution . The set of executions of A is execs(A) . W e say that execution α is fair , if α is finite and its last state is a state of A where n o locally controlled action is en abled, or α is infinite and ev ery locally 5 controlled action π ∈ acts ( A ) is perf ormed infin itely m any times or th ere a re infinitely many states in α where π is disabled . The set of fair executions of A is fair exe cs ( A ) . An execution fragmen t α ′ extends a finite execution fragment α o f A , if α ′ begins with the last state of α . W e let α · α ′ stand f or the execution fragm ent resulting f rom concatenating α and α ′ and rem oving the (duplicated ) first state of α ′ . For two states s an d s ′ of an execution fragmen t α , we say that state s p r ecedes state s ′ and we w rite s < s ′ if s appears b efore s ′ in α . M oreover we write s ≤ s ′ if state s eith er p recedes state s ′ in α or th e states s an d s ′ are the same state of α . W e use the term preced es and the symbols < and ≤ in a same way for the action s of an execution frag ment. W e use the term precedes and the sy mbol < if an a ction π app ears before a state s in an execution fragment α or if a state s app ears before an action π in α . Finally for a set of st ates S of an execution fragment α , we define as s max = max S the state s max ∈ S , s.t. ∀ s ∈ S , s ≤ s max in α . W e mo del pro cess crashes by action stop p in acts ( A ) f or each process p . If stop p appears in an execution α then no actions π ∈ acts p appear in α thereafter . W e th en say that p rocess p crashed . Actions stop p arrive from some unspecified e xterna l en vironmen t, c alled an adversary . In this work we consider an o mniscient , on-line adversary [ 23] th at h as complete knowledge of the algorithm executed by th e processes. The adversary con trols asy nchrony and c rashes. W e allow u p to f < m crashes. W e den ote by fair exe cs f ( A ) all fair executions o f A with at most f crashes. Note that since th e pro cesses can only c ommun icate through atomic read /write oper- ations in th e sh ared memor y , all the asynch ronou s executions are linear izable. Th is m eans that concur rent actions can be mapped to an equiv alent sequence of state transitions, where only one process perfo rms an action in each transition, and thus the model presented above is appropriate for the analysis of a multi-pr ocess asynchronou s atom ic read/write shared memory system. 2.2. A t-Most-Once Pr oblem, Effectiveness and Complexity Let A be an algorith m spe cified for m p rocesses with ids f rom set P = [1 . . . m ] , an d for n jobs with unique ids from set J = [1 . . . n ] . W e assume that ther e are at least as m any jobs as there a re processes, i.e ., n ≥ m . W e m odel the perform ance of jo b j by pr ocess p by means of action do p,j . For a seq uence c , we let l en ( c ) den ote its length , a nd we let c | π denote the sequence of elem ents π o ccurrin g in c . The n f or an execution α , l e n α | do p,j is the n umber o f times p rocess p perform s job j . Finally we denote b y F α = { p | stop p occurs in α } the set of crashed processes in e xecution α . Now we define the number of jobs performed in an e xecution. Note here that we are borrowing most definitions from Kentros et al. [2 6]. Definition 2.1. F or e xecution α let J α = { j ∈ J | do p,j occurs in α fo r some p ∈ P } . The total number of jobs performed in α is defined to be D o ( α ) = |J α | . W e next define the at-most-onc e problem. Definition 2.2. Algorithm A solves the at-mo st-once p r ob lem if for each e xecution α of A we have ∀ j ∈ J : P p ∈P l en α | do p,j ≤ 1 . Definition 2.3. Let S be a set of elements with u nique id entifiers. W e define as the rank of element x ∈ S an d we write [ x ] S , the rank of x if we sort in a scending o r der th e eleme nts o f S according to their identifiers. Measur es of Efficiency W e an alyze ou r algorithm s in terms o f two comp lexity measures: effectiveness and work . Effecti veness counts the numbe r of jobs perform ed by an algorithm in the worst case. 6 Definition 2.4. E A ( n, m, f ) = min α ∈ fair ex e cs f ( A ) ( D o ( α )) is the effectiveness of algorithm A , wher e m is the number of pr oc esses, n is the numb er of jobs, and f is th e numb er of crashes. A trivial algo rithm can solve th e at- most-once pr oblem by splitting th e n jo bs in gro ups of size n m and ass igning one group to each process. Su ch a solution h as ef fectiv eness E ( n, m, f ) = ( m − f ) · n m (consider an execution where f pro cesses fail at the beginning of the e xecution). W ork com plexity measur es the total num ber of basic oper ations (compar isons, additio ns, multiplications, shared memory read s and writes) p erform ed by an algorith m. W e assume that each in ternal or sh ared memor y cell has size O(log n ) bits and per formin g operations in volving a con stant number of memory cell co sts O(1) . This is consistent with th e way work comp lexity is measured in previous related w ork [23, 28, 36]. Definition 2.5. The work o f alg orithm A , denoted by W A , is the worst ca se total number o f basic operations performed by all the pr ocesses of algo rithm A . Finally we repeat here as a theore m, Corollary 1 fro m Kentros et al. [26], that gives an upper bound on the effecti veness for any algorithm solving the at-most-once problem. Theorem 2.1. fr om Kentr os et al. [26] F or all algo rithms A that solve the at-most-on ce pr o blem with m pr ocesses and n ≥ m jobs in the pr esence of f < m crashes it holds that E A ( n, m, f ) ≤ n − f . 3. Algorithm KK β W e pr esent algor ithm KK β , that solves the at-most-once prob lem. Paramete r β ∈ N is the termination pa rameter o f the algorithm . Algorithm KK β is d efined f or all β ≥ m . If β = m , algorithm KK β has op timal up to an additive factor of m ef fectiveness. Note that alth ough β ≥ m is not necessary in or der to prove the correctn ess of the a lgorithm, if β < m we canno t guaran tee termination of algorithm KK β . Shared V ariables: next = { next 1 , . . . , next m } , next q ∈ { 0 , . . . , n } initially 0 done = { done 1 , 1 , . . . , done m,n } , done q,i ∈ { 0 , . . . , n } initially 0 Signatur e: Input: stop p , p ∈ P Output: do p,j , p ∈ P , j ∈ J Internal: compNext p , p ∈ P check p , p ∈ P Internal Read: gatherT ry p , p ∈ P gatherDone p , p ∈ P Internal Write: setNext p , p ∈ P done p , p ∈ P State: S T ATUS p ∈ { comp next, set next, g ather tr y , gath er done, check, d o, done, end, stop } , initially S TA T U S p = comp next FREE p , DONE p , TR Y p ⊆ J , initially FREE p = J and DONE p = TR Y p = ∅ P O S p = { P O S p (1) , . . . , PO S p ( m ) } , where P O S p ( i ) ∈ { 1 , . . . , n } , initially P O S p ( i ) = 1 N EX T p ∈ { 1 , . . . , n } , initially undefined TM P p ∈ { 0 , . . . , n } , initially undefined Q p ∈ { 1 , . . . , m } , initially 1 Figure 1: Algorithm KK β : Shared V ariabl es, Signature and States The idea b ehind the algorithm KK β (see Fig. 1, 2) is q uite intuitive an d is b ased on an al- gorithm for renaming processes p resented b y A ttiya et al. [4]. Each process p , p icks a job i to perfor m, annou nces (by writing in shared m emory) tha t it is abou t to perf orm the job and then 7 checks if it is safe to p erform it (by reading the ann ouncem ents other proce sses made in th e shared memo ry , and th e jobs oth er processes anno unced they h av e perfo rmed). If it is safe to perfor m the job i , process p will pr oceed with the do p,i action and then mark the job co mpleted. If it is not safe to p erform i , p will r elease the job. In either case, p pick s a new job to pe rform. In ord er to p ick a new jo b, p read s from the shared m emory and g athers inform ation on which jobs ar e safe to p erform , by read ing the a nnoun cements that oth er pro cesses made in the sh ared memory abou t the jobs they are about to perfor m, a nd the jobs other processes annou nced they have already p erfor med. Assumin g th at th ose jobs are ordered , p splits the set of “free” jobs in m intervals and picks the first jo b of the in terval with rank equal to p ’ s rank. Note th at since the informa tion needed in order to d ecide whether it is safe to perform a specific job an d in orde r to pick the next job to per form is the same, these steps are com bined in the algorith m. In Figure 2, we u se functio n rank (SET 1 , SET 2 , i ) , that r eturns the elemen t of set SE T 1 \ SET 2 that has rank i . If SET 1 and SET 2 have O( n ) eleme nts and are stored in some tree structure like red- black tree o r some v ariant of B-tr e e , the op eration rank (SET 1 , SET 2 , i ) , costs O( | SE T 2 | log n ) assuming that SET 2 ⊆ SET 1 . W e will prove that algorithm KK β has effectiveness n − ( β + m − 2) . For β = O ( m ) th is effecti veness is asym ptotically o ptimal for any m = o ( n ) . No te that by Theo rem 2. 1 the upper bound o n e ffecti veness o f th e at-m ost-once pr oblem is n − f , wher e f is the nu mber of failed processes in the system. Next we present algorithm KK β in more detail. Shared V ariables. next is an ar ray with m elem ents. In the cell next q of the array process q annou nces th e jo b it is abo ut to perfor m. Fro m the structur e o f algorithm KK β , on ly p rocess q writes in cell next q . On the othe r hand any process may read cell next q . done is a n m × n matrix . In line q of the matr ix, p rocess q anno unces the jobs it has per- formed . Each cell of line q co ntains the id entifier of exactly one jo b that h as been p erform ed by process q . Only process q writes in the cells of line q but any process may read them. Mo reover , process q upda tes line q by adding entries at the end of it. Internal V ar iables of process p . The variable S TA T U S p records the statu s of pr ocess p and defines its next actio n as follows: S TA T U S p = comp next - proce ss p is rea dy to compu te the next job to perform (this is th e initial status of p ), S TA T U S p = set next - p comp uted the next jo b to perform and is r eady to announce it by writing in th e shared memory , S TA T U S p = g ather tr y - p reads th e ar ray next in shared memory in or der to c ompute the TR Y p set, S TA T U S p = g athe r done - p reads th e matrix done in share d m emory in order to u pdate the DONE p and FREE p sets, S TA T U S p = check - p h as to check whether it is safe to per form its current jo b, S TA T U S p = do - p can safely pe rform its current job, S TA T U S p = done - p perf ormed its current job an d need s to u pdate the sh ared mem ory , S TA T U S p = e nd - p termina ted, S TA T U S p = stop - p crashed. FREE p , DONE p , TR Y p ⊆ J ar e thr ee sets that are u sed b y p rocess p in order to comp ute the next job to perfor m and whether it is safe to perform it. W e u se some tree stru cture like r ed- black tr ee o r some variant of B-tr ee [5, 1 9] for th e sets FREE p , DONE p and TR Y p , in ord er to be able to add, remove and search elements in them with O(log n ) work. FREE p , is initially set to J and contains an estimate of th e jobs th at are still av ailab le. DONE p is in itially em pty and contains an estimate of the jobs that h av e b een p erform ed. No job is r emoved from DO NE p or added to FREE p during the execution of algorithm KK β . TR Y p is initially empty and contain s an estimate of the job s that other p rocesses are abo ut to perf orm. I t holds that | TR Y p | < m , since there are m − 1 p rocesses apart from process p tha t may be attempting to perfor m a job . P O S p is an array of m elem ents. Position P O S p ( q ) of th e array co ntains a pointer in the line 8 T ransitions of process p : Input stop p Effect: S T ATUS p ← stop Inte rnal compNext p Precondition: S T ATUS p = comp next Effect: if | FREE p \ TR Y p | ≥ β then TM P p ← | FREE p | − ( m − 1) m if T M P p ≥ 1 then TM P p ← ⌊ ( p − 1) · T M P p ⌋ + 1 N EX T p ← r ank (FREE p , TR Y p , T M P p ) else N EX T p ← r ank (FREE p , TR Y p , p ) end Q p ← 1 TR Y p ← ∅ S T ATUS p ← set next else S T ATUS p ← end end Inte rnal W rite setNext p Precondition: S T ATUS p = set next Effect: next p ← N E X T p S T ATUS p ← gather tr y Inte rnal Read gatherT ry p Precondition: S T ATUS p = gather tr y Effect: if Q p 6 = p then TM P p ← next Q p if T M P p > 0 then TR Y p ← TR Y p ∪ { T MP p } end end if Q p + 1 ≤ m then Q p ← Q p + 1 else Q p ← 1 S T ATUS p ← gather done end Inte rnal Read gatherDone p Precondition: S T ATUS p = gather done Effect: if Q p 6 = p then TM P p ← done Q p , P O S p ( Q p ) if P O S p Q p ≤ n AND T M P p > 0 then DONE p ← DONE p ∪ { T MP p } FREE p ← FREE p \ { TM P p } P O S p Q p = P O S p Q p + 1 else Q p ← Q p + 1 end else Q p ← Q p + 1 end if Q p > m then Q p ← 1 S T ATUS p ← check end Inte rnal check p Precondition: S T ATUS p = check Effect: if N E X T p / ∈ TR Y p AND N E X T p / ∈ DONE p then S TA T U S p ← do else S T ATUS p ← comp next end Output do p,j Precondition: S T ATUS p = do N EX T p = j Effect: S T ATUS p ← done Inte rnal W rite done p Precondition: S T ATUS p = done Effect: done p, P O S p ( p ) ← N E X T p DONE p ← DONE p ∪ { N E X T p } FREE p ← FREE p \ { N E X T p } P O S p ( p ) ← P O S p ( p ) + 1 S T ATUS p ← comp next Figure 2: Algorithm KK β : Transition s q of the shared matrix done . P O S p ( q ) is the element of line q that proce ss p will read from . In the special case where q = p , P O S p ( p ) is the elemen t of line p that proc ess p will write into after p erform ing a n ew job . T he ele ments o f th e shar ed matrix done ar e read when proce ss p is updating the DONE p set. N E X T p contains the job process p is attempting to perfo rm. T M P p is a tempora ry s torag e for v alues read from the shared memory . Q p ∈ { 1 , . . . , m } is u sed as indexing for looping through proce ss identifiers. Actions of process p . W e visit them on e by one below . compNext p : Proce ss p compu tes the set FREE p \ TR Y p and if it has more or equal elements to β , were β is the ter mination parameter of the algorithm, process p computes its next candidate job, by splitting the FREE p \ TR Y p set in m parts and pic king the first elem ent of the p -th 9 part. In o rder to do that it uses the fun ction r ank (SET 1 , SET 2 , i ) , which returns the element of set SET 1 \ SET 2 with rank i . Fin ally pro cess p sets the TR Y p set to the empty set, the Q p internal variable to 1 an d its status to s et next in o rder to up date the shared memor y with its new candidate job . I f the FREE p \ TR Y p set has less than β e lements process p ter minates. setNext p : Process p anno unces its new cand idate job by writing the co ntents of its N E X T p internal v ariable in t he p -th p osition of th e next a rray . Rememb er that th e next array is stored in shared memory . Pro cess p c hanges its status to g ath e r try , in o rder to start collecting the TR Y p set from the next array . gatherT ry p : W ith th is action p rocess p implements a loo p, which read s from th e share d memory all th e positions o f th e array ne xt and update s the TR Y p set. In each execution o f th e action, process p checks if Q p is equa l to p . If it is not equal, p reads the Q p -th position of the array next , ch ecks if the value read is greater th an 0 and if it is, add s the value it read in the TR Y p set. If Q p is equal to p , p just sk ips th e step descr ibed above. Then p chec ks if the value of Q p + 1 is le ss tha n m + 1 . If it is, then p in creases Q p by 1 and leaves its status g ather try , otherwise p has fin ished updatin g the TR Y p set and thus sets Q p to 1 and chan ges its statu s to g athe r done , in o rder to update the DONE p and FREE p sets f rom the contents of the done matrix. gatherDone p : W ith this action process p implemen ts a loop, which updates the D ONE p and FREE p sets with values read f rom th e matr ix done , which is stored in sh ared mem ory . In each execution of the action, process p checks if Q p is equa l to p . If it is not equ al, p u ses the intern al variable P O S p Q p , in order to read fresh values from the line Q p of the done matrix. I n detail, p reads the shared v ariable do ne Q p , P O S p ( Q p ) , checks if P O S p Q p is less than n + 1 and if the v alue read is greater than 0 . If bo th conditions hold, p ad ds the value read at the DONE p set, removes the value read fro m the FREE p set and increases P O S p Q p by one. Otherwise, it means th at either pr ocess Q p has terminate d (by pe rformin g all the n jo bs) or the line Q p does no t contain any new comp leted jobs. In eith er case p increases the value of Q p by 1. The value of Q p is increased b y 1 also if Q p was equ al to p . Finally p che cks whether Q p is g reater tha n m ; if it is, p has completed the loop and thus changes its status to check . check p : Process p ch ecks if it is safe to p erform its cur rent job. This is d one by ch ecking if N E X T p belongs t o the set TR Y p or to the set DONE p . If it d oes not, then it is safe to perform the job N E X T p and p ch anges its status to do . Otherwise it is not safe, an d th us p chang es its status to comp next , in order to find a new job that may be safe to perform. do p,j : Process p perf orms job j . No te th at N E X T p = j is part of th e pr econditio ns for the action to be enabled in a state. Then p ch anges its status to done . done p : Pro cess p wr ites in th e done p, P O S p ( p ) position o f th e shared memory the value of N E X T p , letting o ther p rocesses k now that it perfor med job N E X T p . Also p add s N E X T p to its DONE p set, removes N E X T p from its FREE p set, increases P O S p ( p ) by 1 and changes its status to comp next . stop p : Process p cr ashes by setting its status to stop . 4. Correctness and Effectiv eness Analysis W e begin the analy sis of algorithm KK β , by showing in Lemma 4.1 that K K β solves th e at-most-on ce pro blem. That is, there exists n o execution o f KK β in which 2 distinct actio ns do p,i and do q,i appear f or som e i ∈ J and p, q ∈ P . W e co ntinue the analysis b y showing in Theorem 4.4 that algor ithm KK β has effecti veness E KK β ( n, m, f ) = n − ( β + m − 2) . This is 10 done in two steps. First in Lemma 4.2, we show that algorithm KK β cannot termin ate its ex ecu- tion if less than n − ( β + m − 1) jobs are p erform ed. The effecti veness an alysis is completed by showing in Lemma 4.3, that the algorith m is wait-free (it has n o infinite fair executions). In Theorem 4.4 we combin e the two lemmas in o rder to show that the effectiveness of algorithm KK β is greater that or equal to n − ( β + m − 2) . M oreover , we show the e xistence of an adver - sarial strategy , that results in a t ermina ting execution where n − ( β + m − 2) jobs are co mpleted, showing that the boun d i s tight. In the analysis that follows, for a st ate s and a pro cess p we denote by s. FREE p , s. DONE p , s. TR Y p , the values o f the internal v ariables FREE , DONE and TR Y of process p in state s . Moreover with s.next , and s.done we denote t he contents of the array next and the matrix done in state s . Remember that next an d done , are stored in shared memory . Lemma 4.1. Ther e exists no execution α of algorithm KK β , such th at ∃ i ∈ J and ∃ p, q ∈ P for which do p,i , do q,i ∈ α . Proof. Let us for the sake o f contradictio n assume that th ere exists an execution α ∈ execs (KK β ) a nd i ∈ J and p, q ∈ P such that do p,i , do q,i ∈ α . W e examin e tw o cases. Case 1 p = q : Let states s 1 , s ′ 1 , s 2 , s ′ 2 ∈ α , such that the transitions s 1 , do p,i , s ′ 1 , s 2 , do p,i , s ′ 2 ∈ α and witho ut loss of generality assume s ′ 1 ≤ s 2 in α . From Figure 2 we have that s ′ 1 . N E X T p = i , s ′ 1 . S TA T U S p = done an d s 2 . N E X T p = i , s 2 . S TA T U S p = do . From al- gorithm KK β , state s 2 must be preceded by transition s 3 , check p , s ′ 3 , such that s 3 . N E X T p = i and s ′ 3 . N E X T p = i , s ′ 3 . S TA T U S p = do , where s ′ 1 precedes s 3 in α . Finally s 3 must b e p re- ceded in α b y tran sition s 4 , done p , s ′ 4 , wh ere s ′ 1 precedes s 4 , such th at s 4 . N E X T p = i and i ∈ s ′ 4 . DONE p . Since s ′ 4 precedes s 3 and d uring the execution of KK β no elements are re- moved from DONE p , we have that i ∈ s 3 . DONE p . This is a co ntradictio n, since the tr ansition ( { N E X T p = i, i ∈ DONE p } , check p , { N E X T p = i, S TA T U S p = do } ) / ∈ trans (K K β ) . Case 2 p 6 = q : Giv en tran sition s 1 , do p,i , s ′ 1 in execution α , we ded uce fro m Fig. 2 tha t there exist in α transitions s 2 , setNext p , s ′ 2 , s 3 , gatherT ry p , s ′ 3 , s 4 , check p , s ′ 4 , where s ′ 2 .next p = s ′ 2 . N E X T p = i , s 3 .next p = s 3 . N E X T p = i , s 3 . Q p = q , s 4 . N E X T p = i , s ′ 4 . N E X T p = i , s ′ 4 . S TA T U S p = do , such that s 2 < s 3 < s 4 < s 1 and there exists no action π = compNext p in execution α , such that s 2 < π < s ′ 1 . Similarly for transition t 1 , do q,i , t ′ 1 there exist in ex ecution α transitions t 2 , setNext q , t ′ 2 , t 3 , gatherT ry q , t ′ 3 , t 4 , check q , t ′ 4 , where t ′ 2 .next q = t ′ 2 . N E X T q = i , t 3 .next q = t 3 . N E X T q = i, t 3 . Q q = p , t 4 . N E X T q = i , t ′ 4 . N E X T q = i , t ′ 4 . S TA T U S q = do , such that t 2 < t 3 < t 4 < t 1 and there exists no action π ′ = compNext q in execution α , such that t 2 < π < t ′ 1 . Either state s 2 < t 3 or t 3 < s 2 which imp lies t 2 < s 3 . W e will show tha t if s 2 < t 3 then do q,i cannot take place, leadin g to a contradiction . The case wher e t 2 < s 3 is sy mmetric and will be omitted. Let us a ssume th at s 2 precedes t 3 . W e have two cases, eithe r t 3 .next p = i or t 3 .next p 6 = i . In the first case i ∈ t ′ 3 . TR Y q . The o nly action in which en tries are rem oved fro m th e TR Y q 11 set, is action compNext q , where the TR Y q set is reset to ∅ . Th us i ∈ t 4 . TR Y q , since ∄ π ′ = compNext q ∈ α , such that t 2 < π ′ < t 1 . This is a contradictio n since t 4 , check q , t ′ 4 / ∈ trans (K K β ) , if i ∈ t 4 . TR Y q , t 4 . N E X T q = i an d t ′ 4 . S TA T U S q = do . If t 3 .next p 6 = i , sinc e s 2 , setNext p , s ′ 2 ∈ α and s ′ 2 < t 3 there exists action π 1 = setNext p ∈ α , such that s ′ 2 < π 1 < t 3 . Moreover, the re exists action π 2 = compNext p in α , such that s ′ 2 < π 2 < π 1 . Since ∄ π = com pNext p ∈ α , such that s 2 < π < s ′ 1 , it ho lds that s ′ 1 < π 2 < π 1 < t 3 . Furthermo re, f rom Fig. 2 th ere exists transition s 5 , done p , s ′ 5 in α and j ∈ { 1 , . . . , n } , such that s 5 . P O S p ( p ) = j , s 5 .done p,j = 0 , s 5 . N E X T p = i , s ′ 5 .done p,j = i and s ′ 1 < s ′ 5 < π 2 < t 3 . It m ust be the case that i / ∈ t 2 .D ON E q , since t 2 . N E X T q = i . Fro m that a nd fr om Fig. 2 we have that ther e exists tran sition t 6 , gatherDone q , t ′ 6 in α , such that t 6 . Q q = p , t 6 . P O S q ( p ) = j and t 3 < t 6 < t 4 . Since s ′ 5 < t 3 and done p,j from algorithm KK β cannot be changed again in execution α , we have th at t 6 .done p,j = i and as a re sult i ∈ t ′ 6 . DONE q . Moreover , d uring the execution o f algor ithm KK β , entries in set DONE q are only ad ded and n ev er removed, thus we h av e that i ∈ t 4 .D ON E q . This is a con tradiction since t 4 , check q , t ′ 4 / ∈ trans (K K β ) , if i ∈ t 4 . DONE q , t 4 . N E X T q = i and t ′ 4 . S TA T U S q = do . This completes the proo f. Next we e xamine the effectiveness o f the algo rithm. First we sho w that algo rithm KK β cannot terminate its execution if less than n − ( β + m − 1) jobs are perfor med. Lemma 4.2. F or any β ≥ m , f ≤ m − 1 and for any fi nite execution α ∈ execs (K K β ) with D o ( α ) ≤ n − ( β + m − 1) , there e xists a (n on-emp ty) execution fragment α ′ such tha t α · α ′ ∈ execs (KK β ) . Proof. From th e algorithm KK β , we have that f or any pro cess p and any state s ∈ α , | s. FREE p | ≥ n − D o ( α ) and | s. TR Y p | ≤ m − 1 . The first in equality holds since the s. FRE E p set is estimated b y p by examinin g the done matr ix which is stored in share d memory . Fro m algorithm KK β , a job j is only inserted in line q of the matrix done , if a do q,j action has already been perfo rmed by p rocess q . The secon d ineq uality is obviou s. T hus we h ave that ∀ p ∈ P and ∀ s ∈ α , | s. FREE p \ s. TR Y p | ≥ n − ( D o ( α ) + m − 1) . If D o ( α ) ≤ n − ( β + m − 1) , ∀ p ∈ P and ∀ s ∈ α we hav e that | s. FREE p \ s. TR Y p | ≥ β . Since th ere can be f ≤ m − 1 failed processes in our system, at th e final s tate s ′ of ex ecution α there exists at least one process p ∈ P that has n ot failed. This process has n ot termin ated, since f rom Fig. 2 a proc ess p can only terminate if in the en abling state s o f action compNext p , | s. FREE p \ s. TR Y p | < β . Th is process ca n contin ue executing steps an d thus there exists a (non-em pty) execution fragment α ′ such that α · α ′ ∈ execs (KK β ) . Since no finite execution of algorithm KK β can terminate if less than n − ( β + m − 1) jobs are performed , Lem ma 4.2 implies that if the algorithm KK β has ef fectiv eness less th an or equal to n − ( β + m − 1) , th ere must exist som e infinite fair execution α with Do ( α ) ≤ n − ( β + m − 1 ) . Next we prove tha t algorithm KK β is wait-free (it has no infinite fair executions). Lemma 4.3. F or any β ≥ m , f ≤ m − 1 there exists no i nfin ite fair execution α ∈ execs (KK β ) . 12 Proof. W e will prove th is by contra diction. Let β ≥ m and α ∈ execs (KK β ) an infin ite fair execution with f ≤ m − 1 failures, and let D o ( α ) be the jobs executed b y e xecution α acc ording to Definition 2.1. Since α ∈ execs (KK β ) and from Lemma 4.1 KK β solves th e at-most-on ce problem , D o ( α ) is finite. Clearly th ere exists at least o ne proc ess in execution α that has not crashed an d d oes n ot ter minate (some pro cess mu st take step s in α in o rder fo r it to be in finite). Since D o ( α ) and f are finite, ther e exists a state s 0 in α su ch that after s 0 no process cr ashes, no process terminates, no do action takes place in α and no process adds new en tries in the do ne matrix in share d memory . Th e later holds since the execution is infinite and fair, the Do ( α ) is also finite, co nsequen tly any non failed proc ess q that has not term inated will eventually u pdate the q line o f the done matr ix to be in agreemen t with the do q, ∗ actions it h as p erform ed. Mo reover any p rocess q that has terminated , h as a lready updated the q line of done matrix with the latest do a ction it perf ormed, be fore it terminated, since in o rder to terminate it must have rea ched a compNext action that has set its status to end . W e defin e the following sets of proc esses and jobs accordin g to state s 0 . J α are jobs that have been perform ed in α accor ding to D efinition 2.1. P α are proce sses that do not crash and do n ot terminate in α . By the way we defined state s 0 only processes in P α take steps in α after state s 0 . STUCK α = { i ∈ J \ J α |∃ failed process p : s 0 .next p = i } , i.e., STUCK α expresses the set of jo bs that are held b y failed processes. DONE α = { i ∈ J α |∃ p ∈ P and j ∈ { 1 , . . . , n } : s 0 .done p ( j ) = i } , i.e., DONE α expresses the set of jo bs that have been performed before state s 0 and the processes that perf ormed them managed to up- date the shared memory . Finally we define POOL α = J \ ( J α ∪ STUCK α ) . After state s 0 , all processes in P α will keep executing. This m eans that whenever a p rocess p ∈ P α takes action compNext p in α , the first if statement is true. Specifically it holds th at f or ∀ p ∈ P α and f or all the enabling states s ≥ s 0 of actions compNext p in α , | FREE p \ TR Y p | ≥ β . From Figure 2, we h av e th at for any p ∈ P α , ∃ s p ∈ α such that s p > s 0 and f or all states s ≥ s p , s. DONE p = DONE α , s. FREE p = J \ DONE α and s. FREE p \ s. TR Y p ⊆ POOL α . Let s ′ 0 = ma x p ∈P α [ s p ] . From the above we hav e: |J \ DONE α | ≥ β ≥ m and | POOL α | ≥ β ≥ m , since ∀ s ′ ≥ s ′ 0 we h ave that s ′ . FREE p = J \ DONE α and s ′ . FREE p \ s ′ . TR Y p ⊆ POOL α and ∀ p ∈ P α and for all the enabling states s ≥ s ′ 0 of actions compNext p in α , we hav e that | FREE p \ TR Y p | ≥ β . Let p 0 be the process with the smallest process identifier in P α . W e examine 2 cases accord- ing to the size of J \ DONE α . Case A |J \ DONE α | ≥ 2 m − 1 : Le t x 0 ∈ POO L α be the job such that [ x 0 ] POOL α = j ( p 0 − 1) · |J \ DONE α |− ( m − 1) m k + 1 . Su ch x 0 exists since ∀ p ∈ P α and ∀ s ≥ s ′ 0 it h olds s. FREE p \ s. TR Y p ⊆ PO OL α , s. FREE p = J \ DONE α from which we have that | P O OL α | ≥ |J \ DONE α | − | s. TR Y p | ≥ |J \ DONE α | − ( m − 1 ) ≥ m . It follows that any p ∈ P α that executes action compNext p after s tate s ′ 0 , will h av e its N E X T p variable poin ting in a job x with [ x ] POOL α ≥ j ( p − 1) · |J \ DONE α |− ( m − 1) m k + 1 . Thus ∀ p ∈ P α , ∃ s ′ p ≥ s ′ 0 in α such t hat ∀ states s ≥ s ′ p , [ s.next p ] POOL α ≥ j ( p − 1) · |J \ DONE α |− ( m − 1) m k + 1 . Let s ′′ 0 = max p ∈P α [ s ′ p ] , we have 2 cases fo r p 0 : Case A.1) Af ter s ′′ 0 , pr ocess p 0 executes action compNext p 0 and the transition leads to state s 1 > s ′′ 0 such that s 1 . N E X T p 0 = x 0 . Since [ x 0 ] POOL α = j ( p 0 − 1) · |J \ DONE α |− ( m − 1) m k + 1 and p 0 = min p ∈P α [ p ] , from the previous discussion we have that ∀ s ≥ s 1 and ∀ p ∈ P \ { p 0 } , s.next p 6 = x 0 . Th us when p 0 executes actio n check p of Fig. 2 for the fi rst time after state s 1 , the 13 condition will be true, so in some sub sequent transition p 0 will have to execute a ction do p 0 ,x 0 , perfor ming job x 0 , which is a contrad iction, since after s tate s 0 no jobs are executed. Case A.2 ) After s ′′ 0 , pro cess p 0 executes actio n compNext p 0 and the transition lead s in state s 1 > s ′′ 0 such that s 1 . N E X T p 0 > x 0 . Since p 0 = min p ∈P α [ p ] , it ho lds th at ∀ x ∈ POOL α such that [ x ] POOL α ≤ j ( p 0 − 1) · |J \ DONE α |− ( m − 1) m k + 1 , ∄ p ∈ P such that s 1 .next p = x . Let the transition s 2 , compNext p 0 , s ′ 2 ∈ α , wh ere s 2 > s 1 , be the first time that action co mpNext p 0 is executed after state s 1 . W e have that ∀ x ∈ POOL α such that [ x ] POOL α ≤ j ( p 0 − 1) · |J \ DONE α |− ( m − 1) m k + 1 , x / ∈ s 2 . DONE p 0 ∪ s 2 . TR Y p 0 , since fr om the discussion above we hav e that ∀ s ≥ s 1 and ∀ p ∈ P α \ { p 0 } , [ s.next p ] POOL α ≥ j ( p − 1) · |J \ DONE α |− ( m − 1) m k + 1 . Thu s [ x 0 ] s 2 . FREE p 0 \ s 2 . TR Y p 0 = [ x 0 ] POOL α = j ( p 0 − 1) · |J \ DONE α |− ( m − 1) m k + 1 . As a result, s ′ 2 . N E X T p 0 = x 0 . W ith sim- ilar arguments l ike in case A.1, we can see that job x 0 will be performed by process p 0 , which is a contrad iction, since after state s 0 no jobs are executed. Case B |J \ DONE α | < 2 m − 1 : L et x 0 ∈ POO L α be the job such that [ x 0 ] POOL α = p 0 . Such x 0 exists since β ≥ m an d P OOL α ≥ β . I t f ollows that any p ∈ P α that executes action compNext p after state s ′ 0 , will hav e its N E X T p variable poin ting in a job x with [ x ] POOL α ≥ p . Thus ∀ p ∈ P α , ∃ s ′ p ≥ s ′ 0 in α such that ∀ states s ≥ s ′ p , [ s.next p ] POOL α ≥ p . Le t s ′′ 0 = max p ∈P α [ s ′ p ] , we have 2 cases fo r p 0 : Case B.1) After s ′′ 0 , pr ocess p 0 executes action compNext p 0 and the transition leads in state s 1 > s ′′ 0 such that s 1 . N E X T p 0 = x 0 . Since [ x 0 ] POOL α = p 0 and p 0 = min p ∈P α [ p ] , fro m th e previous discussion we have th at ∀ s ≥ s 1 and ∀ p ∈ P \ { p 0 } , s.next p 6 = x 0 . Th us wh en p 0 executes action check p of Fig. 2 for th e first time after state s 1 , the conditio n will be true, so in some subsequen t transition p 0 will have to execute action do p 0 ,x 0 , performin g job x 0 , which is a contradictio n, si nce after state s 0 no jobs are executed. Case B.2) After s ′′ 0 , pr ocess p 0 executes action compNext p 0 and the transition leads in state s 1 > s ′′ 0 such that s 1 . N E X T p 0 > x 0 . Since p 0 = min p ∈P α [ p ] , it hold s that ∀ x ∈ POO L α such that [ x ] POOL α ≤ p 0 , ∄ p ∈ P such th at s 1 .next p = x . Let the transition s 2 , compNext p 0 , s ′ 2 ∈ α , whe re s 2 > s 1 , be the first time that action compNext p 0 is executed after state s 1 . W e have that ∀ x ∈ POOL α such that [ x ] POOL α ≤ p 0 , x / ∈ s 2 . DONE p 0 ∪ s 2 . TR Y p 0 , since fr om the discussion above we have that ∀ s ≥ s 1 and ∀ p ∈ P α \ { p 0 } , [ s.next p ] POOL α ≥ p . Thus [ x 0 ] s 2 . FREE p 0 \ s 2 . TR Y p 0 = [ x 0 ] POOL α = p 0 . As a result, s ′ 2 . N E X T p 0 = x 0 . W ith similar ar- guments like in case B.1, we can see that job x 0 will be perfo rmed b y process p 0 , wh ich is a contradictio n, si nce after state s 0 no jobs are executed. W e combin e the last two lemmas in orde r to show the main result on the effectiveness of algorithm KK β . Theorem 4.4. F or any β ≥ m , f ≤ m − 1 a lgorithm KK β has effectiveness E KK β ( n, m, f ) = n − ( β + m − 2) . Proof. From Lemm a 4.2 we have th at any finite execution α ∈ execs (KK β ) with D o ( α ) ≤ n − ( β + m − 1) can b e extend ed, essentially proving that in such executions no pro cess has terminated. Mor eover from L emma 4. 3 we have that KK β is wait free, and th us there exists n o infinite fair execution α ∈ e xecs (KK β ) , such that D o ( α ) ≤ n − ( β + m − 1) . Since finite 14 fair executions are executions where all n on-failed pr ocesses hav e term inated, from the above we have th at E KK β ( n, m, f ) ≥ n − ( β + m − 2 ) . If all pr ocesses but the process with id m fail in an execution α in such a way that J α ∩ STUCK α = ∅ an d | STUCK α | = m − 1 (wher e STUCK α is d efined as in the pro of of Lemma 4.3), it is easy to see that there exists an adversarial strategy , such that when process m terminates, β + m − 2 job s h ave not been performed . Such an execution will be a finite fair execution where n − ( β + m − 2) jobs are p erform ed. Th us we h av e that E KK β ( n, m, f ) = n − ( β + m − 2) . 5. W ork Complexit y Analysis In this section we are goin g to prove that for β ≥ 3 m 2 algorithm KK β has work complexity O( nm log n log m ) . The main idea of the pro of, is to dem onstrate that unde r the assumption β ≥ 3 m 2 , pr ocess collisions on a jo b cannot accru e with out mak ing p rogress in the algorithm. In order to prove that, we first demonstra te in Lemma 5.1 that if two different pr ocesses p, q set th eir N E X T p , N E X T q internal variables to the same jo b i in some co mpNext action s, then the DONE p and DONE q sets of the processes, ha ve at le ast | q − p | · m d ifferent elements, gi ven that β ≥ 3 m 2 . Next we prove in Lemma 5.4 th at if two p rocesses p, q collide th ree consecu tiv e times, while tr ying to pe rform some jobs, the size of the set DONE p ∪ DONE q that processes p and q k now will increase by at least | q − p | · m elements. T his essentially tells us that every three co llisions b etween the same two p rocesses a significant n umber of jobs h as bee n perf ormed, and th us en ough pro gress has been made. In o rder to p rove the ab ove statement, we f ormally define wh at we mean b y collision in Definition 5.2, and tie such a co llision with some specific state, the state the collision is detected, so that we have a fixed “point of reference” in the execution; a nd sho w that the order collisions a re d etected in an execution, is consistent with the o rder the i nv o lved p rocesses attempt to perform the respective jobs in Lemma s 5 .2, 5.3. Finally we use Lem ma 5.4, in o rder to pr ove in Le mma 5.5, that a p rocess p cannot collide with a process q more than 2 l n m ·| q − p | m times in any execution. This is pr oven by co ntradiction , showing that if process p collides with pro cess q more tha n 2 l n m ·| q − p | m times, the re exist states f or which the set | DONE p ∪ DONE q | has m ore than n elements which is impossible. Le mma 5.5 is used in o rder to prove the main resu lt on the work complexity o f alg orithm KK β for β ≥ 3 m 2 , Theorem 5.6. W e ob tain Theo rem 5.6 by counting the total numbe r of collisions that can happe n and the cost of each collision. W e start by defining th e n otion o f immediate pr e decessor transition for a state s in an exe- cution α . The imm ediate predecessor is the la st transition of a specific action type th at preced es state s in th e execution. This is p articularly u seful in uniquely id entifying the tr ansition with action compNext p in an execution, that last set a N E X T p internal variable to a specific value, giv en a state s of inter est. Definition 5.1. W e say tha t transition s 1 , π 1 , s ′ 1 is an im mediate predecessor of state s 2 in an execution α ∈ execs (KK β ) a nd we write s 1 , π 1 , s ′ 1 7→ s 2 , if s ′ 1 < s 2 and in the execution fragment α ′ that be gins with state s ′ 1 and ends with state s 2 , ther e exis ts n o action π 3 = π 1 . Next we de fine what a collision between two processes mean s. W e say that process p collided with pro cess q in job i at state s , if proc ess p attem pted to preform job i , but was not able to, 15 because it detected in state s that eith er pro cess q was trying to p erform job i o r p rocess q ha s already perfor med job i . Definition 5.2. In an execution α ∈ execs (K K β ) , we say tha t p r ocess p collided with p r ocess q in jo b i at state s , if (i) ther e exist in α transitions s 1 , compNext p , s ′ 1 , t 1 , compNext q , t ′ 1 and s 2 , check p , s ′ 2 , wher e s 1 , compNext p , s ′ 1 7→ s 2 , t 1 < s 2 and s ′ 1 . N E X T p = t ′ 1 . N E X T q = s 2 . N E X T p = i , s ′ 1 . S TA T U S p = t ′ 1 . S TA T U S q = se t next , s ′ 2 . S TA T U S p = comp next , (ii) in execution fragment α ′ = s ′ 1 , . . . , s 2 either the r e exists transition s, gatherT ry p , s ′ such that s. Q p = q , s.next q = i , or transition s, gatherDone p , s ′ and j ∈ { 1 , . . . , n } such that s. Q p = q , s. P O S p ( q ) = j , s.done q,j = i an d i / ∈ s. T R Y p . Definition 5.3. In a n execution α ∈ execs (KK β ) , we sa y tha t pr ocesses p, q collide in jo b i a t state s , if p r ocess p co llided with pr ocess q or p r oce ss q collided with pr o cess p in job i a t state s , according to Definition 5.2. Next we sh ow th at if two processes p , q decide, with some comp Next actions, to p erform the same job i , then their DONE sets at the en abling states of those co mpNext actions, differ in at-least | q − p | · m elements. Lemma 5.1. If β ≥ 3 m 2 and in an e xecution α ∈ execs (K K β ) ther e e xist states s 1 , t 1 and pr o- cesses p , q ∈ P with p < q such that s 1 . N E X T p = t 1 . N E X T q = i ∈ J , then ther e exist transitions s 2 , compNext p , s ′ 2 7→ s 1 , t 2 , compNext q , t ′ 2 7→ t 1 , wher e s ′ 2 . N E X T p = t ′ 2 . N E X T q = i , s ′ 2 . S TA T U S p = t ′ 2 . S TA T U S q = set next and s 2 . DONE p ∩ t 2 . DONE q > ( q − p ) · m or s 2 . DONE p ∩ t 2 . DONE q > ( q − p ) · m . Proof. W e will prove this by contradictio n. From algorith m K K β there must exist transi- tions s 2 , compNext p , s ′ 2 7→ s 1 and t 2 , compNext q , t ′ 2 7→ t 1 , where s ′ 2 . N E X T p = i and t ′ 2 . N E X T q = i , if there exist s 1 , t 1 ∈ α an d p , q ∈ P with p < q such that s 1 . N E X T p = t 1 . N E X T q = i ∈ J , since tho se are the transitions tha t set N E X T p and N E X T q to i . In order to ge t a con tradiction we assume that s 2 . DONE p ∩ t 2 . DONE q ≤ ( q − p ) · m and s 2 . DONE p ∩ t 2 . DONE q ≤ ( q − p ) · m . W e will prove tha t if th is is the case, then s ′ 2 . N E X T p 6 = t ′ 2 . N E X T q . Let A = J \ s 2 . DONE p = s 2 . FREE p and B = J \ t 2 . DONE q = t 2 . FREE q , thus from the contradictio n assumption we have that: A ∩ B ≤ ( q − p ) · m and A ∩ B ≤ ( q − p ) · m . It could either be that | A | < | B | or | A | ≥ | B | . Case 1 | A | < | B | : From the contrad iction assump tion we hav e that A ∩ B ≤ ( q − p ) · m . Since s 2 . FREE p \ s 2 . TR Y p can have up to m − 1 fewer elements than A – th e elements of set s 2 . TR Y p – and it can be the case that s 2 . TR Y p ∩ t 2 . TR Y q = ∅ , we ha ve: | t 2 . FREE q \ t 2 . TR Y q ∩ s 2 . FREE p \ s 2 . TR Y p | ≤ m ( q − p ) + m − 1 (1) Moreover , since s 2 . FREE p \ s 2 . TR Y p ⊆ A and | s 2 . FREE p \ s 2 . TR Y p | ≥ β ≥ 3 m 2 , | A | ≥ 3 m 2 . Similarly | B | ≥ 3 m 2 . W e h ave: ( q − 1) | B | m = ( p − 1) | B | m + ( q − p ) | B | m > ( p − 1) | A | m + ( q − p ) | B | m ⇒ 16 ⇒ ( q − 1) | B | m > ( p − 1) | A | m + 3 m ( q − p ) ⇒ ⇒ ( q − 1) | B | m > ( p − 1 ) | A | m + (3 m − 1)( q − p ) + ( q − p ) ⇒ ⇒ ( q − 1) | B | m > ( p − 1 ) | A | m + (3 m − 1)( q − p ) + ( q − p )( m − 1) m ⇒ ( q − 1) | B | − ( m − 1) m + 1 ≥ ( p − 1) | A | − ( m − 1) m + 1 + (3 m − 1 )( q − p ) (2) Since s ′ 2 . N E X T p = t ′ 2 . N E X T q = i , we h av e: [ i ] s 2 . FREE p \ s 2 . TR Y p = ( p − 1) | A | − ( m − 1) m + 1 [ i ] t 2 . FREE q \ t 2 . TR Y q = ( q − 1) | B | − ( m − 1) m + 1 Equation 2 becom es: [ i ] t 2 . FREE q \ t 2 . TR Y q ≥ [ i ] s 2 . FREE p \ s 2 . TR Y p + (3 m − 1)( q − p ) Thus set t 2 . FREE q \ t 2 . TR Y q must ha ve at least (3 m − 1)( q − p ) m ore elements with rank less that the rank o f i , than set s 2 . FREE p \ s 2 . TR Y p does. This is a contrad iction since from eq. 1 we have th at: | t 2 . FREE q \ t 2 . TR Y q ∩ s 2 . FREE p \ s 2 . TR Y p | ≤ m ( q − p ) + m − 1 Case 2 | B | ≤ | A | : W e hav e that A ∩ B ≤ ( q − p ) · m and A ∩ B ≤ ( q − p ) · m fr om the contradictio n assumption. Since s 2 . FREE p \ s 2 . TR Y p can ha ve up to m − 1 less elements than A – the elements of set s 2 . TR Y p – and it can be the case that s 2 . TR Y p ∩ t 2 . TR Y p = ∅ , we have: | t 2 . FREE q \ t 2 . TR Y q ∩ s 2 . FREE p \ s 2 . TR Y p | ≤ m ( q − p ) + m − 1 (3) From the contrad iction assump tion an d th e case 2 assumption we have that | B | ≤ | A | ≤ | B | + ( q − p ) · m . Mo reover | A | ≥ β ≥ 3 m 2 and | B | ≥ β ≥ 3 m 2 . W e h ave: ( q − 1 ) | B | + ( q − p ) m m = ( p − 1 ) | B | + ( q − p ) m m + ( q − p ) | B | + ( q − p ) m m ≥ ≥ ( p − 1) | A | m + ( q − p ) | B | + ( q − p ) m m ≥ ( p − 1) | A | m + 3 m ( q − p ) + ( q − p ) 2 ⇒ ⇒ ( q − 1) | B | m ≥ ( p − 1 ) | A | m + 3 m ( q − p ) + ( q − p ) 2 − ( q − 1)( q − p ) ⇒ ⇒ ( q − 1) | B | m ≥ ( p − 1 ) | A | m + (3 m − p + 1) ( q − p ) ⇒ ⇒ ( q − 1) | B | m ≥ ( p − 1 ) | A | m + (2 m + 2) ( q − p ) ⇒ 17 ⇒ ( q − 1) | B | m ≥ ( p − 1 ) | A | m + (2 m + 1) ( q − p ) + ( q − p ) ( m − 1) m ⇒ ( q − 1) | B | − ( m − 1) m + 1 ≥ ( p − 1) | A | − ( m − 1) m + 1 + (2 m + 1 )( q − p ) (4) Since s ′ 2 . N E X T p = t ′ 2 . N E X T q = i , we h av e: [ i ] s 2 . FREE p \ s 2 . TR Y p = ( p − 1) | A | − ( m − 1) m + 1 [ i ] t 2 . FREE q \ t 2 . TR Y q = ( q − 1) | B | − ( m − 1) m + 1 Equation 4 becom es: [ i ] t 2 . FREE q \ t 2 . TR Y q ≥ [ i ] s 2 . FREE p \ s 2 . TR Y p + (2 m + 1)( q − p ) Thus set t 2 . FREE q \ t 2 . TR Y q must ha ve at least (2 m + 1)( q − p ) m ore elements with rank less that the rank o f i , th an set s 2 . FREE p \ s 2 . TR Y p . This is a contradiction since fr om eq. 3 we have that: | t 2 . FREE q \ t 2 . TR Y q ∩ s 2 . FREE p \ s 2 . TR Y p | ≤ m ( q − p ) + m − 1 Next we show that if a process p detects consecu tiv e collisions with process q , the pro cesses p, q a ttempted to p erfor m the jobs assoc iated with th e collisions in the sam e order a nd the o rder process p detects the collisions according to Defin ition 5.2 is th e same as the order processes p, q attempted to perfo rm the j obs. In the p roof s th at follow , for a state s in execution α we define as s. DONE the following set: s. DONE = { i ∈ J |∃ p ∈ P and j ∈ { 1 , . . . , n } : s .done p ( j ) = i } . Lemma 5.2. In an execution α ∈ execs (KK β ) for a ny β ≥ m if ther e exist pr o cesses p, q , jobs i 1 , i 2 ∈ J and states ˜ s 1 < ˜ s 2 such tha t pr ocess p c ollided with pr o cess q in job i 1 at state ˜ s 1 and in job i 2 at state ˜ s 2 according to D efinition 5.2, then ther e exis t tran- sitions s 1 , compNext p , s ′ 1 7→ ˜ s 1 , s 2 , compNext p , s ′ 2 7→ ˜ s 2 and t 1 , compNext q , t ′ 1 , t 2 , compNext q , t ′ 2 wher e s ′ 1 . N E X T p = t ′ 1 . N E X T q = i 1 , s ′ 2 . N E X T p = t ′ 2 . N E X T q = i 2 , s ′ 1 . S TA T U S p = s ′ 2 . S TA T U S p = t ′ 1 . S TA T U S q = t ′ 2 . S TA T U S q = set next such that: s 1 < s 2 and t 1 < t 2 . Proof. From Definition 5.2 we ha ve that ther e exist transitions s 1 , compNext p , s ′ 1 , s 2 , compNext p , s ′ 2 with s ′ 1 . N E X T p = i 1 , s ′ 2 . N E X T p = i 2 , s ′ 1 . S TA T U S p = s ′ 2 . S TA T U S p = set next , and there e xists no actio n π 1 = compNext p for which s 1 < π 1 < ˜ s 1 or s 2 < π 1 < ˜ s 2 . From the latter and the fact that ˜ s 1 < ˜ s 2 , it must b e the c ase that s 1 < ˜ s 1 < s 2 < ˜ s 2 . Furthermo re from Definitio n 5.2 we have that there exist transitions t 1 , compNext q , t ′ 1 , t 2 , compNext q , t ′ 2 with t ′ 1 . N E X T q = i 1 , t ′ 2 . N E X T q = i 2 , t ′ 1 . S TA T U S q = t ′ 2 . S TA T U S q = 18 set n e xt , such th at t ′ 1 < ˜ s 1 and t ′ 2 < ˜ s 2 . W e can p ick those transitions in α in such a way that there exists no o ther tr ansition between t ′ 1 and ˜ s 1 that sets N E X T q to i 1 and similarly there exists no other transition between t ′ 2 and ˜ s 2 that sets N E X T q to i 2 . W e need to p rove n ow that t 1 < t 2 . W e will prove this by contradic tion. Let t 2 < t 1 . Since t ′ 1 < ˜ s 1 , we have that t 2 < t 1 < t ′ 1 < ˜ s 1 < s 2 < ˜ s 2 . Since fro m Definition 5.2 either ˜ s 1 .next q = i 1 or there exists j ∈ { 1 , . . . , n } such that ˜ s 1 .done q,j = i 1 , it must be the case that ˜ s 2 . S TA T U S p = g ather done , ˜ s 2 . Q p = q an d there exists j ′ ∈ { 1 , . . . , n } such that ˜ s 2 .done q,j ′ = i 2 . Essentially , it must be that case that process q performed job i 2 after transition t 2 , compNext q , t ′ 2 . This m eans that the re exists transition t 3 , done q , t ′ 3 and j ′ ∈ { 1 , . . . , n } such that t ′ 3 .done q,j ′ = i 2 and t 2 < t ′ 3 < t 1 < t ′ 1 < ˜ s 1 < s 2 < ˜ s 2 . If ˜ s 1 . S TA T U S p = g ather try th en fro m a lgorithm KK β we have that ˜ s 1 . DONE ⊆ s 2 . DONE p , since ac tions gatherT ry p are followed by action s gatherDone p before any a c- tion setNext p takes place. As a result i 2 ∈ s 2 . DONE p , which is a con tradiction since s 2 , compNext p , s ′ 2 / ∈ tra ns (KK β ) if i 2 ∈ s 2 . DONE p and s ′ 2 . N E X T p = i 2 , s ′ 2 . S TA T U S p = set next . If ˜ s 1 . S TA T U S p = g ather done then f rom alg orithm KK β we have that ˜ s 1 . Q p = q and there exists j ∈ { 1 , . . . , n } such tha t ˜ s 1 . P O S p ( q ) = j and ˜ s 1 .done q,j = i 1 . Since t 2 < t ′ 3 < t 1 < t ′ 1 < ˜ s 1 < s 2 < ˜ s 2 it must be the case that j ′ < j and as a r esult i 2 ∈ ˜ s 1 . DONE p . Clearly ˜ s 1 . DONE p ⊆ s 2 . DONE p , which is a con tradiction since s 2 , compNext p , s ′ 2 / ∈ tr ans (KK β ) if i 2 ∈ s 2 . DONE p and s ′ 2 . N E X T p = i 2 , s ′ 2 . S TA T U S p = set next . Next we sho w that if two consecutive co llisions take place between processes p, q , and p d e- tects the one collision and q the other , the processes p, q attempted to perform the jobs a ssociated with th e co llisions in the same or der and the o rder in which the p rocesses detect the c ollisions accordin g to Definition 5.2 is the same a s the o rder the pro cesses p, q attempted to p erform the jobs. Lemma 5.3. In an execution α ∈ execs (KK β ) fo r any β ≥ m if ther e exist pr ocesses p, q , jo bs i 1 , i 2 ∈ J an d states ˜ s 1 < ˜ s 2 such that pr ocess p co llided with pr ocess q in job i 1 at state ˜ s 1 and pr oce ss q co llided with pr o cess p in job i 2 at state ˜ s 2 according to Definition 5.2, then the r e exist transitions s 1 , compNext p , s ′ 1 7→ ˜ s 1 , t 2 , compNext q , t ′ 2 7→ ˜ s 2 and s 2 , compNext p , s ′ 2 , t 1 , compNext q , t ′ 1 , wher e s ′ 1 . N E X T p = t ′ 1 . N E X T q = i 1 , s ′ 2 . N E X T p = t ′ 2 . N E X T q = i 2 , s ′ 1 . S TA T U S p = s ′ 2 . S TA T U S p = t ′ 1 . S TA T U S q = t ′ 2 . S TA T U S q = set next such that: s 1 < s 2 and t 1 < t 2 . Proof. From Definition 5.2 we have that there exist tran sitions s 1 , compNext p , s ′ 1 , s 2 , compNext p , s ′ 2 with s ′ 1 . N E X T p = i 1 , s ′ 2 . N E X T p = i 2 , s ′ 1 . S TA T U S p = s ′ 2 . S TA T U S p = set next , and ther e exists no action π 1 = compNext p for which s 1 < π 1 < ˜ s 1 . Furthermo re from Definitio n 5.2 we have that there exist transitions t 1 , compNext q , t ′ 1 , t 2 , compNext q , t ′ 2 with t ′ 1 . N E X T q = i 1 , t ′ 2 . N E X T q = i 2 , t ′ 1 . S TA T U S q = t ′ 2 . S TA T U S q = set next , an d th ere exists no actio n π 2 = co mpNext q for wh ich t 2 < π 2 < ˜ s 2 . From th e later and the fact that ˜ s 1 < ˜ s 2 , it must be the c ase that t 1 < t 2 < ˜ s 2 . W e can pick the tran sitions 19 that ar e enabled by states t 1 and s 2 in α in suc h a way that there exists n o othe r transition between t ′ 1 and ˜ s 1 that sets N E X T q to i 1 and similarly there exists no other transition between s ′ 2 and ˜ s 2 that sets N E X T p to i 2 . W e n eed to prove no w that s 1 < s 2 . W e will p rove this by contradiction . Let s 2 < s 1 . From a lgorithm KK β and Definition 5.2 there exist transitions s 3 , setNext p , s ′ 3 , and t 3 , setNext q , t ′ 3 , where s ′ 3 .next p = i 2 , t ′ 3 .next q = i 1 and s 2 < s ′ 3 < s 1 , t 1 < t ′ 3 < t 2 . Th ere are 2 cases, either s ′ 3 < t ′ 3 or t ′ 3 < s ′ 3 . Case 1 s ′ 3 < t ′ 3 : W e have that s ′ 3 < t ′ 3 < t 2 and t 2 , compNext q , t ′ 2 , where t ′ 2 . N E X T q = i 2 and t ′ 2 . S TA T U S q = set next which means that i 2 / ∈ t 2 . TR Y q ∪ t 2 . DONE q . This is a contrad iction since the t 2 . TR Y q and t 2 . DONE q are compu ted by actions gatherT ry q and gatherDone q that are prec eded by state s ′ 3 . Either i 2 ∈ t 2 . TR Y q or a n ew actio n setNext p took place b efore the gatherT ry q actions. In the latter case, if there is a transition s 4 , done p , s ′ 4 , where s 4 .next p = i 2 , befor e th e a ction setNext p , it must be the case that i 2 ∈ t 2 . DONE q . If th ere exists no such transition we have again a co ntradiction since we cannot h av e a collision in job i 2 at state ˜ s 2 as defined in Definition 5.2. Case 2 t ′ 3 < s ′ 3 : W e h av e th at t ′ 3 < s ′ 3 < s 1 and s 1 , compNext p , s ′ 1 , where s ′ 1 . N E X T p = i 1 and s ′ 1 . S TA T U S p = set next which mean s th at i 1 / ∈ s 1 . TR Y p ∪ s 1 . DONE p . Th is is a contra- diction since the s 1 . TR Y p and s 1 . DONE p sets ar e computed by g atherT ry p and gatherDone p actions th at ar e pr eceded by state t ′ 3 . Eith er i 1 ∈ s 1 . TR Y p or a n ew action setNext q took p lace before the gatherT ry p actions. In the latter case, if there is a transition t 4 , done q , t ′ 4 , wh ere t 4 .next q = i 1 , b efore the action setNext q , it must be the case that i 1 ∈ s 1 . DONE p . If th ere exists no such transition we ha ve again a contradiction since we cannot ha ve a collision in job i 1 at state ˜ s 1 as defined in Definition 5.2. Next we show that if 2 pr ocesses p, q ∈ P collid e three tim es, th eir DO NE sets a t th e third collision will co ntain at least m · ( q − p ) more jobs th an they did at the first collision. This will allow us to find an upper b ound o n the collisions a process may particip ate in. I t is possible that both processes becom e aware of a collision or only one of them doe s while the oth er o ne successfully completes the job . Lemma 5.4. If β ≥ 3 m 2 and in an execution α ∈ execs (KK β ) ther e exist pr ocesses p 6 = q , jobs i 1 , i 2 , i 3 ∈ J and states ˜ s 1 < ˜ s 2 < ˜ s 3 such th at pr ocess p, q collide in job i 1 at state ˜ s 1 , in job i 2 at state ˜ s 2 and in job i 3 at state ˜ s 3 according to Defin ition 5.3, then ther e exist states s 1 < s 3 and t 1 < t 3 such that: s 1 . DONE p ∪ t 1 . DONE q ⊆ s 3 . DONE p ∩ t 3 . DONE q | s 3 . DONE p ∪ t 3 . DONE q | − | s 1 . DONE p ∪ t 1 . DONE q | ≥ m · | q − p | Proof. From Definitions 5.2, 5.3 we hav e that there e xist transitions s 1 , compNext p , s ′ 1 , s 2 , compNext p , s ′ 2 , s 3 , compNext p , s ′ 3 and t 1 , compNext q , t ′ 1 , t 2 , compNext q , t ′ 2 , t 3 , compNext q , t ′ 3 , where s ′ 1 . N E X T p = t ′ 1 . N E X T q = i 1 , s ′ 2 . N E X T p = t ′ 2 . N E X T q = i 2 , s ′ 3 . N E X T p = t ′ 3 . N E X T q = i 3 , s ′ 1 . S TA T U S p = s ′ 2 . S TA T U S p = s ′ 3 . S TA T U S p = t ′ 1 . S TA T U S q = t ′ 2 . S TA T U S q = t ′ 3 . S TA T U S q = set next a nd s 1 < ˜ s 1 , t 1 < ˜ s 1 , s 2 < ˜ s 2 , t 2 < ˜ s 2 , and s 3 < ˜ s 3 , 20 t 3 < ˜ s 3 . W e pick from α the transitions s 1 , compNext p , s ′ 1 , t 1 , compNext q , t ′ 1 , in such a way that ther e exists n o other compNext p , co mpNext q between states s 1 , ˜ s 1 respectively t 1 , ˜ s 1 that sets N E X T p respectively N E X T q to i 1 . W e can p ick in a similar man ner the transition s for jobs i 2 , i 3 . From Lemma s 5 .2, 5 .3 an d Definitions 5. 2, 5.3 we have th at s 1 < s 2 < s 3 and t 1 < t 2 < t 3 . W e will first prove that: s 1 . DONE p ∪ t 1 . DONE q ⊆ s 3 . DONE p ∩ t 3 . DONE q From algo rithm KK β we have that there exists in α tran sitions s 4 , setNext p , s ′ 4 , t 4 , setNext q , t ′ 4 with s ′ 4 .next p = i 2 , t ′ 4 .next q = i 2 and the re exist no action π 1 = compN e xt p , such that s ′ 2 < π 1 < s 4 , and n o action π 2 = compN ext q , such that t ′ 2 < π 2 < t 4 . W e nee d to prove th at t 1 < s 4 and s 1 < t 4 . W e start by proving th at t 1 < s 4 . In ord er to get a con tradiction we assume that s 4 < t 1 . Fro m algor ithm KK β we have that there exists in α transition t 5 , gatherT ry q , t ′ 5 , with t 5 . Q q = p , and th ere exists no action π 2 = compN ext q , such tha t t ′ 5 < π 2 < t 2 . W e have that s 4 < t 1 < t ′ 5 < t 2 and i 2 / ∈ t 2 . TR Y q ∪ t 2 . DONE q . If t 5 .next p = i 2 we h ave a co ntradictio n since i 2 ∈ s 2 . TR Y q . If t 5 .next q 6 = i 2 there exists an action π 3 = setN ext p in α , such that s 4 < π 3 < t 5 . If this π 3 = setNext p is preceded by transition s 5 , done p , s ′ 5 with s 5 . N E X T p = i 2 , we hav e a contrad iction since i 2 ∈ t 5 . DONE and t 2 . DONE q is comp uted by gatherDone q actions that are preced ed by state t 5 , which results in i 2 ∈ t 2 . DONE q . If th ere exists no such transition we have again a co ntradiction since we cannot h av e a collision in job i 2 at state ˜ s 2 as defined in Definition 5.2. The case s 1 < t 4 is symmetric and can be proved with similar arguments. From the discussion above we hav e th at t 1 < s 4 , thu s t 1 . DONE q ⊆ s 4 . DONE . Moreover s 3 . DONE p is comp uted by gatherDo ne p actions that ar e preced ed b y state s 4 , fr om which we have tha t t 1 . DONE q ⊆ s 3 . DONE p . Since s 1 < s 3 it holds that s 1 . DONE p ⊆ s 3 . DONE p , thus we have t hat s 1 . DONE p ∪ t 1 . DONE q ⊆ s 3 . DONE p . From s 1 < t 4 , with similar argu ments as before, we can prove that s 1 . DONE p ∪ t 1 . DONE q ⊆ t 3 . DONE q , which giv es us that: s 1 . DONE p ∪ t 1 . DONE q ⊆ s 3 . DONE p ∩ t 3 . DONE q Now it only remains to prove th at: | s 3 . DONE p ∪ t 3 . DONE q | − | s 1 . DONE p ∪ t 1 . DONE q | > m · | q − p | If p < q from Lem ma 5.1 we h ave that s 3 . DONE p ∩ t 3 . DONE q > ( q − p ) m or s 3 . DONE p ∩ t 3 . DONE q > ( q − p ) m . Since s 1 . DONE p ∪ t 1 . DONE q ⊆ s 3 . DONE p ∩ t 3 . DONE q , we have that: | s 3 . DONE p ∪ t 3 . DONE q | − | s 1 . DONE p ∪ t 1 . DONE q | > ( q − p ) · m If q < p with similar argum ents we have that: | s 3 . DONE p ∪ t 3 . DONE q | − | s 1 . DONE p ∪ t 1 . DONE q | > ( p − q ) · m Combining the above we hav e: | s 3 . DONE p ∪ t 3 . DONE q | − | s 1 . DONE p ∪ t 1 . DONE q | > m · | q − p | 21 Next we pr ove that a proce ss p canno t collide with a p rocess q more than 2 l n m ·| q − p | m times in any ex ecution. Lemma 5.5. If β ≥ 3 m 2 ther e e xists no execution α ∈ execs (KK β ) at which pr ocess p collided with pr o cess q in mor e than 2 l n m | q − p | m states according to Definition 5.2. Proof. Let execution α ∈ e xecs (KK β ) be an execution at which proce ss p c ollided with process q in at least 2 l n m | q − p | m + 1 states. L et us examine the first 2 l n m | q − p | m + 1 such states. Let those states be ˜ s 1 < ˜ s 2 < . . . < ˜ s 2 ⌈ n m | q − p | ⌉ < ˜ s 2 ⌈ n m | q − p | ⌉ +1 . Fro m L emma 5.2 we have that there e xists states s 1 < s 2 < . . . < s 2 ⌈ n m | q − p | ⌉ < s 2 ⌈ n m | q − p | ⌉ +1 that enable the compN ext p actions and states t 1 < t 2 < . . . < t 2 ⌈ n m | q − p | ⌉ < t 2 ⌈ n m | q − p | ⌉ +1 that e nable the compN ext q actions that lead to the collisions in states ˜ s 1 < ˜ s 2 < . . . < ˜ s 2 ⌈ n m | q − p | ⌉ < ˜ s 2 ⌈ n m | q − p | ⌉ +1 . Then from Lemma 5.4 we have that ∀ i ∈ n 1 , . . . , l n m | q − p | mo : | s 2 i +1 . DONE p ∪ t 2 i +1 . DONE q | − | s 2 i − 1 . DONE p ∪ t 2 i − 1 . DONE q | > m | q − p | | s 2 i +1 . DONE p ∪ t 2 i +1 . DONE q | − | s 1 . DONE p ∪ t 1 . DONE q | > i m | q − p | | s 2 i +1 . DONE p ∪ t 2 i +1 . DONE q | > i m | q − p | (5) From eq. 5 we have that: s 2 ⌈ n m | q − p | ⌉ +1 . DONE p ∪ t 2 ⌈ n m | q − p | ⌉ +1 . DONE q > m | q − p | n m | q − p | ≥ n (6) Equation 6 leads to a co ntradiction since s 2 ⌈ n m | q − p | ⌉ +1 . DONE p ∪ t 2 ⌈ n m | q − p | ⌉ +1 . DONE q ⊆ J and |J | = n . Finally we are rea dy to prove the main theorem on the work comp lexity of algorith m K K β for β ≥ 3 m 2 . Theorem 5.6. If β ≥ 3 m 2 algorithm KK β has work complexity W KK β = O( nm log n lo g m ) . Proof. W e start with the ob servation that in any execution α of algorithm KK β , if ther e exists process p , job i , tran sition s 1 , done p , s ′ 1 and j ∈ { 1 , . . . , n } suc h tha t s 1 . P O S p ( p ) = j , s 1 . N E X T p = i , f or any pr ocess q 6 = p there exists at most one transition t 1 , gatherDone q , t ′ 1 in α , with t 1 . Q q = p , t 1 . P O S q ( p ) = j and t 1 ≥ s 1 . Such tra nsition perfo rms exactly one read operation f rom the shared memory , one insertion at the set DONE q and on e r emoval fro m the set FRE E q , thus su ch a tran sition costs O(log n ) work . Clearly th ere exist at most m − 1 such transitions fo r each done p . From Lemma 4.1 for all pr ocesses there can be at most n actio ns done p in any execution α of alg orithm KK β . Each done p action per forms one write op eration in shared m emory , on e in sertion at the set DONE p and one removal from th e set FREE p , thus such an action has cost O(log n ) work. Fur thermor e any done p is preceded by m − 1 gatherT ry p read actions that read the next ar ray and each ad d at most on e element to the set TR Y p with 22 cost O(log n ) and m − 1 ga therDone p read actions that do not add elements in the DONE p set. Note that w e have already co unted the gathe rDone p read actio ns th at r esult in ad ding job s at the DONE p set. Finally any done p action is precede d by one c ompNext p action. This actio n is dominated by the cost o f the r ank (FREE p , TR Y p , i ) fun ction. If the sets FREE p , TR Y p are represented with some efficient tree structure lik e r ed-bla ck tr ee or s ome variant o f B-tr ee [5, 19] that allows insertion, d eletion and search of an element in O(log n ) , an inv ocation of function rank (FREE p , TR Y p , i ) costs O( m log n ) work. T hat gives us a total o f O ( nm log n ) work associated with the done p actions. If a process p collid ed with a pr ocess q in job i at state s , we have an extra compNext p action, m − 1 extra gathe rT ry p read actions and in sertions in th e TR Y p set and m − 1 gath erDone p read actions that do not add elements in the D ONE p set. Th us each collision costs O( m log n ) work. Since β ≥ 3 m 2 from Lemma 5 . 5 for two distinct processes p, q we have that in any ex ecution α of algorithm KK β there exist less than 2 l n m | q − p | m collisions. For process p if we coun t all such collisions with any other process q we g et: X q ∈P −{ p } 2 n m | q − p | ≤ 2( m − 1) + 2 n m X q ∈P −{ p } 1 | q − p | ≤ ≤ 2( m − 1) + 4 n m ⌈ m 2 ⌉ X i =1 1 i ≤ 2( m − 1) + 4 n m log m (7) If we c ount the to tal numb er of collisions f or all the m processes we get that if β ≥ 3 m 2 in any execution of algorith m KK β there can b e at most 2 m 2 + 4 n lo g m < 4( n + 1 ) log m collisions (since n > β ). Thus collisions cost O ( nm log n log m ) work. Fin ally any p rocess p tha t fails may add in the work c omplexity less than O ( m log n ) work fro m its compNext p action and fro m r eads (if the proc ess fails without perform ing a done p action after its latest compNext p action). So for the work complexity of algo rithm KK β if β ≥ 3 m 2 we have t hat W KK β = O( nm lo g n lo g m ) . 6. An Asymptotically W o rk Optimal Algorithm W e demon strate how to solve th e at-most-o nce prob lem with effecti veness n − O( m 2 log n log m ) and work complexity O( n + m (3+ ǫ ) log n ) , for an y constant ǫ > 0 , such that 1 /ǫ is a positi ve integer , w hen m = O ( 3 √ n ) , usin g algorithm KK β with β = 3 m 2 . Algo rithm IterativeKK ( ǫ ) , presented in Fig. 3, per forms iter ativ e calls to a variation of alg orithm KK β , called IterStepKK . IterativeKK ( ǫ ) has 3 + 1 / ǫ distinct matric es done and vectors next in shared memory , with different g ranular ities. On e done matrix, stores the regular job s performed , while th e re maining 2 + 1 /ǫ matrices store sup er-jobs . Sup er-jobs are groups of co nsecutive jobs. From them, o ne stores super-jobs o f size m log n log m , while th e remaining 1 + 1 /ǫ m atrices, store super-jobs of size m 1 − iǫ log n log 1+ i m for i ∈ { 1 , . . . , 1 / ǫ } . The 3 + 1 /ǫ distinct vector s next are used in a similar way as the matrices done . The algorithm IterStepKK is different from KK β in the following ways. First, a ll instances of IterStepKK work f or β = 3 m 2 . Mor eover , IterStepKK has a term ination flag in shar ed memory . Th is terminatio n flag is initially 0 and is set to 1 by any pr ocess that d ecides to ter- minate. In the execution of algor ithm IterStepKK , a process p , that in an action co mpNext p has | FREE p \ TR Y p | < 3 m 2 , sets the ter mination flag to 1 , compu tes new sets FREE p and 23 IterativeKK ( ǫ ) for pr ocess p : 00 size p, 1 ← 1 01 size p, 2 ← m log n log m 02 FREE p ← map ( J , siz e p, 1 , size p, 2 ) 03 FREE p ← IterStep KK (FRE E p , size p , 2 ) 04 f or ( i ← 1 , i ≤ 1 /ǫ, i + +) 05 size p, 1 ← size p, 2 06 size p, 2 ← m 1 − iǫ log n log 1+ i m 07 FREE p ← map (FREE p , size p, 1 , size p, 2 ) 08 FREE p ← IterStep KK (FRE E p , size p , 2 ) 09 endf or 10 size p, 1 ← size p, 2 11 size p, 2 ← 1 12 FREE p ← map (FREE p , size p, 1 , size p, 2 ) 13 FREE p ← IterStep KK (FRE E p , size p , 2 ) Figure 3: Algorithm Iterativ eKK ( ǫ ) : pseudocode TR Y p , retu rns th e set FREE p \ TR Y p and terminates. After a proce ss p checks if it is safe to perfor m a job, the pr ocess also checks th e termination flag and if the flag is 1 , the process instead of per forming the job, compu tes n ew sets FREE p and TR Y p , returns the set FREE p \ TR Y p and terminates. Finally , algorithm IterStepKK takes as inputs the v ar iable siz e an d a set SET 1 , such that | SET 1 | > 3 m 2 , an d return s the set SET 2 as outp ut. SE T 1 contains super-jobs of size siz e . I n IterStepK K , with an action do p,j process p per forms all the job s o f super-job j . A process p p erform s as many super-jobs as it can an d returns in SET 2 the super-jobs it can verify that no process will perfor m. In algorithm Iter ativeKK ( ǫ ) we use also th e f unction SET 2 = map (SET 1 , size 1 , size 2 ) , that takes the set of super -jobs SET 1 , with s uper-jobs of size siz e 1 and maps it to a set of super- jobs SET 2 with size siz e 2 . A job i is alw ays mapped to the same super-job o f a s pecific size an d there is no intersection between the jobs in super-jobs of the same size. 6.1. A nalysis W e begin th e analysis of algorithm IterativeKK ( ǫ ) by showing in T heorem 6.3 that IterativeKK ( ǫ ) solves the at-mo st-once pro blem. Th is is do ne by first showing in Lemma 6.1 that algor ithm IterStepK K solves th e at-most-on ce pro blem fo r the set of all sup er-jobs of a specific size, a nd then by showing in Lem ma 6 .2 that there exist no p erform ed s uper-jobs in any output set SE T 2 . W e co mplete the analysis with Theorem 6.4, where we show th at algorithm IterativeKK ( ǫ ) has effectiveness E IterativeKK( ǫ ) ( n, m, f ) = n − O( m 2 log n log m ) and work complexity W IterativeKK( ǫ ) = O( n + m 3+ ǫ log n ) . Let the set of all super-jobs of a specific size d be Super Se t d . All inv o cations o f alg orithm IterStepKK on sets SET 1 ⊆ Sup er Set d , use the m atrix done and vector next that correspon d to the super-jobs of size d . Moreover each p rocess p inv okes alg orithm IterStepKK fo r a set SET 1 ⊆ Super Set d only once. W e have the fo llowing lemma. Lemma 6.1. Algorithm IterStepKK solves the at-most-o nce pr o blem for the set SuperSet d . Proof. As described above, alg orithm IterStepKK is d ifferent fro m KK β in the fo llowing ways: 24 • Pr ocess p , on algo rithm Iter StepKK , h as an in put set SET 1 ⊆ Super Set d of super-jobs of size d to be p erform ed an d outp uts a set SET 2 ⊂ Sup erSet d of super-jobs, that have not been p erform ed. Process p initially sets its set FREE p , eq ual to SE T 1 and procee ds as it would do when executing K K β , with the d ifference that an action do p,i results in perfor ming all the jobs under sup er-job i . Entries in the matr ix done and vector next in shared memo ry co rrespon d to the identifiers of super-jobs of set Sup erSet d . Again af ter its initialization, entries are only removed from set FREE p . Note that th e main d ifference caused by this modificatio n, betwee n alg orithm IterStepKK and algo rithm K K β , is that jobs are replaced by super-jobs, and th at the initial sets FREE p and FREE q of processes p, q cou ld be set to different sub sets of set Sup erSet d . This does not affect th e correctn ess of the a lgorithm, since in any state s o f an execution α of algorithm KK β , the sets FRE E p and FREE q could b e d ifferent su bsets of the set of all jobs J . • Alg orithm Iter StepKK has a termination flag in sh ared m emory . The terminatio n flag is initially 0 and is set to 1 by any process tha t dec ides to terminate. As men tioned above, any pr ocess that discovers that | FRE E p \ TR Y p | < 3 m 2 in an action compNe xt p , sets the termination flag to 1 , computes new sets FREE p and TR Y p , returns the set FREE p \ TR Y p and term inates. This modification on ly affects the sequen ce of ac tions dur ing the termination of a process p . Observe p rocess p d oes not p erform any super-jobs in that termination sequence. Additionally , after a pro cess p che cks if it is safe to p erform a super-job, it also c hecks the termination flag a nd if the flag is 1 , the p rocess instead of per formin g the super-job, enters the termin ation sequ ence, comp uting new sets FREE p and TR Y p , return ing the set FREE p \ TR Y p and ter minating. A p rocess p fir st che cks if it is safe to perf orm a super-job acc ording to algorithm KK β and then che cks the flag. Thus this modification only affects the e ffecti veness, but not the correctn ess of th e alg orithm, since it co uld on ly result in a super-job that w as safe to perfo rm not being performed. • Fin ally all in stances o f IterStepKK work for β = 3 m 2 . Th is do es not affect correc tness, since Lemma 4.1 holds for any β . It is easy to see th at non e of th e mo difications described ab ove affect the key argum ents in the proof of Lem ma 4 .1. Th us with similar argum ents as in th e proof of Lemma 4 .1, we ca n sh ow that there exists n o execution of algorithm Iter StepKK , wh ere two distinct actio ns π = do p,i and π ′ = do q,i take p lace for a super-job i ∈ Super Set d and proc esses p, q ∈ P ( p could be equal to q ). Next we show that in the output sets o f algo rithm IterStepK K a t a specific iteration (calls for super-jobs of size d ), no co mpleted super-jobs are includ ed. Combined with the previous lemma, this argumen t will help us establish that alg orithm IterativeKK ( ǫ ) solves th at at-m ost- once prob lem. Lemma 6.2. Ther e e xists no e xe cution α of algorithm IterStepKK , such tha t ther e exists a ction do q,i ∈ α fo r some pr oce ss q and super-job i in the output set SET 2 ⊂ Sup erSet d of some pr oce ss p ( p co uld be equal to pr ocess q ). Proof. As described ab ove, a process p before terminating alg orithm IterStepKK , e ither sets th e flag to 1 or o bserves that the flag is set to 1 . Th e proce ss p then com putes new sets FREE p and 25 TR Y p , returns the set FRE E p \ TR Y p and term inates its execution o f algo rithm IterStepKK for inp ut set SET 1 ⊆ Sup erSet d and sup er-jobs of size d . Let state s be the state at which process p termin ates, we have that SE T 2 = s. FREE p \ s. TR Y p . If p = q and the re exists action π = do p,i in execution α of algorithm I ter StepKK , for super-jobs i ∈ Sup erSet d , clearly π < s , f rom which we hav e that i / ∈ s. FREE p and thus i / ∈ SET 2 . It is easy to see th at if p 6 = q and i ∈ SE T 2 of pro cess p , there exists no action π = do q,i in execution α . I f i ∈ SET 2 then i ∈ s. FREE p and i / ∈ s . TR Y p . Moreover proc ess p eithe r set flag to 1 o r observed tha t th e flag was set, be fore co mputing sets s . FREE p and s . T R Y p . If there exists π = do q,i ∈ α , for process q , it m ust be the case that afte r p rocess q per formed th e transition t, compNext q , t ′ 7→ π (see Definition 5 .1 of immediate predecessor), it read th e flag and found it was equ al to 0 . This leads to a contrad iction, since it m ust be the case that either i ∈ s. TR Y p or i / ∈ s. FREE p . W e are ready now to s how the correctness of algorithm IterativeKK ( ǫ ) . Theorem 6.3. Algorithm IterativeKK ( ǫ ) solve s the at-most-onc e pr oblem. Proof. Fro m Lemma 6 .1 we hav e th at any super-job of a specific size d is p erform ed at-most- once (if p erform ed at all) in the execution of alg orithm IterStepKK f or the sup er-jobs in the set Sup erSet d . Moreover , from Lemma 6.2 we have that sup er-jobs in the outp ut sets of an execution of algorithm IterStepKK fo r super-jobs of size d , ha ve not been perform ed. Fu nction SET 2 = map (SET 1 , size 1 , size 2 ) maps the jobs in the super-jobs of set SET 1 , to sup er-jobs in SET 2 . A job i is always mapped to the same super-job of a specific size d an d there is no intersection between the job s of the super-jobs in set SuperSet d . It is easy to see that there exists no execution of algorithm IterativeKK ( ǫ ) , where a jo b i is per formed more than once. W e complete the analysis of algorithm Iter ativeKK ( ǫ ) with T heorem 6.4, which gives up per bound s for the effecti veness and work complexity of the algorithm. Theorem 6.4. Algorithm IterativeKK ( ǫ ) has W IterativeKK( ǫ ) = O( n + m 3+ ǫ log n ) work com- plexity and ef fectiveness E IterativeKK( ǫ ) ( n, m, f ) = n − O( m 2 log n log m ) . Proof. In order to determine the effectiv eness an d work complexity o f algorithm IterativeKK ( ǫ ) , we comp ute the job s perform ed by and the work spent in each inv ocation o f IterStepKK . Moreover we compute the work that the in vocations to the function map ( ) add. The first in vocation to func tion ma p ( ) in line 02 can be completed by pro cess p with work O( n m log n log m log n ) , since proc ess p need s to constru ct a tree with n m log n log m elements. This contributes for all pro cesses O( n log m ) work. From Th eorem 5.6 we have that Iter StepKK in 03 has total work O( n + n m log n log m m log n log m ) = O( n ) , wh ere the first n co mes from do actions and the seco nd ter m from the work c omplexity of T heorem 5.6. Note that we cou nt O(1) work for each norm al job executed by a do action on a super-job. That means that in the in vocation of IterStepKK in line 03 , do action s cost m log n lo g m work. Mo reover f rom Theorem 4 .4 we have effecti veness n m log n log m − (3 m 2 + m − 2) on the su per-jobs of size m log n log m . Fr om the su per-jobs not completed , up to m − 1 may be c ontained in th e TR Y p sets upon ter mination in line 03 . Since those super-jobs are not added ( and thu s ar e ign ored) in the output FREE p set in line 03 , up to ( m − 1 ) m log n lo g m jobs may n ot be perfo rmed by IterativeKK ( ǫ ) . The set FRE E p returned b y algorithm Iter StepKK in line 03 h as no m ore than 3 m 2 + m − 2 super-jobs of si ze m log n log m . 26 In each repetition of the loop in lin es 04 − 09 , the map () function in line 07 constru cts a FREE p set with at most O ( m 2+ ǫ / log m ) elemen ts, which costs O( m 2+ ǫ ) per proc ess p for a total of O( m 3+ ǫ ) work for all processes. M oreover ea ch inv ocation of IterStepKK in lin e 0 8 costs O(3 m 3 log n log m + m 3+ ǫ log m ) < O ( m 3+ ǫ log n ) work f rom Theorem 5 .6, where the term 3 m 3 log n log m is an upper b ound on the w ork needed for the do actions on th e super-jobs. From Theorem 4.4 we h ave th at eac h output FREE p set in line 08 has at most 3 m 2 + m − 2 super-jobs. M oreover from each in vocation of IterStepKK in lin e 08 at most m − 1 super-jobs are lost in TR Y sets. Those accou nt for less than ( m − 1) m log n log m jobs in each itera tion, since the size of th e super-jobs in th e iterations of the loop in lines 04 − 09 is strictly less th an m log n log m . When we leave the loo p in lines 0 4 − 09 , we h ave a FREE p set with at most 3 m 2 + m − 2 super-jobs of size log n log 1+1 /ǫ m , which me ans that in line 12 fun ction map () will retur n a set FREE p with less than (3 m 2 + m − 2 )(lo g n log 1+1 /ǫ m ) elemen ts that cor respond to jobs and no t su per-jobs. T his costs f or all pro cesses a total of O( m 3 log m log log n log lo g m ) < O( m 3+ ǫ log n ) work, s ince ǫ is a constant. Fin ally we ha ve that IterStepKK in line 13 has from Theorem 5.6 work O( m 3 log 2 m log log n log log m ) < O( m 3+ ǫ log n ) and from Th eorem 4.4 effecti veness (3 m 2 + m − 2)(log n log 1+1 /ǫ m ) − (3 m 2 + m − 2 ) . If we add up all the w ork, we have th at W IterativeKK( ǫ ) = O( n + m 3+ ǫ log n ) since the loop in lines 04 − 09 repeats 1 + 1 /ǫ times and ǫ is a constant. Mor eover for the effecti veness, we ha ve that less than or equal to ( m − 1) m lo g n log m job s will be lost in the TR Y set at line 03 . After that strictly less than ( m − 1) m log n log m jobs will be lost in the TR Y sets o f the iterations of the loop in lines 0 4 − 09 an d fewer than 3 m 2 + m − 2 jobs will be lost fr om th e effecti vene ss of the la st inv ocation of IterStepKK in line 13 . Thus we h ave that E IterativeKK( ǫ ) ( n, m, f ) = n − O( m 2 log n log m ) . For any m = O( 3+ ǫ p n/ log n ) , algo rithm IterativeKK ( ǫ ) is work optimal and asy mptoti- cally effecti ven ess optimal. 7. An Asymptotically Optimal Algorithm for the Write-All Problem Based on IterativeKK ( ǫ ) we con struct algorithm W A IterativeKK ( ǫ ) Fig . 4, that solves the Write-All pro blem [2 3] with work comp lexity O( n + m (3+ ǫ ) log n ) , for any constant ǫ > 0 , such th at 1 /ǫ is a po siti ve integer . Fro m Kanellakis and Shvartsman [23] th e Write-All problem for the shared memo ry model, c onsists of : “Using m pr ocessors write 1 ’ s to all loc ations of an array o f size n . ” The p roblem assumes that all cells of the array are initialized to 0 . Alg o- rithm W A IterativeKK ( ǫ ) is different fr om IterativeKK ( ǫ ) in two ways. It uses a modified version of IterStepKK , that in stead of returning the FREE p \ TR Y p set upo n termination re- turns the set FREE p instead. Let us name this modified version W A IterStepKK . Moreover i n W A IterativeKK ( ǫ ) after line 1 3 , pro cess p , in stead of terminating , e xecutes all jo bs in th e set FREE p . Note that since we a re in terested in the Write-All problem , when process p perform s a job i with ac tion do p,i , pr ocess p just writes 1 , in the i − th po sition of the Write All array wa [1 , . . . , n ] in shared memory . Theorem 7.1. Algorithm W A IterativeKK ( ǫ ) solves the Write-All pr oblem with work co mplex- ity W W A IterativeKK( ǫ ) = O( n + m 3+ ǫ log n ) . Proof. W e prove th is with similar arguments as in the proo f o f Theore m 6 .4. From Th eorem 4.4 after each in vocation of W A IterStepKK the output set F REE p has less than 3 m 2 + m − 1 27 W A IterativeKK ( ǫ ) for pr ocess p : 00 size p, 1 ← 1 01 size p, 2 ← m log n log m 02 FREE p ← map ( J , siz e p, 1 , size p, 2 ) 03 FREE p ← W A IterStep KK (FRE E p , size p , 2 ) 04 f or ( i ← 1 , i ≤ 1 /ǫ, i + +) 05 size p, 1 ← size p, 2 06 size p, 2 ← m 1 − iǫ log n log 1+ i m 07 FREE p ← map (FREE p , size p, 1 , size p, 2 ) 08 FREE p ← W A IterStep KK (FRE E p , size p , 2 ) 09 endf or 10 size p, 1 ← size p, 2 11 size p, 2 ← 1 12 FREE p ← map (FREE p , size p, 1 , size p, 2 ) 13 FREE p ← W A IterStep KK (FRE E p , size p , 2 ) 14 f or ( i ∈ FREE p ) 15 do p,i 16 endf or Figure 4: Algorithm W A IterativeKK ( ǫ ) : pseudocod e super-jobs. The d ifference is that now we d o n ot leav e jobs in the TR Y p sets, since we are no t interested in maintaining th e at-most-once property between successi ve in vocations of algorithm W A IterStepKK . Since af ter e ach in vocation of W A IterStepKK the output set FREE p has the sam e upper b ound on sup er-jobs as in Itera tiveKK ( ǫ ) , with similar argum ents a s in th e proof of The orem 6. 4, we hav e tha t at line 1 3 the total work perfo rmed by all p rocesses is O( n + m 3+ ǫ log n ) . Moreover from Theor em 4.4 the outpu t FREE p set in line p has less than 3 m 2 + m − 2 job s. Th is giv es us for all processes a total work o f O( m 3 ) for the loo p in lin es 14 − 16 . Af ter the loop in lines 14 − 16 all jobs hav e been p erform ed, since we left no TR Y sets behind, thu s algorith m W A IterativeKK ( ǫ ) solves the Write-All p roblem with work com plexity W W A IterativeKK( ǫ ) = O( n + m 3+ ǫ log n ) . For any m = O( 3+ ǫ p n/ log n ) , alg orithm W A IterativeKK ( ǫ ) is work optimal. 8. Conclusions W e devised and analyzed a deterministic alg orithm for the at mo st o nce problem called KK β . For β = m algorithm KK β has effectiv eness n − 2 m + 2 , which is asymptotically optimal for any m = o( n ) and close by an additive f actor of m to the effectiveness up per bound n − m + 1 on all possible a lgorithm s. This is a significan t imp rovement ov er the previous be st known determin istic algorithm [2 6], th at ach iev es asympto tically optimal e ffecti veness only fo r m = O(1) . W ith respect to work complexity , fo r any constant ǫ and for m = O( 3+ ǫ p n/ log n ) we de monstrate how to use K K β with β = 3 m 2 , in ord er to constru ct an itera ted algorithm IterativeKK ( ǫ ) , that is work -optimal and asymptotically ef fectiveness-optimal. Finally we used algo rithm IterativeKK ( ǫ ) in ord er to solve the Write- All prob lem with work comp lexity O( n + m (3+ ǫ ) log n ) , f or any constant ǫ > 0 , which is work optimal for m = O( 3+ ǫ p n/ log n ) . Our solution im proves on the algorithm of Malewicz [36] both in term s of the range o f pro - 28 cessors fo r wh ich we achieve optim al work and on the fact that we do no t assume test-and- set prim itiv es, but use o nly atom ic r ead/write shared memory . The solution of Ko walski an d Shvartsman [28 ] is work optimal for a wider range of processors m than our algorithm , but their algorithm u ses a collection of q permutatio ns with co ntention O( q log q ) . Althoug h an efficient polyno mial tim e constru ction of per mutations with con tention O( q poly log q ) has be en devel- oped by K owalski et al. [27], constructing permutatio ns with contention O( q lo g q ) in p olynom ial time is still an open prob lem. Subsequen t to the confer ence version of this p aper [25], Alistarh et al. [2] show th at there exists a deterministic algorithm for the Write-All prob lem with work O( n + m log 5 n log 2 max( n, m )) , by derandom izing th eir randomized so lution for the problem . Their solution is so far existential, while ours explicit. In terms of ope n questions there still exists an effectiveness g ap between the shown effec- ti veness of n − 2 m + 2 of algo rithm KK β and the kn own ef f ectiv eness bound of n − m + 1 . It would be interesting to see if th is can be bridged for deterministic alg orithms. Mor eover , there is a lack of an up per bound on work com plexity , when the effecti veness o f an algorithm approaches the o ptimal. Finally it would be interesting to study the existence and ef ficiency of algorithms that try to implement at-most-o nce semantics in systems with dif ferent means of co mmun ication, such as message-passing systems. References [1] Y . Afek, E. W eisberge r , and H. W eisman. A completeness theorem for a class of synchroniza tion objects. In P r oc. of the 12th annual ACM Sy mp. on Principles of Distribute d Computing(PODC ’93) , pages 159–170. A CM, 1993. [2] D. Alistarh, M. Bender , S. Gilbert, and R. Guerraoui. How to allocat e tasks asynchronously . In F oundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on , pages 331 –340, Oct. 2012. [3] R. J. Anderson and H. W oll. Algorithms for th e certi fied write-all problem. SIAM J . Computing , 26(5):1277–128 3, 1997. [4] H. Attiya, A. Bar-Noy , D. Dole v , D. Peleg, and R. Reischuk. Renaming in an asynchron ous env ironment. J . ACM , 37(3):524– 548, 1990. [5] R. Bayer . Symmetric binary b-trees: Data structure and maintenance algorithms. Acta Informatica , 1:290–306, 1972. [6] A. D. Birrell and B. J. Nelson. Implement ing remote procedure call s. ACM T rans. Comput. Syst. , 2(1):39–59 , 1984. [7] D. Bokal, B. Bre ˇ sar , and J. Jerebic. A generali zation of hungarian m ethod and hall’ s theorem with applicat ions in wireless sensor network s. Discr ete Appl. Math. , 160(4-5):460 –470, Mar . 2012. [8] S. Chaudhuri, B. A. Coan, and J. L. W elch. Using adapti ve timeouts to achie ve at-most-once m essage deli very . Distrib . Comput. , 9(3):109–117, 1995. [9] B. S. Chle bus and D. R. Ko walski . Cooperati ve asynchronous update of shared memory . In STOC , pages 7 33–739, 2005. [10] A. Czygri now , M. Han ´ c ko wiak, E. Sz yma ´ nska, a nd W . W awrzyni ak. Distribute d 2-approximat ion algorit hm for t he semi-matchi ng problem. In Pr oceedings of the 26t h interna tional confere nce on Distribute d Computing , DISC’12, pages 210–222, Berlin, Heidelbe rg, 2012. Springer-V erla g. [11] G. Di Crescenzo and A . Kiayias. Async hronous perfectly secure communicati on over one-time pads. In Proc. of 32nd Int ernational Colloqui um on Aut omata, Langua ges and Pro gramming( ICALP ’05) , pag es 216–227. Spr inger , 2005. [12] A. Drucker , F . Kuhn, and R. Oshman. The communication complexity of distribut ed task allocation . In Pro c. of the 31st annual Symp. on Principles of Distribut ed Computing(PODC ’12) , pages 67–76. A CM, 2012. [13] M. J. Fischer , N. A. Lyn ch, and M. S. Paterson . Impossibility of distribut ed consensus with one fault y process. J. ACM , 32(2):374–382, 1985. [14] M. Fitzi, J. B. Nielsen, and S. W olf. How to share a key . In Allerton Confer ence on Communicat ion, Contr ol, and Computing 2007 , 2007. [15] C. Georgi ou and A. A. Shvart sman. Do-All Computing in Distribute d Systems: Cooper ation in the Presenc e of Adversit y . Springer , 2008. [16] C. Geor giou and A. A. Shv artsman. Cooperat ive T ask-Oriented Compu ting: Algorithms and C omplex ity . Synthesis Lectures on Distribut ed Computing Theory . Morgan & Clayp ool Publishers, 2011. 29 [17] K. J. Goldman and N. A. L ynch. Modellin g shared state in a shared action model. In Logic in Computer Science , pages 450–463, 1990. [18] J. Groote, W . Hesselink, S. Mauw , and R. V ermeulen. An algorithm for the asynchronou s write-all problem based on process collisi on. Distribute d Computing , 14(2):75–81, 2001. [19] L . J. Guibas and R. Sedge wick. A dichromatic frame work for balan ced trees. In 19th Annual Symposium on F oundations of Computer Science (FOCS) , pages 8–21, 1978. [20] N. J. A. Harve y , R. E. Ladner , L. Lov ´ asz, and T . T amir . Semi-matching s for biparti te graphs and load ba lancing. J. Algorithms , 59(1):53–78, Apr . 2006. [21] M. Herlihy . W ait-fre e synchronizat ion. ACM T ransactions on Pr ogramming Langua ges and Syste ms , 13:124–149, 1991. [22] K. C. Hillel . Multi-sided shared coins and ran domized set-agree ment. In Pro c. of the 22nd ACM Symp. on P arallel Algorithms and Arch itectur es (SP A A’10) , pages 60–68, 2010. [23] P . C. Kanella kis and A. A. Shva rtsman. F ault-T olerant P arallel Comput aion . Kluwer Academic Publishers, 1997. [24] S. Kentr os, C. Kari, and A. Kiayias. The strong at-most-once problem. In Pr oc. of 26th Internati onal Symp. on Distrib uted Computing(DISC’12) , pages 390–404, 2012. [25] S. Ken tros and A. Kiayias. Solving the at-most-onc e problem with nearly optimal effec ti veness. In ICDCN , pages 122–137, 2012. [26] S. Kentr os, A. Kiayias, N. C. Nicolaou , and A. A. Shvartsman. At-most-onc e semantics in asynchronous shared memory . In Proc. of 23r d International Symp. on Distribute d Computing(DIS C’09) , pages 258–273, 2009. [27] D. Ko walski, P . M. Musial, and A. A. Shvartsman. Explicit combinato rial structures for coopera ti ve distribut ed algorit hms. In Pr oceedings of the 25th IEEE Intern ational Confere nce on Distribute d Computing Systems , ICDCS ’05, pages 49–58, W ashington, DC, USA, 2005. IEEE Computer Societ y . [28] D. R. Ko wal ski and A. A. Shva rtsman. Writing-all determinist ically and optimally using a nontri vial number of asynchrono us processors. ACM T ransactio ns on Algorithms , 4(3), 2008. [29] L . Lamport. The part-ti me parliament . ACM T rans. Comput. Syst. , 16(2):133–1 69, 1998. [30] B. W . L ampson, N. A. Lyn ch, and J. F . S-Andersen. Co rrectness of at-most-once message deli very protocols. In Pr oc. of t he IFIP TC6 /WG6.1 6th Inte rnational Confer ence on F ormal De scription T echnique s(FORTE ’93) , page s 385–400. North-Holla nd Publishing Co., 1994. [31] K. -J. Lin and J. D. Gannon. Atomic remote procedur e call. IEEE T rans. Softw . E ng. , 11(10): 1126–1135, 1985. [32] B. Lisko v . Distrib uted programming in argus. Commun. ACM , 3 1(3):300–312, 1988. [33] B. Liskov , L . Shrira, and J. Wroclawski. E ffic ient at-most-once m essages based on synchronize d clocks. ACM T rans. Comput. Syst. , 9(2):125–142, 1991. [34] N. L ynch and M. T uttle. An introduction to input/o utput automata . CWI-Quarterly , pages 219–246, 1989. [35] N. A. L ynch. Distributed Algorithms . Morgan Kaufman n Publishers, 1996. [36] G. Male wicz. A work-optimal determinist ic algorithm for the certified write-all problem with a nontri vial number of asynchrono us processors. SIAM J. Comput. , 34(4):99 3–1024, 2005. [37] A. Z. Spector . Performing remote operations effic iently on a local computer network. Commun. ACM , 25(4):246 – 260, 1982. [38] R. W . W atson. T he delta-t transport protocol: Features and ex perience . In P r oc. of the 14th Conf. on Local Computer Networks , pages 399–407, 1989. 30
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment