Algorithms For Extracting Timeliness Graphs
We consider asynchronous message-passing systems in which some links are timely and processes may crash. Each run defines a timeliness graph among correct processes: (p; q) is an edge of the timeliness graph if the link from p to q is timely (that is…
Authors: Carole Delporte-Gallet (LIAFA), Stephane Devismes (VERIMAG - IMAG), Hugues Fauconnier (LIAFA)
Algorithm s For Extrac ting T imeliness Graphs ∗ Carole Delporte-Gallet University Paris Diderot Carole.Delpor te@liafa.jussieu.fr St ´ ephane De vismes University Joseph Fourier (Grenoble) Stephane.Devismes@imag.fr Hugues Fauconnier University Paris Diderot Hugues.Faucon nier@liafa.jussieu.fr Mikel Larrea University of the Basque Country Mikel.Larrea@ehu. es Abstract W e consider asynch rono us message -passing systems in which some link s are timely and processes may crash. Each run defin es a timeline ss g raph among corr ect processes: ( p, q ) is an edge of th e time- liness graph if the lin k f rom p to q is tim ely ( that is, there is bo und o n comm unication delays from p to q ). The main goal of this paper is to approximate this timeliness graph by graphs having some properties (such as being trees, rings,... ). Giv en a family S o f g raphs, for run s such that the timelin ess graph contains at least on e graph in S then using an e xtr action algorithm , each corr ect proc ess has to conv erge to the same graph in S that is, in a precise sen se, an ap proxim ation of the timeliness graph of the run . For example, if the timeliness graph contains a ring, then using an extraction algorithm, all correct pro cesses ev entually converge to the same ring and in this ring all nodes will be corr ect proc esses and all links will be timely . W e first presen t a g eneral extraction algor ithm and then a more specific extraction algo rithm tha t is commun ication efficient ( i.e. , eventually all the messages of the extraction algorithm use on ly links of the extracted graph). 1 Introd uction Designin g fault -tolera nt protocol s for asyn chrono us systems is highly d esirab le b ut also highly comp lex. Some classic al agreement problems such as consensus and r eliable br oadcast are well-kno w n tools for solvin g more sophistica ted tasks in fau lty en vi ronments (e.g., [17, 15]). Roughly speaking, with consensu s proces ses must reach a c ommon deci sion on thei r inputs, and with reliabl e broadcas t proc esses must del i ver the same set of messages. It is well kno wn that consen sus cann ot be solved in asynch ronous systems with fai lures [14], and se v- eral mechanisms were introd uced to circumv ent th is impossibility resu lt: ran domizati on [7], partia l syn- chr ony [11, 12] and (unr eliable) failur e detector s [6]. Informal ly , a fail ure d etecto r is a dis trib ute d oracle tha t g i ve s ( possib ly inco rrect) hints about th e proces s crashe s. Each process can access a local fail ure detecto r module that m onitor s the process es of the system and maintain s a list of processes that are suspected of hav ing crashed. Sev era l clas ses o f f ailure dete ctors ha v e be en in troduc ed, e.g. , P , S , Ω , e tc. Failure detectors classes c an be compared by reducti on a lgorith ms, so for an y giv en problem P , a natu ral q uestio n is “ What is th e weak est failur e detector (class) tha t can s olve P ? ”. This quest ion has been ext ensi v ely studied for se v eral proble ms in systems with infinite pr oce ss memory ( e.g . , uniform and non-uni form vers ions of consensus [5, 13], ∗ This work has been supported in part by the ANR projet SHAMAN . non-b lockin g at omic commit [9], uniform reliable broadc ast [1, 19], implementing an atomic regi ster in a message- passin g system [9], m utual exclusio n [10], boostin g obstruction -freedom [16], set consensus [21, 22], etc.). This quest ion, howe v er , has not been as ext ensi v ely studied in the contex t of syste ms with finite pr ocess memory . In this paper , we cons ider systems where process es hav e finite memory , proce sses can crash and links can lose messages (mo re precis ely , link s are fa ir lossy and FIF O 1 ). Such en vi ronments can be found in many systems, for ex ample in sensor networks, senso rs are typica lly eq uippe d with small memories, the y can crash when their batteri es run out, and they can exp erienc e m essage losses if the y use wireless communication . In such syste ms, we consider (the uniform ve rsions of) reliab le broadcast, conse nsus and repeate d con- sensus . Our contrib ut ion is threefold : First, we establis h that the weake st failur e detector for reliable broad- cast is P − — a failure detector that is almost as po w erful than the perfect failure detector P . 2 Next, we sho w th at conse nsus can be solv ed u sing fai lure detect or S . Finally , we pro v e that P − is th e weake st failur e detect or for repeated con sensu s. Since S is strictly weak er than P − , in some prec ise se nse these result s imply that, in the systems that we consider here, co nsens us is easier to solve than reliable broadcas t, and reliabl e broadcast is as difficu lt to solve as repeate d consensus. The abov e results are some what surprising becau se, when processes hav e infinite memory , reliable broadc ast is easie r to solve than consensus 3 , and repeate d consensus is not more dif ficult to solve than consen sus. Roadmap. The rest of the paper is or ganiz ed as follo w s: In the nex t section, we presen t the model con- sidere d in this paper . In S ection ?? , we sho w that in case of proce ss memory limitation and possibility of crashe s, P − is necess ary and suf ficient to solv e reliable broadc ast. In Section ?? , we sho w that consensus can be so lve d using a fail ire detecto r of type S in our sys tems. In Section ?? , we sh o w that P − is neces sary and suf ficient to solv e repeated consens us in this con text . For spa ce conside rations , all the proofs are relegate d to an optional appendix . 2 Inf ormal Model Graphs. W e b egin with so me definitions and notatio ns concerning graphs. For a directed graph G = h N , E i , N ode ( G ) and E dge ( G ) denote N an d E , respe cti v ely . Giv en a g raph G and a set M ⊆ N ode ( G ) , G [ M ] is the subgr aph of G induced by M , i.e. , G [ M ] is the graph h M , E dge ( G )[ M ] i where ( p, q ) ∈ E dge ( G )[ M ] if and only if p, q ∈ M and ( p, q ) ∈ E dge ( G ) . The tuple ( X, Y ) is a dir ecte d cut ( dicut for short) of G if and only if X and Y d efine a partition of N ode ( G ) and there is no directed edge ( y , x ) ∈ E dg e ( G ) such that x ∈ X and y ∈ Y . W e say that G ′ is a dicut re ductio n from G if there exist s a dicut ( X, Y ) of G such that G ′ = G [ X ] . A set S o f graphs is dicut- closed if and only if it is closed unde r dicut reduc tion, namely if G ∈ S then all the graph s obtained by a dicut-r eductio n of G are in S . Pro cesses and Links. W e consider distrib uted syste ms composed of n processes which communicate by message- passin g th rough directed links. W e denote the set of process es by Π = { p 1 , ..., p n } . W e assume that the communicat ion graph is complete, i.e. , for each pair of distinct proc esses ( p, q ) , there is a directed link from p to q . 1 The FIFO assumption is necessary because, from the results i n [20], if lossy links are not FIFO, reliable broadcast requires unboun ded message headers. 2 Note that P ⊆ P − and P − is unr ealistic according to the definition in [8]. 3 W ith infinite memory and fa ir lossy li nks, (uniform) reliable broadcast can be so lved using Θ [4], and Θ is strictly weak er than (Σ , Ω ) which is necessary to solve con sensus. 2 A proces s may fail by crashin g, in which case it definitiv ely stops its local algorithm. A process that ne ve r crashes is said to be corr ect , faulty otherwise. The (directed) links are r eliable , i.e. e ve ry messag e sent thro ugh a link ( p, q ) is e ven tually recei v ed by q if q is corre ct and if a m essage m from p is recei ve d by q , m is receiv ed by q at most once, and only if p pre viou sly sent m to q . The links being reliable, an implementation of the re liable br oadcast [18] is possible. A reliable broad- cast is defined w ith two primiti ves: rbroad cast h m i and rde live r h m i . Informal ly , after a correc t proces s p in vo kes rbroadc ast h m i , all correct proce sses e ve ntually rdeli ver h m i ; after a f aulty proce ss p i n vok es rbro adcas t h m i , eithe r all co rrect proc esses e ve ntually rdelive r h m i or correct proc esses ne ve r rde liver h m i . Timel iness. T o simplify the present ation, we assume the exis tence of a discrete global clock. This is merely a fictional dev ice: the processe s do not ha ve acce ss to it. W e take the range T of the clock’ s ticks to be the set of natural numbers. W e assume that ev ery correct proce ss p is timely , i.e . , there is a lo wer and an upper bound on the ex ecuti on rate of p . C orrect p rocess es also hav e clock s that are not necess arily synchroni zed b ut we a ssume that the y can accuratel y m easure interv als of time. A link ( p, q ) is timely if there is an unkno wn bound δ such that no message sent by p to q at time t may be recei v ed by q after time t + δ . A timeliness gr aph is simply a di rected g raph w hose set of nodes a re a subset of Π . The timelin ess g raph repres ents the timeliness propert ies of the links. Intuiti vely , for timeline ss graph G , N ode ( G ) is the set of correc t processes and ( p, q ) is in E dg e ( G ) if and only if the link ( p, q ) is timely . Runs. An algorit hm A consists of n deterministic (infinite ) aut omata, one for eac h pro cess; the automaton for process p is denoted A ( p ) . The execu tion of an algorith m A proceeds as a sequence of process steps . Each process pe rforms its steps ato mically . During a step, a process may send an d/or recei v e some m essage s and chang es its state. A run r of algorit hm A is a tuple r = h T , I , E , S i where T is a timeliness graph, I is the initial state of the pro cesses in Π , E is an in finite sequen ce of steps of A , and S is a list of increasin g time valu es indica ting when each step in E occur red. A run must satisf y usual properties concernin g sending and recei vi ng messages . Moreo ve r , we assume that (1) all correct proces ses make an infinite number of steps: p ∈ N ode ( G ) if and only if p makes a n in finite number of steps in E and (2) the timelin ess of links is deduc ed from the timeliness graph: ( p, q ) ∈ E dge ( G ) if and only if the link ( p, q ) is timely in E . In the follo w ing for run r = h T , I , E , S i , T ( r ) denotes T the timeliness graph of r , a nd C or r ect ( r ) is the set of correct processe s for the run r , namely , C or r ect ( r ) = N ode ( T ( r )) . Note that by definition , ( p, q ) is a timely link if and only if ( p, q ) ∈ E dge ( T ) . Remark that in th e definition giv en here a link may be timely eve n if no message is sent on the link. If li nk ( p, q ) is FIFO ( i.e . , messages fro m p to q are rece i ved in the o rder th ey are sent) and p re gul arly sends messag es to q , the n the timeli ness of thes e message s implies the timel iness of the l ink itsel f. So in the follo wing we alwa ys assume that links are FIFO. 2.1 Some Systems W e say that timeliness graph G is compatib le with timeline ss graph G ′ if and only if (1) N ode ( G ) = N ode ( G ′ ) and (2) E dg e ( G ) ⊆ E dg e ( G ′ ) . By ext ension , timeliness graph G is compatibl e with run r if G is compatib le with T ( r ) , the timeliness graph of r . Hence, timeliness graph G is compat ible with run r if N ode ( G ) is the set of correct processes in r and if ( p, q ) is an edge of G then ( p, q ) is timely in r . 3 A system X is defined as a set of timeliness graphs. The set of runs of system X denoted R ( X ) is the set of all runs r such that there exists a timeliness graph G in X compatib le with r . Belo w , we define the systems consid ered in this paper: • AS Y N C is the set of all timeliness graphs G such that E dge ( G ) = ∅ . In AS Y N C there is no timelines s assumption about links and R ( AS Y N C ) is the set of all runs in an asynchron ous system. • C O MP LE T E is the set of all complete graphs whose nodes are the subsets of Π . • S T A R is the set of all ti meliness grap hs with a sour ce , i.e . , G ∈ S T AR if a nd only if N ode ( G ) ⊆ Π and there exists p 0 ∈ N ode ( G ) (the cente r of the star or the sourc e) such that E dg e ( G ) = { ( p 0 , q ) | q ∈ N ode ( G ) \ { p 0 }} . Clearly a run r is in R ( S T AR ) if and only if there is at least one sour ce in r . • T RE E is the set of all timeliness graphs G that are rooted directed trees, i.e. , | E dg e ( G ) | = | N ode ( G ) |− 1 and the re ex ists p 0 in N ode ( G ) such that ∀ q ∈ N ode ( G ) , there is a direct ed path of G from p 0 to q . Clearly a run r is in R ( T RE E ) if and only if there is at least one timely path from a correct proces s to all correc t processes. • RI N G is the set of all timeliness graphs G such that G is a directe d cycle (a ring). C learly a run r is in R ( RI N G ) if and only if there is a timely (directe d) cy cle ov er all corre ct processes. • S C is the set of all timeliness graphs that are strongly connected. Clearly , a run r is in R ( S C ) if and only if there exi sts a (directed) timely path between each pair of distinct correct processe s. • B I C is the set of all timelines s graphs G such that for all p , q ∈ N ode ( G ) , th ere exi st at least two distin ct paths from p to q . B I C correspond s to the set of 2-strongly- conne cted graphs. Clearly , a r un r is in R ( B I C ) if and only if the re exists at lea st two disti nct timely paths between each pair o f distinc t correc t processes . • P AI R is the set of all timeliness graphs G such that E dge ( G ) = { ( p 0 , p 1 ) , ( p 1 , p 0 ) } with p 0 , p 1 ∈ N ode ( G ) and p 1 6 = p 0 . Clearly , a run r is in R ( P AI R ) if and only if there exis ts two distinct correct proces ses p 0 and p 1 such that ( p 0 , p 1 ) and ( p 1 , p 0 ) are timely link s. 3 Extraction Algorithms Giv en a syst em X , the goal of an extr act ion algori thm is to ensure that in each run r in X , all correct proces ses e ve ntuall y agree on th e s ame element of X and that this elemen t is, in some precise s ense, an approx imation of the timeliness graph of run r . For exampl e, in RI N G , all processe s hav e to e ven tually ag ree on some rin g and this ring has t o be compatib le w ith the timeline ss graph of the run. In particu lar this ri ng contains all the correct processes . Ho wev er , t he compatibilit y relat ion may be too strong: In many systems, it is not possible to distinguish between a crashed proc ess and a correct one, so the graph G on which the processes ev entu ally agree may contai n crashed processes and then the graph is not exactly compatible with the run. Then w e weaken the compatib ility and impose only that the subgraph of G induced by the set of correct processes of the run is a dicut reductio n of the timeliness graph of the run. W e now formally define what an extra ction algorithm is. F irst, in such an algorith m, e very process p maintain s a local variab le G p which contains a timeline ss graph . Then, we say that an algorithm extr acts a timelines s gra ph in X if and only if for eve ry run r in X there is a timeline ss graph G (called the extr acted gra ph ) such that: • Con ver gen ce: for all correct processes p there is a time t after which G p = G 4 • Compatibilit y: G [ C or r ect ( r )] is compatib le with T ( r ) • Closur e: G [ C or r ect ( r )] is a dicut reduct ion of G or is equal to G • V alidi ty: G is in X Remark that for all systems that contain A S Y N C there is a triv ial extr action algorith m: for each run proces ses extract the graph G such that N ode ( G ) = Π and E dg e ( G ) = ∅ . A more cons traine d version of the extractio n problem is the follo wing: an algorithm A ex tra cts exact ly timelines s graphs in X if for ev ery run r in system X , the ext racted graph G is compatible with T ( r ) . In this case, all correct proce sses e v entual ly kno w the ex act set of correct proce sses: it is the set of nodes of the extr acted graph. Some Results about Extractio n Algorit hms. First we sho w that an extra ction algorithm may help to route messages using only timely links: Lemma 3.1 Let G be a graph extr acte d fr om run r , if ( p, q ) is in E dg e ( G ) and q is a corr ect pr oc ess then p is corr ect. Pro of. By contradictio n, assume that p is not correct, then ( C or r ect ( r ) , N ode ( G ) − C or r ec t ( r )) is not a dicut because ( p, q ) ∈ E dge ( G ) , p ∈ N ode ( G ) − C or rect ( r ) and q ∈ C or r ect ( r ) , which contra dicts the Closure propert y . ⊓ ⊔ From this lemma and the Compatibili ty property , we deduce directl y: Pro position 3.2 If ( p = p 0 , . . . , p i , . . . , q = p m ) is a path in the extr acted grap h and p and q a r e corr ect pr ocesses, then for every i suc h that 0 ≤ i < m the link ( p i , p i +1 ) is timely and pr ocess p i is corr ect. From a practic al point o f vi e w , this pr oposit ion sho ws th at the extrac ted graph may be used to route messages between processes using only timely links: the route from p to q is a path in the extracte d graph (if an y). All intermed iate nodes are corre ct processes and agree on the e xtrac ted graph and then o n the pat h. For e xample with T RE E , the tree e xtra cted by t he al gorith m enables to rou te messag es from the root o f the tree to an y other process es and the rou ting uses only timely links. Generally , the main goal of the extra ction algorithm is not only to extract a graph G in X bu t also to ensure that G [ C or r ec t ( r )] is in X (e ven if the processes do not know the set of corre ct processes). I n particu lar , this property is ensured if X is dicut-c losed: the Closure property implies that G [ C or r ect ( r )] is in X . Among the systems we consider , only system P AI R is not dicut-close d: H = h{ x } , ∅i is a dicut reduct ion of G = h{ x, y , z } , { ( y , z ) , ( z , y ) }i b ut is not in P AI R . It is easy to veri fy that ev ery other pre viou sly introduced system is dicut-clo sed. Fo r the se systems w e obt ain: Pro position 3.3 Consider any extr acti on algorithm for the system X . • If X = S T AR , then the center of the ext rac ted star is a corr ect pr ocess. • If X = T RE E , then the r oot of the e xtr acte d tre e is a corr ect pr oces s. • If X ∈ {S C , C O MP LE T E , RI N G , B I C } , th en the extr actio n is exac t. 5 Pro of. For S T AR and T RE E , all the dicut reductions of the extra cted graph conta in at least respec ti ve ly the center and the root, then the restriction of the extracte d graph cont ains at least these nodes, proving that the y are correct processes . There is no dicut for a strongly connected graph. Hence in S C , there is no dicut reductio n then by the Closure pr operty the sub graph induced by t he set of corre ct processe s of the ex tracted gra ph is the e xtra cted graph itself . C O MP LE T E , RI N G , and B I C ar e particu lar cases of systems only compose d of strongly conne cted timeliness graphs. ⊓ ⊔ An immediate consequen ce of Proposition 3.3 is that any extrac tion algorithm giv es an implementatio n of e ve ntual leade r elec tion (f ailu re det ector Ω ) for sy stems S T AR and T RE E as well as an implementatio n of fa ilure detector ♦ P fo r systems C OMP LE T E , RI N G , S C and B I C . Due to the lack of spac e, the proo fs of the two follo wing propositio ns hav e b een mo ved in the app endix. In t he fi rst propositio n we sho w that extraction is not a lwa ys possible. Actually , in the proof we exhibit some non dicut-c losed systems, namely P AI R , where no ex tractio n algorithm can be implemented. Pro position 3.4 Ther e e xist some systems X for which ther e is no extr actio n algorithm. In the next section we sho w that for all dicut -close d sys tems there is an extraction algorithm. For systems like S T AR , T RE E and P AI R , there exist s no exac t extractio n algorithm. Pro position 3.5 Ther e exist some syste ms X for whic h ther e is an extr act ion alg orithm a nd ther e is no exa ct e xtr action algorithm. 4 An Extraction Algorithm The aim of this s ection is to sho w that the dicu t-clos ed prope rty of a system is suf ficient to solv e the ex trac- tion problem. T o that end, we prop ose in Figure 1 an extrac tion algorith m, c alled A ( X ) , for dicut-clo sed systems X . The basic idea of Algorithm A ( X ) is to make process es selec t a graph that is compatible with the timelines s graph of the run. For this, each process maintains for each graph x in X an accu sation counter Acc [ x ] . T his counte r infinitely grows if some corr ect process is not in x or if some directed edge of x is not timely . Then, Acc [ x ] is bound ed if and only if x cont ains all correct processes and all timely links between pairs of correct proces ses. W e implement accu sation counters as follows. A process regularl y blames all the graph s in X in which it is not a node: it increments the accu sation counters of all these graphs. Note that if the process is correct this accu sation is justified and if the process i s not correct, after some time, the process being dead stops to incremen t the accusation counter s. Moreov er , each process regu larly sends on its outgo ing links al iv e messages . Each process maintai ns an estimate of the communic ation delays for each incoming link ( ∆[ q ] for th e inc oming link ( q, p ) ). If it d oes not rece i ve aliv e m essage s within the se estimat es on some inco ming link it blames all timeliness graphs in X conta ining this li nk ( i.e. , increments the accusation co unters for these graphs). As the estimate of the communic ation delay may be too short , each time it is exce eded the proces s increases it for the link. In this way , if the link is timely , at some time the estimat e will be greate r than the bound on communicat ion delay . The accusatio n count ers are broadcas t by reliable broadcast s. Each time a process recei v es a new v alue of accus ation counte r it updates its own accusation counter to the maximum of the recei ved values and its curren t valu es. Hence, if some timely graph stops to be blamed then all correct processes e ven tually agre e on the v alue of its accusa tion counter . 6 By sele cting the gra ph G with the l o west accu sation v alue (to break ti es, we a ssume a t otal ord er among the grap hs of X ) if an y , correct pro cesses e ven tually agree on the s ame timel iness graph of X , moreov er we can prov e that this graph contain s (1) all the corr ect proce sses, and (2) all edg es between correct proce sses are timely links. As a consequen ce, the Con ver genc e, the C ompatibi lity and the V alidity properties of the extra ction algorithm are ensu red. Nev erth eless, this graph can also contain faulty processes and edges between correct and faul ty processes. Consider no w the C losure property . If G conta ins only correct processes then the Closure property is tri vially satisfied . Otherwise, G contains C or r ec t ( r ) and a set F of fau lty processes . In this case, ( C or r ect ( r ) , F ) is a dicut red uction of G : Indeed if the re is a n edge in G from a faulty proc ess q to a correc t proce ss p , eve ntuall y the proce ss p stops to recei v e messages from q and the accusa tion counter of G grows infiniti v ely often. Hence, in all case s, the Closure property is satisfied. Hence, if X is dicut-clos ed, A lgorith m A ( X ) extr acts a graph in X . Moreov er from Proposi tion 3.3 , if all the graph s of X are strongly connected then the algorithm exactly extr acts a graph in X . In the alg orithm, each process p uses lo cal timers, one per proce ss. T he timer of p d edicate d to q is set (by setting sett imer ( q ) to a positi ve v alue ) to a time interv al rather than absolu te time. The timer is decremen ted until it expires. When the timer exp ires time rexpi re ( q ) becomes tr ue . Note that a timer can be resta rted before it expire s. In the algorit hm, we denote by ≺ the total order relation on X and by ≺ lex (see L ine 2) the total order relatio n defined as follo ws: ∀ x, y ∈ X , ∀ c x , c y ∈ N , ( c x , x ) ≺ lex ( c y , y ) ≡ [ c x < c y ∨ ( c x = c y ∧ x ≺ y )] . Code for each process p 1: Proc edure updateE xtr actedGr aph () 2: G ← x such that ( Acc [ x ] , x ) = min ≺ lex { ( Acc [ x ′ ] , x ′ ) such tha t x ′ ∈ X } 3: On initi aliza tion: 4: f or all x ∈ X do Ac c [ x ] ← 0 5: f or all q ∈ Π \ { p } do 6: ∆[ q ] ← 1 7: settimer ( q ) ← ∆[ q ] 8: updateE xtr actedGr aph () 9: start tasks 1 and 2 10: task 1: 11: loop fo re ve r 12: send h aliv e i to ev ery q ∈ Π \ { p } ev ery K time 13: rbroadcast h AC C , ⊥ , p i eve ry K time / ∗ to accuse graph s that do not contain p ∗ / 14: task 2: 15: upon receive h aliv e i from q do 16: settimer ( q ) ← ∆[ q ] 17: upon timerexpire ( q ) do 18: rbroadcast h AC C, q , p i / ∗ to accuse graph s that contain the link ( q , p ) ∗ / 19: ∆[ q ] ← ∆[ q ] + 1 20: settimer ( q ) ← ∆[ q ] 21: upon rdeliver h AC C , q , h i do / ∗ infor mation from h ∗ / 22: f or all x ∈ X do 23: if q = ⊥ then 24: if h / ∈ N ode ( x ) then Acc [ x ] ← Ac c [ x ] + 1 25: else 26: if ( q, h ) ∈ E dge ( x ) then Acc [ x ] ← Acc [ x ] + 1 27: updateE x tractedGr aph () Figure 1: Algorithm A ( X ) ex tracts a graph in X A sketch of the correctness proo f of A ( X ) is gi v en below . In this sketch, w e consider a run r of A ( X ) in dicut -close d system X . W e will denote by v ar t p the v alue of v ar of proces s p at time t . W e first notice that all variab les Acc p [ x ] are monoto nicall y increas ing: 7 Lemma 4.1 F or all times t and t ′ suc h that t ≥ t ′ , for all pr ocesses p , for all gra phs x in X , Acc t p [ x ] ≥ Acc t ′ p [ x ] . Let sup( Acc p [ x ]) be the supremum of Acc t p [ x ] for all t , w e say that Acc p [ x ] is unbounded if sup( Acc p [ x ]) is equal to ∞ a nd bounded other wise. A s Acc p [ x ] is also updated b y r eliable broadcast each time so me proces s q modifies Acc q [ x ] we ha v e: Lemma 4.2 F or all corr ect pr ocess es p and q , f or all gra phs x in X , sup( Acc p [ x ]) = su p( Acc q [ x ]) Let sup( Acc [ x ]) be the supremum sup( Acc p [ x ]) ove r all corre ct proce ss p of Acc p [ x ] , then sup( Acc [ x ]) is well-defined . If there is a least one x ∈ X such that sup( Acc [ x ]) is bound ed, then min { sup( Acc [ x ]) | x ′ ∈ X } is finite, hence G the graph such that ( Acc [ G ] , G ) = min ≺ lex { ( Acc [ x ′ ] , x ′ ) | x ′ ∈ X } is well defined. Then all correc t processes con v er ge to the same graph: Lemma 4.3 If t her e e xist s x in X such that sup( Acc [ x ]) is bounded then the r e is a time after whi ch for eve ry corr ect pr ocess p , G p is G . No w prov e the Compatibilit y property . Cons ider an y timeliness graph compatible with T ( r ) , and assume that x ∈ X , then there is a time t 0 after w hich all f aulty processes are dead and the estimates of communi- cation delays are greater than the bound s of communication delays of timely links of the run. After time t 0 , (1) as x conta ins all correct processes , no process will blame x becaus e it is not a node of x , and (2) as all edges of x are timely , no process will blame x for one of its edges then: Lemma 4.4 If x in X is compat ible w ith T ( r ) , then sup( Acc [ x ]) is bounded . Reciproc ally , let x be a timeline ss graph of X that is not compatibl e with the run. If process p is not correc t ther e is a time t after which it d oes n ot send an y al iv e messag e, and there is a time after th e timers on p expire fore ver for all corre ct proce sses, then if p is a node of some x ∈ X , Acc p [ x ] is incremen ted i nfinitely often and sup( Acc [ x ]) = ∞ . In the same way if ( p, q ) is not timely , by the fifo prope rty of the link, the timer for p expir es infinitely often for process q and if ( p, q ) is an edge of x then Acc q [ x ] is incremented infinitely often and sup( Acc [ x ]) = ∞ . Then: Lemma 4.5 F or e very x in X , if sup( Ac c [ x ]) is bound ed then x [ C or r ect ( r )] is compa tible with T ( r ) . Hence: Lemma 4.6 (Compatibilit y) G [ C or r ect ( r )] is compatible with T ( r ) . It remains to prove that G satisfies the Closure property: G [ C or r ect ( r )] is a dicut reduction of G or is equal to G . As G [ C or r ect ( r )] is compatible with T ( r ) , we ha v e: Lemma 4.7 C or rect ( r ) ⊆ N ode ( G ) . Let F = N ode ( G ) − C or r ect ( r ) . If F is empty the Closur e prope rty is t ri vially ens ured. C onside r now the case where F is not empty . F contains only faulty processes and ( C or r ec t ( r ) , F ) is a partition of G ( N ode ) . If there is an edge in E dg e ( G ) from a faulty proc ess q to a co rrect process p , e v entual ly the process p nev er recei v es a m essage from q and the accus ation counter of G will be unbounded , contradic ting the choice of G . So, w e ha v e: Lemma 4.8 If F 6 = ∅ then E dge ( G ) ∩ ( F × C or r ect ( r )) = ∅ . 8 Hence, ( C or r ect ( r ) , F ) is a dic ut of G . Lemma 4.3 an d Lemma 4 .4 p rov e the Con ver gence prope rty , Lemma 4.6 pro ve s the Compatibility prop - erty and L emma 4.8 pro v es the C losure property . Moreov er , G is clea rly in X pro ving the V al idity . Propo- sition 3.3 sho ws that the extract ion is exa ct when all graphs of X are strongl y connected . Hence , w e can conclu de with the follo wing theorem: Theor em 4.9 Let X be a dicu t-clos ed system. Algo rithm A ( X ) e xtra cts a gr aph in X . Mor eo ver if all gra phs of X ar e str ongly connected , Algorith m A ( X ) e xactl y extr acts a graph in X . 5 An Efficient Extraction Algorithm In this secti on, we propose another extr action algorithm called AF ( X ) (Figures 2 and 3). T his algo rithm is ef ficient m eaning that the (correct) pro cesses e ven tually o nly send m essage s alo ng th e edges of the e xtra cted graph. AF ( X ) (exactly) e xtract s a timeline ss graph from sy stem X , where (1) X is dicu t-clos ed and (2) for a ll graphs g ∈ X there is some process p , called r oot , such that there is a directed path from p to ev ery nod e of g . For e xample , T RE E and RI N G systems hav e this prope rty . In the followin g, w e refer to these syst ems as dicut -close d systems with a r oot . For ev ery graph g in X , the functio n r oot ( g ) returns a root of g . In the algo rithm, e very pr ocess p stor es se veral v alue s con cernin g the graph s x ∈ X such that r oot ( x ) = p : (1) Acc [ x ] is the accusatio n counte r of x whose goal is the sa me as in A lgorit hm 1, (2) P r op [ x ] is a pr oposition counter whose goal will be explained later , and (3) ∆[ x ] gi v es the expected time for a message to go from p (the roo t of the x ) to all the nodes of x . Every proc ess also ma intains a set v ariable C an didates . Each ele ment of t his s et is a 4 -tuple co mposed of a gr aph x of X and the ne west v alues of Acc [ x ] , P rop [ x ] , and ∆[ x ] kno wn by the process (the exact v alues are maintained at r oot ( x ) ). Each element in this set is called candid ate and each process selects its ext racted graph among the graphs in the candidate elements. As in Algorith m 1: (1) Each pr ocess p sends aliv e m essage s on it s out going link s and moni tors its incomin g links . Howe v er , we restrain her e th e aliv e messag e send ings: process p sends al iv e messag es on its ou tgoing li nk ( p, q ) only if ( p, q ) is in a graph can didate . (2) A gra ph cand idate is bla med if (a) a corr ect proc ess is no t in the gra ph or ( b) a proce ss recei ves an out of da te message throug h one of its incomin g links. In both c ases the can didate is definiti v ely remo ved from the C andidates sets of all process es. T o achie ve this goal the proces s sends an accusati on message ( AC C ) usin g a reliable bro adcas t and use s an array H ear d that ensures tha t an identica l candid ate (that is, the same graph with the same accus ation and proposit ion valu es) can ne v er be added again. Moreo v er , upon deli v ery of an accusation message for graph x , r oot [ x ] incremen ts Acc [ x ] . W e now pres ent diffe rent mechanisms used to obtain the efficie ncy . For all grap hs x ∈ X , only the proc ess r oot ( x ) is allowed to propo se x as a candida te to the rest. Each pro cess p stores its better candid ate in its va riable me , that is, the least blamed graph x su ch that r oot ( x ) = p . • If a process finds in C an didates a bett er candidat e than me , it removes me from C andidates . 9 • If a proce ss fi nds that me is better , it adds me to C andidates and sends a new message containin g me (1) to all proc esses that are not in N ode ( me ) , and (2) to immediate success ors of p in me . The immediate successors in me add m e to their C andidates set and relay the new message, and so on. By the relia bility of the links, e ve ry corre ct process that is not in me ev entu ally recei ve s this message and blames me . These mechanisms are achiev ed by the procedu re updateE xtr actedGr aph () . This procedu re is called each time a graph cand idate is blamed or a ne w cand idate is propo sed. Note tha t th e C andidates set is maintain ed with the set O therC and (the candidates of other processes) , a boolean Local that is true when the proces s has a candidate , and me , the grap h candidate. A process p may gi v e up a candida te with out this candida te be ing blamed: in this case, p is the root of the candid ate, it finds a better candidate in O therC and , and remo ves me from C andidates . Then, p must not incremen t Acc [ me ] when it recei ves accu sation s caused by this remo ving, indeed these accusation s are not due to delayed messag es. T hat is the go al of the pro posit ion counte r ( P r op ): in P r op [ x ] , r oot ( x ) counts the number of times it proposes x as candidate and inclu des this v alue in each of its n ew messages (to inform other process of the current value of the count er). Hen ce, when q wants to blame x , it now includ es its own vie w of P r op [ x ] in the accu sation message . This accusat ion will be consid ered as legitimate by r oot [ x ] (that is, w ill cause an incremen t of Acc [ x ] ) only when the proposi tion coun ter inside the m essage matche s P rop [ x ] . Also, whene v er r oot [ x ] remov es x from C andidates , r oot [ x ] incremen ts P r op [ x ] and does not send the ne w v alue to the other proce sses. In this way accusa tions due to this removing will be ignore d. For any timely candidate, the accusation counter will be bounded and its proposition counter increased each time it is proposed. In this way the graph with the smalle st accusation an d proposition v alu es e ve ntuall y remains forev er in the C andidates set of all corre ct proce sses and it is chose n as extracted graph. (This is done in the procedure updateE xtractedGr aph () .) Moreo ve r , e ve ntuall y all other candid ates are giv en up and it remains only this graph in C andidates . In this way , only al iv e messages are sent and they are sent along the directed edges of the ext racted graph ensurin g the efficien cy . Code for each process p 1: Proc edure updateE xtr actedGr aph () 2: Let ( a min , min ) = min ≺ lex { ( acc, c ) such that ( c, acc, − , − ) ∈ O ther C and } ∪ { ( ∞ , ∞ ) } 3: if ( a min , min ) < ( Acc [ me ] , me ) ∧ Local then / ∗ G i ve up me ∗ / 4: rbroadcast h AC C , me , Acc [ me ] , P r op [ me ] , ∆[ me ] i 5: P r op [ me ] ← P r op [ me ] + 1 6: Local ← f alse 7: C andida tes ← Othe r C and 8: me ← x such that ( a, x ) = min ≺ lex { ( acc, c ) such that c ∈ X ∧ root ( c ) = p } 9: if ( Acc [ me ] , me ) < ( a min , min ) ∧ Local = f alse then / ∗ Propose me ∗ / 10: Local ← true 11: C andida tes ← C andidates ∪ { ( me, Acc [ me ] , P r op [ me ] , ∆[ me ] ) } 12: send h new , me , Ac c [ me ] , P r op [ me ] , ∆[ me ] i to e very process not in N ode ( me ) 13: f or all h ∈ Π \ { p } do 14: if ( h , p ) ∈ E dg e ( me ) then 15: ∆[ h ] ← max(∆[ h ] , ∆[ me ] ) 16: settimer ( h ) ← ∆[ h ] 17: if ( p , h ) ∈ E dg e ( me ) and h 6 = r oot ( me ) then 18: send h new , me, Acc [ me ] , P r op [ me ] , ∆[ me ] i to h 19: G ← x such that ( a, x ) min ≺ lex { ( a ′ , x ′ ) such that ( x ′ , a ′ , p ′ , d ′ ) ∈ C andidates } Figure 2: P rocedu re updateExtrac tedGraph of Algorith m AF ( X ) A sketch of the correctness proof of AF ( X ) is gi ve n in the app endix. Then, we can conclude with the follo wing theore m: Theor em 5.1 Let X be a dicut-clo sed system w ith a r oot. Algor ithm A ( X ) ef ficien tly ext rac ts a grap h in X . Mor eov er if all grap hs of X ar e str ongly connected , Algorithm A ( X ) ef ficien tly and exactl y extr acts a 10 Code for each process p 20: On initi aliz ation: 21: for all x ∈ X s uch that root ( x ) = p do 22: Acc [ x ] ← 0 ; P r op [ x ] ← 0 ; ∆[ x ] ← n 23: for all x ∈ X s uch that root ( x ) 6 = p do H ear d [ x ] ← ( − 1 , − 1) 24: for all q ∈ Π \ { p } do ∆[ q ] ← 1 25: O ther C and ← ∅ 26: Local ← f alse 27: me ← min { x such that x ∈ X ∧ r oot ( x ) = p } 28: updateE xtractedGr aph () 29: start tasks 1 and 2 30: task 1: 31: loop fo re ve r 32: send h aliv e i to ev ery process q such that ∃ ( x, - , - , - ) ∈ C andidates and ( p , q ) ∈ E dg e ( x ) e ver y K time 33: task 2: 34: upon receive h aliv e i from q do 35: settimer ( q ) ← ∆ [ q ] 36: upon timerexpire ( q ) do / ∗ Link ( q , p ) is not timely , blame all candidates that contains ( q , p ) ∗ / 37: f or all ( x, a, pr, d ) ∈ O ther C and such that ( q , p ) ∈ E dg e ( x ) do 38: rbroadcast h AC C , x , a , pr , d i 39: if ( q , p ) ∈ E dge ( me ) then 40: rbroadcast h AC C , me , Acc [ me ] , P r op [ me ] , ∆[ me ] i 41: upon receive h new , x, a, pr, d i fr om q do / ∗ Proposition of a ne w candidate ∗ / 42: if p / ∈ N ode ( x ) then / ∗ Blame x that does not cont ain p ∗ / 43: rbroadcast h AC C , x , a , pr i 44: else 45: new C and ← f al se 46: if ( x, − , − , − ) / ∈ O ther C and and H ear d ( x ) < ( a, pr ) then / ∗ New ca ndidat e ∗ / 47: new C and ← true 48: if ∃ ( x, a c , pr c , d c ) ∈ O ther C and with ( a c , pr c ) < ( a, pr ) then / ∗ New ca ndidat e ∗ / 49: O ther C and ← O ther C and \ ( c, a c , pr c , d c ) 50: new C and ← true 51: if new C and then 52: O ther C and ← O ther C and ∪ ( x, a, pr, d ) 53: updateE x tractedGr aph () 54: H eard [ x ] ← ( a, pr ) 55: f or all h ∈ Π \ { p } do 56: if ( h , p ) ∈ E dg e ( x ) then 57: ∆[ h ] ← max(∆[ h ] , d ) 58: settimer ( h ) ← ∆[ h ] 59: if ( p , h ) ∈ E dg e ( x ) and h 6 = r oot ( x ) then send h new , x , a, pr , d i to h 60: upon rdeliver h AC C , x , a , pr , d i do 61: if r oot ( x ) = p then 62: if x = me ∧ a = A cc [ me ] ∧ pr = P r op [ me ] then / ∗ Check if the accusat ion is up to date ∗ / 63: Acc [ me ] ← Acc [ me ] + 1 ; ∆[ me ] ← ∆[ me ] + 1 64: Local ← f alse 65: else 66: O ther C and ← O ther C and \ ( x, a, pr, d ) 67: if H eard [ x ] < ( a, pr ) then H eard [ x ] ← ( a, p r ) 68: updateE x tractedGr aph () Figure 3: Algorithm AF ( X ) that efficie ntly extra cts a graph in X gra ph in X . 6 Conclusion Failur e detector implementat ions in partially synchronou s models gene rally use the timeliness properti es of the system to approximate the set of correct (or faulty) proc esses. In some way , the ex tractio n problem is a kind of generalizat ion: inst ead of only searchi ng the set of correc t process es, here we try to extract also 11 informat ion abou t the timeliness of links. Besides, our solutio ns are based on alrea dy ex isting mechanisms used in failu re detectors implementation s as in [2, 3]. Informat ion abou t the timeliness of links is useful for efficien ecy of fault -tolera nt algorith ms. In partic- ular , in any ex tracted graph , any path betwee n a pair of correct proc esses is only constitu ted of timely links . This prope rty is particula ry intere sting to get efficien t routing algorithms. W e ga ve an e xtract ion algorithm for dicu t-clos ed set of timelin ess graphs . Moreov er , we pro ved tha t the ext raction is exact when all the timelines s graphs are also strongly connected. Giv en dicut -close d ti meliness graphs that contai ns a root, we sho wn how to ef ficiently extract a graph from it. By efficien cy we mean gi vin g a soluti on where ev entu ally messages are only sent over the links of the extr acted graph. It is important to note that the m ain purpos e of the algorithms we proposed is to show the feasabil ity of the ex tractio n under some condi tions. So, the comple xity of our algorithms was not the main focus of this paper . As a c onsequ ence, our algorit hms are someho w un realist ic becaus e of their high compl exi ty . Giving more practica l solutions w ill be the purpos e of our future works. Acknowledgmen ts W e are grateful to members of the GRAPH team of the LIAF A Lab for th e h elpful discu ssions and th eir interes ting suggestion s. Refer ences [1] Marcos K. Aguilera, Sam T oueg, a nd Boris Deianov . Revisiting the weakest f ailure detector for uniform reliable broadcast. In DISC ’99: Pr ocee dings of the thirteenth Internationa l Symposium on Distributed Co mputing , pages 13–33, LNCS vol. 16 93. Springer-V erlag, September 19 99. [2] Marcos Kaw azoe Aguilera, Caro le Delpo rte-Gallet, Hugues Fauconnier , and Sam T oueg. On implementing ome ga wit h we ak reliability and synchron y assumptions. In P ODC , page s 306–314, 2003. [3] Marcos Kawazo e Aguilera, Carole Delporte-Gallet, Hugues Fauco nnier , and S am T oueg. Communication-ef ficient leader election and consensus with limit ed link synchron y . In Soma Chaudhuri and Shay Kutten, editors, P ODC , pages 328–337 . A CM, 2004. [4] Rida A. Bazzi and Gil Neiger . Simulating crash failures with many faulty processors (extended abstract). In 6th Interna- tional W orkshop on Distributed A lgorithms (WD AG ’92) , v olume 647 o f Lectur e Notes in Computer Science , pages 16 6–184. Springer , 1992. [5] Tus har Deepak Chandra, V assos Hadzilacos, and S am T oueg. The weakest failure detector for solving consensu s. Jo urnal of the ACM , 4 3(4):685–7 22, 1996. [6] Tus har Deepak Chandra and Sam T oueg. Unreliable failure detectors f or reliable distributed systems. J ournal of the ACM , 43(2):225–2 67, 1996. [7] Benny Chor and Brian A. Coan. A simple and efficient randomized byzantine agreement al gorithm. IEEE T rans. Softwar e Eng. , 11 (6):531–539 , 1985. [8] Carole Delporte-Gallet, Hugues Fauconnier , and Rachid Guerraoui. A realistic l ook at failure detectors. In DSN , pages 345–35 3. IEEE Computer Society , 2002. [9] Carole Delporte-Gallet, Hugues Fauconnier , Rachid Guerraoui, V assos Hadzilacos, Petr K ouznetso v , and Sam T oueg . The weakest failure detectors to solve certain fundamen tal problems in distri buted computing. In T wenty-Third Annual ACM Symposium on Principles of Distributed Computing (PODC 2 004) , pages 338–346, 2004. [10] Carole Delporte-Gallet, H ugues Fauconnier , Rachid Guerraoui, and Petr Ko uznetso v . M utual exclusion in asynchron ous systems with failure detectors. J ournal of P arallel and Distrib uted Computing , 65(4 ):492–505, April 2005. [11] Danny Dole v , Cynthia Dwork, and Larry J. Stockmeyer . On the minimal synchron ism needed for distr ibuted consensus. J ournal of the ACM , 3 4(1):77–97, 1987. [12] Cynthia Dwork, Nancy A. Ly nch, and Larry J. Stockmeyer . Consensus in the presence of partial synchrony . Journ al of the ACM , 35(2):28 8–323, 1988. 12 [13] Jonathan Eisler , V assos H adzilacos, and Sam T oueg. The weake st f ailure d etector to solve nonuniform consen sus. Distributed Computing , 19(4):335 –359, 2007. [14] Michael J. Fischer, Nancy A. Lyn ch, and Mike Paterson. Impossibility of distributed consensus with one faulty process. J ournal of the ACM , 3 2(2):374–38 2, 1985. [15] Eli Gafni and Leslie Lamport. Disk paxos. Distributed Computing , 16(1):1–2 0, 2003. [16] Rachid Guerraoui, Michal Kapalka, and Petr Kou znetsov . The weakest failure detectors to boost obstruction-freedom. In DISC ’06: Proc eedings of the twentieth International Symposium on Distributed Computing , pages 399–412, LNCS vol. 4167. Springer-V erlag, September 20 06. [17] Rachid Gue rraoui and Andr ´ e Schipe r . The gene ric consensu s service. IEEE T ransactions on Softwar e Eng ineering , 27 (1):29– 41, 2001. [18] V . Hadzilacos and S. T ou eg. A modular approach to fault-tolerant broadcasts and related problems. T echnica l Report TR 94-1425, Department of Computer Science, Cornell Univ ersity , 1994. [19] Joseph Y . Halpern and Aleta Ricciardi. A kno wl edge-theoretic analysis of uniform distributed co ordination and f ailure d etec- tors. In Eighteenth Annual ACM Sy mposium on Principles of Distributed Comp uting (PODC ’99) , pages 73–82, 1999. [20] Nancy A. L ynch , Y ishay Mansour , and Alan Fekete. Data link layer: T wo impo ssibility results. In Symposium on Principles of Distributed Co mputing , pages 149–170, 1988. [21] Michel Raynal and Corentin T rav ers. I n search of the holy grail: Looking for the weakest failure detector for wait-free set agreement. In Alex ander A. Shvartsman, editor , OPODIS , volume 4305 of Lectur e Notes in Computer Science , pages 3–19. Springer , 2006. [22] Piotr Zielinski. Anti-omeg a: the weak est f ailure detector f or set agreement. T echnical Report UCAM-CL-TR-694, Computer Laboratory , Univ ersity of Cambridge, Cambridge, UK, July 2007. 13 A A ppen dix A.1 Pr oof of Proposition 3.4 Proposition 3.4 Ther e exists some systems X for which ther e is no extr action algorithm. Sketch of Proof. Assume there is an extraction algorithm A for P AI R with 5 proce sses. Consider a run r of A in system P AI R with T ( r ) = h{ p 1 , p 2 , p 3 , p 4 , p 5 } , { ( p 1 , p 2 ) , ( p 2 , p 1 ) , ( p 3 , p 4 ) , ( p 4 , p 3 ) }i . T o satisfy the proper ties of the extraction, h{ p 1 , p 2 , p 3 , p 4 , p 5 } , { ( p 1 , p 2 ) , ( p 2 , p 1 ) }i or h{ p 1 , p 2 , p 3 , p 4 , p 5 } , { ( p 3 , p 4 ) , ( p 4 , p 3 ) }i m ust be extracted from the run r . There is a time t 1 after wh ich r co n verges fo r example to h{ p 1 , p 2 , p 3 , p 4 , p 5 } , { ( p 1 , p 2 ) , ( p 2 , p 1 ) }i . Consider now r un r ′ of A in system P AI R with T ( r ′ ) = h{ p 3 , p 4 , p 5 } , { ( p 3 , p 4 ) , ( p 4 , p 3 ) }i such that r and r ′ are indistinguishab le until time t 1 and p 1 and p 2 crash in r ′ at time t 1 + 1 . There is a time t 2 after which r ′ conv erges to a graph with the directed edges { ( p 3 , p 4 ) , ( p 4 , p 3 ) } . Consider now that in r all messages fr om p 1 and p 2 to { p 3 , p 4 , p 5 } sent after time t 1 are delayed a fter time t 2 . F or p 5 , the runs r and r ′ are indisting uishable until t 2 . S o, at time t 2 , p 5 outputs a graph with directed edges { ( p 3 , p 4 ) , ( p 4 , p 3 ) } . Now consider run r ′′ of A in system P AI R with T ( r ′′ ) = h{ p 1 , p 2 , p 5 } , { ( p 1 , p 2 ) , ( p 2 , p 1 ) }i such that r and r ′′ are indistinguishable until time t 2 and p 3 and p 4 crash in r ′′ at time t 2 + 1 . There is a time t 3 after which r ′′ conv erges to a graph with the directed edges { ( p 1 , p 2 ) , ( p 2 , p 1 ) } . Consider ag ain that in the run r all messages f rom p 3 and p 4 to { p 1 , p 2 , p 5 } sent after time t 2 are delay ed af- ter t 3 . For p 5 the ru ns r and r ′′ are ind istinguishable. So, at time t 3 , p 5 outputs a graph with directed edg es { ( p 1 , p 2 ) , ( p 2 , p 1 ) } . Inductively , we c an con struct the r un r in such a way th at p 5 alternates fo rever between a grap h with d irected edges { ( p 1 , p 2 ) , ( p 2 , p 1 ) } and a gr aph with directed edg es { ( p 3 , p 4 ) , ( p 4 , p 3 ) } and n ever con verges definiti vely . Th is contradicts the existence of an algorithm that extracts a grap h in P AI R . ⊓ ⊔ A.2 Pr oof of Proposition 3.5 Proposition 3.5 Ther e e xists some s ystems X for w hich the r e is an e xtraction a lgorithm a nd ther e is no e xact e xtraction algorithm. Sketch of Proof. Consider the system T RE E with 3 processes. W e prove in the next section that ther e is an extraction algorithm for this system. Assume there is an e xact extraction algorithm A for this system. Consider a r un r of A in th is system with T ( r ) = h{ p 1 , p 2 , p 3 } , { ( p 1 , p 2 ) , ( p 1 , p 3 ) }i . T o satisfy the p roperties of the exact extraction, there is a time t 1 after which the graph h{ p 1 , p 2 , p 3 } , { ( p 1 , p 2 ) , ( p 1 , p 3 ) }i is extracted. Consider now run r ′ of A in system T RE E with T ( r ′ ) = h{ p 1 , p 2 } , { ( p 1 , p 2 ) }i such that r and r ′ are in- distinguishab le until time t 1 and p 3 crashes in r ′ at time t 1 + 1 . There is a tim e t 2 after which r ′ conv erges to h{ p 1 , p 2 } , { ( p 1 , p 2 ) }i . Consider now th at in r all m essages from p 3 to { p 1 , p 2 } sent afte r time t 1 are delayed after time t 2 . For p 1 , the run r and r ′ are indisting uishable until t 2 . So, at time t 2 , p 1 outputs h{ p 1 , p 2 } , { ( p 1 , p 2 ) }i . Inductively , we can construct the r un r in such a way that p 1 alternates fo rever between a graph h{ p 1 , p 2 , p 3 } , { ( p 1 , p 2 ) , ( p 1 , p 3 ) }i and a graph h{ p 1 , p 2 } , { ( p 1 , p 2 ) }i and ne ver co n verges d efinitively . This contrad icts the existence of an algorithm that exactly e xtracts a graph in T RE E . ⊓ ⊔ A.3 Pr oof of Theor em 5 .1 In this section , we propose a sketch of the correctness proof of th e efficient extrac tion a lgorithm AF ( X ) (Figures 2 and 3) . In this sketch, we consider a run r of AF ( X ) in dicut-closed system with a root, X . W e will denote by v ar t p the value of var p at time t . W e first notice t hat all variables Acc [ x ] and P rop [ x ] can only be modified by the process r oot ( x ) and are increas- ing: 14 Lemma A.1 F or all time t and t ′ , t ≥ t ′ , for all pr oce sses p , for all graphs x in X such that p = root ( x ) , Acc t p [ x ] ≥ Acc t ′ p [ x ] and P r op t p [ x ] ≥ P r op t ′ p [ x ] . Consider a g raph x su ch that its r oot p cr ashes. Eventually , every process q such that x ∈ Other C and and ( p, q ) ∈ E dg e ( x ) re liably bro adcasts an a ccusation for x . This way , x is removed from the Other C and set o f any correct process and never more added (because p is crashed) , henc e: Lemma A.2 If p is faulty , ther e exists a time t such that for all graphs x of X with r oot ( x ) = p , for a ll correct pr o cesses q in r , for all t ′ ≥ t : x / ∈ Othe rC and t ′ q . As r is a run of X , there exists some timeliness g raph o in X such that o is co mpatible with T [ r ] . In this case, N odes ( o ) = C orr ect ( r ) and the proc ess root ( o ) is a corr ect process: Lemma A.3 Ther e exists a timeliness graph o of X such that o is compatib le with T ( r ) and r oot ( o ) is a correct pr o cess. Moreover: Lemma A.4 Let o be a timeliness graph of X such tha t o [ C or rec t ( r )] is a compa tible with T ( r ) and root ( o ) is a corr ect pr ocess: Acc r oot ( o ) [ o ] is bound ed. For all cor rect proc esses p , fo r all grap hs x in X with root ( x ) = p , let A [ x ] p be the largest value of Acc [ x ] p in r ( ∞ if Acc [ x ] p is unbou nded) . Le t g to be the g raph with the smallest A [ g ] p (break ties by the total order on grap hs). Let C be the value of A [ g ] p . Note that from Lemma A.3 and Lemma A.4, C < ∞ . Moreover, b y construction of g , root ( g ) is a correct process, root ( g ) ev entually elects g forever ( me r oot ( g ) = g ), and as a consequence P r op [ g ] r oot ( g ) becomes constant: Lemma A.5 Ther e exis ts a time after which me r oot ( g ) = g . Lemma A.6 Ther e exis ts a time after which P rop [ g ] r oot ( g ) stops changing. Let P be the la rgest value of the pro position cou nter of g ( P r op [ g ] ). The following three lemmas are im mediate consequen ces of Lemm a A.5: Lemma A.7 F or every corr ect pr ocess p 6 = root ( g ) , ther e e xists a time after which g ∈ Oth erC an d p . Lemma A.8 Ther e exis ts a time after which me r oot ( g ) = g and Local r oot ( g ) = tr ue and Other C and r oot ( g ) = ∅ . Lemma A.9 F or every correct pr ocess p 6 = r oot ( g ) , th er e exists a time after which O th erC an d p = { g } and Local p = f al se . From Lemmas A.8 and A.9, the algorithm conver ges to a graph of X : Lemma A.10 Ther e e xists a timeliness graph x ∈ X (actu ally g ) such that every corr ect pr ocess q o utputs x for ever . From Lemma A.8 and Lemma A.9, we can deduce that the algorithm is efficient: Lemma A.11 Ther e is a time after w hich every corr ect pr oce ss p sends messages only to the pr oc ess q such that there is a dir ected edge ( p, q ) in E dg e ( g ) . From the Lemma A.10, we deduce the Con vergence and the V alid ity properties. It remains to prove tha t g satisfies the prop erties of the appr oximation : (1) g [ C orr ect ( r )] is compa tible with T [ r ] , and (2) g [ C or rec t ( r )] is a dicut redu ction of g or is equal to g . When root ( g ) sets Local to true and me to ( g , C, P , − ) , it sends a message new to all processes (rec all that C the final value of the accusation counter of g and P the final value of its the proposition counter .). As the links are reliable, all correct pro cesses eventually receives this message. If a cor rect pro cess q is not in N ode ( g ) , it reliably broadcasts an accusation message AC C . When p rocess root ( g ) deliv ers such a broad cast, it increments the accusatio n co unter of g contradictin g the fact that Acc [ g ] is bounded by C , hence: 15 Lemma A.12 C orr ect ( r ) ⊆ N ode ( g ) . When a co rrect pro cess recei ves this new message, it sends h ali v e i to e very process q such that ( p, q ) in E dg e ( g ) . And it monitor s all incom ing links ( q, p ) suc h that ( q , p ) in E dge ( g ) . If there is a link ( a, b ) of E dg e ( g ) between two correct pro cesses a and b , then a sen ds regularly ali v e message to b . By construction of g , b nev er blames g , th en b receives no out of date message. By the FIFO property of the link, the link is timely: Lemma A.13 g [ C or r ect ( r )] is c ompatible with T [ r ] . By Lemma A.12, N ode ( g ) = C or rec t ( r ) ∪ F . If F is empty the Closure p roperty is trivially ensured . W e no w consider the case wh ere F is not em pty . F contains only faulty processes. I f there is an edge in E dg e ( g ) from a faulty process q to a c orrect process p , e ventu ally the process p stops receiving messages from q and the accusation counte r of g will be incremented , which con tradicts the fact that the accusation counter of g remains equal to C forever . So we hav e: Lemma A.14 If F 6 = ∅ then E dg e ( g ) ∩ ( F × C orr ect ( r )) = ∅ . W e showed the Convergence (Lemm a A.10), the V alidity (L emma A.1 0), th e Comp atibility (Lem ma A.13), the closure (Lemma A.14), and the Ef ficiency (Lemma A.11). Moreover , Prop osition 3.3 sho ws the exact extraction when all graph s of X ar e strongly connected. Hen ce, we can conclude with the following theor em: Theorem 5.1 Let X be a dicu t-closed system with a r oot. Alg orithm A ( X ) efficiently extr acts a graph in X . Moreo ver if all graphs of X ar e str ongly connected, Algorithm A ( X ) efficien tly and exactly extr acts a graph in X . 16
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment