Solving k-Set Agreement with Stable Skeleton Graphs

In this paper we consider the k-set agreement problem in distributed message-passing systems using a round-based approach: Both synchrony of communication and failures are captured just by means of the messages that arrive within a round, resulting i…

Authors: Martin Biely, Peter Robinson, Ulrich Schmid

Solving k-Set Agreement with Stable Skeleton Graphs
Solving k -Set Agreement with Stable Skel eton Graphs Martin Biely ∗ , Peter Robinso n ‡ , and Ulrich Sc hmid † ∗ EPFL, S witzerland, biely@ecs.tu wien.ac.at † ECS Grou p, T echnische Univ ersit ¨ at W i en, Austria, s@ecs.tuwien .ac.at ‡ Division of Mathematical Sciences, Nanyang T e chnolog ical University , Singapo re, peter.robi nson@ntu.edu .sg Abstract —In this paper 1 we consider the k -set agree ment problem in d istributed message-passing systems usin g a round- based approach: Both synchrony of communication and f ailures are captured just by means of the messages that arrive within a r ound, resulting in round-by-round comm unication graphs that can be characterized b y simp le communication predicates. W e introduce the weak communication predicate P srcs ( k ) and show that it is tight fo r k -set agreement, in t he follo wi ng sense: W e (i) prov e that there i s no algorithm for solving ( k − 1) -set agree ment in systems characterized by P srcs ( k ) , and (ii) present a nov el distributed algorithm that achieves k -set agreement in runs where P srcs ( k ) holds. Our algorithm uses local approximations of the stabl e skeleton graph, which reflects the underlying p erpetual synchrony of a run. W e pro ve that th is app roxima tion is corr ect in all runs, regardless of the communi cation p redicate, and show that graph-th eoretic properties of th e stable skeleton graph can be used to solve k -set agre ement if P srcs ( k ) h olds. I . I N T RO D U C T I O N The q uest of finding min imal synchr ony requ irements for circumventing the impossibility o f distributed agreemen t prob- lems like consensus [9 ] h as always b een a very active research topic in distributed comp uting. Since the exact solvability border of consen sus has been researched exhaustively , see e.g., [2], [6], [12], the atten tion has shifted to weaker agr eement problem s, in particular, k -set agree ment [1], [11], [14], which allows the processes in a distrib uted system to agree on at most k different values. For k > 1 , the prob lem itself is possibly not as interesting as consensus ( k = 1 ) from a prac tical point of view , except fo r pa rtitionable systems that ne ed to re ach consensus in every pa rtition. In any case, k -set agreement is highly r elev ant f rom a theor etical per spectiv e, as it allows to study what level o f agreement can be achieved in a fault- tolerant distributed system. This question is d efinitely relev ant in practice, e.g. , for name- space reduction (renam ing) and similar prob lems. One way to model synchrony require ments is through the use of round models. Round-based distrib u ted algorithms execute in a sequen ce of commun ication-clo sed rounds, which consist of message exchan ges and pro cessing steps. The classic par tially syn chron ous mod els o f Dwork et. al. [7] were p robably the first to allow som e messages not to ar rive 1 Peter Robinson h as b een sup ported by the Austrian Science Foundation (FW F) project P20529 and Nanyang T ech nolog ical University g rant M5811 0000. within a roun d due to asynchro ny (i.e., non-time liness), ra ther than solely due to failure s. Th e seminal work by Santor o and W idm ayer [1 5], [1 6] unified the treatmen t of asynchro ny and failures by consid ering synchro nous pro cesses that only suffer from “end-to-end comm unication failure s”. T his i dea also und erlies the Round -by-Ro und failure detector (RRFD) approa ch by Gafni [10], which assumes a local R RFD that tells whether a proc ess shall wait for a round message from some other process or no t. Th e actual reason why a r eceiv er p rocess does not get a message fro m th e sender pro cess is considered irrelev a nt here. The Heard-Of (H O) mo del [3], [4] integrates this unified treatment of failures and asynchro ny o f [15], [16] with a flexible way of describ ing guar antees abou t comm u- nication. The basic entity of this model are commu nication- closed r ounds an d HO pr edicates, wh ich sp ecify c ondition s on the collection of h eard-o f sets: For each ro und r an d proc ess p , HO ( p, r ) denotes the set of processes that p h ears o f (i.e ., receives a me ssage from) in ro und r . In this paper, we will use pr operties of commun ication graphs for stud ying k - set agreemen t in me ssage p assing systems with very weak synchrony r equireme nts. In k -set agreemen t, co rrect pr ocesses must output a sing le value based on values p ropo sed loc ally , with no mor e than k d ifferent values being o utput system-wide. Detailed contributions: W e intr oduce an algo rithm fo r k -set a greement, which exploits a n atural cor respond ence be- tween co mmun ication predicates and roun d-by -roun d “timely commun ication” g raphs G r in a ru n; G r contains an edge ( q → p ) when process p hears of q in round r . Our algorithm incorpo rates a gener ic method fo r ap proxim ating the stable skeleton G ∩ ∞ , wh ich is the in tersection of all G r and reflects the under lying p erpetual synchr ony of a run . W e also in troduce the class of communicatio n p redicates P srcs ( k ) , which g uarantees th at at least two pro cesses in every subset of k + 1 processes hear fr om a comm on pr ocess, in every r ound . Using the grap h-theor etic prop erties of G ∩ ∞ guaran teed by the pre dicate P srcs ( k ) , w e show that o ur alg orithm solves k - set a greement in all runs where P srcs ( k ) holds. Mo reover , we also show that P srcs ( k ) is “ tight” f or k -set a greement, as it is too weak f or solving k − 1 -set agreemen t. I I . C O M P U T I N G M O D E L A N D P RO B L E M D E FI N I T I O N W e consid er distributed co mputation s of a set o f p rocesses Π co mmunicatin g by message passing. Mo reover , we consider that the comp utation is organ ized in an infinite sequenc e of commun ication-clo sed [8] r ound s; that is, any message sent in a ro und can be received on ly in that ro und. As in the mod els of Gafni[10] and Charron-Bost an d Schiper [4], we will express assump tions about the synchrony an d th e r eliability of commun ication in a system by a predicate that characterizes the set of edges in the comm unication grap h of each ro und. Intuitively speaking, there is an edg e from pro cess p to q in the c ommun ication g raph of round r is q rece iv ed p ’ s ro und r message. W e will in fact name a system by its pred icate, that is, in a system P the collection s of co mmunicatio n graph s of eac h ru n of an algorith m in tha t system will must fulfill predicate P . W e now formally define co mputation s in our round mo del. As in the aforemen tioned mo dels, an alg orithm is compo sed of two fun ctions: The sendin g fun ction determ ines, for each process p and ro und r > 0 , the message p broadcasts in round r ba sed on the p ’ s state at the b eginning of ro und r . The transition f unction deter mines, for each p an d ro und r an d the vector of messages received in r , the state at th e en d of round r , i.e., at th e beginnin g of rou nd r + 1 . Clearly , a run of an algorithm is completely determined by the initial states of the processes a nd the sequence of commun ication graph s. For each round r , we denote th e communica tion graph b y G r = h V , E r i , wh ere each node of the set V is associated with one process from Π , and where E r is the set of directed timely edges for round r . There is an edg e from p to q , den oted as ( p → q ) , if a nd only if q receives p ’ s ro und r message (in r ound r ). 2 T o simplify the p resentation, we will den ote a process a nd th e associated node in the commun ication gr aph by th e same sy mbols. H owe ver, as we differentiate between V and Π , we will always be able to resolve po ssible am biguities by stating from which set a node or process is taken. W e will write p ∈ G r and ( p → q ) ∈ G r instead of p ∈ V resp. ( p → q ) ∈ E r . W e are primar ily interested in the rou nd r skeleton G ∩ r of G r , which we d efine as the subgra ph consisting of the e dges that have been timely in all r ound s up to round r . Formally , G ∩ r := h V , E ∩ r i whe re E ∩ r := T 0 r . T hat is, ∀ r > 0 : E ∩ r ⊇ E ∩ r +1 , which implies th e subgrap h relation ∀ r > 0 : G ∩ r ⊇ G ∩ r +1 . (1) W e are p articularly interested in th e stable skeleton o f a run , which we define as the intersection 3 over all r ounds, i.e., G ∩ ∞ := T r ∈ N + G ∩ r . (2) Considering th at a run α con sists o f infinitely many roun ds, whereas our system consists of only a finite nu mber of 2 Since we consider communication-c losed rounds, a message sent in round r cannot be recei ved in any later round. 3 For simplicity , we set G ∩ G ′ := h V ∩ V ′ , E ∩ E ′ i . processes, it follows that the n umber of possible d istinct stable skeletons m ust also be finite. Co nsequently , the sub graph proper ty (1) implies that there is some roun d r ST when G ∩ ∞ has stabilized , i.e., ∀ r > r ST : G ∩ r = G ∩ ∞ . As mentio ned in the introductio n, our algo rithm will solve k -set ag reement b y appr oximating the stable skeleton o f a run. Th e first step in this effort is to use the loc ally a vail- able infor mation abou t the co mmunica tion grap h, which is captured by th e notion of timely neig hbou rhood s. Th e timely neighbo rhood of p , denoted as P T ( p, r ) , is the s et of processes that proc ess p has p erceived as perpe tually timely un til round r . In other words, p h as received a message from every process in P T ( p, r ) in every r ound u p to and includ ing r , i.e., P T ( p, r ) := { q | ( q → p ) ∈ G ∩ r } . Analogou sly to (1) and (2), we h av e P T ( p, r ) ⊇ P T ( p, r + 1) (3) and define P T ( p ) := \ r > 0 P T ( p, r ) . (4) W e will m ake heavy use of the standard graph -theoretic notion o f a str on gly connected compon ent of G ∩ r . Note that we im plicitly assume that strongly conn ected co mpone nts are always non empty and maximal. W e use the superscrip t notation C r when talking about a str o ngly connected compo- nent of G ∩ r . Mor eover , we write C r p to den ote the (uniqu e) strongly connected comp onent of G ∩ r that co ntains pr ocess p in r ound r . The strongly connec ted compon ent C ∞ p ⊆ G ∩ ∞ that co ntains p in a run is defined an alogou sly to ( 2) as C ∞ p := \ r > 0 C r p . Note tha t when p and q are strongly connected in G ∩ r , then they are also strongly connected in all G ∩ r ′ , for 0 < r ′ 6 r . From proper ty (1) of G ∩ r , we immediately have ∀ r > 0 : C r p ⊇ C r +1 p . (5) W e will also use directed paths in G ∩ r , where we assume that all nodes on a path are distinct. Let C r ⊆ G ∩ r be a stro ngly connec ted c ompon ent. If C r has no incomin g edges from any q ∈ G ∩ r \ C r , we say C r is a r oot compon ent in r oun d r . Formally , ∀ p ∈ C r ∀ q ∈ G ∩ r : ( q → p ) ∈ G ∩ r ⇒ q ∈ C r . Figure 1b shows a g raph with 2 root compo nents { p 3 , p 4 , p 5 } and { p 1 , p 2 } . Regarding the relation to the existing rou nd-by -roun d mod- els, we sh ortly recall wh at their p redicates are ba sed on : In the He ard-Of model [4], for each round r and each process p , the set H O ( p, r ) con tains tho se p rocesses that p hears from , i.e., rece i ves a m essage fro m, in roun d r . In the c ase of the Round-b y-Roun d Fault Detectors [10], the outpu t of p ’ s fault detector in ro und r is referred to by D ( p, r ) . I n each rou nd r , process p waits until it receives a message from every process that is not co ntained in D ( p, r ) . While it is possible th at p also receives a round r message fro m a process in D ( p, r ) , we will consider that th is is ne ver th e case. Fro m this it is evident that we h av e th e fo llowing co rrespond ence between o ur skeleton graphs an d th e HO/RbR model: ( p → q ) ∈ E ∩ r ⇐ ⇒  ∀ r ′ 6 r : p ∈ H O ( q , r ′ ) ∀ r ′ 6 r : p 6∈ D ( q , r ′ ) (6) Thus a process can determine its timely neighb ourho od in the two mod els as follows: P T ( p, r ) = ( T 0 k pro cesses can crash. Recalling th e correspo ndence between crashed processes an d process that no o ne hears of, it is not surp rising that this impossibility also h olds fo r the system P true :: T R U E , where all runs are admissible. I I I . A T I G H T C O M M U N I C A T I O N P R E D I C A T E F O R k - S E T A G R E E M E N T In this section, we introd uce a predicate that, tog ether with Algorithm 1 in Section IV, is sufficient for solving k -set agreemen t. For a run α , pr edicate P srcs ( k ) req uires that in every set S of k + 1 processes, ther e are two pr ocesses q , q ′ that rece iv e timely messages from the same co mmon p rocess p , in every round . W e say that p is a 2-source and q , q ′ are timely r eceivers of p in α . P src ( p, S ) :: ∃ q , q ′ ∈ S, q 6 = q ′ : p ∈ ( P T ( q ) ∩ P T ( q ′ )) P srcs ( k ) :: ∀ S, | S | = k + 1 ∃ p ∈ Π : P src ( p, S ) (8) Note that p is no t required to be distinct f rom q and q ′ : P srcs ( k ) still holds if p = q , i.e. , p always perceives itself in a timely fashion. Regarding commun ication graph s, this predicate ensur es that any induced sub-gr aph S of G ∩ ∞ with k + 1 n odes contains distinct no des q and q ′ , such that, fo r some n ode p , edges ( p → q ) and ( p → q ′ ) exist (one of which may be a self- loop). Fig ure 1b sh ows the stable skeleton gr aph in a ru n w here P srcs ( k ) ho lds fo r k = 3 . At a first g lance, it might appear that the perpetual na ture of P srcs ( k ) is an unne cessarily stro ng r estriction. T o see wh y some (p ossibly weak) perpetual syn chrony is necessary , c on- sider the pred icate ♦ P srcs ( k ) that satisfies (8) ju st eventually , and suppose that there is a n algorithm A that solves k - set ag reement in sy stem ♦ P srcs ( k ) . Due to its “eventual” nature, ♦ P srcs ( k ) allows runs where every pro cess forms a roo t compon ent b y itself, i.e., h ears from n o other pr ocess, for a finite number of rounds. Moreover , for any k , the (infinite) run, where a single proce ss f orms a root com ponent forever and thus has to decide on its own inpu t value, is admissible. Using a simple indisting uishability argument, it is easy to show th at processes d ecide on n different values. The following result will be instrum ental in Section I V, where we show how to solve k - set agr eement with P srcs ( k ) . Note that T heorem 1 is inde penden t of th e algo rithm em- ployed. Theor em 1: There ar e at mo st k root co mpone nts in any run that is admissible in system P srcs ( k ) . Pr oof: Assume b y co ntradiction that there is a ru n α of some algorithm A that is ad missible in system P srcs ( k ) , where there is a set of ℓ > k + 1 disjoin t roo t co mpon ents R =  C ∞ p 1 , . . . , C ∞ p ℓ  containing proc esses p 1 , . . . , p k +1 , . . . , p ℓ . Let r be the rou nd where every strongly connected root component C ∞ p i ∈ R has stabilized, i.e., ∀ i : C r p i = C ∞ p i . That is, any two distinct root comp onents in R mu st already b e disjoint fro m round r on . Since α satisfies P srcs ( k ) and ℓ > k + 1 , there must be a 2-sour ce p such that, fo r two distinct processes p i , p j ∈ { p 1 , . . . , p k +1 } , it holds that p ∈ ( P T ( p i ) ∩ P T ( p j )) . By ( 6), it fo llows that th e ed ges e i = ( p → p i ) an d e j = ( p → p j ) are in G ∩ r . Considerin g that C r p i and C r p j are root comp onents by assumption, i.e., do not have incom ing edg es, it must be that e i ∈ C r p i and e j ∈ C r p j , and th erefore p ∈ C r p i ∩ C r p j . This, howe ver, co ntradicts the fact that C r p i and C r p j are disjoin t, which completes o ur p roof. A. Impossibility of ( k − 1) -Set Agreement W e will now show that P srcs ( k ) does no t allow to solve ( k − 1) -set agreemen t. More specifically , we will prove this by assuming the existence of such an algo rithm A , an d then construct a ru n fulfilling P srcs ( k ) wh ere processes decid e on k (instead of k − 1 ) different values. Theor em 2: Consider any k such that 1 < k < n . There is n o alg orithm A tha t solves ( k − 1) -set agreem ent in system P srcs ( k ) . Pr oof: Assume for th e sake of a contrad iction that such an alg orithm A exists. Supp ose that all processes star t with pairwise distinct inp ut values. Consider the ru n α an d a fixed set L of k − 1 pr ocesses that only hear from themselves, formally speaking, ∀ p ∈ L : P T ( p ) = { p } . Mor eover , there is on e pr ocess s such that every pr ocess not in L o nly hears from itself and s , i. e., ∀ p ∈ Π \ L : P T ( p ) = { p, s } . Since, b y validity and termination , p rocesses eventually have to decide o n so me in put value an d p rocesses in L ∪ { s } canno t learn any other p rocess’ input value, th ey have to decide o n their o wn value. Th us, we have k d ifferent d ecision v alues, as we have assumed a un ique input value for each p rocess, and therefor e a violation of ( k − 1) -agreement . What remains to b e shown is that this r un α actually fu lfills P srcs ( k ) . Recall equation (8), i.e., th e definition of P srcs , and consider for any set S of size k + 1 the set P = S \ L . Since | S \ L | > 2 , the set P contain s at least two distinct pr ocesses that perman ently hea r from s ( one of w hich may be s ). That is, process s is th e requ ired 2 -so urce f or any set S of k + 1 processes. I V . A P P R OX I M AT I N G T H E S TA B L E S K E L E T O N G R A P H A N D S O LV I N G k - S E T A G R E E M E N T In this section, we presen t and analy ze an algo rithm that solves k -set agree ment with predicate P srcs ( k ) . Algorithm 1 employs a generic approx imation of th e stable skeleton graph of th e run, which works as f ollows: First, every pr ocess p keeps track of the pr ocesses it has perceived as tim ely until r ound r in the set P T p , updated in Line 9 . L emma 3 will show that P T p satisfies the defin ition of P T ( p, r ) , fo r all roun ds r . In addition , every proc ess p loca lly maintains an approx imation graph G p of the stable skeleton, denoted G r p for round r , which is b roadcast in ev ery ro und. If a process q receives such a graph G r p from som e proce ss p in its timely neighbo rhood P T ( q , r ) , it ad ds the info rmation contained in G r p to its own lo cal appro ximation G r q . Note th at, in contrast to the stable skeleton graph G ∩ r , the appro ximation graph G p is actually a weighted dire cted g raph. The ed ge labels of G p correspo nd to the rou nd number when a particular edge was added by some process, i.e., th e ed ge ( q ′ r → q ) is in G p if, an d only if, q ′ ∈ P T ( q , r ) (cf . Lemma 3(b )). T o prevent outdated inform ation f rom remaining in the app roximatio n graph per manently , e very pro cess p purges all edges in G r p that were initially added m ore than n − 1 round s ago. Figures 1c-1h show this approx imation mechan ism at work. For k -set agreemen t, proce ss p only consider s pro posal values fo r its estimated decision value x p that were sent by processes in its current timely n eighbo rhood , i.e., in P T p . Th is ensures th at p a nd q will have a commo n estimated decision value x p = x q in roun d n , if they a re in th e same strongly connected co mponen t (c f. Lemma 14). T o d etermine when to terminate, p an alyzes its app roximatio n graph in e very rou nd r > n and decid es if G r p is a stro ngly co nnected graph. Why is this dec ision safe with respec t to the agreemen t proper ty? Using our graph appro ximation results, we will show in Lemma 15 that any strongly connected approximation graph contains a t least o ne ro ot com ponen t in the stable skeleton graph. Furthe rmore, if two processes decide on different Algorithm 1 A ppr oxima ting the stable skeleton graph and solving k -set agreement with P srcs ( k ) V ariables and Initialization: 1: P T p ∈ 2 Π initially Π 2: x p ∈ N initially v p / / Est imated decision value 3: G p := h V p , E p i initially h{ p } , ∅i / / weighted digraph 4: decided p ∈ { 0 , 1 } initially 0 / / is 1 if f p has decided Round r : sending function S r p : 5: if decided p = 1 then 6: send ( decide, x p , G p ) to all processes 7: else 8: send ( pr op, x p , G p ) to all processes Round r : transition function T r p : 9: update P T p 10: if receiv ed ( decide, x q , ) from q ∈ P T p and decided p = 0 then 11: x p ← x q 12: decide on x p 13: decided p ← 1 14: / / Approximate stable skeleton graph: 15: G p ← h{ p } , ∅i 16: for q ∈ P T p do 17: add directed edge ( q r → p ) to E p 18: V p ← V p ∪ V q 19: for every pair of nodes ( p i , p j ) ∈ V p × V p do 20: R i,j ← { r e | ∃ q ∈ P T p : ( p i r e → p j ) ∈ E q } 21: if R i,j 6 = ∅ then 22: r max ← max( R i,j ) 23: E p ← E p ∪ { ( p i r max → p j ) } 24: discard all ( p i r e → p j ) from E p where r e 6 r − n 25: discard p i 6 = p from V p if p is unreachable from p i in G p 26: if decided p = 0 then 27: x p ← min { x q | q ∈ P T p } 28: if r > n and G p is strongly connected then 29: decide on x p 30: decided p ← 1 values, it follows tha t their ap proxim ated grap hs in the ro unds of their respective d ecision are disjoint. Sinc e The orem 1 confirms tha t there are at m ost k roo t compo nents in any r un where P srcs ( k ) ho lds, ther e can be in fact at most k different decision values. A. Appr oximation of the Stab le Skeleton Graph Throu ghout our analy sis, we denote th e value of variable v ar of process p at the en d o f r ound r as var r p . When we use the subgraph relation ( ⊆ ) b etween graphs C r p and G r p , we mean the stand ard subgraph re lation b etween C r p and th e unweighted version of G r p . W e first state some obvious facts that follow directly fr om the co de o f th e algorithm : Observation 1 : For a ny rou nd r > 0 it holds that p ∈ G r p and that n o ed ge ( q ′ s → q ) ∈ G r p has s 6 r − n . Note that, after the initial assignment, p only updates variable P T p in Line 9, which is equiv alent to (7). From this and the inspectio n of Lines 15 and 1 7, Lemm a 3 follows immediately: p 1 p 2 p 3 p 4 p 5 p 6 (a) G ∩ 2 p 1 p 2 p 3 p 4 p 5 p 6 (b) G ∩ ∞ p 1 p 2 p 3 p 4 p 5 p 6 1 1 (c) G 1 p 6 p 1 p 2 p 3 p 4 p 5 p 6 2 2 1 1 (d) G 2 p 6 p 1 p 2 p 3 p 4 p 5 p 6 3 2 1 1 (e) G 3 p 6 p 1 p 2 p 3 p 4 p 5 p 6 4 3 2 2 1 1 1 (f) G 4 p 6 p 1 p 2 p 3 p 4 p 5 p 6 5 4 3 2 2 (g) G 5 p 6 p 1 p 2 p 3 p 4 p 5 p 6 6 5 4 3 (h) G 6 p 6 Fig. 1 : A system of 6 proce sses where P srcs (3) ho lds. The stable skeleton g raph for round 2 is depicted in Figu re 1a; 1 b shows the stable skeleton graph for the entire run. For simplicity , we omit self-loop s, i.e., ∀ p i : p i ∈ P T ( p i ) . Figures 1c -1h show process p 6 ’ s a pprox imation of G ∩ ∞ during round s 1 to 6 . Lemma 3 : It holds th at q ∈ P T ( p, r ) if, and on ly if, all of the following are true: (a) q ∈ P T r p , (b) p adds a dir ected edg e q r → p to G r p by executing Line 17 in roun d r , and (c) for any r ′ 6 = r , th ere is no oth er ed ge q r ′ → p in G r p . The fo llowing lemma sh ows that the appr oximation graph G p ℓ +1 accurately reflects the timely neighborhoo d of a process. That is, if p 1 is connected to p ℓ +1 throug h a path of length ℓ , then p ℓ +1 will add the timely n eighbo rhood in formation of p 1 to its approx imated graph by round ℓ . Lemma 4 : Suppo se that there exists a d irected path Γ = ( p 1 → . . . → p ℓ +1 ) in G ∩ r for round r > n , wh ere Γ has length ℓ 6 n − 1 . Then, ∀ q ∈ P T ( p 1 , r − ℓ ) it h olds th at (a) edge ( q r q → p 1 ) is in G r p ℓ +1 where r > r q > r − ℓ , an d (b) G r p ℓ +1 contains no oth er ed ges f rom q to p 1 . Pr oof: Consider an arb itrary q ∈ P T ( p 1 , r − ℓ ) . The pro of proceed s by inductio n over th e ed ges of path Γ ind exed by k . That is, we show that for all k , with 0 6 k 6 ℓ , it ho lds that there is an edg e e = ( q r k → p 1 ) in G r − ℓ + k p 1+ k where r − ℓ + k > r k > r − ℓ . For the b ase case ( k = 0 ) , we have to show that the e dge e is in G r − ℓ p 1 , but this already fo llows from q ∈ P T ( p 1 , r − ℓ ) , by Lemma 3. For the inductio n step, we assume that the statement h olds for some k < ℓ and then show that it holds for k + 1 as well. In r ound r − ℓ + ( k + 1) process p 1+ k broadc asts its curre nt graph estima te, i.e., G r − ℓ + k p 1+ k to all. W e know that p 1+( k +1) will receive this message since ( p 1+ k → p 1+( k +1) ) is in the path Γ ⊆ G ∩ r , which means that p 1+ k ∈ P T ( p 1+( k +1) , r − ℓ + ( k + 1)) . By the induction hypo thesis, th e edge ( q r k → p 1 ) is in G r − ℓ + k p 1+ k and th erefore will b e amo ng the edg es that p 1+( k +1) considers in Lin e 20. This in turn im plies that p 1+( k +1) will a dd an edge q r k +1 → p 1 to its graph G r − ℓ +( k +1) p 1+( k +1) in Line 23, whereb y r k +1 is calculated in Line 22 such that r k +1 > r k . Mo reover , b y induction hyp othesis we have r k > r − ℓ > r − n , which ensures that the edge will n ot b e discarded in Lin e 24. Since the c ode fo llowing the for-loop in Line 1 9 is executed exactly once for every edg e, no other edg e q r ′ → p 1 is added to G r − ℓ +( k +1) p 1+( k +1) . This comp letes the proo f our lemma. The next lemma shows that th e app roximatio n graph o f correctly (over)estimates the stro ngly connec ted comp onent from r ound n on: Lemma 5 : Let r > n and con sider th e strongly connec ted compon ent C r p containing p in G ∩ r . Th en, it ho lds that G r p ⊇ C r p . Pr oof: Consider any ed ge ( q ′ → q ) ∈ C r p . Since C r p is strongly connected, there is a directed path between any pair of processes in C r p , in par ticular there is a path o f length ℓ 6 n − 1 from q to p . By the definition of C r p we know that q always perceives q ′ as timely in all round s up to round r , which mea ns that q ′ ∈ P T ( q , r − ℓ ) . Then , by applying Lemma 4, we get that the edge ( q ′ r ′ → q ) is in G r p , for some r ′ > r − ℓ , which shows that C r p is a subgr aph of G r p . Lemma 3 showed that the timely neighbo rhoo d is eventually in the ap proxim ated graph . W e now show th at our ap proxim a- tion c ontains only valid in formatio n: Lemma 6 : Let r > 1 and supp ose that there is an edge e = ( q ′ s → q ) in the ap prox imated stab le skeleton gr aph G r p of p rocess p . Then it h olds th at q ′ ∈ P T ( q , s ) . Pr oof: No te th at processes o nly add edges to their a p- proxim ation gr aphs in Line 17 or in Line 23. If an ed ge is added via Line 23, then this edge has previously b een adde d by ano ther pr ocess by executing Line 17. Therefo re, every edge mu st h av e been add ed by som e process via Line 17. In case of e , this process can only be q . By Lemma 3 this happens in roun d s a nd q ′ ∈ P T ( q , s ) . The following Lemm a 7 is in some sense the converse r esult of Lem ma 5, as it states th at the approx imated grap h must approa ch C r p from b elow , if it is stro ngly co nnected: Lemma 7 : Let r > 1 and con sider the strongly con nected compon ent C r p . If the approxim ated skeleton graph G r + n − 1 p is strongly c onnected , then C r p ⊇ G r + n − 1 p . Pr oof: Consider any edge e = ( q ′ r ′ → q ) ∈ G r + n − 1 p . By L emma 6 , w e k now th at q ′ ∈ P T ( q , r ′ ) . It follows b y the subset property (3) that q ′ ∈ P T ( q , r ) , as Observation 1 implies r ′ > ( r + n − 1) − n = r − 1 . Therefo re, there is an edge ( q ′ → q ) in G ∩ r . It fo llows that G r + n − 1 p is isom orphic to a ( not necessarily maximal) strong ly connected compo nent S r in G ∩ r . Because C r p and S r both contain p , th eir intersection is no nempty , i.e., C r p ⊇ G r + n − 1 p . As a final result about the approxim ated skeleton graph, we show th at once the approximatio n G p is strong ly connecte d in round r > n , it is closed w .r .t. strongly conne cted componen ts. This m eans that G p can be p artitioned into disjoint strongly connected compo nents in G ∩ ∞ . Theor em 8: Suppose th at R > n . If the appro ximated skeleton graph G R p is stro ngly conn ected, then it con tains the strongly c onnected compo nent C ∞ q of every q ∈ G R p . Pr oof: Consider any q ∈ G R p and its strongly connec ted compon ent C ∞ q . From (5) and Le mma 7 it follows that q ∈ G R p ⊆ C R − n +1 p ⊆ C 1 p , i.e., q ∈ C 1 p ∩ C 1 q . Moreover , due to th e w ell-known fact that two m aximal strongly connected componen ts in a d igraph are either disjoint o r equivalent, we ge t that C 1 q = C 1 p . Now suppose the theor em does not h old. Th en th ere exists some q ′ ∈ C ∞ q such that q ′ 6∈ G R p . Du e to Lem ma 5, q ′ cannot be c ontained in C R p , but due to (5), q ′ ∈ C R q ⊇ C ∞ q . Th erefore , C R q 6 = C R p , and thus C R q ∩ C R p = ∅ . Since G R p is stro ngly connected and contains q , it also contain s a path Γ = ( q = p ℓ → · · · → p 0 = p ) , such th at ∀ i, 0 6 i < ℓ : p i +1 ∈ P T ( p i , R − i ) . Let j be the minim al index i such that p j ∈ C R q , an d let Γ j = ( p j → · · · → p 0 ) b e the path remaining from p j . As both q ′ and p j are in C R q , there is a path Γ ′ in C R q . Let k be the length o f this path. Moreover, by applyin g Le mma 4, we get that G R − j p j contains the outgo ing edg e e of q ′ on this path, labeled with so me roun d r ′ > R − j − k . (9) But then , b y the definition of Γ , it follows that when G R p contains p j — which it does — then it mu st also contain q ′ , unless some proc ess p i ( i < j ) removed e from its set of edges in line 24 in roun d R − i because r ′ 6 R − i − n . Since round R at proce ss p (= p 0 ) is the latest r ound when this can occur, we get that r ′ 6 R − n , and thus, b y ( 9), R − j − k 6 r ′ 6 R − n , i.e., j + k > n . (10) Let ∆ be the subg raph o btained b y conca tenating paths Γ ′ and Γ j . By constru ction, Γ j and Γ ′ only share no de p j , and thus ∆ is a (simple) path and must hav e length j + k 6 n − 1 , as no path can exceed length n − 1 . Th is contradic ts (10) and th us completes the pro of that q ′ is in G R p . Th e proo f showing that all edges of C ∞ q are in G R p proceed s analo gously , by assuming that som e edge in C ∞ q ending in q ′ is not in G R p . B. k -Set Agreement In this section, we will show that Algo rithm 1 no t only approx imates the stable skeleton graph , but also so lves k - set agreemen t. Our previous results allo w us to immediately prove the validity and the ter mination p roper ties. Lemma 9 (V alidity): If a pro cess decid es on v , th en v was the in itial value of some p rocess. Pr oof: Observe that the decision v alue x p of an y process p is initially set to its p ropo sal v alue v p , which is then bro adcast. On all sub sequent updates of x p in Line 27, a value x q that was sent b y some process q (which originated fro m some v q ′ ) is assign ed, therefor e validity holds. Lemma 1 0: Every process decides at most once in any run. Pr oof: Observe that n o pr ocess executes Line 2 9 an d Line 12 in the same run. This is guaranteed by the fact that process p cann ot pass the if-condition s in Line 10 or in Line 26 after deci de d p is set to 1 , w hich hap pens whenever p decides. Lemma 1 1 ( T ermination): Every proce ss decides exactly once. Pr oof: Lemma 10 shows that ev ery proc ess decid es at most once. W e will now sh ow that every pro cess dec ides at least onc e. First, we will show th at there is a ro ot compon ent in ev ery r ound. Consider th e strongly conne cted comp onents that p artition th e set of no des of the stable skeleton grap h G ∩ r in some round r . Such a set always exists, since the stron gly connected c ompon ents form equiv alen ce classes of n odes. I t is well k nown that the contractio n of th e strong ly con nected compon ents is a d irected acyclic gra ph, whic h reveals that there is at least one node C r in the contr acted gra ph th at has no incomin g edg es. Clearly , C r satisfies the definition of a root compon ent in G ∩ r . Therefo re, there is a n onemp ty set R r of stro ngly con nected compo nents all o f whic h are root compon ents in round r . Let r > 1 be the ea rliest rou nd where G ∩ r is stable for at least n − 1 rou nds, i.e., ∀ r ′ ∈ [ r , r + n − 1 ] : G ∩ r ′ = G ∩ r . Note that proper ty (1) implies that r exists. Now , consider any root comp onent R r ∈ R r : Clearly , since every pr ocess is in exactly one strongly conne cted comp onent, we have ∀ p ∈ R r : C r p = R r = R r + n − 1 = C r + n − 1 p . (11) W e will now show that the ap proxim ated skeleton graph of such a p rocess p is in fact exactly the strong ly conne cted compon ent of p . Consider any p ∈ R r (= C r + n − 1 p ) . First, since ( r + n − 1 ) > n , Le mma 5 an d ( 11) imply that R r ⊆ G r + n − 1 p . W e will now sho w that R r ⊇ G r + n − 1 p , which proves that t hese graphs are equal: Sin ce G r + n − 1 p is con nected by con struction, it is s ufficient to show that e very edge i n G r + n − 1 p is also in R r . Assume in contrad iction that th ere is an e dge e = ( q ′ r ′ → q ) in G r + n − 1 p such that q ∈ R r but q ′ / ∈ R r ; note that the other way round ( q ′ ∈ R r but q / ∈ R r ) is impossible by construction . Using Lem ma 6 we know that q ′ ∈ P T ( q , r ′ ) , and Observation 1 implies that r ′ > ( r + n − 1 ) − n = r − 1 , i.e., r ′ > r . The n, by definition , we have that e ∈ G ∩ r , i.e., e is an incomin g ed ge of R r , con tradicting the assumption that R r is a ro ot c ompon ent. W e ca n therefo re conc lude that R r = G r + n − 1 p . By assump tion, R r is a root com ponent, which tells us that G r + n − 1 p is stro ngly conn ected, i.e., p will pass th e if-con dition in Lin e 28 in rou nd r + n − 1 and decide. Recall the contracted stable skeleton grap h of rou nd r + n − 1 . Since every path in this graph is ro oted at som e no de corr espondin g to a roo t compon ent in the set R r . T hus, all p rocesses that are not in a roo t compon ent will receive a decision message by ro und r + 2 n − 1 and also decide, which comp letes ou r proo f. In the rem ainder o f this section we will prove tha t Algo- rithm 1 satisfies the k -agreement prop erty . W e will star t ou t with some basic in variants on d ecision e stimates. Observation 2 (Monoton icity): In any run of Alg orithm 1 it h olds that ∀ r > 0 : x r p > x r +1 p . Lemma 1 2: If process p d oes no t d ecide in Lin e 12, we have that ∀ r > n − 1 : x r p = x r +1 p . Pr oof: Sup pose th at there is an r > n − 1 such that p sets x r +1 p ← x q and x r p 6 = x q . This can only occur in Line 27, if the process doe s not dec ide in L ine 12. Fro m Observation 2 and validity (c f. Lem ma 9), we know that p did not pr eviously receive x q and that x q is the initial value of some distinct process q . Since processes forward their estimated de cision value in e very round , (3) implies that the shortest path f rom q to p (alon g which x p has been propag ated to p ) in G ∩ r +1 has length r + 1 . Howe ver, this is imp ossible as r + 1 > n and the lo ngest p ossible path has length n − 1 . Lemma 1 3: Supp ose that some p rocess p decid es on x p in round r by executing line 12. Th en some pro cess q 6 = p has decided o n x p in roun d r ′ < r by executing L ine 2 9. Pr oof: Every process decides either in Line 29 or in Line 12, but not both (Lemma 10). Since p decided in Line 1 2 it mu st h av e received a ( decide, x q , ) message fr om som e distinct p rocess q . If q decid ed in Line 29 we are done; otherwise q decided in Line 12 in round r − 1 , we can repeat the same argument for q . After at mo st n − 1 iterations, we arrive at some proce ss that mu st have decided using Line 29. Lemma 1 4: Let C n p be the stro ngly conn ected componen t of process p in roun d n . Then, it hold s that ∀ q ∈ C n p : x n q = x n p . Pr oof: First, observe that due to Lemma 13 and th e fact that n o pro cess can pass th e ch eck in Line 28 bef ore roun d n , no p rocess can decide before rou nd n . Theref ore, processes can u pdate the ir estimate values until at least round n . Suppose that there are pro cesses p, q ∈ C n p , such that x n p 6 = x n q . In particular we assume without lo ss o f generality , that x n q is minimal amon g all roun d n estimation values of proc esses in C r p , i.e., x n p > x n q . Let r q be the roun d where q first sets x q to the value x n q . By Obser vation 2 it follows that q do es not update x q anymore b efore ro und n . Sin ce Algorithm 1 satisfies validity (Lemma 9 ), we know that there is some pro cess s that is the sour ce of this value, i.e. , s initially propo sed x r q . By the code o f the algorithm we know that in r ound r pro cess p o nly considers values in Line 27 that were sent by so me p rocess in P T ( p, r ) . This implies that there is a sequence of pair wise distinct p rocesses s = q 1 , . . . , q ℓ = q , such that ∀ i, (1 6 i < ℓ ) : q i ∈ P T ( q i +1 , i ) . (12) Clearly , r q = ℓ − 1 . Let j 6 ℓ be such that q j ∈ C n p and j is minimal, let Γ q be the path in G ∩ 1 induced by the sequence s up to q j . Mo reover , since q j ∈ C n p , ther e is a path Γ p in C n p from q j to p . Since C n p ⊆ G ∩ 1 , Γ p is a path in G ∩ 1 as well. Let Γ be the path in G ∩ 1 obtained by ap pendin g Γ p to Γ q . By construction Γ is simple, and there fore its length is bound ed by n − 1 . Moreover, the initial value of s was pr opagated along this path — over Γ q by constructio n and over Γ p , because x n q is minimal in C n p . This leads to process p assigning this value to x p in some round r p 6 n − 1 , which contradicts the assumption that x n p > x n q . Lemma 1 5 ( k -Agreement): Processes decide on at most k distinct values. Pr oof: For the sak e of a contradiction , ass ume that there is a set of ℓ > k pro cesses D = { p 1 , . . . , p ℓ } in a ru n α where p i decides on x ∞ i = x r i i 4 in rou nd r i > n and ∀ p i , p j ∈ D : x ∞ p i 6 = x ∞ p j . By virtue of Lemma 13, we can assum e that ev ery p i has decided by executing Line 29. Considering that no p rocess d ecides before r ound n , applying Lem ma 1 2 yields that ∀ r > n ∀ p i , p j ∈ D : x r p i 6 = x r p j . (13) Note that the ap proxim ated skeleton gr aphs G r i p i and G r j p j are strongly co nnected in ro und r i resp. r j , o therwise th e p ro- cesses co uld no t have passed the if-co ndition befo re Line 2 9. W e will first show that the d ifferent d ecision values o f p i and p j imply that their approx imated skeleton graphs in rounds r i resp. r j are disjoin t. Lemma 7 reveals that th ese skeleton graphs are contain ed within the respective strongly conn ected compon ents of an earlier round , i.e., C r i − n +1 p i ⊇ G r i p i and C r j − n +1 p j ⊇ G r j p j . If these strongly con nected co mponen ts of p i and p j are disjoint, then so are th e appro ximated skeleton graphs an d 4 Note that x ∞ p denotes p ’ s final “estimate”, i.e., the actua l dec ision v alue of process p . we are d one. T herefor e, assume in contra diction that I = C r i − n +1 p i ∩ C r j − n +1 p j 6 = ∅ . W e will now prove th at one of these compon ents co ntains the o ther . Without loss of gen erality , suppose that r i 6 r j and consider any nod e p ∈ I ⊆ C r j − n +1 p j . Clearly , p is strongly connected to every node in C r j − n +1 p j . Let Z be the ind uced subgrap h of C r j − n +1 p j in th e skeleton graph G ∩ r i − n +1 . By the subgrap h p roperty (5) and since r i 6 r j , it follows th at Z = C r j − n +1 p j , an d h ence Z ∩ C r i − n +1 p i 6 = ∅ . By the fact that p ∈ I , we k now that p ∈ C r i − n +1 p i . T hat is, in the skeleton grap h G ∩ r i − n +1 , pr ocess p is strongly conne cted to all nodes in C r i − n +1 p i and Z . But since the stro ngly co nnected comp onent C r i − n +1 p i is m aximal, we actually h av e C r i − n +1 p i ⊇ Z = C r j − n +1 p j , which means that p j ∈ C r i − n +1 p i . Then, Lemma 14 read ily implies that ∀ q ∈ C r i − n +1 p i it ho lds that x n p i = x n q and, in particular, x n p i = x n p j , wh ich contradicts ( 13). W e can there- fore co nclude tha t the intersection of the strongly connecte d compon ents, and therefore, b y L emma 7, also the intersection of G r i p and G r j p j is indeed empty , i.e ., ∀ p i , p j ∈ D : ( G r i p i ∩ G r j p j ) = ∅ . (14) By Th eorem 8 it follows that each o f th e strongly connected approx imated sk eleton gr aphs G r i p i can be partitioned into a set D i of strongly con nected components in G ∩ ∞ . By Theo rem 1, at most k of th e sets D i can contain a ro ot co mpone nt. Note that (14) implies that no stron gly conn ected com ponent is in two distinct sets D i , D j . For the sake of a contrad iction, assume that (w .l.o.g.) th e set D ℓ correspo nding to G r ℓ p ℓ does not contain a roo t co mpon ent. Now consider the con tracted graph o f G ∩ ∞ where the n odes are the strongly conne cted compon ents. Sin ce the contrac ted graph is acyclic, it follows that there exists a p ath Γ in the (n on-con tracted) graph G ∩ ∞ that end s at pro cess p ℓ ∈ D ℓ , and is rooted at some pro cess q ∈ C ∞ q where C ∞ q is a root component and thus by assumption not in D ℓ . Howe ver, by the sub graph p roperty (1), we know that th e path Γ is also in G ∩ r ℓ . But then Lemma 4 implies that q ∈ G r i p i , and Theo rem 8 sho ws that C ∞ q ∈ D ℓ , i.e. , o ne of the compon ents in D ℓ in fact is a r oot com ponent. Th is provide s the requir ed contradictio n. Theor em 16 : Alg orithm 1 solves k -set agr eement in system P srcs ( k ) . Pr oof: Lemm a 15 imp lies k - agr eeme nt . T ermination is guaran teed by Lemma 11 and Lemma 9 shows that va lidity holds. V . D I S C U S S I O N A N D F U T U R E W O R K W e have introdu ced the notion of commu nication graph s and p resented an algorithm that appr oximates the stable skele- ton o f a run. The algorithm is based o n exchan ging lo cal approx imations o f the stable skeleton, he nce has a worst-case message bit complexity th at is po lynomially in n . W e have also introduc ed a class of commun ication predicates P srcs ( k ) and proved that using this a pprox imation one can solve k - set agreem ent in a system th at gu arantees P srcs ( k ) . Note that the algorithm actually solves con sensus in sufficiently we ll- behaved runs. The one-to- one correspo ndence between the (at most) k ro ot compon ents o f the stable skeleton graph and distinct de cision values s hows that these communica tion graphs are a prom ising new tool for studyin g the u nderlyin g syn chrony in a system. Since our algorith m yields a cor rect approx imation atop of any com municatio n pre dicate, par t of our future work will be d ev o ted to find ing a grap h-theor etic characterizatio n of the weakest syn chrony requ irements fo r d ifferent agre ement pr ob- lems and furth er exploring t he duality between commu nication predicates an d g raph-th eoretic prop erties. R E F E R E N C E S [1] E. Boro wsky and E. Gafni. Generalized FLP impossibili ty result for t-resili ent asynchronous computatio ns. In STOC ’93: Proc eedings of the twent y-fifth annual ACM s ymposium on Theory of computing , pages 91–100, New Y ork, NY , USA, 1993. A CM. [2] T . D. Chand ra, V . Hadzi lacos, and S. T oueg. The weakest fail ure detector for solving consensus. Journal of the ACM , 43(4):68 5–722, June 1996. [3] B. Charron-Bost and A. Schiper . Improving fast Paxos: being optimistic with no overhe ad. In 12th IE EE P acific Rim Internati onal Symposium on Dependabl e Computin g (P RDC 2006) , page s 287–295. IEEE Computer Society , 2006. [4] B. Charron-Bost and A. Schiper . The Heard-Of model: computing in distrib uted s ystems with benign faul ts. Distrib uted Computing , 22(1):49–7 1, Apr . 2009. [5] S. Chaudhuri. More choices allow more fault s: set consensus problems in total ly asynchronous systems. Inf. Comput. , 105(1):132–158, 1993. [6] D. Dolev , C. Dwork, and L. Stockmeye r . On the minimal synchronism needed for distrib uted consensus. Journal of the ACM , 34(1):77–97 , Jan. 1987. [7] C. Dwork, N. Ly nch, and L. Stockmeyer . Consensus in the presence of partia l s ynchron y . J ournal of the ACM , 35(2):288–32 3, Apr . 1988. [8] T . Elrad and N. Francez . Decomposition of distribut ed programs into communicat ion-clo sed layers. Science of Computer Pro gramming , 2(3):155–1 73, 1982. [9] M. J. Fischer , N. A. L ynch, and M. S. Paterson. Impossibility of distrib uted consensus with one faulty process. Journa l of the AC M , 32(2):374– 382, Apr . 1985. [10] E. Gafni. Round-by-round fault detectors (extended abstract): unifying synchron y and asynchron y . In Procee dings of the Sev entee nth Annual ACM Symposium on Principle s of Distrib uted Computing , pages 143– 152, Puerto V allarta, Mexico , 1998. ACM Press. [11] M. Herlihy and N. Shavit . The asynchronous computab ility theorem for t-resili ent tasks. In STOC ’93: Procee dings of the twenty-fi fth annual ACM symposium on Theory of computi ng , pages 111–120, Ne w Y ork, NY , USA, 1993. A CM. [12] M. Hutle, D. Malkhi, U. Schmid, and L. Zhou. Chasing the weakest sys- tem model for implementing omega and consensus. IEEE T ransact ions on Dependabl e and Secure Computing , 6(4):269–281 , 2009. [13] M. Hutle and A. Schiper . Communication predicates: A high-le vel abstrac tion for coping with transie nt and dynamic faults. In 37th Annual IEEE/IFIP In ternatio nal Confere nce on Depen dable Systems and Network s (DSN’07) , pages 92–101, 2007. [14] M. Saks and F . Zaharoglou. W ait-fre e k-set agreement is impossible: The topology of public kno wledge. SIAM J. Comput. , 29(5):1449–1483, 2000. [15] N. Santoro and P . W idmayer . Time is not a healer . In Proc. 6th Annual Symposium on Theor . Aspects of Computer Science (ST AC S’89) , LNCS 349, pages 304–313, Paderborn, Germany , Feb. 1989. Springer -V erlag. [16] N. Santoro and P . Wid mayer . Agreement in synchronous networks with ubiquit ous fa ults. Theor . Comput. Sci. , 384(2-3):232–249 , 2007.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment