An Analytical Model of Information Dissemination for a Gossip-based Protocol

An Analytical Mo del of Information Dissemination for a Gossip-based Proto col Rena Bakhshi, Daniela Gavidia, W an F okkink, and Maarten v an Steen Department of Computer Science, V rije Un ivers iteit Amsterdam, Netherlands { rbakhshi ,daniela,wanf,stee n } @few.vu.nl Abstract. W e develop an analytical mod el of information dissemination for a gossiping proto col that combines b oth pull and push approaches. With this model we analyse how fast an item is replicated t hrough a netw ork, and how fast the item spreads in the netw ork, and how fast the item cov ers the netw ork. W e also determine the optimal size of the exchange buﬀer, to obt ain fast rep lication. Our results are conﬁrmed by large-scale sim ulation exp erimen ts. 1 In tro duction T o day , large-sca le distributed sy stems consisting of thousands of no des are co m- monplace, due to the wide av aila bility of high-p erfor ma nce and low-cost dev ic es. Such systems are highly dynamic in the sense that no des a r e contin uously in ﬂux, with new no des joining a nd existing no des leaving. In pr actice, large -scale sys tems are o ften emulated to discover co r relations betw ee n de s ign pa r ameters a nd obser ved b ehaviour. Such exp erimental r esults provide esse n tia l data on system b ehaviour. How ever, they usually show only behaviour of a par ticula r implementation, and can b e time cons uming. More- ov er, in general exp eriments do not give a go o d understanding of the emerg ent behaviour of the s ystem, and into how pa rameter settings inﬂuence the extra- functional pr op erties o f the system. As a result, it is very diﬃcult to pre dict what the eﬀects o f certain design decisio ns a re, as it is pra ctically infeasible to explore the full range of input data. A ch a llenge is to develop ana lytical mo dels that capture (part of ) the b ehaviour of a s ystem, and then subsequently optimize design para meters following an analy tica l ra ther tha n an exp erimental a pproach. W e ar e interested in developing and v alidating a nalytical mo dels for g o ssip- based systems (cf. [1 ]). These systems rely on epidemic techniques for the com- m unica tion and ex change of infor ma tion. These c o mm unica tion proto cols, while having simple sp eciﬁcations, show complex and o ften unexp ected b ehaviour when exe c uted on a larg e sc ale. Our ana lytical mo dels of gossip pro to cols need to b e realistic, yet, suﬃciently abstr act to allow for eas y prediction of systems behaviour. By ‘realistic’ we mean that they can b e applied to la rge-sca le systems and ca n capture functional and extra-functional be haviour s uch as replication, cov erag e, conv ergenc e , and other system dynamics (see [2]). Such mo dels are amenable for mathematica l a na lysis, to make prec is e predictions. F ur thermore, we will exploit the fact that be c a use an analytical mo del presents an abstrac - tion of the orig inal pro to c ol, a simulation o f the mo del tends to b e muc h more eﬃcient (in computatio n time and memor y consumption) than a simulation of an implemen tatio n of this proto co l. In this pap er, we dev elo p a n analytical mo del o f a shuﬄe proto co l from [3], which was developed to disseminate da ta items to a collectio n of wireless devices, in a decentralized fas hio n. A dec ent r alized solution considera bly decr eases the probability of information loss or unav a ila bility that may o ccur due to a single po int o f failure, o r high latency due to the overload of a no de. No des executing the pro to col p e rio dically contact each other, according to so me probability dis- tribution, and e x change data items. Concisely , a no de initiates a contact with its random neighbour , pulls a random subset of items from the contacted node, si- m ultane o usly pushing its own ra ndom subset of items. This push/pull approach has a b etter p erfo rmance than a pure push o r pull appro ach [4, 5]. The amo un t o f information exchanged during each contact b e t ween tw o co mm unica ting no des is limited. Replication ensures the av ailabilit y of the data items even in the face of dyna mic b ehaviour, which is characteristic of wireless environmen ts. Thus, no des no t only conserve the data collectively stored in the netw ork, but a lso relo cate it in a r andom fashion; node s will even tually see a ll data items. The central point of our study is a rigor ous probabilistic analysis of infor ma- tion dissemination in a larg e -scale netw ork using the aforementioned proto co l. The behaviour of the proto co l is mo delle d on a n abstract lev el as pairwise no de int er actions. When tw o neig hbo uring no des interact w ith each other (go ssip), they may undergo a state transition (exchange items) with a certain proba bilit y . The transition probabilities dep end on the probability that a given item in a no de’s cache has b een r e placed b y another item after the shuﬄe. W e calculated accurate v alues for these pro babilities, yielding a rather complicated expression. W e also determined a clos e approximation that is expresse d by a m uch simpler formula, as well a s a correctio n factor for this approximation, allowing for precise error estimatio ns. Thus we obtain a b etter understanding of the emerg e n t b e- haviour of the pr o to col, and how parameter se ttings inﬂuence its extra -functional behaviour. W e investigated tw o pr op erties characterizing the proto c ol, namely , the num- ber of replicas of a given item in the netw ork at a certain moment in time (r epli- cation), and the num b er of no des tha t have ‘s een’ this item over time (cov er- age). Using the v a lues of the tra nsition probabilities, we determined the optimal nu mber of items to exchange pe r g ossip, for a fas t conv erg ence of cov er a ge and replication. Moreov er, we determined formulas that capture the diss emination o f an item in a fully connected netw or k. All our modelling and analysis results a re conﬁrmed by la rge-sca le simulations, in which sim ulations bas ed on our analyt- ical mo dels ar e compar e d with running the actual proto c ol. T o the b est of our knowledge, we are the ﬁr st to develop a n accurate, realistic for mal mo del that can b e used to optimally design and ﬁne-tune a given gossip pr oto col. In this 2 sense, our main contribution is demonstrating the feasibility of a mo del-dr iven approach to developing real- world gossip protoco ls. The pap er is structur ed as follows. The r e mainder o f this int r o duction dis- cusses rela ted work. Sec tio n 2 explains the shuﬄe proto col. In Section 3 the analytical mo del is developed. Sectio n 4 discusses the results of o ur exp erimen- tal ev aluations. Section 5 pr esents a round-based p ersp ective of r eplication and cov erag e. And Section 6 contains the co nclusions. Related w ork Two areas of research a re relev ant to our pap er: r igorous analysis of gos s ip (and related) proto cols, a nd results from ma thema tical theor y of epidemics [6, 7]. The results from epidemics ar e often used in the analysis of gossip proto c o ls [8]. W e restrict our overview to the most relev ant publications from the a rea o f gos s ip proto cols. Several works have fo cused on g o ssip-based membership management pro to- cols. Allav ena et al. [9 ] pro p osed a g ossip-based membership ma nagement proto col and analysed the ev o lution of the n umber of links b etw een t wo no des executing the proto col. T he states of the a s so ciated Markov c hain are the num b er of links betw ee n pair s of no des. F rom the des igned Ma rko v chain they calculated the exp ected time until a netw or k partition o ccurs. This ca se s tudy als o includes a mo del of the s ystem under ch urn. A g oal of that pap er is to show the eﬀect of mixing bo th pull a nd push approaches. Eugster et al. [10] presented a ligh tw eight probabilistic broadca st algor ithm, and analyse d the evolution of pro c e s ses that gossip one mes s age. The states o f the as so ciated Mar ko v chain are the num b er of pro cesses that propag ate one gossip mess a ge. F rom the designed Marko v chain, the authors computed the distribution of the go ssiping nodes . Their analys is has shown that the exp ected nu mber of rounds to propag ate the messag e to the e n tir e sys tem do es not dep end on the out-degree of no des. The s e results are based o n the a nalysis a ssumption that the individual out-deg rees ar e uniform. Ho wev er , this simpliﬁcation has shown to b e v alid only for small systems (cf. [4 ]). Bonnet [11] studied the evolution of the in-degree distribution of no des exe- cuting the Cyclo n pr oto col [12]. The states o f the asso ciated Mar kov c ha in ar e the fraction of no des with a sp eciﬁc in-de g ree distribution. F rom the designed Marko v chain the a utho r deter mined the distribution to which the proto col con- verges. There are a num be r o f theoretical results on goss ip pro to c ols, targ eted to a distributed agg regation. Boyd et al. [1 3] studied the av er aging problem and analysed a gossip proto c ol in which no des compute the av era ge of their lo cal measur ement s. The Mar ko v chain is deﬁned by a w eig ht ed random walk o n the g raph. E very time step, a pair of no des (connected by an edg e) communicates with a transition probability , and sets their v alues equal to the av erag e of their curr ent v alues. A state of the asso ciated Ma rko v chain is a vector o f v alues at the end of the time step. The 3 authors c o nsidered the o ptimization of the neighbour sele ction pr obabilities for each node, to ﬁnd the fa s test-mixing Mar ko v c ha in (for fast co nv ergence of the algorithm) on the gr aph. Jelasity et a l. [14] prop o sed a solution for agg regation in large dynamic net- works, supp orted by a p erformance ana lysis of the proto co l. A state o f the sys tem is represented by a vector, the elements o f whic h cor resp ond to the v alues at the no des, a target v a lue of the pro to col ca lculated from the vector elemen ts, and a measure of homogeneity c ha racterizing the quality of lo ca l appr oximations. The vector ev o lves a t every step o f the system acco rding to some dis tribution. In the analysis, the authors cons idered diﬀerent strategies (e.g., neighbour selection) to optimize the proto col implementation, and calculated the expected v alues for the ab ov ementioned pr o to col parameter s. Deb et a l. [15] studied the adaptatio n of r andom netw ork co ding to gos sip proto cols. The authors analys ed the exp ected time and messa ge c omplexity of t wo go ssip pr oto cols for message transmissio n with pure push and pure pull communication mo dels. 2 A Gossip-based Pr oto col for Wireless Net works This sec tio n describ es the shuﬄe proto c o l intro duced in [3]. It is a goss ip pro to col to disseminate small data items o f general interest to a c o llection of wireless devices. T he proto col r elies on replication to ensure the a v aila bility of data items in the face of dy namic b ehaviour, which is ch a racteristic of wir eless environments. The sy stem consists of a collectio n of wireles s no des , each of w hich contributes a limited amo unt of sto r age space (which we will refer to as the no de’s cache) to store data items. The nodes per io dically swap (sh uﬄe) data items from their cache with a randomly chosen neighbo ur. In this way , nodes update their cac hes on a regular basis, allowing no des to gr adually discover new items as they ar e disseminated through the netw or k. Items can b e published by any use r o f the system, and are pro pa gated thro ugh the netw ork . While an item is a piece o f informatio n, a co py is the representation of the item in the netw or k, and for e ach item sev er al co pies may exist. As items are go ssip ed b etw een neig hbouring no des , replication may o c c ur when a no de has av aila ble storage space to keep a cop y of an item it just gossip ed to a neighbour. 2.1 Proto col assumptions All no des have a commo n agre e men t on the frequency of go ssiping. How ever, there is no agr eement on when to go ssip. In terms of storag e space, we as s ume that all no des dedicate the same amo unt of storag e spa c e to keep items lo ca lly , and that all items a re o f the same s iz e. Therefore, w e say that each no de has a cache size of c . When shuﬄing, each no de sends a ﬁxed num b er s o f the c items in the c a che. The goss ip exchange is p erfor med as a n atomic pro cedur e, meaning that once a no de initiates an exchange with another no de, these pair of no des ca nnot bec ome inv olved in another exchange until the current exc hange is ﬁnished. 4 2.2 Description No des executing the shuﬄe pro to col initia te a shuﬄe p erio dically . In order to execute the pro to col, the initiating no de needs to contact a goss iping par tner. W e describ e the proto co l from the p oint of view of eac h participating no de. W e refer to [3] for a mo r e detailed description. No de A initiates the shuﬄe b y exe c uting the following steps: 1. pic ks a neighbour ing no de B at rando m; 2. selects ra ndomly s items from the lo cal cache, a nd se nds a copy of these items to B ; 3. receives s items fr om the lo cal cache of B ; 4. c hecks whe ther any of the r eceived items are already in its cache; if so, these received items are eliminated; 5. adds the r est of the received items to the loca l cac he; if the total n umber of items exceeds cach e size c , remov es items among the ones that were sen t b y A to B , but no t those that were also received by A from B , until the cache contains c items. In resp onse to b eing contacted b y A , no de B executes the following s teps: 1. receives s items fr om the lo cal cache of A ; 2. selects rando mly s items from its lo cal ca che, and sends a copy of these items to A ; 3. c hecks whe ther any of the r eceived items are already in its cache; if so, these received items are eliminated; 4. adds the r est of the received items to the loca l cac he; if the total n umber of items exceeds cach e size c , remov es items among the ones that were sen t b y B to A , but not those that were also received b y B fro m A , un til the cache contains c items. According to the pr o to col, each no de agrees to keep the items received from a neighbour. Given the limited storage space av ailable in each node, k eeping the items received during an exchange implies discarding some items that the no de has in its cache. By picking the items to b e discarded from the ones that have bee n sent to the neighbour, the co nserv ation of data in the netw ork is ensured. 2.3 Prop erties W e are int er ested in c ha r acteristics o f the disse mina tion of data items when the proto col is executed at a large scale, i.e. with a larg e set of no des. F o r this reason, we focus o n tw o prop erties tha t can b e obse r ved in la rge deploymen ts: i) the num b er o f replicas of an item in the netw or k, and ii) the coverage achieved by an item ov er time. Replication This prop er t y is deﬁned as the frac tio n o f nodes that hold a cop y of a generic item d in their cache, a t a g iven moment. After a n item is introduced int o the net work, w ith every shuﬄe inv o lv ing a no de that ha s the item in its cache, there is a chance that a new co py of the item will be crea ted, o r that 5 the item will b e discarded. As a result, with every pass ing round the num b er of copies in the net work for a pa r ticular item ﬂuctuates. Given that the stor age space at the no des is limited, items are in constant compe titio n to place c o pies in the netw or k. Since comp etition is fair (a ll items have the same chance of being r eplicated or disca rded), even tually the stor age capacit y is ev enly divided betw ee n the existing items. T o be mor e precise, consider a netw ork of N no des, in which n diﬀerent items hav e b een published in tota l. Since there are N · c ca che ent r ies in the netw ork in total, the av era ge num b er of copies that an individua l item has in the net work will conv erge to N · c n . So replica tion will conv erge to c n . Co vera ge This prop er ty is deﬁned as the fraction of no des in the netw o rk that have seen a gener ic item d since it was introduced into the netw ork. As explained e arlier, several copies of an item a re gener ated after the item is ﬁr st published. Due to the p erio dic nature of the proto col, these copies contin ually mov e thro ugh the netw ork. This results in no des disco vering item d over s e veral rounds. With each passing round, more no des will hav e seen d . Even tually , d will hav e been s een by a ll no des (i.e., the coverage is equal to 1 ). The sp eed at which the cov erag e grows is inﬂuenced by several factors (a s will b e expla ined later on) including the num b er of diﬀerent items in the netw ork (i.e. co mpe tition), cache size, and the size of the exchange buﬀer. 3 An Analytical Mo del of Information Dissemination W e analys e dissemination of a ge ne r ic item d in a netw ork in which the no des execute the shuﬄing proto col. 3.1 Probabilities of state transiti o ns Fig. 1. Symbo lic repres ent a tion for caches of g ossiping no des. W e pr esent a model of the s h uﬄe proto col that captures the pres ence or a bs ence of a generic item d after shuﬄing of t wo nodes A a nd B . There are four p os sible states of the caches of A and B b efor e the shuﬄe: bo th hold d , either A ’s or B ’s cache holds d , or neither cache holds d . W e use the nota tion P ( a 2 b 2 | a 1 b 1 ) for the pro bability that from state a 1 b 1 af- ter a s hu ﬄe we get to state a 2 b 2 , with a i , b i ∈ { 0 , 1 } . The indices a 1 , a 2 and b 1 , b 2 indicate the presence (if equal to 1) or the abs ence (if e q ual to 0) of a generic item d in the cache of an initiato r A and the con ta c ted node B , resp ectively . F or example, P (01 | 10) means that no de A had d befor e the shuﬄe, whic h then mov ed to 6 the cache of B , afterwards. Due to the s ymmetry of informatio n exchange b e- t ween nodes A and B in the s huﬄe proto col, P ( a 2 b 2 | a 1 b 1 ) = P ( b 2 a 2 | b 1 a 1 ). Fig. 1 depicts all p ossible outco mes fo r the caches of go s siping no des as a state transition diag ram. If b e fo re the exchange A and B do not ha ve d ( a 1 b 1 = 00), then clear ly after the exchange A and B still do not have d ( a 2 b 2 = 0 0 ). Otherwise, if A or B has d ( a 1 = 1 ∨ b 1 = 1), the shuﬄe proto co l guarantees that after the exchange A or B s till has d ( a 2 = 1 ∨ b 2 = 1). Therefor e, the state ( − , − ) has a self-tra nsition, and no other outgoing or incoming tra nsitions. W e deter mine v alues for all pro babilities P ( a 2 b 2 | a 1 b 1 ). They are expressed in terms of proba bilities P sele ct and P dr op . The probability P sele ct expresses the chance of an item to be selected b y a no de from its lo cal ca che when engage d in an exchange. The pro ba bilit y P dr op represents a proba bilit y that an item which can b e ov erwr itten (meaning it is in the ex change buﬀer of its no de, but not of the other no de in the s huﬄe) is indeed overwritten by a n item received by its no de in the shuﬄe. Due to the symmetr y of the proto col, these probabilities ar e the same for bo th initiating and contacted no des. In Sec. 3.2 , we will ca lculate P sele ct and P dr op . W e write P ¬ select for 1 − P sele ct and P ¬ dr op for 1 − P dr op . Scenario 1 ( a 1 b 1 = 00 ) Befor e shuﬄing, neither node A nor no de B have d in their cache. a 2 b 2 = 00 : neither node A nor no de B have item d after a shuﬄe b ecause neither of them had it in the c aches b efor e the shuﬄe: P (00 | 0 0) = 1 a 2 b 2 ∈ { 01 , 10 , 1 1 } : cannot o c c ur, bec a use none of the no des have item d . Scenario 2 ( a 1 b 1 = 01 ) Befor e shuﬄing, a co py of d is only in the ca che of no de B . a 2 b 2 = 01 : no de A do es not hav e d b ecause no de B had d but did not select it (to send) and, th us , B did not o verwrite d , i.e. the probability is P (01 | 01) = P ¬ select a 2 b 2 = 10 : o nly no de A has d b ecause no de B selected d and dropp ed it; that is, the pro bability is P (10 | 0 1 ) = P sele ct · P dr op a 2 b 2 = 11 : b oth nodes A and B have a co py of d b ecaus e no de B selected d and kept it; that is, P (11 | 01) = P sele ct · P ¬ dr op a 2 b 2 = 00 : ca nnot o ccur as completely dis carding d is no t p o ssible in the pro - to col; that is , if either no des send an item, its pa rtner keeps this copy as well, and if an item is not a mo ng the selected for a shuﬄe, the item is not replaced by another one (see Sec. 2.2). Scenario 3 ( a 1 b 1 = 10 ) Befor e shuﬄing, d is only in the cache of no de A . Due to the sy mmetry o f no des A a nd B , this scena r io is symmetric to the previous one with P ( a 2 b 2 | 10) = P ( b 2 a 2 | 01). Scenario 4 ( a 1 b 1 = 11 ) Befor e shuﬄing, d is in the cache of node A as well as in the cache of node B . a 2 b 2 = 01 : o nly no de B has d b ecause no de A sele c ted d and dropp ed it and no de B did not select d ; that is, P (01 | 11) = P sele ct · P dr op · P ¬ select 7 a 2 b 2 = 10 : this outcome is symmetr ic to the previo us one: P (10 | 11) = P ¬ select · P sele ct · P dr op a 2 b 2 = 11 : a fter the shuﬄe b oth no des A and B hav e d , b ecause: – nodes A and B had d but b oth did not select it, i.e. P ¬ select · P ¬ select ; – both no des A and B selected d (thus, both kept it), i.e. P sele ct · P sele ct ; – node A selected d and kept it and no de B did not select d : P sele ct · P ¬ dr op · P ¬ select ; – symmetric case with the previous o ne: P ¬ select · P sele ct · P ¬ dr op . Thu s , P (11 | 11) = P ¬ select · P ¬ select + P sele ct · P sele ct + 2 · P sele ct · P ¬ select · P ¬ dr op a 2 b 2 = 00 : ca nnot o ccur , discar ding of an item is not permitted by the pr oto col (see Sec. 2.2). 3.2 Probabilities of selecting and dropp i ng an item The follo wing analysis assumes that all node caches are full (that is, the netw or k is alrea dy r unning for a while). Mor eov er, w e assume a uniform distr ibution of items ov er the netw ork ; this ass umption is suppor ted by ex p er iment s in [3 , 4]. Consider no des A and B engag ed in a shuﬄ e, and let B receive the exchange buﬀer S A from A . Let k b e the num b er of duplica tes (see Fig. 2), i.e. the items of an in tersection of the no de ca che C B and the ex change buﬀer of its g o ssiping partner S A (i.e. S A ∩ C B ). Recall fro m Se c . 2.1 that C A and C B contain the same n umber of items for all A and B , and likewise for S A and S B ; we use c and s for these v alues. The total num ber of diﬀerent items in the netw ork is denoted as n . n S A k C B Fig. 2. k items in S A ∩ C B S A S B b s C B Fig. 3. b s items in S A ∩ S B The proba bility o f selecting an item d in the cache is the probability of a single selection trial (i.e. 1 c ) times the num b er of selec tio ns (i.e. s ): P sele ct = s c . Thu s , the proba bilit y that an item d in the cache is not se lected is: P ¬ select = 1 − P sele ct = c − s c . Consider Figs. 2 and 3 . The shuﬄe proto col demands that all items in S A are kept in C B after the shuﬄe. This implies that: a) a ll items in S A \ C B will ov erwrite items in S B ⊆ C B , and b) all items in S A ∩ C B are k ept in C B . Th us, the proba bilit y that an item from S B will b e ov erwr itten is deter mined by the 8 probability that an item from S A is in C B , but not in S B . Namely , the items in S B \ S A provide a space in the cache for items from S A \ C B . W e w ould like to express the probability P dr op of a s elected item d in S B \ S A (or S A \ S B ) to be ov erwritten by another item in C B (or C A ). Due to symmetry , this probability is the same for A a nd B ; therefore, we o nly calculate the pro bability that an item in S B \ S A is dropp ed from C B . The exp ected v alue of this proba bility dep ends on how many duplicates a no de re ceives from its g ossiping partner: E [ P dr op ] =            s X k =0 ( P | S A ∩ C B | = k dr op · P | S A ∩ C B | = k ) if s + c 6 n s X k =( s + c ) − n ( P | S A ∩ C B | = k dr op · P | S A ∩ C B | = k ) otherwise where P | S A ∩ C B | = k is the probability of having exactly k items in S A ∩ C B , and P | S A ∩ C B | = k dr op is the proba bility that an item in S B \ S A is dropped from C B given k duplica tes in S A ∩ C B . The case distinction is because if s + c > n , then clea rly there are at lea st ( s + c ) − n items in S A ∩ C B . F rom the  n s  po ssible sets S A , we compute how ma ny hav e k items in co mmo n with C B . Firstly , there are  c k  wa ys to choos e k such items in C B . Secondly , there are  n − c s − k  wa ys to choo s e the remaining s − k items o utside C B . So in total,  c k  ·  n − c s − k  po ssible sets S A hav e k items in common with C B . Hence, under the assumption o f a uniform distribution of the da ta items over the caches of the no des, 1 P | S A ∩ C B | = k =  c k  ( n − c s − k ) ( n s ) . The exp ected v alue of P | S A ∩ C B | = k dr op is: E [ P | S A ∩ C B | = k dr op ] =              k X b s =0 P | S A ∩ S B | = b s dr op · P | S A ∩ S B | = b s if s + k 6 c k X b s =( s + k ) − c P | S A ∩ S B | = b s dr op · P | S A ∩ S B | = b s otherwise where b s is the num b er of items in S A ∩ S B (see Fig. 3). T he cas e distinction is bec ause if s + k > c (with k the num b er of items in S A ∩ C B ), then clearly there are at least ( s + k ) − c items in S A ∩ S B . Among the s items in S B , there a re b s items also in S A , and th us o nly the s − b s items in S B \ S A can b e dropp ed from C B . P | S A ∩ S B | = b s dr op is the probability that an item in S B \ S A is dropp ed from C B , given b s items in S A ∩ S B : P | S A ∩ S B | = b s dr op = ( 0 if s = b s s − k s − b s otherwise P | S A ∩ S B | = b s is the pr obability of having ex actly b s items in S A ∩ S B : E [ P | S A ∩ S B | = b s ] =  s b s  ( c − s k − b s ) ( c k ) . The intuit io n b ehind this exp ected v alue of P | S A ∩ S B | = b s is similar to 1 Here we u se a generalization of t h e usual deﬁn ition of binomial co eﬃcien ts to negative integ ers. That is, for all m and l ≥ 0, ` m l ´ = ( − 1) l ` − m + l − 1 l ´ (cf. [16]) 9 the one o f P | S A ∩ C B | = k . F r om the  c k  po ssible s e ts S A , we co mpute how many hav e b s items in common with S B . That is, there ar e  s b s  wa ys to c ho o se b s items in S B , and  c − s k − b s  wa ys to c ho o se the remaining k − b s items outside S B . Let’s assume 2 s ≤ c ≤ n − s (b eca use then s + c ≤ n a nd s + k ≤ 2 s ≤ c ). Then, substituting in the expression for E [ P dr op ] in ca se s + c ≤ n , and noting that in the summand k = s the factor P | S A ∩ S B | = s dr op is equal to zero , w e get: E [ P dr op ] = s − 1 X k =0  c k   n − c s − k   n s  k X b s =0 s − k s − b s  s b s   c − s k − b s   c k  = n − c  n s  s − 1 X k =0  ( n − c ) − 1 ( s − k ) − 1  k X b s =0  c − s k − b s  s b s  s − b s (1) The probability of keeping an item d in S B \ S A ⊆ C B can be expr essed as P ¬ dr op = 1 − P dr op . 3.3 Simpliﬁ cation of P dr op In order to ga in a clea rer insight into the emerge nt b ehaviour of the gos s iping proto col we make an eﬀort to simplify the for m ula for the probability P dr op of an item in S B \ S A to b e dropp ed from C B after a s huﬄe. Therefor e, we re- examine the rela tio nships b etw een the k duplicates received fr om a neighbour, the b s items of the overlap S A ∩ S B , and P dr op . Let’s estimate P | S A ∩ C B | = k dr op by considering each item from S A separately , and calculating the probability that the item is a duplicate (i.e., is also in C B ). The proba bilit y o f an item from S A to b e a duplicate (also present in C B ) is c n . In v iew of the uniform distribution of items o ver the netw ork , the items in a no de’s cache are a ra ndom sa mple from the universe of n data items; so all items in S A hav e the s ame chance to be a duplicate. Th us, the exp ected num b er of items in S A ∩ C B can be estimated by E [ k ] = s · c n . And the ex pec ted num b er of items in S A ∩ S B can b e es tima ted by E [ b s ] = k · s c , b ecaus e only the k items in S A ∩ C B may end up in S A ∩ C B ; s c captures the probability that an item fr o m C B is also selected to be in S B . It follows that the pr obability of an item in S B \ S A to b e dropp ed fro m C B after the shuﬄe is E [ P dr op ] = s − k s − b s = s − s · c n s − s · c n · s c = n − c n − s . The complementary probability of keeping an item is E [ P ¬ dr op ] = 1 − n − c n − s = c − s n − s . These estimates a re v alid for general s ≤ c ≤ n . Substituting the expressions for P sele ct and the simpliﬁed P dr op int o the for- m ula s for the transition pro babilities in Fig. 1, we obtain: P (01 | 01) = P (10 | 10) = c − s c P (01 | 11) = P (10 | 11 ) = s c c − s c n − c n − s P (10 | 01) = P (01 | 10) = s c n − c n − s P (11 | 11) = 1 − 2 s c c − s c n − c n − s P (11 | 01) = P (11 | 10) = s c c − s n − s In or de r to verify the accuracy of the prop osed simpliﬁcation for E [ P dr op ], we compare the simpliﬁcation and the accura te formula (1) for diﬀerent v alues of 10 n . W e plo t the diﬀerence of the a ccurate P dr op and the simpliﬁcation, for ca che sizes c = 250 and c = 500 (Fig. 4 ). 1e-200 1e-180 1e-160 1e-140 1e-120 1e-100 1e-80 1e-60 1e-40 1e-20 1 30 60 90 120 relative error exchange buffer size c=250 n=700 n=1100 n=1500 1e-15 1e-10 1e-05 1 1 3 5 1e-15 1e-10 1e-05 1 1 3 5 0 1e-300 1e-250 1e-200 1e-150 1e-100 1e-50 1 60 120 180 exchange buffer size c=500 n=1400 n=2200 n=3000 1e-15 1e-10 1e-05 1 1 3 5 1e-15 1e-10 1e-05 1 1 3 5 Fig. 4. The diﬀerence of the accurate P dr op and its appro ximation, for diﬀeren t val u es of n and c . 3.4 Correction factor W e now examine how clo sely the simpliﬁed formula E [ P dr op ] = n − c n − s (here re- ferred a s S ( n, c, s )) a pproximates for mula (1) (here referr ed as E ( n, c, s )). W e compared the diﬀerence b etw een these tw o formulas using an implementation on the basis of common fractio ns , which provides loss-les s ca lculation [17]. W e observed that the inv erse of the diﬀerence of the inv ers e v alues of b oth formulas, i.e. e c,s ( n ) =  E ( n, c, s ) − 1 − S ( n, c, s ) − 1  − 1 , exhibits a certa in pattern for diﬀer- ent v alues o f n , c and s . F or s = 1, E ( n, c, 1) = n − c n , whereas S ( n, c, 1) = n − c n − 1 . W e then inv estigate the cor rection factor θ in E ( n, c, s ) = n − c ( n − s )+ θ . Thus, for s = 1 we have θ = 1. Y et, for s > 1 the situa tio n tur ned o ut to b e more complicated. F or s = 2, we go t e 4 , 2 (7) − e 4 , 2 (6) = 3 . 5, e 4 , 2 (8) − e 4 , 2 (7) = 4, e 4 , 2 (9) − e 4 , 2 (8) = 4 . 5, and etc. Therefo r e w e ca lculated the ﬁrst, the second a nd other (forward) diﬀerences 2 ov er n . W e r ecognized that the s - th diﬀerence of the function e c,s ( n ) is always 1 s . Moreover, at the p oint n = 0 the 1st, . . . , s - th dif- ferences of the function e c,s exhibit a pattern similar to the Pascal triangle [19]; i.e. for d ≥ 1 the d -th diﬀerence is: (∆ d e c,s )(0) = 1 s · ( s − 1 d ) (assuming  a b  = 0, whenever b > a ). Knowing the initial diﬀerence at the p o in t n = 0, we were able to use the Newton for ward diﬀerence equatio n [18 ] to derive the following formula fo r n > 0: E [ P dr op ] = n − c ( n − s )+ 1 γ , where γ = s − 1 X d =0  n d  s ·  s − 1 d  =  n s  ( n − s ) + 1 · s − 1 X d =0 1  n − d ( s − 1) − d  (2) 2 A forw ard diﬀerence of discrete function f : Z → Z is a function ∆f : Z → Z with ∆f ( n ) = f ( n + 1) − f ( n ) (cf. [18] ). 11 In this equation the sum is ﬁnite b ecause due to the o bserv ation that the s -th diﬀerence is constant 1 s , all higher diﬀerences ar e 0. Extensive exp er imen ts with Mathematica and Matlab indicate that n − c ( n − s )+ 1 γ and formula (1) coincide. W e can also see from Fig. 4 that the corr ection facto r is small. 3.5 Optimal si ze for the excha nge buﬀer 50 55 60 65 70 75 80 85 90 0 250 500 750 1000 optimal exchange buffer size items (n) Fig. 5 . O ptimal v alue of exchange buﬀer size, dep ending o n n . W e study what is the optimal v alue for fa s t c o nv ergence of replication and coverage with resp ect to an item d . Since d is introduced at only one no de in the netw ork, o ne needs to o ptimize the chance that an item is duplicated. That is , the probabilities P (11 | 01) and P (11 | 10) should b e optimized (then P (01 | 11 ) and P (10 | 11) are o ptimized as well, int uitively b ecause for eac h dupli- cated item in a shuﬄe, another item m ust b e dropp ed). Thes e pr o babil- ities bo th equal s c c − s n − s ; w e compute when the s -der iv ative of this form ula is zer o. This yields the equation s 2 − 2 ns + nc = 0; taking into the a ccount that s ≤ n , the only solution of this equatio n is s = n − p n ( n − c ). W e conclude that this is the optimal v alue fo r s to obtain fast conv er gence o f replication and cov era ge. This will also be conﬁrmed by the exp eriments a nd analyses in the following sections. 4 Exp erimen tal Ev aluation In o rder to test the v alidity of the ana ly tical mo del of informa tion spr e ad under the shuﬄe pr o to col presented in the previous section, we follow ed an exp eri- men ta l appr oach. W e compar ed prop erties observed while running the shuﬄe proto col in a large- scale deploymen t with simulations of the mo del under the same co nditions. These expe r iments show that the analytical mo del indeed cap- tures infor mation spr ead of the shuﬄe pro to col. W e note that a simulation of the analy tical mo del is muc h mor e eﬃcient (in computation time a nd memory consumption) than a simulation of the implement a tion of the shuﬄe proto col. The exp eriments simulate the case where a new item d is intro duced at o ne no de in a netw or k, in which all caches are full and unifor mly p opula ted by n = 5 00 items. They w ere performed o n a net work of N = 2500 no des, arranged in a square grid top o logy (50 × 50 ), where each no de can communicate only with its four immediate neighbo urs (to the North, South, Ea s t and W est). 12 This conﬁgura tion of no des is ar bitrary , we only require a la rge num be r of no des for the obser v ation of emergent b ehaviour. Our aim is to v alida te the co r- rectness o f our analytica l mo del, not to test the endless po ssibilities of net work conﬁguratio ns. The model and the shuﬄe pr oto col do no t make any a ssumptions ab out the net work. The net work conﬁguratio n is provided by the simulation en- vironment and can easily b e c hang ed into something diﬀer ent , e.g. other netw o r k top ology . F or this reaso n, we hav e chosen this la r ge grid for testing, although other conﬁgura tions could ha ve been po ssible. Each no de has a cache size of c = 100 , and sends s items when go ssiping. In each round, every no de r andomly sele c ts one of its neighbour s , and up dates its state according to the transition probabilities in tro duced befor e (Fig. 1). This mimics (the probabilities of ) an a ctual exchange o f items betw een a pair of no des a ccording to the shuﬄe proto col. While in the proto co l, this r e sults in b oth no des up dating the conten ts o f their cac hes, in a simulation using the analytical mo del, up da ting the state of a no de refers to up dating only one v ariable: whether the no de is in p ossess io n of the item d or not. In the exp eriments, a fter each gossiping ro und, we measure d the total num b er of o ccurrences of d in the netw or k (replication), and how ma ny no des in total hav e seen d (cov era ge); se e Sec. 2.3 . In order to ﬁll the caches o f the no des with a r andom selection of items, the measurements a re initiated after 1 000 rounds of gossiping . In o ther words, 500 diﬀerent items are inserted at the b eginning of the simulation, and sh uﬄed for 1000 rounds. During this time, items a re replica ted a nd the replicas ﬁll the caches of all no des. At round 100 0, a co py of the fres h item d is inserted a t a random lo ca tio n, and its spread throug h the netw o rk is track ed over the nex t 2000 rounds. 0 100 200 300 400 500 600 0 200 400 600 800 1000 1200 1400 number of replicas rounds s = 95 s = 50 s = 10 0 100 200 300 400 500 600 0 200 400 600 800 1000 1200 1400 number of replicas rounds s = 95 s = 50 s = 10 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 1200 1400 coverage (% of nodes) rounds s = 95 s = 50 s = 10 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 1200 1400 coverage (% of nodes) rounds s = 95 s = 50 s = 10 Fig. 6. The shuﬄe proto col ( left) and the mod el (right), for N = 2500, n = 500, c = 100 and diﬀerent v alues of s . 13 Fig. 6 shows the behaviour of bo th the s huﬄe pr oto col and the analytical mo del in terms of replica tion and cov erag e of d , for v ar ious v alues of s . E ach curve in the g raphs repres ent s the av era ge and standar d devia tion calculated over 100 runs. The ex per iments with the mo del calcula te P dr op using the simpliﬁed formula n − c n − s describ ed in Sec. 3.3 . It can be observed very clearly that the results obtained from the mo del (right) rese mble close ly the ones fro m executing the proto col (left). W e note that in all cases , the netw ork conv erg es to a situa tio n in whic h there are 5 00 copies o f d , mea ning that replication is 500 2500 = 0 . 2; this ag rees with the fact that c n = 100 500 = 0 . 2. Moreover, our exp eriments show that r eplication and cov erag e displa y the fastest conv ergence when s = 50 ; this agrees with the fact that n − p n ( n − c ) = 5 00 − √ 500 · 40 0 ≈ 50 (cf. Sec. 3 .5). 5 Round-based Mo delling of Proto col Prop erties In this sectio n we e x ploit the ana lytical mo del of infor mation disseminatio n to per form a mathematical analy s is of replication and cov era ge with regar d to the shuﬄe pr oto col. F or the particula r case of a netw ork with full connectiv ity , where a no de can g ossip with any other no de in the netw ork, we can ﬁnd explicit ex- pressions for the disse mina tion of a generic item d in terms of the proba bilities presented in Sec. 3 . W e construct tw o diﬀerential equations that ca pture repli- cation and coverage of item d from a round-based p er sp ective. The a dv an tag e of this approach is that we can determine the lo ng-term b ehaviour of the system as a function of the pa rameters. 5.1 Replication One no de intro duces a new item d into the netw ork at time t = 0, by placing d int o its cache. F rom that moment o n, d is r eplicated a s a consequence of go ssiping among no des. Let x ( t ) represe n t the p ercentage of nodes in the netw ork that hav e d in their cache at time t , where each gossip ro und ta kes one time unit. The v ariation in x per time unit dx dt can be derived based on the pr o bability that an item d will replicate or disapp ear after an exchange betw een tw o no des, where at least one of the no des has d in its cache: dx dt = [ P (11 | 10) + P (11 | 01 )] · (1 − x ) · x − [ P (10 | 1 1 ) + P (01 | 11)] · x · x The ﬁrst ter m repr esents duplication of d when a no de that has d in its cache initiates the shuﬄe, and contacts a node that do es not have the item. The second term repr e sents the opp osite situation, when a no de that do es not hav e the item d initiates a sh uﬄe with a no de tha t has d . The third and fourth term in the equation r epresent the ca ses where b oth gossiping no des hav e d in their c a che, and a fter the ex change only one copy o f d re ma ins. Substituting 14 P (11 | 10) = P (11 | 01 ) = s c c − s n − s and P (10 | 11) = P (01 | 11 ) = s c n − c n − s c − s c , w e obtain dx dt = 2 · s c · c − s n − s · x · (1 − n c · x ) (3) The solution of this equation, taking into account that x (0) = 1 N with N the nu mber o f no des in the netw or k , is x ( t ) = e αt ( N − n c ) + n c e αt (4) where α denotes 2 s c c − s n − s . By imp osing s ta tionarity , i.e. dx dt = 0, we ﬁnd the stationary solution c n . Hence, this ca lculation conﬁr ms the observ ation in Se c . 2.3 that the netw ork conv er g es to a situation in which replication of d is c n . 0 0.05 0.1 0.15 0.2 0.25 0.3 0 200 400 600 800 1000 replicas (% of nodes) rounds simulation (average) model 0 0.05 0.1 0.15 0.2 0.25 0.3 0 200 400 600 800 1000 replicas (% of nodes) rounds simulation (average) model 500 items 1000 items 0 0.05 0.1 0.15 0.2 0.25 0.3 0 200 400 600 800 1000 replicas (% of nodes) rounds simulation (average) model 2000 items Fig. 7. P ercentage of no des in the netw ork with a replica of item d in th eir cache, fo r N = 2500, c = 100, s = 50, and n = 500, n = 1000 or n = 2000. W e ev aluate the accuracy o f x ( t ) as a r epresentation of the fractio n of no des carrying a replica of d , by running a ser ies of exper iments where N = 25 00 no des 15 execute the shuﬄe proto co l, and their caches are monito r ed for the presence o f d . Unlike the exp eriments in Sec. 4, we assume full co nnectivity; that is, for each no de, all o ther no des are within reach. After 10 00 rounds, wher e items are disseminated and r e plicated, a new item d is inser ted at a random no de, at time t = 0. W e track the num b er of replicas of d fo r the next 1 000 ro unds. The exp eriment is r ep eated 100 times and the results are av era ged. Thes e sim ulation results (av erag e and standar d deviation) for the pro to c ol, together with x ( t ), are presented in Fig. 7. This ﬁg ure shows the same initial increase in r eplicas after d has b een inser ted, and in all cases the s tea dy state reaches pr e cisely the exp ected v alue c n predicted from the sta tio nary solution. W e rep ea t the ca lculation fro m Sec. 3.5 , but now aga inst x ( t ), to deter- mine which size of the exchange buﬀer yields the fastest co n vergence to the steady-state for both replica tion and coverage. That is, w e search for the s tha t maximizes the v alue of x ( t ). W e ﬁrst compute the deriv ative of x ( t ) with resp ect to s ( z ( t, s )), a nd then derive the v alue of s that maximizes x ( t ), by tak ing z ( · , m ) = ∂ x ∂ s | m = 0: z ( t, s ) = ∂ x ∂ s = 2 e kt ( cN − n )( cn + s ( − 2 n + s )) t ( cN +( − 1+ e kt ) n ) 2 ( n − s ) 2 , where k = 2 s c c − s n − s . Let z ( t, s ) = 0. F or t > 0, c n = s (2 n − s ). Solving this equa tio n we get s = n ± p n ( n − c ). T aking in to the account that s ≤ n , the only solution is s = n − p n ( n − c ) . So this c o incides with the optimal exchange buﬀer size found in Sec. 3.5. 5.2 Co vera ge W e use the term cov erag e to deno te the p ercentage of no des in the net work that hav e seen item d from the momen t it was in tro duced into the netw or k. Let y ( t ) represent the cov erag e of d at time t . The v ariation in cov erag e p er time unit, dy dt , is determined b y the fraction of no des that hav e not see n d , 1 − y , that in tera c ts with no des that hav e d in their cache, x . Let ∗ ∈ { 0 , 1 } , then: dy dt = P (1 ∗ | 01) · P ( ∗ 1 | ∗ 1 ) · (1 − y ) · x + P (1 ∗ | 01) · P (1 ∗ | 1 ∗ ) · x · (1 − y ) (5) The ﬁr s t term is repres ents increased cov erag e due to no des discovering d after int er acting with no des that hav e d in their cache. This ca n o cc ur when a no de initiates the exchange ( P (1 ∗ | 01 )), or when the no de is co n ta c ted ( P ( ∗ 1 | 10)). The seco nd par t of these terms represe n ts the case when a no de discov er s a nd do es no t give aw ay its copy of d within the same ro und to another no de. This is b ecause coverage is only measur ed a t the end of a go ssiping round, meaning that a no de tha t sees item d for the ﬁrst time, and drops it in the same round, is cons idered not to ha ve se en item d y et. 3 Since no des shuﬄe, on average, twice per round (once when they initiate the s hu ﬄe a nd ag ain if they a r e co nt a cted by a neighbour), this could o ccur under t wo scena rios: i) the no de acquire d d 3 The reason for th is is t hat the application has an op p ortunity to read from the lo wer-lev el cac h e only once every round . 16 by initiating an exchange with a no de that had d ( P (1 ∗ | 01)) and next lost its copy of d when shuﬄing with a no de that contacted it ( P ( ∗ 1 | ∗ 1)), or ii) the no de was ﬁrs t contacted by a no de that sent a co py of d ( P ( ∗ 1 | 1 0)) and nex t initiated a shuﬄe and g av e aw ay its copy of d ( P (1 ∗| 1 ∗ )). Note that P ( ∗ 1 |∗ 1) = 1 − P (01 | 10 ), as the proba bilit y P (01 | 10) is the o nly transition probability that do es no t ma tch the pattern P ( ∗ 1 | ∗ 1 ). Due to the s ymmetry o f b o th g ossiping no des, P ( ∗ 1 | ∗ 1) = P (1 ∗ | 1 ∗ ). Substituting (4) into (5), we obta in dy dt = 2 · s c ·  1 − s c · n − c n − s  · (1 − y ) · e αt ( N − n c ) + n c e αt 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 coverage (% of nodes) rounds simulation (average) model 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 coverage (% of nodes) rounds simulation (average) model 500 items 1000 items 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 coverage (% of nodes) rounds simulation (average) model 2000 items Fig. 8. P ercentage of nodes in th e netw ork that hav e already seen a replica of item d , for N = 2500, c = 100, s = 50, and n = 500, n = 1000 or n = 2000. The solution of this equa tion, taking int o a ccount that y (0) = 1 N , is y ( t ) = 1 − ( N − 1) N β − 1  N − n c  + n c · e αt  − β 17 where β denotes n c · ( 1 − s c · n − c n − s ) c − s n − s . By impo sing s tationarity dy dt = 0, we ﬁnd the stationary solution 1, meaning that even tually all no des will s e e d . In order to ev aluate ho w closely y ( t ) models co verage, we us e the traces from the simulations executed for Sec . 5 .1 . A t e very round, the no des that car ry a replica of d ar e identiﬁed, and a re cord of the no des that hav e seen d since it was published is kept. Fig . 8 presents the cov era ge measured for four sets of exp eriments, each set with a diﬀerent v alue for n . As n increases, a new ly inserted item r equires mor e time to cov er the whole net work. This is due to having more comp etition from other items to create replicas in the limited space av aila ble, as was previo usly s hown in Fig. 7. Howev er, as predicted b y the statio nary solution, in all c a ses the coverage eventually rea ches 1. As shown in Fig. 8, the s olution y ( t ) mo dels the b ehaviour observed in simulations, falling nicely within the standard deviation of the simulation results. 6 Conclusions In this pap er, we hav e demonstrated that it is p os s ible to mo del a g ossip pro to col through a rigo rous proba bilis tic ana ly sis of the state transitio ns of a pair o f no des engaged in the g ossip. W e hav e shown, thr ough an extensive s im ula tion study , that the dissemination of a data item ca n b e faithfully repro duced by the mo del. Having an accura te mo del of no de interactions, we hav e b een able to car r y out the following: – After ﬁnding precise expressions for the pr obabilities involv ed in the mo del, we provide a simpliﬁed version of the transition probabilities. These sim- pliﬁed, y et accurate, expressions c a n b e easily computed, allowing us to simulate the disseminatio n of an item without the c omplexity of executing the a ctual shuﬄe proto c o l. These simulations use very little state (only some parameters and v ariables, as opp osed to maintaining a cache) a nd can b e executed in a fra ction of the time required to run the proto col. – The mo del r eveals relationships be t ween the pa r ameters of the sy stem. Armed with this knowledge, we successfully optimize o ne o f the para meter s (the size of the exchange buﬀer) to obtain the fastest con vergence of the observed prop erties. – Under the a ssumption of full connectivity , we are able to use the transition probabilities to mo del the prop erties of the dissemination of a g eneric item. Each pro pe r ty is ultimately expr essed as a for mula which is shown to display the same behaviour a s the average behaviour o f the pr oto col, verifying the v alidit y of the mo del. While gossip pr oto cols are easy to understand, ev en for a s imple push/pull pr o- to col, the int er actions b etw een no des a re unexp ectedly complex. Understand- ing these int er actions provides insig ht into the mechanics b ehind the emerg ent behaviour observed in goss ip proto cols. W e b e lieve that unders tanding the me- chanics of gossiping is the key to o ptimizing (and even shaping ) the emergent 18 prop erties that make g o ssiping app ealing as communication par adigm for dis- tributed systems. References 1. Bakhshi, R., Bonnet, F., F okk ink, W., Ha verko rt, B.: F ormal analysis techniques for gossiping proto cols. ACM SIGOPS Op er. Syst. Rev . 41 (5) (2007) 28–36 2. Eugster, P ., Guerraoui, R., Kermarrec, A.M., Massouli ´ e, L.: Epidemic Information Dissemination in Distributed S ystems. IEEE Co mp uter 37 (5) (2004) 60–67 3. Ga vidia, D., V oulga ris, S., v an Steen, M.: A Gossip-based Distribut ed News Service for Wireless Mesh Netw orks. In: Pro c. 3rd I EEE Conf. on Wireless On -demand Netw ork Sy stems and Services (WONS’06), IEEE (2006) 59–67 4. Jelasit y , M., V oulgaris, S., Guerraoui, R., Kermarrec, A.M., v an Steen, M.: Gossi p - based p eer sampling. ACM T rans. Comput. Sy st. 25 (3) (2007 ) 8 5. Karp, R ., Schindelhauer, C., Shenker, S., V o cking, B.: Rand omized ru mor spread- ing. In: Proc. 41st S y mp. on F ound. of Comput. S ci. ( F O CS’00), IEEE (2000) 565–574 6. Bailey , N.T.: Mathematical Theory of I n fectious Diseases an d Its Ap plications. second edn. Griﬃn, London (1975) 7. Daley , D.J., Gani, J.: Epidemic Mo delling: An I ntroduction. Cambridge Universit y Press, Cambridge, U K (1999) 8. Eugster, P ., Guerraoui, R., Kermarrec, A .M., Massouli ´ e, L.: F rom epidemics to distributed comp u ting. I EEE Computer 37 (5) (2004) 60–67 9. Alla vena, A., Demers, A., Hop croft, J.E.: Correctness of a gossip based membership protocol. In: Pro c. 24th ACM Symp . on Principles of Distributed Computing (PODC’05), A CM Press (2005) 292 –301 10. Eugster, P ., Guerraoui, R., H anduruk and e, S., Kermarrec, A.M., Kouznetsov, P .: Ligh tw eight Probabilistic Broadcast. ACM T rans. Comput. Syst. 21 (4) (2003) 341–374 11. Bonnet, F.: P erformance analysis of Cyclon, an inexp ensive mem b ership man- agemen t for unstructu red P2P overl ays. Master thesis, EN S Cachan Bretagne, Universit y of Rennes, IR ISA (2006) 12. V oulgaris, S., Gavidia, D., v an St een , M.: Cyclon: In ex p ensive membership man- agemen t for unstructured p2p ov erlays. J. Netw ork and Syst. Manage. 13 (2) (2005) 197–217 13. Bo yd, S., Ghosh, A., Prabhaka r, B., S hah, D.: Gossip algorithms: Design, anal- ysis and applications. In: Proc. 24th IEEE Conf. on Comput. Commun. (IN F O - COM’05). V olume 3., IEEE (2005) 1653–1 664 14. Jelasi ty , M., Montresor, A., Babaoglu, O.: Gossip-based aggregatio n in large dy- namic n etw orks. ACM T rans. Comput. Syst. 23 (3) (2005) 219–252 15. Deb, S ., M´ edard, M., Choute, C.: Algebraic gossip: a netw ork co ding approac h to optimal multiple rumor mongering. IEEE/ACM T rans. Netw. 14 (SI) (2006) 2486–25 07 16. Hilton, P ., Holton, D., Pe d ersen, J.: Mathematical Reﬂ ections. S pringer-V erlag , New Y ork (1997) 17. Gillela nd , M.: W orkin g with F ractions in Ja va (2002) http://www .merriampark.com/ fractions.ht m . 18. Abramo witz, M., Stegun, I.A.: Handb ook of Mathematical F unctions with F ormu- las, Graph s, and Mathematical T ables . 9 edn. Dov er, New Y ork (1972) 19 19. R.Graham, Knuth, D., P otashnik , O.: Concrete Mathematics. 2 edn . A ddison- W esley (1994) 20

An Analytical Model of Information Dissemination for a Gossip-based Protocol

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment