Emergence and resilience of cooperation in the spatial Prisoners Dilemma via a reward mechanism

Emergence and r esilience of coope ration in the spatial Prisoner’ s Dilemma via a r eward mechanism Ra ´ ul Jim ´ enez a , d , Haydee Lugo b , d Jos ´ e A. Cuesta c Angel S ´ anchez c , e , ∗ a Departamen to de Estad ´ ıstica, F acu ltad d e Cienc ias So ciales y Ju r ´ ıdicas , Universid ad Carlos III de Madrid , 2890 3 Getafe , S pain b Departamen to de Econom ´ ıa, F a cultad de C iencias Social es y J ur ´ ıdicas , Universid ad Carlos III de Madrid , 2890 3 Getafe , S pain c Grupo Inter disc iplinar de Sistema s Complejos (GISC), Departamen to de Matem ´ aticas , Escuela P olit ´ ecnica Superior , Univ ersidad Carlos III de M adrid, 2891 1 Le gan ´ es, Spain d Departamen to de C ´ omputo Cient ´ ıﬁco y E stad ´ ıstica, Univers idad Sim ´ on Bolivar , A. P . 89000 , Car acas 1 090, V enez uela e Institu to de Biocomputac i ´ on y F ´ ısica de Sistemas Complejos, Univer sidad de Zar agoza, 50009 Zara goza, Spain Abstract W e s tudy the probl em of th e emerg ence of cooperation in th e spatial Prisoner’ s Dilemma. The pioneering work by No wak and May (1992) showed that lar ge initial popu lations of coope rators can surv iv e and susta in coo peration i n a square lattice with imitate-th e-best e volu tionary dynamics. W e rev isit this problem in a cost -beneﬁt formulati on suitable for a numb er of biolog ical appli cations. W e s how th at if a ﬁ xed- amount rewa rd is estab lished for cooper ators to share, a singl e coop erator can in vade a popul ation of defect ors and form structu res that are resilien t to re-in vasion eve n if the rewar d mechanism is turned of f. W e discus s an alytically the case of the in vasio n by a singl e co operator and presen t ag ent-based simulatio ns for small initia l fractio ns of coope rators. Larg e coope ration le vels, in the sus- tainab ility range, are found . In th e conclus ions we d iscuss poss ible application s of this model as well as its connections with other mechanisms propos ed t o promote the emer- gence of coope ration. K e y word s: Emerg ence o f cooperation , E v olutionary g ame theory ∗ Correspo nding autho r . Email addr esse s: rjjim ene@est-e con.uc3m.es (Ra ´ ul Jim ´ enez), hayde e.lugo@ca ntv.net (Haydee Lugo), cuest a@math.uc 3m.es (Jos ´ e A. Cuesta ), anxo@ math.uc3m .es (Angel S ´ anchez ). Preprint submit ted to Else vier 3 March 2022 1 Intr oduction The emer gence of cooperative behavior among unrelated indi vid uals is o ne of the mo st promin ent unsolved problem s of current research (Pennisi, 200 5). Whil e such non-kin cooperation is evident in human societies (Hammerstein, 2003), it is by n o m eans exclusiv e of them, and can be observed in many differe nt species (Doebeli and Hauert, 2005) down t o the level of microor ganism s (V elicer, 2003; W ingreen and Levin , 2006). This conundrum can be suitably formulated in terms of e volutionary game theory (Maynard-Smith, 1 982; Gintis, 2000; Now ak and Sigmund, 2004; Nowak , 2006) by studying games th at are stylized versions of social dilem- mas (K ollock, 1998), e.g., si tuations in which individually reasonable behavior leads to a situation in which e veryone is w orse of f than they might ha ve been other - wise. Paradigmatic examples of these dilemmas are th e provision o f public goo ds (Samuelson, 1954), the tragedy of the commons (Hardin, 1968), and the Prisoner’ s Dilemma (PD) (Axelrod and Hamilton, 1981). The ﬁrst two of them in volve multi- ple actors, wh ile the l atter in volves only two actors, th is last case being the setting of choice for a majority of models on the e volution of cooperation. The PD embodies a st ringent form of social dilemm a, namely a s ituation i n which individuals can beneﬁt from m utual cooperation but they can do e ven better by exploiting cooperation of others. T o be speciﬁc, the two players in t he PD can adopt either one of two strategies: cooperate (C) or defect (D). Cooperation results in a beneﬁt b t o t he opposing pl ayer , but incurs a cost c to the cooperator (where b > c > 0). Defection has no costs and prod uces no beneﬁts. Therefore , if the opponent plays C, a pl ayer gets the payoff b − c if she also plays C, but she can do e ven better and get b if s he plays D. On the o ther hand, if the opponent pl ays D, a pl ayer gets the lowest payof f − c if she plays C, and it get s 0 if she also defects. In either case, it is better for both players to pl ay D, in spit e of t he fact th at mu tual cooperation would yield higher beneﬁts for them, hence the dilemma. Conﬂicting situations that can be described by the PD, either at the lev el of indi- viduals or at the level of population s are ubiquitous . Thus , T urner and Chao (1999) showed that interactions between RN A phages co-infecting bacteria are gov erned by a PD. Escherichia coli stationary phase GASP mutants in starved cult ures are another example of this dilemm a (V uli ´ c and K olter, 2001). A PD also arises wh en diffe rent y easts compete by switching from respiration to respirofermentatio n when resources are limited (Frick and Schuster, 2003). Hermaphroditic ﬁsh that alt er - nately release sperm and eggs end up in volved in a PD with cheaters that release only sperm with less metabolic effort (Dugatkin and Mesterton-Gibbo ns, 1996). A recent study of cooperative t erritorial defence in lions ( P anthera leo ), described the correct ranking structure for a PD (Legge, 1996). And, of course, the PD ap- plies to very many different situations of interactions between human individuals or collective s (Axelrod, 1984; Camere r , 2003). 2 In view of its wide applicability , the PD is a suitable context to p ose the q ues- tion of the emergence of cooperation. How do cooperative ind ividuals or popula- tions survive or even t hriv e in the cont ext of a PD, where defecting is t he only e volutionarily stable strategy (Maynard-Smith, 198 2; Now ak, 2006)? Sev eral an- swers to this puzzle hav e been put forward (Now ak, 2 006) among whi ch the m ost rele vant e xamples are kin selectio n theory (Ha m ilton, 1964), reciprocal altruism or direct reciprocity (Tri vers, 1971; Axelrod and Hamilton, 1981), indirect reci- procity (Now ak and Sigmun d, 1998), emer gence of cooperation through punis h- ment (Fehr and G ¨ achter, 2002) or t he existence of a spatial or social s tructure of interactions (No wak and May, 1992). Th is l ast approach has received a great deal of attention in the last decade and has proven a source of important insights into the e volution of cooperation (see Szab ´ o and F ´ ath (2007) for a recent and comprehen- siv e re view). One such in sight is the fact that cooperators can outcompete defectors by forming clus ters where the y help each other . This result, in turn, lea ves open the question of the emer gence of cooperation in a population with a majority of defec- tors. Recently , i t has been shown (Ohtsuki et al. , 2006) that, if the avera ge number of connections in the in teraction network is k , the condition b / c > k implies that se- lection f av ors cooperators in v ading defectors in t he weak selec ti on limit, i.e., when the contribution of the game to the ﬁtness of the individual is v ery small. Howe ver , a general result valid for any intensity of the selection is still lacking. In this paper , we propose a new mechanism for the emergence of cooperation, which we call shar ed r ewar d . In this setti ng, players interact t hrough a standard PD, b ut i n a second stage cooperators receive an addit ional payoff comi ng from a resource a vailable only to them a nd not to defectors. It should be emphasized that similar re ward m echanisms may be relev ant for a number of speciﬁc appl i- cations, such as mutualis tic situations with selecti on impos ed by hosts rewarding cooperation or punishing less cooperative behavior (see, e.g., Kiers et al. (2003) and reference s therein). Another context that may be modelled by our approac h is team form ation in animal soci eties (Anderson and Franks, 2001), e.g., i n coop- erativ e hunti ng (Packer and Ruttan, 1988). On the oth er hand, the idea of a shared re ward could be implemented in practice as a way to promote cooperation in human groups or , alternativ ely , m ay arise from c ost ly signali ng prior to t he game, when the exchange of cooperative signals among cooperators is free (Skyrms, 2004). As we will see, this scheme makes it poss ible for a single coop erator to in vade a popu la- tion of defectors. Furthermore, when strategies ev olve by unconditional imitat ion (Now ak and May, 1992), cooperation persi sts after the additional resource has been exhausted or turned of f. W e present e vi dence for these conclusions coming from numerical si mulations on a regular network. In the conclusio n, we discus s the rea- son for this surprisi ng result and the relation of o ur proposal to previous work on e volutionary games on graphs and to public goods games. 3 2 Spatial Prisoner’ s dilemma with shar ed reward Our model is deﬁned by a two-stage game on a network. In the ﬁrst st age, p layers interact with their neighbo rs and gobt ained payoffs as prescribed by the PD game, whose payof f matri x in a cost-beneﬁt context is giv en by C D C D   b − c b − c 0   . (1) Subsequently , in th e second stage o f the gam e, a ﬁxed amount ρ is dist ributed among all cooperators . It is im portant to realize at this point that such a t wo-stage game is onl y in teresting in a population setting: in a two-player gam e, the second stage would amount to shift the cooperator’ s payof f by ρ / 2 or ρ , dependi ng on the opponent’ s strategy . Th en, for ρ < c we would s imply h a ve another PD, whereas for 2 c > ρ > c we w ould ha ve t he Hawk-Dove or Snowdrift game (Maynard-Smith, 1982), and for ρ > 2 c w e would have the trivial Harmony game (also called Byprod- uct M utualism (Dugatkin et al. , 1992; Connor, 1995). In a pop ulation setting , the amount recei ved by a cooperator depends on the number of cooperators in t he total population and is therefore subject to e volution as the population itself changes. In order to w rite down the payoffs for the game after t he second st age, we need to introduce some notation . Let us consider a po pulation of N players, each of whom plays t he g ame against k other players. For player i , 1 ≤ i ≤ N , let us denote by V i the number of cooperators among the opp onents of i , and by N c the total number of cooperators in the population. The payoffs can then be written as follows: P i =    V i b − kc + ρ N c , if i cooperates V i b , if i defects . (2) This mechanism to reward cooperation has been studi ed by Cuesta et al . (2007) i n a game theoretical model of n players with no spati al structure. As stated above, our goal here is to understand whether or not the mechanism of th e s hared reward can explain t he emergence of cooperation in the Prisoner’ s Dilemma on networks. T o address this problem, we wil l consider below this game in the framew ork of a spa- tial setup following the same general lines as No wak and May (1992) for compari- son. W e place N individuals on a square latt ice with periodic boundary condit ions, each of whom cooperates or defec ts with her neighb ors (4, von Neumann neighbor- hood). W e hav e chosen t his neighborhood for the s ake of simplicit y in the calcula- tion; results for Moore neighborhood [used, e.g., by Now ak and May (1992)] can be obt ained in a s traightforward m anner . After receiv eing their payof fs according to (2), all indi vid uals update their strategy synchronously for the next round, by 4 imitate-the-best (also call ed unconditio nal im itation) dynami cs: th ey look in their neighborhood for players whose payoff is higher than their o wn. If there is any , the player adopts the strategy that led to the highest payof f among them (randomly chosen in case of a t ie). W e then repeat the process and let the sim ulation run until the density of cooperators in the lattice reaches a n asympto tic average value or else it becomes 0 or 1 (no te that these two stat es, corresponding to full defection and full cooperation, are absorbing states of th e dynamics because there a re not mutations). From the work by N ow ak and May (1992), we know that if w e begin the sim ula- tion with a suf ﬁciently large coop erator density , then the lattice helps sustain the cooperation level by allowing cooperators interacting with cooperators in cluster to s urviv e and av oid in v asio n by defectors; defectors thrive in the boundaries be- tween cooperator clusters. What we are i nterested in is in t he questi on as to how the lar ge init ial cooperator lev el requi red by Nowak and May (19 92) may arise; if th e initial number of cooperators i s small, they cannot form clusters and full defection is ﬁnally established. On the other hand, another relev ant point is resilience, i.e., the resist ance of the coop erator clus ter to re-in v asion by defectors. In thi s respect, we note that w hile the clusters obtained by Nowak and May (1992) did show re- silience, their corresponding cooperation le vel was not large. As we wi ll see belo w , the mechanism we are proposing will lead to high er coop eration levels wit h good resilience properties, even for medi um cost s. T o address t hese issues, we b egin by discussing the i n vasion by a single cooperator pl aced on th e center of the lattice (in fact, on any site, as the period ic boundary condit ions make all sit es equiv a- lent). This, al ong wi th the pos sible scenarios of in vasion by a single defector , will lead to a classiﬁcation of the different re gim es in terms of the cost parameter . Sub- sequently , we will carry out simulations with a very lo w initial concentration of cooperators. 3 In vasion by a single cooperator and r esilience of cooperation As our strate gy up date rule i s unconditi onal imitation, the proce ss is fully deter- ministi c, so we can com pute analytically the ev olutio n of the process. T hus, for the ﬁrst cooperator , seeded at tim e t = 0, to transform her defector neighbors into new cooperators, it is immediate t o see that ρ > b + 4 c ; otherwise, the cooperator is changed into a defector and the ev olutio n ends. If the condit ion is satisﬁed, the four neighbors become cooperators, and we hav e now a rhom b centered on t he site of the initial cooperator . In what follows, we discuss the generic si tuation in the subsequent ev olution of the system. After the initial coopera to r has giv en rise t o a rhomb, there wi ll alwa ys be four types of players in the system: • The cooperators in the b ulk , that interact with another four cooperators. 5 • The cooperators in the boundary , deﬁ ned as t he set of coo perators that hav e links with defectors. Th ese bound ary players have two cooperator neighbors or only one if they are at the corners o f the rhomb, but the k ey point is t hat they are alwa ys connected to a coop erator that interacts only with cooperators. • The defectors in the b oundary , t hat interact with one (opp osite to t he corners of the rhomb) or two cooperators. • The defectors in the bulk, that interact with another four defectors. For the rhomb to gro w two conditions must be met: ﬁrst of all, the payoff obtained by the boundary cooperators at the corner ( b − 4 c plus the re ward contribution) has to be larger than that of the boundary defectors with onl y one cooperator ( b ); secondly , the payoff obt ained by cooperators that ha ve two coop erator neighbors (2 b − 4 c pl us the rew ard contribution) has to be l ar ger than that of the boundary defectors that interact with two cooperators (2 b ). If both conditions are veriﬁed, defectors are forced to become cooperators by imitation. Therefore, we must ha ve b − 4 c + ρ N c ( t ) > b and 2 b − 4 c + ρ N c ( t ) > 2 b ⇐ ⇒ ρ N c ( t ) > 4 c . (3) W e thus ﬁnd t hat the condition for in vasion d oes not depend on the beneﬁt b . In addition, it predicts that in v asion proceeds unti l the rhomb contains too many co- operators so that th e condit ion is not fulﬁlled anymore. In view of th is result, we ﬁnd it con venient to introduce a parameter to m easure the rew ard in terms of the cost: δ ≡ ρ 4 cN . (4) W ith this notation, the prediction for the in vasion by a sing le cooperator is that it will proceed as long as th e fraction of cooperators veriﬁes N c ( t ) / N ≤ δ . N c ( t ) , the number of cooperators at t ime t , can be easily d etermined from the recurrence relation for the growing rhom b: in case the cooperators increase, a new boun dary layer is added to the rhomb, and we ha ve N c ( t ) = N c ( t − 1 ) + 4 t , which can be immediately solved (with ini tial conditi on N c ( 0 ) = 1) to gi ve N c ( t ) = 2 t 2 + 2 t + 1. Inserting th is resul t in the abov e c on dition allo ws to determi ne the maximu m growth tim e for the cluster , that is t ∗ = max { t : 2 t 2 + 2 t + 1 ≤ δ N } , and the fraction of cooperators in the steady state: N c ( t ∗ ) N . (5) So far , we have seen that wh en t he rew ard is large enou gh ( ρ > 4 cN ), full cooper- ation sets in, whereas for sm aller rew ard, a cooperator clust er grows up to a ﬁnal size that depend s on δ . Interestingl y , when b / 2 > c , the rew ard mechanism is onl y needed to establish an ini tial population of cooperators, i.e., the rhomb is resili ent. T o show thi s, notice th at boun dary coo perators obs erve the defectors that earn the lar gest payof f (tho se with two li nks to two cooperators) and compare it wit h the payoff obtained by bulk cooperators; boundary cooperators are linked t o both and unconditional imitation will lead them t o adopt the strategy of the neighbor with 6 Fig. 1. Final stage of the in v asion of a cooperator popul ation by a single defect or , for the medium cost case b / 4 < c < b / 2. Def ectors are whi te, cooperators are b lack. the largest payoff. The condition for the cooperators to resist re- inv asion is then 4 ( b − c ) > 2 b ⇐ ⇒ c < b 2 . (6) Indeed, if after a num ber of tim e steps we turn off the rewa rd, the rhomb structure arising from the ev oluti onary process cannot be re-in vaded by d efectors, as can be seen fr om Eq. (6). In the oppo site case, c > b / 2, the rew ard must be kept at all times to stabilize the cooperator cluster . In order to st udy the resilience of clust ers of cooperators, we consi der the simplest case of in vasion by a sing le defector in the Prisoner’ s Dil emma (wi thout rew ard). It ca n be easily sho wn that this leads to three different cost regimes (Jim ´ enez et al. , 2007): • Low cost case, c < b / 4: t he defector is only able to i n vade its 4 neighbors, giving rise to a 5 defector rhomb . • Medium cost case, b / 4 < c < b / 2: a structure with the shape of a cross with sawtooth boundaries is formed, implying a ﬁnite density of defec tors in the ﬁnal state (cf. Fig. 1). • High cost case, c > b / 2: the system is fully in v aded by the defe ctor , and cooper - ators go extinct. 7 4 Simulations with an initial concentration of defectors After considering the case of the in v asi on of a defecting population by a single cooperator , we no w p roceed t o a more general situation in which there a ppear a number of cooperators randomly distrib uted on th e lattice. T o th is end, we ha ve carried o ut simu lations on sq uare lattices of size N = 100 × 100 for differe nt ini- tial numbers of cooperators as a funct ion of the cost parameter (we t ake b = 1 for reference) and the re ward. A single simu lation consists of running th e game until a steady st ate is r eached, as shown by the fraction of cooperators becoming ap- proximately constant. Generally speaki ng, the steady s tate is reached i n some 100 games per player . For every choice of parameters, we compute an a verage o ver 100 realizations of the i nitial distribution of the coo perators. Results are shown in Fig. 2 for low , medium and high costs. Figure 2 shows a num ber of remarkable features. T o begin with, the case of i n vation by a single c oop erator reproduces the analytical result (5), On the other hand, in all three plots we see that if inst ead of a singl e cooperator we hav e an i nitial density of cooperators, the resultin g lev el of coop eration is q uite higher , particularly when costs a re lo w . Indeed, by looking at panel a), for which c = 0 . 2 ( b = 1), we see that with a 10% of init ial cooperators cooperation sets in ev en witho ut reward, as observed by Nowak and May (1992). Notwithstanding, a mo re remarkable result is the fact t hat with an initial density of cooperators as low as 0.1% we ﬁnd large cooperation levels for small rew ards, for all va lu es of costs. Clearly , the coopera- tion le vel decreases with increasing cos t, but eve n f or high cos ts [panel c), c = 0 . 7], the cooperation leve l is s igniﬁcantly hi gher than the si ngle cooperator one. In this last case, we also o bserve that the ﬁnal state becomes practically independent of the density of initial cooperators. Finally , an intriguing result is that in the lo w cost case, the obs erved cooperation fraction is not a mon otonically increasing functi on of the reward: As it can be seen from t he plot, for moderate and particularly for lar ge initial densi ties o f coo perators, increasing the rewar d may lead to lo wer levels of cooperation. The reason for this phenomenon is that, if the re ward increases, the cooperator clusters arisin g from the cooperator inv aders grow large r and overlap. Therefore, clusters with rugged boundaries are formed, allo wing for defectors with three cooperators whi ch may then b e able to rein v ade. Further increments of the re ward restore the cooperation levels because then ev en these special defectors are overr ode. The im portant con sequence is that o ne cannot ass ume that, for any sit u- ation, i ncreasing the reward leads to an increasing o f t he cooperation, i.e., one has to be careful in designing the re ward for each speciﬁc application. The other relev ant issue t o address in the simul ations is the resilience of the attained cooperation le vels. Figure 2 summarizes our r esul ts in th is re gard. Both for lo w and high rewards, we conﬁrm the result for the singl e cooperator i n vasion that cooperation disappears if t he re ward i s turned off when the costs are hi gh ( c = 0 . 7 > b / 2). For m oderate and l ow costs, the structures arising from the ev olution 8 a) 0 5 10 15 20 δ 0 0.2 0.4 0.6 0.8 1 fraction of cooperators b) 0 5 10 15 20 δ 0 0.2 0.4 0.6 0.8 1 fraction of cooperators c) 0 5 10 15 20 δ 0 0.2 0.4 0.6 0.8 1 fraction of cooperators Fig. 2. A verage fraction of cooper ators in th e stead y state as a funct ion of t he resc aled re ward δ = ρ / 4 N c , obtain ed startin g with 1 ( + ), 10 ( ∗ ), 100 ( ◦ ), and 1000 ( ⋄ ) initial coop- erators . a) lo w cost, c = 0 . 2; b) medium cost, c = 0 . 4; c), high cost, c = 0 . 7. with rew ard do show resilience, at least to some degree. Interesting ly , the case o f low re ward [panel a)] giv es rise to e xtremely rob ust cooperation le vels, whereas higher rew ards [panel b )] lead to s tructures for whi ch cooperation decreases when the re ward is absent (medium cost case). Th is resul t is conn ected with the one 9 a) b) 0 20 40 60 80 100 120 140 time 0 0.2 0.4 0.6 0.8 fraction of cooperators 0 20 40 60 80 100 120 140 time 0 0.2 0.4 0.6 0.8 fraction of cooperators Fig. 3 . T ime e v olution of the fraction of cooper ators for the cases of lo w (dot-d ashed line, c = 0 . 2), medi um (solid line, c = 0 . 4) and hig h (dashed lin e, c = 0 . 7) costs, f or simulatio ns startin g with 10 0 initial coop erators (densi ty , 1%) randomly distrib uted. Shown are the cases of a) lo w ( δ = 0 . 1) and b ) high ( δ = 0 . 5) rew ards. Rew ard is set in place until t = 100 and turned of f afterward s. a) b) Fig. 4. System sna pshots at the sta tionary state of a sin gle realizat ion of the e volu tion (befor e switchin g of f the r eward , see Fig. 3) for the lo w rew ard case ( δ = 0 . 1). The initial densit y of cooper ators i s 1%. a ) lo w cost ( c = 0 . 2), b) medium cost ( c = 0 . 4). Defectors ar e white, coop erators are black. already di scussed t hat the cooperation le vel may not be m onotonics in the reward, and makes it clear t hat st ructures originating from a very agressive, high rewa rd policy may be less re si lient than those built with low r ewards. Further i nsight on the clust er structure arising from the in vasion process fueled by the reward can be gained from Figs. 4 and 5. Figure 4 shows the statio nary structure of the cooperator clusters for the low rew ard case ( δ = 0 . 1). As we are now considering that the initial conﬁguration contains a 1% of cooperators r andom ly distributed, the shapes are i rregular , and some rhombs are lar ger than others because they merge during evolution. In accordance with Fig. 3, in the l ow cost situation the cooperation level reached is much larger than in th e medium cost case. Howe ver , both structures are resilient and surviv e unchanged if the re ward is removed. This 10 a) b) Fig. 5. System snapsh ots at the stati onary state of a single realiza tion of the ev oluti on, a) before and b) after swit ching of f th e re ward, se e Fig. 3) fo r the high re ward case ( δ = 0 . 5) and medium cost ( c = 0 . 4). Defectors are white, cooperato rs are black. is due to th e fact that, as discus sed above , in that case defectors can never i n vade a cooperating population. The ﬁna l structure for the high cost case i s similar to Fig. 4b), but in this case s uppression of th e reward leads t o an immedi ate in vasion by defectors until they occupy the whole system. When the re ward is lar ger , the situation is somewhat different, as can be appreciated from Fig. 5. While for low cost we again o btain resil ient structures that are preserved ev en without reward, in the medium cost r egime the patterns change. Panel a) shows the st ationary s tate reached wit h the rew ard; when the re ward is taken aw ay , the stat e changes and e volves to the conﬁguration sho wn in panel b). What is t aking pl ace here is that due to the high re ward, a cooperation le vel clo se to 1 is reached, mo st of t he defectors being isolated or along lines. When the re ward is swi tched off , th ese defectors are in a positi on t o rip much payoff from their interactions wit h the cooperators, allowing for a partial rein v asion. Therefore, the ﬁnal cooperation le vel has more or less halved. W e s tress that even then the c oo peration leve l that remains after t he suppression of t he re ward is rather lar ge (about 60%), another hint of the e fﬁcienc y of this mechanism to promote cooperation. 5 Discussion and conclusions W e ha ve proposed a mechanism that allows a population of cooperators to grow and reach sizeable p roportions in t he spatial Prisoner’ s Dilemma in a cost-beneﬁt frame work. This mechanis m i s based in the distribution of a ﬁxed-amount rew ard among all cooperators at e very time step. W ith t his contribution to the payoffs of the standard Prisoner’ s Dilemma, ev en a single coop erator is able to inv ade a fully de- fecting pop ulation. The resulting cooperator fraction is determined by the amount of the rew ard as com pared t o the total num ber o f p layers and to th e cost of t he in- 11 teraction. Furthermore, for lo w and medium costs ( c < b / 2) cooperation is r esil ient in the sense that if at som e time step the rew ard is s uppressed, the cooperator clus- ter cannot be re-in vaded by the defectors. Finally , we hav e seen that low rewar ds are capable to induce a very lar ge coo peration le vel, so the mechanism works e ven when it changes only a little the payof fs of the Prisoner’ s dilemma. The result we have obtained is relev ant, in t he ﬁrst place, as a necessary com - plement of the orig inal work by Now ak and May (1992 ) withi n the cos t-beneﬁt context. In their work they sho wed that the s patial structure allowed cooperator clusters to survive and resist in vasion by defectors, but they began with a large population of cooperators. Our work provides a put ativ e explanation as to where this population comes from . W e note t hat in the o riginal work by N ow ak and May (1992) the y observed that the cooperation le vel decreased with respect to the initial population, so a m echanism leading to the appearance o f high cooperator levels is certainly needed. In this regard, we want to stress that the re ward mechanism giv es rise to structures with very go od resilience properties: Simulations withou t the re- ward show that st arting from a randomly distributed population of cooperators with very lar ge dens ity ( ∼ 90%), the ﬁnal cooperation l e vel is halved for low costs, and practically disappears for moderate costs. W e stress that, to our knowledge, t his is the ﬁrst time that a mechanism based on a ﬁxed-amount re ward to be shared among cooperators is proposed. Notwithstand- ing, there are other proposals which are somewhat related to ours, m ost promi - nent among th em being those by Lugo and Jim ´ enez (2006) and Hauert (2006). Lugo and Jim ´ enez (2006) introduce a tax mechanism in which ev erybody in the population contributes tow ards a poo l that is sub sequently distributed amo ng the cooperators. This is differ ent from the p resent proposal in so f ar as the contribution from th e tax is not a ﬁxed quantit y but rather it increases with t he avera ge payoff. On the o ther hand, Hauert (2006) focuses on t he effects of nonli near discount s (or synergistic enhancement) depending on the number of cooperators in the groups of interacting individuals. Although the correspondin g game theoretical m odel, dis- cussed by Hauert et al. (2006) belongs to the same genera l class of n -player games of our shared rew ard model, the spatial implem entation of the two models is very diffe rent. Th us, in Hauert (2006), payoffs for a give n i ndividual depend on the number of cooperators in her neighborhood, whereas in the present work payoffs depend on the total number of cooperators in th e network. On the other hand, our interest i s also diffe rent, i n so f ar as we are dis cussing a mechanism to f ost er the ap- pearance of an in itial, sizeable populatio n of cooperators w hich can later be stable without this additional resource. It is imp ortant t o stress that wi th our mechanism a large le vel of cooperation can be establ ished and (in the approp riate parameter range) stabilized. W e believe that our results may be relev ant for a num ber of experimental si tua- tions where the Prisoner’ s Dilemma has been shown to appear i n nature. Thus, th e stabilization of mut ualistic symbios es by re wards or sanction s as observed in, e.g., 12 legume-rhizobium mu tualism (Kiers et al. , 2003) is related to the mechanism we are propo sing here: It is observe d that soybeans penalize rhizobia that fail to ﬁx N 2 in their root nodu les. This decreases the defector’ s payoff whi ch is similar to increasing t he cooperator’ s payoff by a re ward. On the other h and, a description of the interaction between differ ent strains of microor gani sms [see Crespi (2001 ); V elicer (2003) and references therein] in terms of this re ward mechanism instead of the standard Prisoner’ s Dilemma m ay prove more accurate and closer to the ac- tual interaction process. An example could be t he e volution o f cooperators with reduced sensitivity to defectors in the RNA Phage Φ 6 (T urner and Chao, 199 9, 2003). Cooperati ve foraging is another context where the mechanism of re warding cooperation m ay be relev ant, ra ngi ng from microorganisms such as Myxococcus xanthus (Dworkin, 1996) through beetles (Berryman et al. , 1985) to wolves o r lions Anderson and Franks (2001). Finally , the qu estion arises as to the validity of such a mechanism to promot e cooperation within humans , as individual players can not predict in advance the additional payoff they will obtain from the re ward, and there- fore it is not clear whether it would ha ve an inﬂuence on them or not. Evidences from cooperati ve hunting in h umans (Alv ard, 2001; Alva rd2, 2003) show that high lev els of sharing help sustain cooperative behavior . Howe ver , in the human case, contexts where the reward would be more explicitly in cluded in a manner transpar - ent to the p layers are po ssible and amenable to experiments. Research along these lines is necessary to assess the p ossible role of the rew ard mechanism in speciﬁc situations. Acknowledgmen ts This work is partially s upported by Mini sterio d e Educaci ´ on y Ciencia (Spain) under grants Ingeni o-MA THEM A TICA, MOSAICO and NAN2004-9087-C03-03 and by Comunidad de Madrid (S pain) u nder grants SIMUMA T and MOSSNOHO. Refer ences Alvard, M. 2001. M utualistic hunting. In The Early Human Diet : The Role of Meat , Craig Stanford and Henry Bunn, eds., pp. 261-278 . Oxford University , Ne w Y ork. Alvard, M. 200 4, Goo d hun ters keep sm aller shares of lar ger p ies. Behav . Brain Sci. 27, 560–561. Anderson, C., Franks, N . R., 2001. T eams i n animal societies. Behav . Ecol. 12, 534–540. Axelrod, R., 1984. The Ev oluti on of Cooperation. P engui n, London. Axelrod, R., Hamilt on, W . D., 1981. The e volution of cooperation . Science 2 11, 1390–1396. 13 Berryman, A. A., De nn is, B., Raf fa, K. F ., Stenseth, N. C., 1985. Ev o- lution of optima group attack with particular reference to bark beetles (Coleoptera:Scolytidae). Ecology 66, 898–903. Camerer , C., 2003. Beha vioral Game Th eory: Experiments in Strategic Interaction. Princeton Uni versity Press, NJ. Connor , R. C., 1995. The beneﬁts of mutuali sm: a conceptual frame work. Biol. Re v . 70, 427–457. Crespi, B. J., 20 01. The ev oluti on of social beha vi or in microor ganisms. T rends Ecol. Ev ol . 16, 178–183. Cuesta, J . A., J im ´ enez, R., Lugo , H., S ´ anchez A. , 2007. Re warding cooperation in social dilem mas. W orking paper 07–52, Departamento d e Econom´ ıa, Univ ersi- dad Carlos III de Madrid. Doebeli, M., Hauert, C., 2005. Models of cooperation based on the Prisoner’ s Dilemma and the Sno wdri ft game. Ecol. Lett. 8, 748–766 . Dugatkin, L. A., Mesterton-Gibbons, Houston, A. I., 1992. Be yo nd the prisoner’ s dilemma: T ow ard model s to discrim ate among mechanism s of cooperation i n nature. T rends Ecol. Evol. 7, 202–205. Dugatkin, L. A., Mesterto n-Gibbons, M ., 1996. Coop eration among unre lat ed indi- viduals: reciprocal a lt ruism, by-product mu tualism and group selection in ﬁshes. BioSystems 37, 19–30. Dworkin, M., 1996. Recent advance s in the soci al and de velopment al biology of the Myxobacteria. Microbiol. Re v . 60, 70–102. Fehr , E., G ¨ achter , S., 2002 . Altrui stic punis hment in hu mans. Nature 415, 137 –140. Frick, T ., Schuster , S., 20 03. An example of the prisoner’ s dilemma i n biochemistry . Naturwiss. 90, 327–331. Gintis, H., 2000. Game Theory Evolving. Pr in ceton Uni versity Press, Princeton, NJ. Hamilton, W . D., 1964. The genetical ev olution of social beha vior I. J. Theor . Biol. 7, 1–16. Hammerstein, P . (Ed.), 200 3. Genetic and Cultural Evolution of Cooperation. MIT Press, Cambridge, MA. Hardin, G., 1968. The tragedy of the commons. Science 162, 1243–1248 . Hauert, C., 2006. Spatial ef fects in social dilemm as. J. Theor . Biol. 240, 627–636. Hauert, C ., Michor , F ., No wak, M. A., Doebeli , M., 2006. Syner gy and discounting of cooperation in social dilemmas. J. Theor . Biol. 239, 195–202 . Jim ´ enez, R., Lugo, H., Egu ´ ıluz, V ., San M iguel, M., 2007 . Learning to cooperate. Unpublished manuscript. Kiers, E. T ., R ou sseau, R. A., W est, S. A., Denison, R. F ., 2003. Host sanctions and the legume-rhizobium mutualism. Nature 425, 78–81. K ollock, P ., 1 998. Social dilemmas: The anatomy of cooperation. Annu. Re v . So- ciol. 24, 183–214. Legge, S, 1996. Cooperati ve lions escape the Prisoner’ s Dilemma. T rends Ecol. Evol. 11, 2–3. Lugo, H., Jim ´ enez, R., 2006. Incentiv es to cooperate in network formation . Comp. Econ. 28, 15–27. 14 Maynard-Smith, J., 1982. Ev oluti on and the Theory of Games. Cambridge Univ er- sity Press, UK. Now ak, M. A., 2006. Five rules for the ev olution of coop eration. Science 314, 1560–1563. Now ak, M. A., 2006. Evolutionary Dynam ics. Harv ard University Press, Cam- bridge, MA. Now ak, M. A. May , R. M . May , 1992. Evolutionary games and spatial chaos. Na- ture 415, 424–426. Now ak, M. A., Sigmund, K., 1998. Evolution of indirect reciprocit y by im age scor - ing. Nature 393, 573–577. Now ak, M. A., Sigmund, K., 2004. Ev olut ionary dy namics of biological games. Science 303, 793–799. Ohtsuki, H., Hauert, C., Lieberman, E., No wak, M. A., 2006. A simple rule for the e volution of cooperation on graphs and social networks. Nature 441, 502–505. Pack er , C., Ruttan, L., 19 88. Th e ev oluti on of cooperativ e hunt ing. Am. Nat. 132, 159–198. Pennisi, E., 2005. How did cooperativ e behavior ev olve? Sc ience 309, 93. Samuelson, P . A., 1954 . The pure theory of public expenditure. Rev . Econ. Stat. 36, 387–389. Skyrms, B, 2006. The S tag H unt and the Ev olution of Social Structure. Cambridge Univ ersity Press, Cambridge, UK. Szab ´ o, G., F ´ at h, G., 2007. Evolutionary games on graphs. Phys. Rep., in press. T river s, R. L., 1971. The ev olution of reciprocal altruism. Q. Re v . B io l. 46, 35–57. T urner , P . E., Chao, L., 1999 . Pr is oner’ s dilemma i n an RN A virus. Nature 398, 441–443. T urner , P . E. , Chao, L., 2003. Escape from prisoner’ s dilem ma in RNA phage Φ 6. Am. Sci. 161, 497–505. V elicer , G. J., 2003. Social st rife in the microbial world. Trends in Mi crobiol. 11, 330–337. V uli ´ c, M ., Kolter , R., 2001. Evolutionary cheatin g in Escherichia col i stationary phase cultures. Genetics 158, 519–526. W ingreen, N. S., Levin, S. A., 2006. Cooperation among microorganisms. PLOS Biology 4, 1486–1488. 15

Emergence and resilience of cooperation in the spatial Prisoners Dilemma via a reward mechanism

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment