The Impact of Complex and Informed Adversarial Behavior in Graphical Coordination Games

1 The Impact of Complex and Inf ormed Adversarial Beha vior in Graphical Coordination Games K eith Paarporn ? , Brian Canty ? , Philip N. Brown, Mahnoosh Alizadeh, and Jason R. Marden Abstract —How does system-level inf ormation impact the abil- ity of an adversary to degrade perf ormance in a networked control system? How does the complexity of an adversary’ s strategy affect its ability to degrade performance? This paper focuses on these questions in the context of graphical coordination games where an adversary can inﬂuence a given fraction of the agents in the system, and the agents follow log-linear learning, a well-known distributed learning algorithm. Focusing on a class of homogeneous ring graphs of various connectivity , we begin by demonstrating that minimally connected ring graphs are the most susceptible to adversarial inﬂuence. W e then proceed to characterize how both (i) the sophistication of the attack strategies (static vs dynamic) and (ii) the informational awareness about the network structure can be leveraged by an adversary to degrade system performance. Focusing on the set of adversarial policies that induce stochastically stable states, our ﬁndings demonstrate that the relativ e importance between sophistication and information changes depending on the the inﬂuencing power of the adversary . In particular , sophistication far outweighs informational awareness with regards to degrading system-level damage when the adversary’ s inﬂuence power is relatively weak. Howev er , the opposite is true when an adversary’s inﬂuence power is more substantial. I . I N T R O D U C T IO N A networked system can be viewed as a collection of subsystems, each required to make local and independent decisions in response to av ailable information. The infor- mation av ailable to each subsystem could pertain to local en vironmental conditions or the behavior of a selected group of neighboring agents in the system; hence, the information av ailable to one subsystem could be vastly different than the information av ailable to other subsystems. Regardless of the speciﬁc problem domain and informational characteristics, the underlying goal is to deriv e agent control policies that ensure the emergent collecti ve behavior is desirable with respect to a system-lev el performance metric. A central focus of such systems is the design of networked control algorithms that provide strong guarantees on the qual- ity of emergent outcomes. A networked control algorithm can be viewed as a decision-making rule that speciﬁes how subsystems respond to local conditions. There are sev eral This research was supported by UCOP grant LFR-18-548175, ONR grant #N00014-17-1-2060 and NSF grant #ECCS-1638214. The material in this paper substantially extends the conference paper [1] by providing complete proofs and nov el results on dynamic policies. The current paper extends the results by fully characterizing dynamic adversarial inﬂuence. K. Paarporn, M. Alizadeh, and J.R. Marden are with the Department of Electrical and Computer Engineering, Univ ersity of California, Santa Barbara. B. Canty is with CA CI International. P . N. Brown is with the Department of Computer Science at the Univ ersity of Colorado, Colorado Springs. Contact: kpaarporn@ucsb.edu, { alizadeh,jrmarden } @ece.ucsb.edu, brian.canty@caci.com, philip.brown@uccs.edu . *These authors contributed equally to this work. notew orthy results in this domain ranging from consensus and ﬂocking [2], [3], sensor allocation [4], [5], coordination of unmanned vehicles [6], and many others. A common theme in all of these works is the following: If all agents follow the prescribed decision-making rules, then the emergent behavior is both stable and desirable. In contrast to this work, here we seek to address whether such decision-making rules are robust to adversarial interventions. While the decentralization associated with distributed archi- tectures is undoubtedly appealing for a host of reasons, it is important to highlight that this also introduces vulnerabilities. In particular , the decision-making process of individual sub- systems can potentially be inﬂuenced by adversarial actors in the system through corrupting or augmenting the information to the subsystems. Accordingly , in this paper we ask whether an adversary can exploit these interconnections to negati vely inﬂuence the quality of the emergent collecti ve behavior . Formal analysis of this interplay has emerged in recent years, often in the context of robust consensus, distributed optimiza- tion, and cyber-physical system security [7]–[9]. The focus of this paper is the susceptibility of a distributed algorithm known as log-linear learning in networked control systems [10]–[12]. Log-linear learning has receiv ed signiﬁcant attention recently in the area of distributed control, as it can often be emplo yed to ensure that the resulting behavior is near optimal. A representative set of examples range from control of wind farms [13], sensor networks [4], [5], [14]– [16], task assignment [17], among others [18]. Howe ver , the susceptibility of this approach to adversarial interventions is generally unknown. The goal of this paper is to shed light on the susceptibility of log-linear learning to adversarial interventions. T o that end, we focus on a well-studied class of systems known as graphical coor dination games [19], [20]. Graphical coordination games model strategic scenarios where agents are tasked with adopt- ing conv entions and deri ve beneﬁts from coordinating with the choices of their neighbors, e.g., adoption of technology or con ventions [20], [21]. Regardless of the speciﬁcs of the graphical coordination game, log-linear learning is known to asymptotically achiev e optimal system-le vel behavior . In this work, we focus on characterizing the de gree to which the performance guarantees of log-linear learning algorithms can be undermined by adversarial manipulations. In particular , our goal is to ev aluate how different adver- sarial features can inﬂict harm on the system. How much more of a threat is an adversary that knows the underlying network structure versus one that does not? An adversary that can dynamically alter its strategy versus one that can not? W e begin by stating our model to ensure that our contrib utions are clear . 2 A. Model: Graphical Coordination Games W e consider the framew ork of graphical coordination games where there is a collection of agents N = { 1 , 2 , . . . , n } enmeshed in an underlying undirected network G = ( N , E ) where E ⊆ N × N deﬁnes the inter-agent interconnections. There are two different conv entions, denoted by x and y , and each agent i ∈ N must decide between a set of con ventions A i ⊆ { x, y } . Note that if A i = { y } , this means that agent i is required to select conv ention y . The beneﬁt agent i associates with a choice x or y depends on how many of its network neighbors N i = { j ∈ N : ( i, j ) ∈ E } hav e selected the same con vention. More formally , given a joint action proﬁle a = ( a 1 , . . . , a n ) ∈ A := A 1 × . . . A n , the total beneﬁt agent i experiences is giv en by U i ( a ) := X j ∈N i V ( a i , a j ) . (1) where V : { x, y } 2 → R deﬁnes the per agent beneﬁt of coordinating with a neighboring agent on a gi ven conv ention. Throughout, we consider V of the following form x y x 1 + α, 1 + α 0 , 0 y 0 , 0 1 , 1 where α > 0 . The system welfar e associated with the action proﬁle a ∈ A is gi ven by W ( a ) := X i ∈N U i ( a ) . (2) The goal of a system operator is to assign decision-making rules for the agents such that their emergent collecti ve behavior optimizes the system welfare, i.e., the emergent action proﬁle is of the form a opt ∈ arg max a ∈A W ( a ) . (3) One such algorithm that achieves this objectiv e is log-linear learning [11], [14], [22]–[25]. Log-linear learning is a stochas- tic distributed algorithm that governs the ev olution of agents’ decisions over time. More formally , log-linear learning pro- duces a sequence of joint action proﬁles { a ( t ) } ∞ t =0 , which we also call states, determined by the following process: Deﬁnition 1 (Log-Linear Learning) . Let a (0) ∈ A be any action pr oﬁle. At each time t ≥ 1 , one agent i ∈ N is selected uniformly at random and allowed to alter its action choice . All other agents ar e r equir ed to repeat their pr evious action, i.e., a − i ( t ) = a − i ( t − 1) where a − i = { a 1 , . . . , a i − 1 , a i +1 , . . . , a n } captures the action choice of all agents 6 = i . The updating agent i selects any action a i ∈ A i at time t with pr obability e β U i ( a i ,a − i ( t − 1)) P ˜ a i ∈A i e β U i (˜ a i ,a − i ( t − 1)) (4) wher e β > 0 is a given algorithm parameter . Once agent i selects her action, the pr ocess is repeated. Log-linear learning induces in an ergodic process over the joint action proﬁles A in an y graphical coordination game of the above form. The stochastically stable states, which we express by LLL ( G, A , α ) ⊆ A is deﬁned as the support of the limiting distribution as β → ∞ . In the context of graphical coordination games, log-linear learning ensures that LLL ( G, A , α ) = arg max a ∈A W ( a ) . (5) Note that log-linear learning guarantees that the emergent be- havior optimizes the system-lev el objective irrespective of the graph G , the conv ention choices av ailable to the agents A , and the value of α . Note that in the special case when A i = { x, y } for all i ∈ N , then LLL ( G, A , α ) = { ~ x = { x, . . . , x }} is the all x conv ention. For alternativ e choices of A , the action proﬁles that optimize system welfare is not as straightforward. B. Models of Adversarial Interventions In this paper we consider an adversary seeking to inﬂu- ence the decision-making process of log-linear learning by strategically integrating S = { 1 , . . . , | S |} adv ersarial nodes into the system. Each of the adversarial nodes s ∈ S will be integrated into the network through a connection to a unique single agent i ∈ N that the adversarial node is tasked with inﬂuencing though a choice a s = { x } or a s = { y } . Let S x , S y ⊆ N , | S x | + | S y | ≤ | S | , denote the set of agents that are being inﬂuenced by an adversary promoting { x } and { y } respectiv ely . Giv en S x and S y , the inﬂuenced utility of an agent i ∈ N is of the form ˜ U i ( a ; S x , S y ) :=    U i ( a ) + V ( a i , x ) if i ∈ S x U i ( a ) + V ( a i , y ) if i ∈ S y U i ( a ) else (6) In words, an agent i ∈ S y (resp. i ∈ S x ) experiences the usual beneﬁts from its neighbors in N i , plus an additional utility of 1 if a i = y (resp. 1 + α if a i = x ). While the adversarial nodes S do not directly contribute to the system-lev el objectiv e as deﬁned in (2), they modify the network agents’ utility functions, which in v ariably inﬂuence the resulting asymptotic behavior associated with log-linear learning. W e now denote by LLL ( G, A , α, π ) the (possibly modiﬁed) stochastically stable states, where π deﬁnes the process, or policy , through which S x and S y are chosen. T echnically speaking, the sets S x ( t ) , S y ( t ) are drawn from the distribution π ( t ) . The performance degradation associated with the adversarial policy π is measured by η ( G, A , α, π ) := min a ∈ LLL ( G, A ,α,π )  W ( a ) W ( a opt )  ≥ 0 . (7) W e will focus on graphical coordination games where the agents have full choice of con ventions, i.e., A i = { x, y } for all i ∈ N . For that setting, we will omit highlighting the dependence of A in the deﬁnition of η ( · ) and LLL ( · ) , i.e. we will instead write η ( G, α, π ) and LLL ( G, α, π ) . C. Summary of Contributions The focus of this manuscript is on characterizing the sus- ceptibility of log-learning learning to adversarial interventions in networked coordination games. In particular , our goal is 3 to identify the salient features of the worst-case adversarial policies. Speciﬁcally , we focus on identifying the importance of the following two attributes: - Informational A wareness: Does the adversary know the network structure? - Strategic Sophistication: Can the adversarial nodes dynam- ically alter their location and con vention choice over time? The abov e attributes deﬁne four classes of adversarial policies, which we represent by { Π I,D , Π I , Π D , Π } , where the subscript I denotes informationally aw are and the subscript D denotes dynamic adversarial policies. The absence of a subscript distinction means the negation. For example, Π I,D denotes the set of adversarial policies that are dynamic and can utilize information about the network structure. On the other hand, Π denotes the set of adversarial policies that are static and agnostic to network structure. By dynamic, we mean that the adversary can alter its behavior based on the current network state. That is, we consider stationary policies 1 { S x ( a ( t )) , S y ( a ( t )) } t =1 , 2 ,... . A static policy does not allo w this ﬂexibility: S x ( a ( t )) = S x and S y ( a ( t )) = S y ∀ t . Our ﬁrst set of main results identify the most vulnerable graph structures. Focusing on a class of homogeneous ring graphs where an adversary can inﬂuence at most γ · n agents, where γ ∈ [0 , 1] , we demonstrate that the most susceptible, i.e., graphs that lead to the lowest efﬁciency as deﬁned in (7), are minimally connected ring graphs. W e demonstrate this ov er the set of adversarial policies Π and Π D (Theorems 2.1 and 2.2). This matches intuition as the graph with the fewest internal edges are in fact the most susceptible to adversarial interference. Our second set of results focus exclusiv ely on these ring graphs and seek to identify how information regarding the network structure can be exploited by the adversary . In doing so, we characterize the tight worst-case performance guarantees as in (7) over policies belonging to Π I and Π I,D (Theorems 3.1 and 3.2). Figure 1 highlights an instance of worst-case performance guarantees for all four types of policies { Π I,D , Π I , Π D , Π } when α = 0 . 5 . As expected, the adversary leverages information and sophistication to most effecti vely degrade performance guarantees. Howe ver , the regimes where each of these attributes is most v aluable is not so predictable. When an adversary has limited strength, i.e., γ < 0 . 5 , sophistication is far more valuable than informational awareness to the adversary . That is, the best adversarial policy in Π D signiﬁcantly outperforms the best adversarial policy in Π I . When an adversary has more substantial strength, i.e., γ > 0 . 5 , the opposite is true. W e formalize these conclusions in Theorem 3.3. Theorem 3.4 highlights the performance differences between static and dynamic policies. In particular, dynamic policies can achiev e the same performance as static ones using fe wer adversarial nodes, but such performance saturates above a threshold budget. W e provide proofs in Sections IV (static adversaries), V (dynamic adversaries), and the Appendix (Theorems 3.3 and 1 In this paper, we restrict attention to adversarial policies that induce stochastically stable states – in particular, static policies and dynamic policies that are stationary . It will be of interest in future work to in vestigate other types of dynamic policies that may not guarantee a SSS is induced. Minimum efficiency induced, = 0.5 Static uninformed Static informed Dynamic uninformed Dynamic informed Fig. 1: This ﬁgure highlights the interplay between an adversary’ s informational awareness (informed vs uninformed), strategic sophis- tication (static vs dynamic), budget, and the minimum efﬁcienc y it can induce on the system. The green and red lines characterize minimum efﬁciencies induced from four different adversarial models on ring networks of sufﬁciently large size, as a function of fractional budget γ ∈ [0 , 1] (the fraction of agents the adversary can inﬂuence). At γ = 0 , neither adversarial model can induce any damage on efﬁcienc y (black circle). For low budgets (i.e. γ < 0 . 5 ), strategic sophistication is more valuable than having system-level information about the network. The con verse holds true for higher budgets (i.e. γ > 0 . 5 ): system-lev el information is more valuable than the ability to implement dynamic policies. 3.4). D. Related W ork Previous work has studied to what extent networked dis- tributed algorithms, designed to operate in the absence of adversarial interference, are susceptible to such inﬂuence [9], [26]–[30]. For example, distributed multi-agent optimization algorithms are shown to easily be compromised by adversarial behaviors [9], [30]. Indeed, there are fundamental limitations to these algorithms and their v ariants. Such algorithms cannot perform optimally in the absence of adversaries as well as be resilient to adversarial attacks at the same time [9], [30], [31]. How emergent behavior associated with game-theoretic learning algorithms, such as log-linear learning, could be inﬂuenced by adversarial nodes was initially studied in [32], [33]. The focus centered on ho w easily adversarial nodes could steer agents towards an inefﬁcient Nash equilibrium. In this paper , we instead focus on an adversary seeking to minimize system-lev el performance. I I . A N A L Y S I S O F S U S C E P T I B L E G R A P H S This section focuses on identifying which graph structures are most susceptible to adversarial inﬂuence. T o that end, we will focus on a class of graphs that we term k -connected ring graphs, for k ∈ { 1 , . . . , b n/ 2 c} . A graph G = ( N , E ) is a k - connected ring graph if N i = { i − k , . . . , i − 1 , i + 1 , . . . , i + k } for each agent i ∈ N , where addition and subtraction are both modulo n . Note that when k = 1 we have the usual ring graph and when k = b n/ 2 c we have the complete graph. Let us denote G k n as the set of all k -connected ring graphs of size n . The follo wing Theorems outline the degradation in 4 performance attainable through admissible adversarial policies belonging to Π and Π D – static and dynamic uninformed adversaries, respectively . Theorem 2.1. Consider the class of network coor dination games wher e (i) 2 α ∈ [0 , 1) , and (ii) an admissible adver- sarial policy can inﬂuence at most a fraction γ ∈ [0 , 1] of agents in the network. Recall Π( G, γ ) is the set of admissible static adversarial policies that ar e agnostic about the network structur e. Then, lim n →∞ inf π ∈ Π( G,γ ) G ∈G k n η ( G, α, π ) = ( 1 , if γ < k α (1 − ( k − 1) α ) − αγ (1+ α )(1 − kα ) , if γ ≥ k α . (8) Theorem 2.2. Consider Π D ( G, γ ) , the set of admissible dy- namic adversarial policies that ar e agnostic about the network structur e on any graph G ∈ G k n . Then for α ∈ [0 , 1) and γ ∈ [0 , 1] , inf π ∈ Π D ( G,γ ) η ( G, α, π ) ≥ ( 1 1+ α , if α < 1 k , γ 6 = 0 1 , if α ≥ 1 k or γ = 0 . (9) Furthermor e, the limit of efﬁciency as the size of G gr ows ( n → ∞ ) equals the lower bound. There are sev eral interesting things to note from Theorems 2.1 and 2.2. If α ≥ 1 /k , neither classes of adversarial policies Π D ( G, γ ) nor Π( G, γ ) can inﬂict any damage on the system regardless of the budget γ . Second, the achiev able efﬁcienc y of a dynamic uninformed adversary , i.e., restriction to Π D ( γ ) , is constant for γ ∈ (0 , 1] . Third, the induced efﬁciency from a static uninformed adversary , i.e., restriction to Π( G, γ ) , is decreasing in γ . Lastly , by tightness we know that for any k ≥ 1 and γ ∈ (0 , 1] we have inf G 1 ∈G 1 ,π ∈ Π η ( G 1 , α, π ) ≤ inf G k ∈G k ,π ∈ Π η ( G k , α, π ) , and an identical relation holds for policies in Π D . Here, we omit highlighting the dependence on Π( · ) for brevity . Hence, ring graphs ( k = 1 ) are the graphs that are most susceptible to adversarial interference. I I I . T H E I M PAC T O F I N F O R M A T I O N O N R I N G G R A P H S The previous section demonstrated that ring graphs are the most susceptible to adversarial inﬂuence. In this section we explicitly characterize the impact informational awareness has on the potential degradation by admissible adversarial policies. W e focus this analysis exclusi vely on ring graphs. A. Static Informed Adversarial P olicies This section focuses on the potential degradation of the adversarial policies in the set Π I . By knowing the graph’ s structure, the adversary can explicitly target speciﬁc agents S x , S y ⊆ N in the network. An adversarial policy π ∈ Π I 2 V alues of α ≥ 1 are not considered here. If α ≥ 1 , then a single x link is valued as much or higher than two y links. If this is the case, no y agents can be induced in the stochastically stable state under any adversarial policy on ring graphs, and no damage can be inﬂicted. deﬁnes the process by which these agents are selected. The resulting policies is static is the sense that for all times t ≥ 1 , S x ( t ) , S y ( t ) = S x , S y . The follo wing Theorem characterizes the potential degradation caused by such adversarial policies. Theorem 3.1. Consider the class of network coor dination games where (i) α ∈ [0 , 1) and (ii) G ∈ G 1 n . Given a fractional adversarial budget consider Π I ( G, γ ) , the set of admissible adversarial policies that ar e static, but can depend on the network structur e. Then for γ ∈ (0 , 1] , inf π ∈ Π I ( G,γ ) η ( G, α, π ) ≥ inf ` x 1 ,` x 2 ,` y 1 ,` y 2 ∈ Z ≥ 0 1 1 + α 1 + (2 + α )( s 1 s 2 − 1) + α ( ` x 1 − s 1 s 2 ` x 2 ) ` x 1 + ` y 1 − s 1 s 2 ( ` x 2 + ` y 2 ) ! , subject to: for j = 1 , 2 , ` x j ≥ 2 , ` y j ≥  2 + α 1 − α  s j = γ ( ` x j + ` y j ) −  α ( ` y j + 1)  − 2 − &  2 − α ( ` x j − 1)  + 1 + α ' s 1 = 0 with ` x 2 , ` y 2 = 0 , or s 1 > 0 and s 2 < 0 . (SI-OPT) F or γ = 0 , the efﬁciency for any graph is 1. Here , we denote [ z ] + = max { z , 0 } for any z ∈ R . Furthermore , the limit of efﬁciency as the size of G grows ( n → ∞ ) equals the lower bound (SI-OPT) . There are sev eral interesting things to note from Theo- rem 3.1, which characterizes the greatest damage that an adversary can inﬂict upon the system when relying on static policies that can depend on the graph structure. This theorem informs the structure of the worst-case attack, which in volv es the adversary attempting to stabilize alternating x, y sequences of four distinct lengths. While the structure of this adversarial attack is not necessarily fundamental, the interesting part of the theorem centers on tightness. That is, the adversary can nev er inﬂict more damage than the bounds given in Theorem 3.1, and the best adversarial strategy approaches this bound as the size of the ring graph in consideration gets larger . B. Dynamic Informed Adversarial P olicies This section focuses on the potential degradation of the adversarial policies in the set Π DI . Here, the adversary can target speciﬁc agents S x ( a ( t )) , S y ( a ( t )) ⊆ N using knowl- edge of the graph structure G and the sequence of action proﬁles { a ( t ) } t ∈ Z ≥ 0 . An adversarial policy π ∈ Π DI deﬁnes the process by which these agents are selected. The following Theorem characterizes the maximum potential degradation caused by such adversarial policies. Theorem 3.2. Consider Π DI ( G, γ ) , the set of admissible adversarial policies that ar e dynamic and can depend on network structur e, and α ∈ [0 , 1) . Then the fundamental lower bound for inf π ∈ Π DI ( G,γ ) η ( G, α, π ) is given by the RHS of 5 (SI-OPT) , wher e the s j variables ar e instead s j = ( γ ( ` x j + ` y j ) − 4 if α < 1 2 and ` x j ≤ 1 +  1 − α α  γ ( ` x j + ` y j ) − 2 else (10) for j = 1 , 2 . Furthermore, the limit of efﬁciency as the size of G gr ows ( n → ∞ ) equals the lower bound. Similar to (SI-OPT), the lower bound of Theorem 3.2 takes the form of an integer programming problem. While the struc- ture of this adversarial attack is not necessarily fundamental, the interesting part of the theorem centers on tightness. That is, the adversary can nev er inﬂict more damage than the bound described in Theorem 3.2 and the best adversarial strategy approaches this bound as the size of the ring graph gets larger . C. Comparison Between Information and Sophistication Here, we emphasize the qualitative differences between information and sophistication. The Theorem below asserts that sophistication, i.e. the ability to implement a dynamic policy , is a more desirable attribute for the adversary if its budget is relativ ely low , while information is more valuable if its budget is high. Theorem 3.3. Suppose α ∈ [0 , 1) . F or budg ets γ ∈ (0 , α ) (empty interval if α = 0 ), we have lim n →∞ inf G ∈G 1 n π ∈ Π D ( G,γ ) η ( G, α, π ) < lim n →∞ inf G ∈G 1 n π ∈ Π I ( G,γ ) η ( G, α, π ) . (11) F or budg ets γ ∈ ( α , 1] , the opposite (strict) inequality holds. The y are equal if γ = α . Hence, in the low budget regime γ < α , the adversary prefers to be uninformed and dynamic ov er being informed but static. The opposite conclusion holds in the high budget regime γ > α . This characterization allows us to explicitly identify the importance of information and sophistication in adversarial policies as highlighted in Figure 1 3 . The next result provides a comparison between static and dynamic informed adv ersaries. It states that gi ven a suf ﬁciently large adversarial budget, an optimal static informed policy can do just as much damage as an optimal dynamic informed policy . Theorem 3.4. The fundamental lower bound on performance for static informed policies is  1 1 + α  ` ∗ + α ` ∗ + 2 (12) if and only if it has a budget γ ≥ γ SI sat := ` ∗ + d 2 − α 1+ α e ` ∗ +2 , wher e ` ∗ := l 2+ α 1 − α m . Furthermor e, the fundamental lower bound on performance for dynamic informed policies coincides with 3 Figure 1 plots the bounds that the four main results, Theorems 2.1, 2.2, 3.1, and 3.2, characterize. While the bounds are analytically derived for Theorems 2.1, 2.2 (uninformed adversaries), the plots for informed adversaries resemble a closely approximated value by solving their respective integer optimization problems with a ﬁnite upper bound of 100 on the decision variables. (12) for budgets γ ≥ γ DI sat := 2+2 · 1 ( α< 1 2 ) ` ∗ +2 , wher e 1 ( · ) is the indicator function. In other words, there are saturation levels on budget for both types of adversaries (DI and SI), where inﬂuencing more than γ DI sat ( γ SI sat ) fraction of agents does not of fer any additional performance gains. Ho wev er , a static adversary will not e xhibit saturation if α < 1 2 . That is, the static adversary achiev es performance lev el (12) if and only if it has a full budget γ = 1 . It is interesting to note from the abov e Theorem that the dynamic informed adversary can maintain the performance lev el (12) for a wider range of budgets γ ∈ [ γ DI sat , 1] ⊇ [ γ SI sat , 1] than the static informed adversary can. Here, the range is the same (no saturation exhibited for either) if and only if α = 0 . Essentially , dynamic policies can inﬂict the same lev el of damage with fewer adversaries than a static policy . The proofs of both Theorems in this subsection are giv en in the Appendix. I V . P R O O F S : P E R F O R M A N C E O F S T A T I C P O L I C I E S In this section, we provide proofs for the minimum efﬁ- ciency a static adversary can induce. W e will ﬁrst prove The- orem 3.1, the case of a static informed adversary . As discussed, we limit our attention here to ring graphs G ∈ G 1 . W e then giv e a proof of Theorem 2.1, the case of a static uninformed adversary . This result relies on extending an intermediate step from the proof of Theorem 3.1 to k -connected ring graphs. The adversary’ s objectiv e is to steer the system to a stochas- tically stable state of minimal efﬁcienc y . W e will refer to action proﬁles that can be stabilized through some static policy as the set of tar get pr oﬁles a static uninformed and static informed adversary can induce, respecti vely . Indeed, we would like to characterize the target proﬁle of minimal ef ﬁciency an adversary can achiev e over any ring graph, i.e. inf G ∈G 1 ,π ∈ Π I ( G,γ ) η ( G, α, π ) . (13) Our approach is to vie w any action proﬁle a (and hence any target proﬁle) as composed of alternating x and y segments. A y segment L y is any subset { j, j + 1 , . . . , j + | L y | − 1 } ⊆ N such that a i = y ∀ i ∈ L y and a j − 1 = a j + | L y | = x (modulo n arithmetic). Similarly , L x describes any such segment of x agents. A. Pr oof of Theor em 3.1 T o begin, we start with a general outline of the forthcoming proof, which we break up into three steps. Following the outline, we give proofs for each of the individual steps. Step 1: Necessary and sufﬁcient budget conditions to stabilize target proﬁles W e deriv e the minimum number of adversarial nodes that is necessary and sufﬁcient to stabilize a given action proﬁle a . Indeed, suppose S is an allocation of adversarial nodes. Then a is stochastically stable if and only if for every y segment L y and x segment L x contained in a , | S y ∩ L y | ≥ d α ( | L y | + 1) e + 2 , (14) | S x ∩ L x | ≥  [2 − α ( | L x | − 1)] + 1 + α  , (15) 6 and the spacing from tw o sequential y adversarial nodes within L y is no more than  1 α  . W e observe that segment lengths must satisfy | L y | ≥ l 2+ α 1 − α m and | L x | ≥ 2 . Step 2: Characterizing minimal efﬁciency target proﬁles Having established the number of adversarial nodes needed to stabilize target proﬁles, we identify structural properties of minimal efﬁcienc y target proﬁles that are stabilizable within the budget γ ∈ (0 , 1] . In particular , we show that (2A) Among adversarial policies that induce maximal damage, there is at least one that utilizes its full budget. Specif- ically , if the policy π S ∈ Π I ( G, γ ) with | S | < b γ · n c stabilizes proﬁle a , then one can alw ays use a policy π S 0 with | S | = b γ · n c that also stabilizes a . (2B) The target proﬁle of minimal efﬁcienc y contains at most two unique x y segment patterns. Step 3: Optimization over worst-case target proﬁles W e formulate an integer optimization problem whose solution giv es (13). The decision variables are the lengths of the two unique x y segment patterns, subject to necessity constraints derived from (14) and (15), as well as constraints giv en by the structural properties (2A) and (2B) of minimal efﬁcienc y target proﬁles. This formulation yields (SI-OPT), and thus the proof of Theorem 3.1. Before getting into the proofs of the claims giv en in the outline, we ﬁrst present preliminary analytical tools for characterizing the emergent behavior when an adversarial policy π S ∈ Π I interferes with the agents’ log-linear learning dynamics. Speciﬁcally , we seek to compute the stochastically stable states LLL ( G, α, π S ) . T o do this, we can rely on the fact the graphical coordination game with static adversarial inﬂuence has a potential game structure [34]. In potential games, the stochastically stable states associated with log- linear learning are the action proﬁles that maximize the poten- tial function [35], [36]. One can show that φ ( a ; S ) := W ( a ) 2 + P i ∈ S x V ( a i , x ) + P i ∈ S y V ( a i , y ) . is a potential function for this game. Here, φ simply measures the number of coordinat- ing links, including those induced from adversaries, weighted by their payoffs (i.e. x or y links). Hence, for any graph G and static policy π S , we hav e LLL ( G, α, π S ) = arg max a ∈A φ ( a ; S ) . Proof of Step 1 W e present the proof only for y segments, as the arguments for x segments are analogous. Suppose a is stochastically sta- ble, and contains a y segment L y = { j, j + 1 , . . . , j + | L y | − 1 } . That is, a i = y for i ∈ L y , and a j − 1 = a j + | L y | = x . Consider any de viation a 0 from a that differs only within the segment L y . Then it holds that φ ( a 0 ; S ) ≤ φ ( a ; S ) . In particular , if a 0 is the proﬁle where all agents in L y deviate to x , then (1+ α )( | L y | + 1) ≤ | L y |− 1+ | S y ∩ L y | must hold. Rearranging, we obtain | S y ∩ L y | ≥ α ( | L y | + 1) + 2 . Since | S y ∩ L y | is a non- negati ve integer , it must hold that | S y ∩ L y | ≥ 2+ d α ( | L y | +1) e . T o prove sufﬁciency , we need to construct an allocation S y ∩ L y of d α ( | L y | + 1) e + 2 y adversarial nodes such that φ ( a ; S ) ≥ φ ( a 0 ; S ) , where a i = y ∀ i ∈ L y and for any a 0 deviating from a in agents only in L y . W e ﬁrst assume that | L y | ≥ d α ( | L y | + 1) e + 2 , i.e. the length of the segment itself is y y y y y y y y y x x y y y y y i w · · · · · · Fig. 2: An illustration of the constructed inﬂuence set given by (16), (17) to stabilize an isolated y segment. The y adversaries belonging to S y are depicted as the smaller circles attaching to agents (larger circles) in the network. In this example, α = 1 4 and | L y | = 9 . The necessary and sufﬁcient number of adversaries to stabilize the segment is 5. greater or equal to the necessary number of adversaries needed. Indeed, let us deﬁne the sets W 1 and W 2 as follows: W 1 = { i ∈ L y : b α ( i − j + 1) c − b α ( i − j ) c > 0 } , (16) W 2 = { j, w , j + | L y | − 1 } , (17) where w = max { i : i ∈ L y \ ( W 1 ∪ { j + | L y | − 1 } ) } , i.e. the largest index that is neither in W 1 nor is the endpoint j + | L k | − 1 . Then, set S y ∩ L y = W 1 ∪ W 2 . An illustration of this inﬂuence set is depicted in Figure 2. Such a placement “spreads out” adversaries along L y at a spacing of  1 α  nodes, and additionally places adversaries at the endpoints. W e will show this placement ensures the sufﬁciency condition. Let us denote a L y as the actions of agents in L y , y L y as the partial proﬁle where all agents in L y play y , and x L y as when all agents in L y play x . Let us assume S x ∩ L y = ∅ . Any proﬁle a L y / ∈ { y L y , x L y } belongs to one of three classes: 1) a single isolated x segment X 1 within L y , 2) an x segment X 2 on the left and/or right edge of L y , and 3) a combination of class 1) and 2). For class 1 proﬁles, the potential of y L y exceeds a L y if | S y ∩ X 1 | > α ( | X 1 | − 1) − 2 (18) By construction of S y , a lower bound on the number of y adversaries inﬂuencing X 1 is | S y ∩ X 1 | ≥ b α | X 1 |c , which satisﬁes (18). Similarly for class 2 proﬁles, a sufﬁcient condi- tion is | S y ∩ X 2 | > α | X 2 | . (19) The number of adversaries on X 2 is giv en by b α | X 2 |c + 1 , which clearly satisﬁes (19). Alternati vely , suppose j + | L y | − 1 ∈ X 2 . The number of adversaries inﬂuencing X 2 is at least ( | X 2 | if w / ∈ X 2 , b α ( | X 2 | − 1) c + 2 else, (20) since when w / ∈ X 2 , X 2 ⊂ S y . When w ∈ X 2 , { w, j + | L y | − 1 } ∈ S y in addition to the nodes in W 1 . Both of these cases satisfy (19). Thus, S y satisﬁes the requirements of (18) and (19). Consequently , the S y satisﬁes the requirement of type 3 proﬁles as well. By construction, | S y | ≤ b α | L k |c + 3 . If it is a strict inequality , one can simply add additional y adversaries anywhere in the segment to meet the necessary condition (14), i.e. the case when a L y = x L y .  Proof of property (2A) Suppose the minimum efﬁcienc y proﬁle a ∗ on the graph G ∈ G 1 is stabilized by the policy π S ∈ Π I ( G, γ ) , where all adversaries are not utilized: | S | < b γ · n c . The conditions 7 (14) and (15) are met for all x and y segments, respectiv ely . One can always add in remaining av ailable x adversaries ( y ) to the existing x ( y ) segments while retaining stability of a ∗ . Therefore, there exists a policy π S with | S | = b γ · n c that also stabilizes a ∗ .  Proof of property (2B) Before proving this property explicitly , we ﬁrst deﬁne some relev ant notations. W e can describe a proﬁle a as a sequence of alternating segments L 1 x L 1 y L 2 x L 2 y · · · . For each unique segment pair pattern that appears in a , i.e. | L x | x agents followed by | L y | y agents, let us deﬁne the (column) vector ` x ( a ) whose elements are the lengths | L x | among the unique patterns. W e deﬁne ` y ( a ) similarly for the corresponding lengths | L y | . Let us also deﬁne the vector r ( a ) whose elements are the number of times each unique pattern appears in a . W e will drop the dependencies on a when the context is clear . W e refer to r as the repetition vector . The efﬁcienc y of a can be rewritten in the following suggestive form: η ( a ) = r > ((1 + α ) ` x + ` y ) − (2 + α ) || r || 1 (1 + α ) r > ( ` y + ` x ) . (21) W e note that the denominator of (21) is simply the number of links in the ring, n , multiplied by 1 + α . This indicates the optimal welfare 1 2 W ( a opt ) . The numerator of (21) counts (and weights with associated payoff) the number of coordinating x and y links given the description vectors ` x , ` y , and r . For a proﬁle a and its associated description vectors ` x and ` y , let us deﬁne the vector s ( a ) of identical length, whose components are giv en by s j ( a ) := γ ( ` x,j + ` y ,j ) − d α ( ` y ,j + 1) e − 2 − l (1 + α ) − 1 [2 − α ( ` x,j − 1)] + m . The number s j is the difference between the adversaries av ailable to a particular segment pattern (giv en budget γ ) whose length is giv en by ` x,j and ` y ,j , and the minimum number of adversaries needed to ensure its stability (giv en by (14), (15)). W e refer to s as the surplus vector . The quantity r > s is the excess budget after using the minimum required number of adversaries to stabilize a . Property (2A) asserts that a target proﬁle of minimum efﬁcienc y satisﬁes r > s = 0 . Now , consider an action proﬁle a 1 with ` 1 x = ( ` x, 1 , ` x, 2 , ` x, 3 ) , ` 1 y = ( ` y , 1 , ` y , 2 , ` y , 3 ) and s = ( s 1 , s 2 , s 3 ) with s 1 > 0 and s 2 , s 3 < 0 . Hence, we can ﬁnd r 1 such that ( r 1 ) > s 1 = 0 . Thus, a 1 is a candidate for a minimum efﬁcienc y stable state. Furthermore, consider the proﬁles a 2 and a 3 (possibly deﬁned on different ring graphs), where a 2 is associated with ` 2 x = ( ` x, 1 , ` x, 2 ) and ` 2 y = ( ` y , 1 , ` y , 2 ) , and a 3 is associated with ` 3 x = ( ` x, 1 , ` x, 3 ) and ` 3 y = ( ` y , 1 , ` y , 3 ) . One can ﬁnd repetition vectors r 2 , r 3 that satisfy ( r 2 ) > s = 0 and ( r 3 ) > s = 0 . Deﬁne g i = ` y ,i + (1 + α ) ` x,i − (2 + α ) and ` i = ` x,i + ` y ,i for each i = 1 , 2 , 3 . W e can express efﬁciency of a 1 as η ( a 1 ) = r 1 2 ( g 2 − s 2 s 1 g 1 )+ r 1 3 ( g 3 − s 3 s 1 g 1 ) (1+ α )( r 1 2 ( ` 2 − s 2 s 1 ` 1 )+ r 1 3 ( ` 3 − s 3 s 1 ` 1 )) . One can write the efﬁciencies of a 2 , a 3 as η ( a 2 ) = g 2 − s 2 s 1 g 1 (1+ α )( ` 2 − s 2 s 1 ` 1 ) and η ( a 3 ) = g 3 − s 3 s 1 g 1 (1+ α )( ` 3 − s 3 s 1 ` 1 ) . Observe that η ( a 1 ) is a mediant sum of weighted values η ( a 2 ) and η ( a 3 ) . Hence, either η ( a 2 ) or η ( a 3 ) is less than or equal to η ( a 1 ) . This result can be extended in a similar way to sho w that for any proﬁle consisting of multiple segment patterns, one can construct another proﬁle of lo wer efﬁcienc y using up to two unique segment patterns from the original action proﬁle.  Proof of Step 3 (Theorem 3.1) Using the collection of results we have obtained in Steps 1-3, we can now prov e Theorem 3.1. From property (2B), the search for a minimal efﬁciency stable state, i.e., one that giv es the efﬁciency (13), reduces to ﬁnding four lengths: ` x = ( ` x, 1 , ` x, 2 ) and ` y = ( ` y , 1 , ` y , 2 ) . The form of the objecti ve function in the integer program of (SI-OPT) thus coincides with the expression for η ( a 2 ) in property (2B). Each ` z ,i , z ∈ x, y and i ∈ { 1 , 2 } , must satisfy the length criterion ` y ,i ≥ l 2+ α 1 − α m and ` x,i ≥ 2 ( 3 if α = 0 ). These length conditions are consequences of the stabilizability conditions (14) and (15). Lastly , one can ﬁnd a repetition vector r that satisﬁes r > s = 0 , as long as s 1 > 0 and s 2 < 0 , or s 1 = 0 with ` x, 1 , ` x, 2 = 0 .  B. Pr oof of Theor em 2.1 Here, we provide a proof of Theorem 2.1, which character- izes the minimal efﬁciency a static uninformed adversary can induce on a k -connected ring graph. The arguments rely on an extension of intermediate step 1 from the proof of Theorem 3.1 to k -connected ring graphs. In particular , the necessary and sufﬁcient condition to sta- bilize a y segment in a k -connected ring graph is | S y ∩ L y | ≥  α  k | L y | + k ( k + 1) 2  + k ( k + 1) , (22) and the spacing between two sequential y adversaries within L y is no more than  1 kα  . Note that according to this condition, the segment length must also satisfy | L y | ≥ max n 1 , d k ( k +1)(1+ α/ 2) 1 − kα e o . A deriv ation of the condition is as follows. There are P k j =1 ( | L y | − j ) links between agents in L y . Assuming | L y | satisﬁes the length requirement, there are 2 P k j =1 j links from L y to outside L y . The potential of y L y (all agents in L y play y ) exceeds that of x L y (all play x ) if | S y ∩ L y | + P k j =1 ( | L y | − j ) ≥ (1 + α ) h P k j =1 ( | L y | − j ) + 2 P k j =1 j i , which reduces to (22). One can prove sufﬁcienc y in a similar manner as step 1 from the previous section – by allocating the y adversaries with a spacing of  1 kα  apart, the potential of y L y exceeds that of any other a L y 6 = { y L y , x L y } . W e are now ready to prove Theorem 2.1. A static and uninformed policy cannot strategically place adversarial nodes in the network. It can only specify the numbers of x and y adversaries. Its baseline performance is given by the minimal damage that can be inﬂicted ov er all possible allocations of these adversaries. Hence to characterize (8), we seek the allocation of adversaries that ensures the best-case efﬁciency for the network. First, we consider the case γ < k α . The adversarial nodes can be allocated sparsely enough across the entire network such that the condition (22) is violated. Consequently , the all x proﬁle is the unique stochastically stable state. Therefore, 8 no damage can be inﬂicted on the system in this regime. Note that if k α > 1 , no damage is possible regardless of the budget. Now , consider γ > kα . If k α < 1 and G ∈ G k is sufﬁciently large, an allocation of y adversaries according to (22) would ensure con version of the entire network to y , giving an efﬁcienc y of 1 1+ α . Howe ver , let us consider a re-allocation of these adversaries that maximally mitigates such damage. The idea is to only allow a minimal fraction f of the network to be conv erted to y , while the rest of the network plays x . Suppose y adversaries are allocated to ev ery agent in a contiguous segment, whose length is a fraction f of the entire network. Suppose this segment is sufﬁciently long such that (22) is satisﬁed. Now , the remaining γ − f adversaries should be allocated to the rest of the network such that the remaining fraction 1 − f of the network (another contiguous segment) is still stable to x . Indeed, an adversarial agent density of up to k α in the remaining network fails to induce any y agents. The smallest f that satisﬁes these conditions is giv en by f = γ − kα 1 − kα . This establishes (8). Note in this analysis, the adversary exclusi vely chooses to implement y adversaries. Based on the above arguments, an optimal static uninformed policy never chooses to use x adversaries. V . P R O O F S : P E R F O R M A N C E O F DY NA M I C P O L I C I E S In this section, we give proofs for the minimum ef ﬁciency dynamic adversaries can induce. Similar to Section IV, we will ﬁrst prov e Theorem 3.2, the case of a dynamic informed adversary . W e then give the proof of Theorem 2.2. Due to the time-dependent nature of dynamic policies, we cannot rely on potential game arguments to compute stochastically stable states as we did in Section IV. One must instead lev erage the theory of regularly perturbed Markov processes and resistance trees. Before delving into the proof of Theorem 3.2, we provide a brief ov erview of this theory below . More detailed treatments can be found in [21], [37]. A. Preliminary: Re gularly perturbed Markov pr ocesses and r esistance trees Deﬁnition 2. A Markov process with transition matrix P  de- ﬁned over state space A and parameterized by a perturbation  ∈ (0 , ¯  ] for some ¯  > 0 is a regular perturbation of the process P 0 if it satisﬁes: 1) P  is aperiodic and irr educible for all  ∈ (0 , ¯  ] . 2) lim  → 0 + P  ( a, a 0 ) → P 0 ( a, a 0 ) for all a, a 0 ∈ A . 3) If P  ( a, a 0 ) > 0 for some  ∈ (0 , ¯  ] then ther e exists r ( a, a 0 ) ≥ 0 such that 0 < lim  → 0 + P  ( a,a 0 )  r ( a,a 0 ) < ∞ . W e call r ( a, a 0 ) the resistance of transition a → a 0 . The log-linear learning process is a regularly perturbed process with error parameter  = e − β . The transition graph of P  is a directed graph whose nodes are the action proﬁles A and the edge ( a, a 0 ) exists if and only if P  ( a, a 0 ) > 0 . The weights of such edges are giv en by the resistances r ( a, a 0 ) . The resistance of a path of length m , ζ = ( z 1 → z 2 → · · · → z m ) , is the sum of resistances along the state transitions: r ( ζ ) := P m − 1 k =1 r ( z k , z k +1 ) . Let us denote the recurr ent classes of the unperturbed process P 0 as E 1 , E 2 , . . . , E N with N ≥ 1 where each class E k ⊂ A . A recurrent class satisﬁes the following. 1) For all a ∈ A , there is a zero resistance path from a to E k for some k ∈ { 1 , . . . , N } . 2) For all k ∈ { 1 , . . . , N } , and all a, a 0 ∈ E k , there exists a zero resistance path from a to a 0 and from a 0 to a . 3) For all a, a 0 with a ∈ E k for some k ∈ { 1 , . . . , N } and a 0 / ∈ E k , r ( a, a 0 ) > 0 . One can also consider another directed transition graph whose nodes are the N recurrent classes. In this graph, all edges exist. Edge ( E i , E j ) is weighted by ρ ij , deﬁned as the minimum resistance among paths in the action proﬁle tran- sition graph starting from E i and ending in E j : ρ ij := min a ∈ E i ,a 0 ∈ E j min ζ ∈P ( a → a 0 ) r ( ζ ) . where P ( a → a 0 ) denotes the set of all paths starting at a and ending at a 0 . Let T k be the set of all spanning trees rooted in the class E k . That is, an element of T ∈ T k is a directed graph with N − 1 edges such that there is a unique path from E j to E k , for ev ery j 6 = k . The resistance R ( T ) of the rooted tree T is the sum of resistances ρ ij on the N − 1 edges that compose it. Now , deﬁne ψ k := min T ∈T k R ( T ) as the stochastic potential of recurrent class E k . W e will use the following result to identify stochastically stable states. Lemma 5.1 (from [37]) . The state a ∈ A is stochastically stable if and only if a ∈ E k , wher e k ∈ arg min j ∈{ 1 ,...,N } ψ j . That is, it belongs to a r ecurrent class with minimum stochastic potential. It is the unique stochastically stable state if and only if E k = { a } and ψ k < ψ j , ∀ j 6 = k . B. Pr oof of Theor em 3.2 The logic of the proof follo ws the same three-step structure as the proof of Theorem 3.1. The only component that differs are the necessary and sufﬁcient budget conditions to stabilize x and y segments. Indeed, we expect a dynamic informed adversary to need fewer adversaries than its static counterpart. An outline of the proof is as follows. W e show a particularly deﬁned dynamic policy , which we term an aggr essive policy , is sufﬁcient to stabilize a given target proﬁle. This entails proving that a is the recurrent class of minimum stochastic potential (see Lemma 5.1). T o do so, we characterize the set of recurrent classes and demonstrate that minimum resistance paths leaving each class leads to another that is more “similar” to the target proﬁle a . W e then prove necessity – any other dynamic policy utilizing strictly fewer adversarial nodes than the aggressi ve policy cannot stabilize a . W e can then formulate an integer optimization problem similar to Theorem 3.1, but with different constraints on the number of adversaries needed for each x and y segment. T o begin, we formally deﬁne the aggressiv e policy based on proﬁle a ∈ A . Deﬁnition 3. (Aggr essive policy tar geting a ). An aggressive policy targeting a ∈ A is a state-dependent policy with adversarial placements { S x ( a ( t )) , S y ( a ( t )) } t ≥ 0 satisfying the following pr operties. 1) (Defensi ve y strategy) F or each y -se gment L y contained in a , suppose [ p, q ] = { p, p + 1 , . . . , q } ⊆ L y , with p 6 = q , is the 9 y y y x x x y y y x x x y y y y a ( t ) a ( t + 1) y y y y x x y y y x x x y x x y y y y x x x x x y y y y y y y y y y y x x x x y y y y y y y y Fig. 3: W e illustrate here both defensiv e and of fensive strategies in an aggressiv e policy . (Left) Defensiv e y strategies are applied the ﬁrst and third segments from the left. The fourth agent from the left transitioning from x to y at time t + 1 activ ates a defensive x strategy in the second. No adversaries are deployed to x segments until only two neighboring agents playing x remain. (Right) A defensive and offensi ve y strategy are applied simultaneously to the ﬁrst segment. The offensi ve strategy attaches a y adversary to the x agent that has a y neighbor . longest se gment of agents within L y playing y in a ( t ) . Then, S y ( a ( t )) ∩ [ p, q ] =      { p, q } if a p − 1 ( t ) = a q +1 ( t ) = x p if a p − 1 ( t ) = x, a q +1 ( t ) = y q if a p − 1 ( t ) = y , a q +1 ( t ) = x (23) If the length of [ p, q ] is one ( p = q ) , then p / ∈ S y ( a ( t )) . 2) (Defensiv e x strategy) F or each x -se gment L x contained in a , suppose [ p, q ] , p 6 = q , is the longest se gment of agents within L x playing x in a ( t ) . (a) If α < 1 2 and | L x | ≤ 1 +  1 − α α  , then S x ( t ) ∩ [ p, q ] = ( { p, q } if q − p = 1 ∅ otherwise (24) If | L x | ≥ 2 +  1 − α α  , then S x ( t ) ∩ L x = ∅ . (b) Suppose α ≥ 1 2 . Then S x ( t ) ∩ L x = ∅ . 3) (Of fensiv e strategies) Consider the segment of lowest index that is not aligned, i.e. a i ( t ) 6 = y for at least one i ∈ L y = [ u, v ] , for a y -se gment. Then the following pr operties hold for S y ( t ) . A similar implementation holds if it is an x -segment. (a) Denote [ p, q ] ⊂ L y as the longest se gment of agents within L y playing y in a ( t ) . Then S y ( t ) ∩ L y contains either p − 1 or q + 1 , but not both. (b) If a i ( t ) = x for all i ∈ L y , then S y ( t ) ∩ L y contains • u or v (but not both), when a u − 1 ( t ) = a v +1 ( t ) . • u , when a u − 1 ( t ) = y and a v +1 ( t ) = x . • v , when a u − 1 ( t ) = x and a v +1 ( t ) = y . Properties 1 and 2 describe “defensi ve y (resp. x )” strategies to maintain y ( x ) segments ov er time. A defensiv e y strategy is implemented on all y -se gments at any giv en time. In property 2, defensive x strategies are applied only if α < 1 2 , and to segments that are shorter than a threshold length. Furthermore, the strategy does not “activ ate” until there are at most two consecutiv e agents playing x in the segment. Property 3 describes “of fensiv e” strategies that are intended to conv ert segments back to their original type x or y . Note that the aggressiv e policy applies an offensi ve strategy to only a single segment at any time t , if needed. Figure 3 depicts an illustration of allocations of adversarial nodes to segments according to an aggressi ve policy . Let n y be the number of y segments and n x the number of x segments of length at most 1 +  1 − α α  in proﬁle a . Then, the minimum number of adversarial nodes needed to implement an aggressiv e policy targeting a is ( 2( n y + n x ) + 1 , if α < 1 2 2 n y + 1 , if α ≥ 1 2 (25) Here, the additional +1 adversary is needed to implement an offensi ve strategy . W e now establish some basic facts and terminologies. Under log-linear learning, states transition via unilateral agent deviations. If two proﬁles a 1 and a 2 differ by agent i ’ s deviation, the resistance is r ( a 1 → a 2 ) = h ˜ U i ( a 1 i , a 1 − i ; S ( a 1 )) − ˜ U i ( a 2 i , a 1 − i ; S ( a 1 )) i + (26) where recall ˜ U i is agent i ’ s perceiv ed utility (6). T o include more speciﬁcity regarding such deviations, we introduce the following notation. Suppose in proﬁle a 1 , agent i plays z ∈ { x, y } , has b ∈ { 0 , 1 , 2 } neighbors also playing z , and is inﬂuenced by an adversarial node of type s ∈ { x, y , ∅ } . In a 2 , agent i unilaterally deviates to { x, y } \ z . Then we write ω ( z , b, s ) := r ( a 1 → a 2 ) (27) to denote the magnitude of the resistance for this transition. The following result characterizes the minimum required lengths for x and y segments in a target proﬁle. Lemma 5.2. If a ∈ A is stochastically stable under the aggr essive policy tar geting a , then • all x se gments of a ar e of length 2 or greater . • all y se gments of a ar e of length d 2+ α 1 − α e or greater . Pr oof. Suppose ( a i − 1 , a i , a i +1 ) = ( y , x, y ) for some agent i . Regardless of what adv ersarial policy is applied, there is a zero resistance path out of a . Speciﬁcally , r ( a → a 0 ) = 0 where a i = y and a 0 − i = a − i . The least resistant path from a 0 to a is 1 − α > 0 , possible if and only if i ∈ S x ( a 0 ) . Hence, a is not a recurrent class and therefore is not stochastically stable. Suppose a has a y se gment L y and | L y | ≤ l 1+2 α 1 − α m . Let a 0 be the similar proﬁle with a 0 L y = x L y . Note that a 0 is a recurrent class. When the aggressive policy applies an offensi ve strategy on L y , the minimum resistance path starting from a 0 and ending in a is giv en by a border agent’ s x → y transition (having resistance 1 + 2 α ), followed by each subsequent neighbor’ s x → y transition (each having resistance 0 ). The resistance of this path is ρ a 0 ,a = 1 + 2 α . The minimum resistance path starting from a and ending in a 0 consists of | L y | − 1 transitions of resistance 1 − α . Hence, ρ a,a 0 = (1 − α )( | L y | − 1) < (1 − α ) 1+2 α 1 − α = 1 + 2 α . Let T be the minimum resistance tree rooted in a , and note that 10 the edge ( a 0 , a ) is necessarily part of T . Consider the tree T 0 rooted in a 0 by replacing the edge ( a 0 , a ) from T with ( a, a 0 ) . T 0 has lower stochastic potential than T and therefore a is not stochastically stable.  Henceforth, we only consider target proﬁles a with prop- erties giv en by Lemma 5.2. In the forthcoming analysis, we characterize the set of recurrent classes induced by the aggressiv e policy . W e ﬁrst deﬁne terminology to describe any proﬁle a 0 relativ e to a ∈ A . Let L y be a y -segment contained in a . W e say the segment L y is homogeneous in a 0 if a 0 i = a 0 j for all i, j ∈ L y . W e say L y is hetero geneous if it is not homogeneous. Similar terminology applies for x -segments of a . W e say the proﬁle a 0 is homogeneous if e very x and y - segment is homogeneous in a 0 , and it is heterogeneous if it is not homogeneous. W e will denote particular portions of a homogeneous action proﬁle a 0 with brackets | x and | y that separate the segments based on the target proﬁle a . For instance, | X | x X | y refers to the actions of agents in an action proﬁle for two consecutiv e segments with all agents playing x in the ﬁrst as well as the second. The subscripts con ve y that agents play x in the target proﬁle a in the ﬁrst se gment and y in the second segment. W e will often compare two homogeneous action proﬁles that dif fer only in one segment, and term the two proﬁles similar . The following result characterizes the set of all recurrent classes induced by the aggressi ve policy . In particular , each recurrent class consists of a single homogeneous action proﬁle. Lemma 5.3. The r ecurrent classes associated with the aggr es- sive policy tar geting a are the homogeneous action pr oﬁles that do not contain an instance of | X | y Y | x . Pr oof. W e can disqualify any action proﬁle a 1 having at least one heterogeneous segment L from being a recurrent class. The reason is that an offensi ve strategy induces a zero- resistance path from a 1 to a 2 , where a 2 L is homogeneous. Furthermore, the resistance of any path from a 2 back to a 1 is necessarily non-zero. Thus, recurrent classes must be homogeneous proﬁles. Observe an y homogeneous proﬁle containing an instance of | X | y Y | x has a zero-resistance path to a proﬁle where that instance is replaced by | X | y X | x . This is because an aggressiv e policy applies an offensi ve x strategy on the Y | x segment. Any homogeneous proﬁle that does not contain an instance of | X | y X | x is necessarily composed of instances of | X | x Y | y X | x , | Y | y X | x Y | y , | X | x X | y X | x , and | Y | y Y | x Y | y . Under an aggressive policy , there are no zero-resistance paths out of any of these segment patterns.  Let us denote A R ⊂ A the set of recurrent classes. W e can thus focus our attention to homogeneous action proﬁles described in Lemma 5.3 as candidates for stochastically stable states, which includes the target proﬁle a itself. For the next steps in the proof, we need the following calculations regard- ing minimum resistances between similar recurrent classes. Lemma 5.4. Consider the following transitions between two similar r ecurrent classes a 1 , a 2 ∈ A R . If the transition is 1) | Y | y Y | x Y | y → | Y | y X | x Y | y , then ρ a 1 ,a 2 = 1 − α . 2) | X | x X | y X | x → | X | x Y | y X | x , then ρ a 1 ,a 2 = 1 + 2 α . 3) | X | x Y | y X | x → | X | x X | y X | x , then ρ a 1 ,a 2 > 1 + 2 α . 4) | X | x → | Y | y , then ρ a 1 ,a 2 > 1 − α . Pr oof. 1) When an aggressive policy implements an offensi ve x strategy on the middle segment L x , a path from a 1 to a 2 consists of | L x | unilateral switches from y to x . The ﬁrst switch requires ω ( y , 2 , x ) = 1 − α resistance. The rest switch with resistance ω ( y , 1 , x ) = 0 . The resistance of any other path necessarily is greater than 1 − α 2) The least resistant path occurs when an offensi ve y strategy is applied to the middle segment L y , and the agents along the segment sequentially switch. This path requires one deviation of resistance ω ( x, 2 , y ) = 1 + 2 α , and the other | L y | − 1 deviations of resistance ω ( x, 1 , y ) = 0 . 3) A defensiv e y strategy is applied to the middle segment L y . The path ζ in which each agent sequentially deviates requires | L y | − 1 de viations of type ω ( y , 1 , y ) = 1 − α . By Lemma 5.2, | L y | > d 1+2 α 1 − α e ⇒ r ( ζ ) > 1 + 2 α . Another path requires at least one deviation of an agent in the middle of the segment, which has resistance ω ( y , 2 , ∅ ) = 2 . Hence, ρ a 1 ,a 2 > 1 + 2 α . 4) A transition of this form either • has at least one deviation of type ω ( x, 2 , ∅ ) = 2(1 + α ) > 1 − α , e.g. an agent with two y neighbors switches. • has at least one de viation of type ω ( x, 1 , x ) = 1 + 2 α > 1 − α , e.g. a defensi ve x strategy is applied. • only has de viations of type ω ( x, 1 , ∅ ) = α , i.e. no defensiv e x strategy applied. This is the case if α ≥ 1 2 , in which α > 1 − α . The other case is if α ≤ 1 2 and | L x | ≥ 2 +  1 − α α  . It takes | L x | − 1 transitions of type ω ( x, 1 , ∅ ) = α , for which α ( | L x | − 1) > 1 − α .  Every a 0 ∈ A R can be assigned a lev el of “disagreement” d ( a 0 ) corresponding to the number of homogeneous segments that differ relative to their counterparts in target proﬁle a : d ( a 0 ) := |{ L z | a 0 L z 6 = a L z , z ∈ { x, y }}| . (28) The next result demonstrates that disagreement decreases along minimum resistance paths between recurrent classes. Lemma 5.5. Consider the dir ected graph Σ = ( A R , E ) in which the edges E ar e formed by connecting recurr ent classes thr ough the minimum r esistance edge leaving each class. Then Σ is composed of a collection of disconnected subgr aphs Σ u = ( A u , E u ) , each one corr esponding to a particular r ecurr ent class u . Each subgraph Σ u has the following pr operties. • The class u belongs to Σ u , and for every node v ∈ A u , v 6 = u , ther e is a unique path fr om v to u . • There exists a class v ∈ A u s.t. ( u, v ) , ( v , u ) ∈ E u . • u = arg min v ∈A u d ( v ) . W e refer to the class u as the head of the subgraph Σ u . Pr oof. By Lemma 5.3, each recurrent class is a homogeneous action proﬁle not containing an instance of | X | y Y | x . Let A 0 R := { a 0 ∈ A R : a 0 has an instance of | Y | y Y | x Y | y } . (29) 11 Suppose a 1 ∈ A 0 R . By Lemma 5.4, the edge ( a 1 , a 2 ) ∈ E has resistance 1 − α , since a 2 is similar to a 1 where an instance of | Y | y Y | x Y | y in a 1 is replaced with | Y | y X | x Y | y . Thus, any ( a 1 , a 2 ) ∈ E with a 1 ∈ A 0 R satisﬁes d ( a 2 ) = d ( a 1 ) − 1 . W e term this type of edge a “type 0” transition. Now , consider any class a 1 ∈ A R \ A 0 R . The minimum resistance edge leaving a 1 to some a 2 ∈ A R is one of the following types (with resistances due to Lemma 5.4). 1) An instance | X | x X | y X | x replaced by | X | x Y | y X | x , with resistance r 1 ( a 1 ) = 1 + 2 α . Then d ( a 2 ) = d ( a 1 ) − 1 . 2) An instance | X | y X | x Y | y replaced by | Y | y Y | x Y | y , with resistance r 2 ( a 1 ) > 1 − α . Then d ( a 2 ) = d ( a 1 ) . 3) An instance | X | x Y | y X | x replaced by | X | x X | y X | x , with resistance r 3 ( a 1 ) > 1 + 2 α . Then d ( a 2 ) = d ( a 1 ) + 1 . 4) An instance | Y | y X | x Y | y replaced by | Y | y Y | x Y | y , with resistance r 4 ( a 1 ) > 1 − α . Then d ( a 2 ) = d ( a 1 ) + 1 . For a 1 = a , the minimum resistance path can only be of type 3 or 4. Thus, the minimum resistance path out of a must increase disagreement. Subsequently , ( a, a 2 ) ∈ E and ( a 2 , a ) ∈ E . T o see this, the case ( a, a 2 ) being type 4 follo ws from previous arguments. For the case ( a, a 2 ) of type 3, it necessarily holds that r 3 ( a 2 ) , r 4 ( a 2 ) > 1 + 2 α . Hence, the edge ( a 2 , a ) exists and is of type 1. For classes a 1 6 = a , it cannot be of type 3. If it is of type 1 or 2, the edge leads either to a class with lo wer disagreement or a class in A 0 R , which in turn leads to a class with lower disagreement. If a 1 6 = a has a type 4 edge, r 4 ( a 1 ) < 1+2 α and the path leads to a class a 2 ∈ A 0 R (with higher disagreement). Subsequently , the minimum resistance edge from a 2 leads back to a 1 with a type 0 edge. W e can thus deduce for each class u 6 = a that has a type 4 edge, there is a corresponding subgraph Σ u ⊂ Σ that has the following properties. All edges in E u except the edge leaving u lead to a class that either has lower disagreement or has an edge connecting to another class with lower disagreement. Consequently , u has the lowest disagreement of all classes in Σ u , i.e. u = arg min a 0 ∈A u d ( a 0 ) .  An illustration of a subgraph Σ u is shown in Figure 4 (left). Note that the rooted tree on a subgraph Σ u with minimal stochastic potential is given by Σ u without the edge ( u, v ) . The next result asserts that the minimal resistance edge that leav es a subgraph leads to another subgraph whose head node has lower disagreement. Lemma 5.6. Consider the subgraph Σ u = ( A u , E u ) where u 6 = a . Then there is an edge ( u, v ) starting at the head u leading to a class v of another subgraph Σ u 0 satisfying ( u, v ) ∈ arg min a 1 ∈A u ,a 2 / ∈A u ρ a 1 ,a 2 . (30) Furthermor e. d ( u 0 ) < d ( u ) . Pr oof. W e ﬁrst observe that any edge e u ∈ arg min a 1 ∈A u ,a 2 / ∈A u ρ a 1 ,a 2 must be of type 1 or 2. This is because the edge of type 1 out of u has resistance 1 + 2 α , so e u must incur a resistance no greater than 1 + 2 α . This disqualiﬁes e u to be of type 3. T ype 4 is also disqualiﬁed v u type 4 type 2 Subgraph Σ u d ( u ) + 3 d ( u ) + 2 d ( u ) + 1 d ( u ) Fig. 4: (Left) The graph formed by connecting recurrent classes through the minimum resistance edge leaving each class, is composed of disconnected subgraphs Σ u with the structure illustrated abov e (Lemma 5.5). All paths in Σ u lead to a “head node” u that has minimal disagreement among all other nodes. Disagreement decreases along any path. (Right) The resistance tree rooted at a of minimum stochastic potential. A minimum resistance edge leaving each subgraph exits via the head node (Lemma 5.6). because the only occurrence of a type 4 edge is between u and v for some v ∈ A u . Hence, if e u is type 1, a minimal resistance path leaving Σ u leav es from the head u . The other possibility is e u is of type 2. The head node u has the most instances of | Y | y among all other nodes in A u . This follows from the fact that all edges in E u except the one leaving u are type 0, 1 or 2. Consequently , u has the most instances of | X | y X | x Y | y from which a type 2 transition leav es Σ u . Moreov er , such instances appearing in any other class in A u also appears in u . Therefore, if e u is type 2, a minimal resistance path leaving Σ u leav es from the head u . Whether e u is type 1 or 2, it leads to a class of Σ u 0 that ei- ther has lower disagreement or has an edge leading to another class with lower disagreement. Therefore, d ( u 0 ) < d ( u ) .  These results give us enough structure about the resistance trees to deduce that a is the unique stochastically stable state. Proposition 5.1. The pr oﬁle a is the unique stochastically stable state under an aggr essive policy tar geting a . Pr oof. W e need to show the rooted tree with minimal stochas- tic potential is rooted in a (Lemma 5.1). Consider each subgraph Σ u detailed in Lemma 5.5. By removing the edge ( u, v ) , we end up with a tree T u , restricted over A u , with minimal stochastic potential that is rooted in u . Let T be the rooted tree ov er all recurrent classes A R constructed by connecting trees T u through the minimum resistance edges leaving each head node u 6 = a (Lemma 5.6). Since all of these edges connect to dif ferent rooted trees with strictly lower disagreement, T must be rooted in a , the class with minimal disagreement. The tree T has the minimum stochastic potential of all possible rooted trees because each sub-tree is of minimal potential, and edges connecting sub-trees are the minimal resistance edges leaving each sub-tree.  The structure of the minimum potential rooted tree T is illustrated in Figure 4 (right). Proposition 5.1 is a suf ﬁciency result – the aggressiv e polic y is a dynamic policy that stabilizes the proﬁle a . Our next result asserts necessity – the number 12 of adversarial nodes employed by the aggressive policy is the minimum required budget to stabilize a . Proposition 5.2. A policy using fewer adversarial nodes than the aggr essive policy tar geting a ∈ A cannot stabilize a . Pr oof. Consider a policy π 0 that uses fewer adversaries than the aggressive policy π . Then there exists either an x or y - segment of a in which a defensive strategy is not implemented under π 0 but is under π . Suppose it is an x segment, and suppose a 0 is a similar proﬁle to a in which an instance of | X | x is replaced by | Y | x on which π applies a defensive x strategy . This is the case only if α < 1 2 and | L x | ≤ 1 +  1 − α α  . Under π 0 , there is a path from a to a 0 using | L x | − 1 deviations of type ω ( x, 1 , 0) = α . The resistance of this path is less than 1 − α . A minimum resistance path from a 0 back to a consists of at least one transition of type ω ( y , 2 , x ) = 1 − α or ω ( y, 2 , ∅ ) = 2 . Hence, r ( a 0 , a ) ≥ 1 − α . Now suppose a defensiv e strategy fails on a y segment. Suppose a 0 is similar to a in which an instance of | Y | y is replaced with | X | y . A minimal path has zero resistance, since it uses only deviations of type ω ( y , 1 , 0) = 0 . In a similar manner, any path from a 0 back to a has resistance r ( a 0 , a ) ≥ 1 + 2 α > 0 . In both cases, the resistance to get back to a from a 0 is greater than going from a to a 0 . Therefore, the target proﬁle a cannot be stochastically stable under π 0 .  W e are now in a position to prov e Theorem 3.2. Pr oof of Theorem 3.2. W e have just established that an ag- gressiv e policy targeting a is the dynamic policy that stabilizes a with the fewest number of adversaries. By Lemma 5.2, a target proﬁle must have x segments of length 2 or greater and y segments of length l 2+ α 1 − α m or greater . T o implement the aggressiv e policy , there needs to be two adversaries for each y segment of any length, and two adversaries for each x segment of length no greater than 1+  1 − α α  when α < 1 2 . When α ≥ 1 2 , no adversaries are needed on any x segment. Propositions 5.1 and 5.2 assert this amount of adversarial nodes are necessary and sufﬁcient to stabilize the target proﬁle. By Step (2B) from Theorem 3.1, the minimum ef ﬁciency tar get proﬁle has at most two segment patterns. W e thus obtain an integer optimization problem similar to (SI-OPT), except with the abo ve constraints taken into account instead.  C. Pr oof of Theor em 2.2: Dynamic uninformed adversary In this subsection, we deriv e the minimum efﬁcienc y a dynamic uninformed adversary can induce on k -connected ring graphs. Similar to the static uninformed case (Theorem 2.1), the dynamic uninformed adversary effecti vely can only select how many x and y adversaries to allocate at each time step, and cannot place them in a strategic manner . The major difference here is it can also randomize among inﬂuence sets by selecting different distributions at each time step. The idea of the proof is by being able to probabilistically attach an adversary to each agent in the network, the all y proﬁle can be stabilized independently of the budget γ . Let us deﬁne Π ∗ ⊂ Π D as a set of dynamic uninformed policies that have the following properties. Suppose π ∈ Π ∗ . Then (a) | S x ( t ) | = 0 for all t = 0 , 1 , 2 , . . . . (b) for any i ∈ N and t = 0 , 1 , 2 , . . . , there exists a 0 ≤ τ < ∞ such that Pr ( i ∈ S y ( t + τ )) > 0 . (c) there exists a subset T ⊂ N of agents satisfying | T | n = γ 0 < γ for all n , such that Pr ( i ∈ S y ( t )) = 1 for all i ∈ T , t = 0 , 1 , 2 , . . . . Property (a) asserts x adversaries are nev er utilized. Property (b) ensures any giv en agent is inﬂuenced by a y adversary inﬁnitely often. Property (c) says the adversary determistically inﬂuences a ﬁxed fraction of agents in the network. In a sense, policies belonging to Π ∗ are “partially static”. Lemma 5.7. Under a policy π ∈ Π ∗ , the all- y and all- x pr oﬁles ar e the only two recurr ent classes. Pr oof. Under a policy π ∈ Π ∗ , a transition out of either ~ y or ~ x has a non-zero resistance. Hence, we need to show there is a zero-resistance transition out of any proﬁle a / ∈ { ~ y , ~ x } . First, assume α < 1 /k (similar arguments can be applied when α > 1 /k ). Proﬁles can be split into three categories. 1) There is a y -segment of length ≥ k . In this case, there exists a zero-resistance transition. Consider an x agent that neighbors the y segment of length ≥ k . This agent could have up to k neighbors also playing x . There is non-zero probability it becomes inﬂuenced by a y adversary . It switches to y with resistance ω ( x, k , y ) = [ k + 1 − k (1 + α )] + , which is zero if α < 1 /k . 2) There is no y -se gment of length ≥ k and there is an x segment of length ≥ k . Consider a y agent that neighbors such an x segment. Its payoff from playing x is at least k (1 + α ) , and its payoff from playing y is at most k − 1 . Hence, it transitions to x with zero resistance. 3) There are no x or y segments of length ≥ k . Then there always exists some agent that switches to x or y with zero resistance. In other words, there exists some y (or x ) agent that will attain more payoff by switching to x ( y ).  T o determine whether ~ x or ~ y is stochastically stable, we now calculate the minimum resistance path between ~ x and ~ y ( ρ ~ x~ y ). Suppose we are in ~ x . Consider a path in which k consecutive agents switch to y , and each agent that unilaterally switches is inﬂuenced by a y -adversary . This occurs with a non-zero prob- ability due to property (b). Each unilateral switch in this se- quence has a non-zero resistance. Howe ver after this sequence of k switches, ev ery subsequent switch of neighboring agents has zero resistance. The total resistance (as long as α < 1 k ) is therefore ρ ~ x~ y = ρ ∗ ~ x~ y := P k i =1 [(1 + α )(2 k − ( i − 1)) − i ] . W e can similarly calculate the minimum resistance path between ~ y and ~ x ( ρ ~ y~ x ). Starting from ~ y , we consider a path in which k consecutiv e agents switch to x , and none of the agents are inﬂuenced by a y -adversary (this occurs with non- zero probability). After this sequence of k switches, every subsequent switch of neighboring agents has zero resistance. Once this growing x segment neighbors a member of T , the node belonging to T will also switch with zero resistance as long as α < 1 k . The total resistance is therefore ρ ~ y~ x = ρ ∗ ~ y~ x := P k i =1 [(2 k − ( i − 1)) − ( i − 1)(1 + α )] + . For ~ y to be stochastically stable, the condition ρ ~ x~ y < ρ ~ y~ x must hold. This yields α < | T | k ( | T | + k ) . As the number of 13 nodes tends to inﬁnity , so does | T | , and we are left with the condition α < 1 k . This yields an efﬁciency 1 1+ α regardless of the fractional budget γ . W e now proceed to show any π ∈ Π DU \ Π ∗ cannot induce an ef ﬁciency less than 1 1+ α . – Suppose π satisﬁes property (a) and (c), but not (b). Then there is a set B ⊂ N of agents that are nev er inﬂuenced by y impostors: Pr ( i ∈ S y ( t )) = 0 for all i ∈ B and t = 0 , 1 , 2 , . . . . Suppose | B | ≥ k . Suppose B is one contiguous segment, and T another . The collection of proﬁles E ⊂ A that satisfy a B = x and a T = y constitutes a recurrent class. The proﬁles ~ x and ~ y are the other two recurrent classes. Similar resistance arguments sho w that E has minimal stochastic potential, and hence LLL ( G, α, π ) = E . The proﬁle a = ( y T , x − T ) achie ves the maximal ef ﬁciency , which yields an efﬁcienc y > 1 1+ α for n sufﬁciently large. For | B | < k , E is not a recurrent class. The same analysis for π ∈ Π ∗ applies in this case, yielding an efﬁcienc y 1 1+ α . – If π satisﬁes (a) and (b) but not (c), there are no static adversarial nodes. There are no recurrent classes other than ~ y and ~ x . The analysis is similar to when π did satisfy (c), where we obtain ρ ∗ ~ x,~ y < ρ ∗ ~ y , ~ x if α < 1 /k , and the opposite if α > 1 /k . Reg ardless, this yields an efﬁciency ≥ 1 1+ α . – If π satisﬁes only (a), ~ x and ~ y are the only recurrent classes. Regardless of which is stable, the efﬁcienc y is ≥ 1 1+ α . – Any policy that does not satisfy property (a) induces an efﬁcienc y at least as much as a corresponding policy that does. The addition of x adversaries only lessens the resistance of transitions from y to x . Also, if there is a “static” x adversary , property (c) allows a single segment to be stable to x . For any number of x adversaries that are “random” in the sense of property (b), the adversary cannot stabilize alternating x and y segments in a repeating fashion. Therefore, no induced efﬁcienc y for large n can be less than 1 1+ α . V I . S I M U L A T I O N S In this section, we provide numerical simulations of log- linear learning dynamics for two of the four adversarial models: static and dynamic informed (SI and DI). W e verify the tightness of the lower bounds giv en in Theorems 3.1 and 3.2. W e simulate the dynamics on ﬁnite ring graphs, and compute the average efﬁcienc y the network experiences when the adversary implements an optimal policy . Our results are giv en in T able I. W e observe that the average ef ﬁciency approaches the fundamental lower bounds as the size of the graphs are increased. The experiments are set up as follows. For SI, we ﬁrst compute the minimum efﬁcienc y SSS given α , budget b γ · n c , and network size n . W e then determine an adversarial set S xy that is necessary and sufﬁcient to stabilize the SSS (according to the proof of Theorem 3.1). W e then run the log-linear learning dynamics initialized at the SSS and with the static S xy . W e perform a similar experiment for DI, except the aggressive policy (Deﬁnition 3) is implemented during the dynamics. For the static uninformed model, we arrange a static adver - sarial set according to the proof of Theorem 2.1, if γ ≥ α . That is, it is the arrangement of solely y adversaries with the giv en budget that induces the least amount of damage. If γ < α , the y adversaries are ev enly spaced throughout the ring. W e initialized each repetition at the stochastically stable state. For the dynamic uninformed model, we arranged the adversary set according to the proof of Theorem 2.2. That is, it uses exclusi vely y adversaries and randomly allocates them. W e initialized each repetition with a random initial action proﬁle. Static Informed Adversary n α = 0 . 3 α = 0 . 5 α = 0 . 7 10 0.6846 0.6667 1 20 0.6730 0.6500 0.6529 30 0.6692 0.6444 0.6588 Fundamental lower bound 0.6615 0.6400 0.6470 Dynamic Informed Adversary n α = 0 . 3 α = 0 . 5 α = 0 . 7 10 0.6384 0.5666 0.5882 20 0.5730 0.5666 0.5500 30 0.5948 0.5666 0.5373 Fundamental lower bound 0.5692 0.5416 0.5245 Static Uninformed Adversary n α = 0 . 3 α = 0 . 5 α = 0 . 7 10 0.7751 0.6744 1 20 0.7993 0.8121 1 30 0.8371 0.8503 1 Fundamental lower bound 0.6615 0.6400 0.6470 Dynamic Uninformed Adversary n α = 0 . 3 α = 0 . 5 α = 0 . 7 10 0.7999 0.7222 0.6294 20 0.7769 0.6777 0.5882 30 0.7692 0.6777 0.5883 Fundamental lower bound 0.7692 0.6666 0.5882 T ABLE I: Simulation results of log-linear learning for the four adversarial models. Each entry represents the a veraged ef ﬁciency o ver 30 repetitions, where each repetitions consists of 10 6 time steps. W e ﬁx the learning parameter β = 25 . V I I . C O N C L U S I O N This paper inv estigated the susceptibility of distributed game-theoretic learning algorithms to adversarial inﬂuences. W e considered a scenario of an adversary intent on maximally degrading a network system’ s performance guarantees associ- ated with distributed learning algorithms. W e asked 1) How susceptible are these algorithms to adversarial interference? In particular , this paper focused on one such algorithm, log-linear learning, that possesses nice properties in non-adversarial set- tings. 2) How does an adversary’ s sophistication and system- lev el knowledge impact the degradation that the adversary can do to the system? W e studied both of these questions in the context of graphical coordination games. In particular, we considered two lev els of adversarial so- phistication – static and dynamic policies – and two lev els of information, informed and agnostic about network structure. 14 In a static policy , the adversary cannot change its inﬂuence ov er time. The dynamic policies we considered are stationary , in which the adversary can respond to the current system state as it ev olves. While both types of policies induce asymptotic outcomes characterized by stochastically stable states, it is of interest in future work to consider non-stationary adversarial policies. That is, how can adversaries exploit dynamic policies that may not induce a stable asymptotic outcome? An important insight gleaned from the analysis is that an adversary with a low resource budget – described by the fraction of agents in the network it can inﬂuence – does not beneﬁt as much from system-lev el information as it would from the ability to employ dynamic strategies. On the other hand, when the adversary’ s budget is high, the opposite con- clusion holds. While the results in this paper are adversarial- centric, these ﬁndings provide insight as to what actions a system operator could take to best protect system behavior . For instance, our analysis can inform decisions of whether to obfuscate system-level information from potential adversarial actors, or to disable capabilities of a highly sophisticated attacker . A P P E N D I X Here, we provide the proofs of Theorems 3.3 and 3.4. A. Pr oof of Theor em 3.4 The performance saturates when the minimum length re- quirements on x and y segments can be stabilized. For static informed, the minimum required y length is ` ∗ = min { ` : ` = d α ( ` + 1) e + 2 } = d 2+ α 1 − α e , and the minimum required x length is 2. These lengths yield the minimum efﬁcienc y (12). One needs ` ∗ y -adv ersaries to stabilize the y segment (14), and one needs d 2 − α 1+ α e x -adversaries to stabilize the x segment (15). Hence, the static informed adversary can achiev e this pattern if and only if γ ≥ ` ∗ + d 2 − α 1+ α e ` ∗ +2 . For dynamic informed policies, Lemma 5.2 says the min- imum required x and y segment lengths are the same as for static informed adversaries. Hence, the saturation value is the same (12). Howe ver , the aggressiv e policy only needs 2 y adversaries for each y segment. It needs 2 x adversaries for x segments if α < 1 / 2 , and none if α ≥ 1 / 2 . Hence, the performance saturates when γ ≥ 2+2 · 1 ( α< 1 2 ) ` ∗ +2 . B. Pr oof of Theor em 3.3 Our approach to proving this result is to analytically char- acterize inf G ∈G 1 ,π ∈ Π I ( G,γ ) η ( G, α, π ) , which was expressed as an integer optimization problem in Theorem 3.1. Indeed, we will demonstrate this v alue satisﬁes the following relations ( = 1 − γ 1+ α , if γ ≤ α ≤ ( γ − α )( ` ∗ + α ) (1+ α )(1 − α )( ` ∗ +2) + (1 − γ ) (1+ α )(1 − α ) , if γ > α (31) Comparing this value to 1 1+ α , the minimum efﬁciency a dynamic uninformed adversary can induce on ring graphs, we obtain the result. In particular, it is larger when γ ≤ α and smaller when γ > α . W e ﬁrst consider γ ≤ α . From Theorem 3.1, a proﬁle of minimal efﬁcienc y consists of at most two unique segment patterns. W e will sho w that for any proﬁle a with one segment pattern, η ( a ) ≥ 1 − γ 1+ α . Note the efﬁcienc y of a can be characterized on a ring graph with just a single repetition of this segment pattern, i.e. there is only one y and one x - segment. T o stabilize the y -segment L y , d α ( | L y | + 1) e + 2 adversarial nodes are needed. This number already exceeds the adversary’ s fractional budget for this segment. That is, d α ( | L y | + 1) e + 2 | L y | ≥ α ( | L y | + 1) + 2 | L y | > α > γ . (32) The length | L x | of the x -segment must balance out the fractional b udget constraint. W ith n = | L y | + | L x | , this means d α ( | L y | + 1) e + 2 +  h 2 − α ( | L x |− 1) 1+ α i +  n ≤ γ (33) must be satisﬁed, from which we obtain n ≥ α ( | L y | +1)+2 γ . The efﬁcienc y of a can be written as η ( a ) = ( | L y | − 1) + (1 + α )( n − | L y | − 1) (1 + α ) n = 1 − α ( | L y | + 1) + 2 (1 + α ) n ≥ 1 − γ 1 + α (34) T o show that 1 − γ 1+ α is the inﬁmum of lower bounds on efﬁcienc y over all graphs (i.e. it is tight), consider the limit of large ring graphs, i.e. as n → ∞ . Suppose the adversary stabilizes one y -segment and one x -segment, and the length of the y -segment is a fraction f of the entire network. As n → ∞ , the adversary needs a fractional b udget αf to stabilize this segment, and no adversaries at all to stabilze the rest of the network to x . If it uses its entire budget to stabilize the y -se gment, γ = αf , it induces the efﬁcienc y lim n →∞ ( | L y | − 1) + (1 + α )( n − | L y | − 1) (1 + α ) n = 1 − γ 1 + α . (35) Through similar arguments, one can show η ( a ) ≥ 1 − γ 1+ α holds for any proﬁle a that contains two segment patterns. W e omit these details for brevity . Now , consider γ > α . W e will characterize a particular action proﬁle the adversary can stabilize, whose efﬁcienc y giv en by the second entry of (31) serves as an upper bound on inf G ∈G 1 ,π ∈ Π I ( G,γ ) η ( G, α, π ) . Indeed, consider an action proﬁle where a fraction f of the ring consists only of repeated segment patterns of the type described in the proof of Theorem 3.4. That is, the pattern consists of the minimum required lengths on x and y segments. On the other hand, suppose the remaining fraction 1 − f of the ring is a single y segment. For sufﬁciently large graphs, a fractional budget f is needed to stabilize the ﬁrst portion, and another fractional budget (1 − f ) α to stabilize the second. Hence, giv en a total fractional budget γ , f = γ − α 1 − α > 0 . The fraction f ranges from 0 (when γ = α ) to 1 (when γ = 1 ). The efﬁcienc y of this action proﬁle is ( γ − α ) ( ` ∗ + α ) (1 + α )(1 − α )( ` ∗ + 2) + (1 − γ ) (1 + α )(1 − α ) . (36) 15 R E F E R E N C E S [1] B. Canty , P . N. Brown, M. Alizadeh, and J. R. Marden, “The impact of informed adversarial behavior in graphical coordination games, ” in 2018 IEEE Conference on Decision and Contr ol (CDC) . IEEE, 2018, pp. 1923–1928. [2] S. Mart ´ ınez, J. Cort ´ es, and F . Bullo, “Motion Coordination with Dis- tributed Information, ” Control Systems Magazine , vol. 27, no. 4, pp. 75–88, 2007. [3] R. Olfati-Saber , J. A. Fax, and R. M. Murray , “Consensus and coop- eration in networked multi-agent systems, ” Proceedings of the IEEE , vol. 95, no. 1, pp. 215–233, 2007. [4] M. Zhu and S. Mart ´ ınez, “Distributed coverage games for mobile visual sensors (I): Reaching the set of Nash equilibria, ” in Pr oceedings of the IEEE Confer ence on Decision and Contr ol , 2009, pp. 169–174. [5] V . Ramaswamy and J. R. Marden, “A sensor coverage game with improved efﬁciency guarantees, ” in Pr oceedings of the American Contr ol Confer ence , vol. 2016-July , 2016, pp. 6399–6404. [6] A. Jadbabaie, J. Lin, and S. A. Morse, “Coordination of groups of mobile autonomous agents using nearest neighbor rules.” T ransactions on Automatic Control , vol. 48, no. 6, pp. 988–1001, 2003. [7] F . Pasqualetti, F . Dorﬂer, and F . Bullo, “ Attack detection and identi- ﬁcation in cyber-ph ysical systems, ” IEEE Tr ansactions on Automatic Contr ol , vol. 58, no. 11, pp. 2715–2729, Nov 2013. [8] H. Fawzi, P . T abuada, and S. Diggavi, “Secure estimation and control for cyber -physical systems under adversarial attacks, ” IEEE T ransactions on Automatic Control , vol. 59, no. 6, pp. 1454–1467, 2014. [9] S. Sundaram and B. Gharesifard, “Distributed optimization under adver- sarial nodes, ” IEEE Tr ansactions on Automatic Control , vol. 64, no. 3, pp. 1063–1076, 2019. [10] D. H. W olpert and K. T umer, “Optimal payoff functions for members of collectiv es, ” in Modeling Complexity in Economic and Social Systems . W orld Scientiﬁc, 2002, pp. 355–369. [11] R. D. McKelvey and T . R. Palfrey , “Quantal response equilibria for normal form games, ” Games and Economic Behavior , vol. 10, no. 1, pp. 6–38, 1995. [12] L. E. Blume, “The Statistical Mechanics of Best-Response Strategy Revision, ” Games and Economic Behavior , vol. 11, no. 2, pp. 111–145, 1995. [13] J. R. Marden and A. Wierman, “Distrib uted W elfare Games, ” Operations Resear ch , vol. 61, pp. 155–168, 2013. [14] Y . Lim and J. Shamma, “Robustness of stochastic stability in game theoretic learning, ” in American Control Conference (A CC) , 2013, pp. 6160–6165. [15] S. Rahili and W . Ren, “Game theory control solution for sensor coverage problem in unknown environment, ” in 53rd IEEE Confer ence on Decision and Contr ol . IEEE, 2014, pp. 1173–1178. [16] C. Sun, “ A time variant log-linear learning approach to the set k- cover problem in wireless sensor networks, ” IEEE Tr ansactions on Cybernetics , vol. 48, no. 4, pp. 1316–1325, 2018. [17] A. Kanakia, B. T ouri, and N. Correll, “Modeling multi-robot task allocation with limited information as global game, ” Swarm Intelligence , vol. 10, no. 2, pp. 147–160, 2016. [18] D. Shah and J. Shin, “Dynamics in Congestion Games, ” Proc. ACM SIGMETRICS’10 , vol. 38, p. 107, 2010. [19] M. Kearns, M. L. Littman, and S. Singh, “Graphical Models for Game Theory, ” Pr oceedings of the 17th conference on Uncertainty in artiﬁcial intelligence , no. Owen, pp. 253–260, 2001. [20] A. Montanari and A. Saberi, “The spread of innovations in social networks.” Proceedings of the National Academy of Sciences of the United States of America , vol. 107, no. 47, pp. 20 196–20 201, 2010. [21] H. P . Y oung, Individual strate gy and social structur e: An evolutionary theory of institutions . Princeton University Press, 2001. [22] J. R. Marden and J. S. Shamma, “Game Theory and Distrib uted Control, ” in Handbook of Game Theory V ol. 4 , H. Y oung and S. Zamir , Eds. Elsevier Science, 2014. [23] J. R. Marden, G. Arslan, and J. S. Shamma, “Joint strategy ﬁctitious play with inertia for potential games, ” IEEE T ransactions on Automatic Contr ol , vol. 54, no. 2, pp. 208–220, 2009. [24] B. Pradelski and H. P . Y oung, “Learning Efﬁcient Nash Equilibria in Distributed Systems, ” Games and Economic Behavior , vol. 75, no. 2, pp. 882–897, 2012. [25] C. Al ´ os-Ferrer and N. Netzer, “The logit-response dynamics, ” Games and Economic Behavior , vol. 68, no. 2, pp. 413–427, 2010. [26] L. Lamport, R. Shostak, and M. Pease, “The byzantine generals prob- lem, ” ACM T ransactions on Pro gramming Languages and Systems , vol. 4, no. 3, pp. 382–401, 1982. [27] F . Pasqualetti, A. Bicchi, and F . Bullo, “Consensus computation in unreliable networks: A system theoretic approach, ” IEEE T ransactions on Automatic Control , vol. 57, no. 1, pp. 90–104, Jan 2012. [28] H. J. LeBlanc, H. Zhang, X. Koutsoukos, and S. Sundaram, “Resilient asymptotic consensus in robust networks, ” IEEE Journal on Selected Ar eas in Communications , vol. 31, no. 4, pp. 766–781, April 2013. [29] H. Jaleel, W . Abbas, and J. S. Shamma, “Robustness of stochastic learning dynamics to player heterogeneity in games, ” in 2019 IEEE 58th Confer ence on Decision and Control (CDC) , Dec 2019, pp. 5002–5007. [30] L. Su and N. V aidya, “Multi-agent optimization in the presence of byzantine adversaries: Fundamental limits, ” in 2016 American Contr ol Confer ence (ACC) , 2016, pp. 7183–7188. [31] K. Paarporn, M. Alizadeh, and J. R. Marden, “Risk and security tradeof fs in graphical coordination games, ” in 2019 IEEE 58th Confer ence on Decision and Contr ol (CDC) , Dec 2019, pp. 4409–4414. [32] H. P . Borowski and J. R. Marden, “Understanding the Inﬂuence of Adversaries in Distributed Systems, ” in IEEE Conference on Decision and Contr ol (CDC) , 2015, pp. 2301–2306. [33] P . N. Brown, H. P . Borowski, and J. R. Marden, “Security against impersonation attacks in distributed systems, ” IEEE T ransactions on Contr ol of Network Systems , vol. 6, no. 1, pp. 440–450, 2018. [34] D. Monderer and L. S. Shapley , “Potential Games, ” Games and Eco- nomic Behavior , vol. 14, no. 1, pp. 124–143, 1996. [35] L. E. Blume, “The Statistical Mechanics of Strate gic Interaction, ” Games and Economic Behavior , vol. 5, no. 3, pp. 387–424, 1993. [36] J. R. Marden and J. S. Shamma, “Revisiting Log-Linear Learning: Asynchrony , Completeness and a Payoff-based Implementation, ” Games and Economic Behavior , vol. 75, no. 4, pp. 788–808, 2012. [37] H. P . Y oung, “The Evolution of Conv entions, ” Econometrica , vol. 61, no. 1, pp. 57–84, 1993.

The Impact of Complex and Informed Adversarial Behavior in Graphical Coordination Games

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment