On the Sample Complexity of Learning Graphical Games

We analyze the sample complexity of learning graphical games from purely behavioral data. We assume that we can only observe the players' joint actions and not their payoffs. We analyze the sufficient and necessary number of samples for the correct r…

Authors: Jean Honorio

On the Sample Complexity of Learning Graphical Games Jean Honorio Computer Science, Purdue Univ ersit y W est Lafay e tte, IN 47907, USA jhonorio@pu rdue.edu Abstract W e analyze the sample complexity of lea r ning graphi- cal games from purely behavioral data. W e assume that we can only obse r ve the players’ joint actions and not their pay offs. W e ana lyze the sufficien t and necessary nu mber of s amples for the co rrect re c ov ery of the set of pure-strateg y Nash equilibria (P SNE ) o f the true game. Our ana ly sis fo cuses on directed graphs with n no des and at most k par ent s p er no de. Sparse gr a phs corre- sp ond to k P O p 1 q with resp ect to n , while dense graphs corres p ond to k P O p n q . By using V C dimension argu- men ts, we show that if the num ber of samples is grea ter than O p k n log 2 n q for spa rse gr a phs or O p n 2 log n q for dense gra phs, then ma ximu m likeliho o d estimation cor - rectly recov ers the PSNE with high probability . By us- ing infor mation-theoretic arguments, w e show that if the nu mber of samples is less than Ω p k n log 2 n q for sparse graphs o r Ω p n 2 log n q for dense graphs, then a n y con- ceiv able metho d fails to recover the PSNE with arbitrar y probability . 1 In tro duction Non-co op erative game theory has b een conside r ed as the appropria te mathema tica l framework in whic h to for- mally s tudy str ate gic b ehavior in m ulti-agent scenar ios. The co re solution concept o f Nash e quilibrium (NE) [15] serves a descriptive role of the stable outco me of the ov er- all behavior of self-in terested ag en ts (e.g., peo ple, compa- nies, governmen ts, groups or autono mous systems) inter- acting strategica lly with ea ch o ther in distr ibuted s et- tings. Algorithmic Game Theory and Applications. There ha s b een cons ide r able prog r ess on c omputing clas- sical equilibrium solution concepts such a s NE a nd c or- r elate d e quilibria [1] in g raphical g ames (se e , e.g., [3, 12, 13, 14, 17, 18, 20] and the references therein) as well as on computing the pric e of anar chy in gr aphical ga mes (see, e.g., [2]). Indeed, graphica l games played a pro minent r o le in e stablishing the computational complexity of comput- ing NE in genera l normal-fo r m games (see, e.g., [6 ] and the references therein). In p olitic al scienc e for instance, the w ork of [10] ide nti- fied the mos t influential s e na tors in the U.S. congres s (i.e., a small set o f senators whose co llectively b ehavior for ces every o ther se nator to a unique c hoice of v ote). The most influen tial s enators were in triguingly similar to the g ang- of-six senator s for med during the national debt c e iling negotiation in 2011. Additionally , it was observed in [9] that the influence from Obama to Republicans increased in the last sessions b efore candidacy , while McCain’s in- fluence to Republicans decr eased. Learning Graphical Games. The problems in algo- rithmic game the ory describ ed ab ov e (i.e., co mputing Nash equilibria , computing the pr ic e of anar c hy , or find- ing the most influen tial agents) r equire a known graphical game which is unobser v ed in the real world. T o o vercome this issue, learning binary-action graphical games fro m behaviora l data was pro po sed in [9], by using maxim um likelihoo d estimation (MLE). W e also note that [9, 10] hav e shown the successful use of g raphical games in real- world settings, such as the analysis of the U.S. co ngres- sional v oting records as well as the U.S. supr eme court. More r ecently , the w ork o f [8] provides a statistically and computationa lly efficient metho d for lea rning binary- action sp arse games. Con tributions. In this pa p er, we study the s t atistic al asp ects of the problem o f learning g raphical g a mes with gener al discr et e actions from strictly behavioral data. As in [8, 9], we assume that w e ca n only observe the play ers’ joint actions and not their pa yoffs. The cla ss o f mo dels considered here are p olymatrix gr aphic al games [11, 14]. W e study the sufficient and necessar y n um b er of samples for the cor rect recov ery of the pure-strategy Nash equi- libria (PSNE) set of the true game, for directed gr a phs with n no des a nd at most k pa rents p er node. Theorem 3 shows that the sufficient num ber of s amples for MLE is O p k n log 2 n q for s parse graphs, a nd O p n 2 log n q for dense graphs. Theo rem 4 shows tha t the necessary num ber of samples for any conceiv able metho d is Ω p k n log 2 n q for sparse graphs, and Ω p n 2 log n q for dense graphs. Thu s, MLE is statistically o ptimal. Discussion. While sp arsity -promoting metho ds were used in pr io r work [8, 9] for binary actions, the b ene- fit of spa rsity for learning games with gener al discr ete actions has no t b e en theoretically ana lyzed befo r e. In this paper, we foc us o n the statistic al analysis of e xact MLE. 1 Prior work on MLE estimatio n [9] has not fo- 1 W e lea ve the analysis of computationa lly efficien t metho ds for future work. T o put this in con text, note that theoretical analysis 1 cused on the corr ect PSNE recovery , but on generaliza tion bo unds. More for mally , Coro llary 15 in [9] shows tha t for dense gra phs with n no des and binary actions, O p n 3 q samples are sufficien t for the e mpir ical MLE minimizer to be close to the b est achiev able expected log- likelihoo d. As a b ypro duct of our PSNE reco very analysis , Lemma 2 shows that for dense g raphs and gener al discr ete actions, only O p n 2 log n q samples are sufficient for obtaining a go o d expe cted lo g-likelihoo d. Regarding PSNE recovery , the results of [8] provide a O p k 4 log n q sa mple complexity for le a rning binary-action sp arse ga mes. The ab ove re- sults per tain to a spe cific c la ss of pay off functions with a particular p ar ametric representation, that allows for a lo- gistic r egressio n approach. The results in [8] also as sume strict p os itiv ity of the payoffs in the PSNE set. Thus, it is unclea r ho w these results can b e extended to gener al discr ete actions. 2 Graphical G ames In clas s ical game-theory (see, e.g. [7] for a textbo ok in- tro duction), a normal-form game is defined by a set of n players V “ t 1 , . . . , n u , and fo r e a ch pla yer i , a set of actions , o r pur e-str ate gies A i , and a pay off function u i : A Ñ R where A is the Cartesia n pro duct A ” Ś j P V A j . The pay off functions u i map the joint a ctions of all the play ers to a real nu mber. In non-co o per ative ga me theor y we assume players ar e greedy , rational and act indep en- dent ly , by whic h we mea n that each play er i alwa ys w ant to max imize their own utility , sub ject to the a ctions s e - lected by others, irresp ective of how the o ptimal action chosen help or hurt others. A core solution concept in non-co o per ative game the- ory is that of an Nash e quilibrium . A joint a ction x P A is a pur e-stra te gy Nash e qu ilibrium (PSNE) of a non-co op era tiv e g ame if, for ea c h player i , x i P arg ma x a P A i u i p a, x ´ i q . That is, x constitutes a mutual b est -r esp onse , no play er i has any incentiv e to unilat- erally deviate from the prescr ib ed action x i , given the joint action of the other players x ´ i P Ś j P V ´t i u A j in the equilibrium. F or normal-form games, we denote a ga me by G “ t u i : A Ñ R u i P V , and the s et of all pur e-str ate gy Nash e quilibria of G b y N E p G q ” t x | p@ i P V , a P A i q u i p x i , x ´ i q ě u i p a, x ´ i qu . (1) A (dir e cte d) gr aphic al game is a ga me-theoretic gr aph- ical mo del [14]. It provides a succinct r epresentation of normal-fo rm g a mes. In a gr aphical game, w e hav e a (directed) gr aph G “ p V , E q in which ea c h no de in V cor resp onds to a pla yer in the g a me. The in ter- pretation of the edges/arcs E of G is that the pay- off function of player i is only a function of his own for learning Bay esian net w orks has fo cused exc lusively on exact MLE [4]. action and the actions of the set of parents/neighbors N p i q ” t j | p i, j q P E u in G (i.e., the s et of play ers corre- sp onding to no des that point to the no de c o rresp onding to player i in the graph). In this context, for ea ch play er i , we have a lo c al pay off function u i : A i ˆ Ś j P N p i q A j Ñ R . A joint action x P A is a PSNE if, for each player i , x i P a rg max a P A i u i p a, x N p i q q . F or graphical games, we denote a game by G “ t u i : A i ˆ Ś j P N p i q A j Ñ R u i P V . In this pap er, w e fo cus on p olymatrix games [11]. Under this mo del, the lo cal pay off functions u i : A i ˆ Ś j P N p i q A j Ñ R hav e a succin t repre sent ation as a sum of a una ry p otential function u ii : A i Ñ R a nd several pairwise p otential functions u ij : A i ˆ A j Ñ R , that is u i p x i , x N p i q q “ u ii p x i q ` ÿ j P N p i q u ij p x i , x j q . (2) F or p olymatrix graphica l games, we denote a g a me by G “ t u ii : A i Ñ R , u ij : A i ˆ A j Ñ R u i,j P V . W e as s ume that A i is a countable finite set s uc h that | A i | ě 2 for all play ers i . F urther, | A i | P O p 1 q with resp ect to n and k . The binary-action mo dels considere d in [8, 9, 10] are a restricted sub class of the models that w e consider here. The res ults in [8 , 9, 1 0] assume that A i “ t´ 1 , ` 1 u , u ii p x i q “ w ii x i and u ij p x i , x j q “ w ij x i x j for a ll i, j a nd for a weigh t matr ix W P R n ˆ n . Equiv alence Classes. Each P SNE set defines an e qu ivalenc e class o f games for which players hav e the same joint b ehavior. Th us, as ar g ued further in Section 4 in [9] for binary-action games, it is no t po ssible to re- cov er the structure and pa yoff functions o f the true game from o bserved join t a ctions. Instea d, we ca n recover the PSNE set (or equiv alence cla ss) of the true g ame. Here , we study the sufficient and necessar y num ber of sa mples for the c orrect recov ery of the P SNE set of the tr ue game. Main As sumptions. Our assumptions are minimal: • W e do not assume the a v a ilability of any informa - tion regar ding the structure or para meters o f the true gra phical g ame. The pr o blem is pr e cisely to infer that informa tio n. • W e do not assume the av ailability of da ta related to the temp oral dynamics , i.e., each pla yer’s move. In- stead, we assume that we only observe ste ady-state joint actions, i.e., NE. Lear ning only from NEs, is arguably more challenging than learning from tem- po ral dynamics. • T o ma ke learning even more challenging, we assume that data might b e no t ent irely fa ithful to a graphi- cal game. That is, we a ssume that a p ortion o f the joint actions in the o bserved/training data is not an NE. This “cor ruption” can b e modeled via a noise me chanism . • Learning games is a n unsu p ervise d ta s k, i.e ., we do no t know which joint actions in the ob- served/training data ar e NE o r not. 2 • W e assume that pay offs are unav ailable in the ob- served/training data, which is a reasona ble assump- tion in some rea l-world instances. 3 Learning G r aph ical Games In this pap er, w e define H to b e the class o f p olyma trix graphical games with n no des a nd at most k parents p er no de, as follows 2 H ” $ & % G | G “ t u ii : A i Ñ R , u ij : A i ˆ A j Ñ R u i,j P V ^ p@ i P V q | N p i q| ď k ^ | N E p G q| P t 1 , . . . , | A | ´ 1 u , . - . Next, we intro duce an extension of the g enerative mo del prop osed in [9] originally for binary actions. Let G b e a game, and let Q G be a set defined as follows 3 Q G ” ´ | N E p G q| | A | , 1 ´ 1 2 | A | ı . With some probability q P Q G , a joint ac tion x is cho- sen uniformly at r andom from N E p G q ; other wise, x is chosen uniformly at rando m from its complement set A ´ N E p G q . Hence, the g enerative mo del is a mix- ture mo del with mixture para meter q co rresp onding to the proba bilit y that a stable outcome (i.e., a P SNE) of the game is o bserved, uniform over P SNE. F ormally , the probability mass function (PMF) ov er joint-behaviors x P A parameterize d by p G , q q is p G ,q p x q ” q 1 r x P N E p G q s | N E p G q| ` p 1 ´ q q 1 r x R N E p G qs | A |´| N E p G q| , (3) where we can think of q as the “s ig nal” level, and thus 1 ´ q a s the “noise” level in the data . Additionally , P G ,q denotes the pro bability distribution defined by the PMF p G ,q p¨q . By using the P MF in eq.(3), we ca n define a (scaled) negative log-likeliho o d function ov er joint-behaviors x P A for a game G and mixture par ameter q as follows L G ,q p x q “ ´ log p G ,q p x q log p 2 | A | 2 q “ ´ 1 r x P N E p G q s log p 2 | A | 2 q log q | N E p G q| ´ 1 r x R N E p G qs log p 2 | A | 2 q log 1 ´ q | A |´| N E p G q| . (4) Note that s ince we s c ale the ne g ative log- likeliho o d with a factor 1 { log p 2 | A | 2 q then L G ,q p x q P r 0 , 1 s for all G P H , q P Q G and x P A . Maximum likeliho o d estimation (MLE) a llows to infer the game (and mixture pa rameter) from observed joint actions. More for mally , given a da taset S of m joint ac- tions, the empiric al MLE minimizer is p p G , p q q “ arg min G P H ,q P Q G 1 m ÿ x P S L G ,q p x q . 2 N E p G q ‰ H and N E p G q ‰ A ensure that P G ,q does not degen- erate i n to a uniform distribution, see e.g., Definition 4 i n [9]. 3 q ą | N E p G q|{| A | ensures that p G ,q p x 1 q ą p G ,q p x 2 q f or x 1 P N E p G q , x 2 R N E p G q , see e.g., Proposi tion 5 and Definition 7 in [9]. Assume that a joint action x is drawn from an ar bitrary data dis tr ibution D . The ex p e cte d MLE minimizer is given by p G , q q “ arg min G P H ,q P Q G E D r L G ,q p x qs . Note that if the data is generated b y a true game G ˚ P H and mixture parameter q ˚ P Q G ˚ , then the ex p ected MLE minimizer is the true game and mixture parameter. That is, if D “ P G ˚ ,q ˚ then N E p G q “ N E p G ˚ q and q “ q ˚ . 4 Sufficien t Samples for PSNE Reco v ery In this section, we show that if the num b er of samples is greater than O p k n log 2 n q for spars e graphs or O p n 2 log n q for dense gr aphs, then MLE correctly recovers the PSNE with high proba bilit y . Num b er of PSNE Sets . First, we show that the nu mber of P SNE sets induced by po lymatrix gra phical games is O p e kn log 2 n q fo r spa rse gr a phs, and O p e n 2 log n q for dense graphs. These results w ill b e useful later in ob- taining a g eneralization b ound a s well as for analyzing the corre c t recovery of PSNE. Lemma 1 (Number of PSNE sets) . L et H b e the class of p olymatrix gr aphic al ga mes with n no des and at most k p ar ents p er no de. L et d p H q b e the num- b er of PSNE sets that c an b e pr o duc e d by games in H , i.e., d p H q “ |Y G P H t N E p G qu| . We have that d p H q P O p e kn log 2 n q for k P O p 1 q , and d p H q P O p e n 2 log n q for k P O p n q . Pr o of. Let A i “ t 1 , . . . , | A i |u for all i P V , w.l.o.g. First, w e introduce an equiv a le nt represe ntation of po lymatrix gr aphical games. T o each unary p o- ten tial function u ii : A i Ñ R , w e ass o c ia te a vec- tor θ p i q P R | A i | such that u ii p x i q “ ř b P A i θ p i q b 1 r x i “ b s . T o e a ch pairwise po ten tial function u ij : A i ˆ A j Ñ R , we asso cia te a matrix Θ p i,j q P R | A i |ˆ| A j | such that u ij p x i , x j q “ ř b P A i ,c P A j θ p i,j q bc 1 r x i “ b, x j “ c s . Note that the pay off functions u i are linear with re spec t to the vec- tors θ p i q and matrices Θ p i,j q for all i, j P V , that is u i p x i , x ´ i q “ ÿ b P A i θ p i q b 1 r x i “ b s ` ÿ j P V ,b P A i ,c P A j θ p i,j q bc 1 r x i “ b, x j “ c s . In the a bove, we c a n define the parent/neighbor se t N p i q ” t j | Θ p i,j q ‰ 0 u a nd thus, summation across j P V is equiv alent to summation acr o ss j P N p i q . By eq.(1), we hav e that a PSNE x fulfills u i p x i , x ´ i q ´ u i p a, x ´ i q ě 0 fo r a ll i P V and a P A i . F or 3 po lymatrix gr aphical g ames, for all players i and a P A i , a PSNE x fulfills ÿ b P A i θ p i q b 1 r x i “ b s ` ÿ j P V ,b P A i ,c P A j θ p i,j q bc 1 r x i “ b, x j “ c s ´ θ p i q a ´ ÿ j P V ,c P A j θ p i,j q ac 1 r x j “ c s ě 0 . Thu s, a PSNE is defined by ř i P V | A i | line ar inequalities with resp ect to the vectors θ p i q and matrices Θ p i,j q . F or every pla yer i and a P A i , let D p i q ” p 1 ` | A i |qp 1 ` ř j P V ´t i u | A j |q and define the vectors y p i,a q P t 0 , 1 u D p i q and φ p i,a q P R D p i q as follows y p i,a q ” pt 1 r x i “ b su b P A i , t 1 r x i “ b, x j “ c su j P V ,b P A i ,c P A j , ´ 1 , t´ 1 r x j “ c su j P V ,c P A j q , φ p i,a q ” pt θ p i q b u b P A i , t θ p i,j q bc u j P V ,b P A i ,c P A j , θ p i q a , t θ p i,j q ac u j P V ,c P A j q . F or p olyma trix graphical games, a PSNE x ful- fills φ p i,a q T y p i,a q ě 0 for all i P V and a P A i . Let D p i, k q ” p 1 ` | A i |qp 1 ` max π Ď V ´t i u , | π |ď k ř j P π | A j |q . F or ev ery play er i and a P A i , define the function clas s H p i,a q as follows H p i,a q ” $ ’ & ’ % f : t 0 , 1 u D p i q Ñ t 0 , 1 u | f p y p i,a q q “ 1 r φ p i,a q T y p i,a q ě 0 s ^ φ p i,a q P R D p i q ^ ř l 1 r φ p i,a q l ‰ 0 s ď D p i, k q , / . / - . Note that H p i,a q is the class of linear classifiers in D p i q di- mensions, o f weigh t vectors with at most D p i, k q nonzero elements. F o r k P O p 1 q , by Theorem 20 in [16] for the V apnik-Chervonenkis (VC) dimension of sparse linear classifiers , and since D p i, k q P O p k q and D p i q P O p n q , the V C dimension of H p i,a q is b ounded as follows VC p H p i,a q q ď 2 p D p i, k q ` 1 q log D p i q P O p k lo g n q . (5) F or k “ n ´ 1, we hav e that D p i, n ´ 1 q “ D p i q P O p n q , by the well-known V C dimension of linear cla s sifiers, we hav e VC p H p i,a q q ď D p i q ` 1 P O p n q . (6) Recall that a PSNE is defined by D ” ř i P V | A i | line ar inequalities. Define the bo o lean function g : t 0 , 1 u D Ñ t 0 , 1 u a s follows g pt z ia u i P V ,a P A i q ” ś i P V ,a P A i z ia . Note that if f p i,a q P H p i,a q for a ll i and a P A i , then g pt f p i,a q p y p i,a q qu i P V ,a P A i q “ 1 ô x P N E p G q . Define the function class g pt H p i,a q u i P V ,a P A i q ” " g pt f p i,a q p y p i,a q qu i P V ,a P A i q | p@ i P V , a P A i q f p i,a q P H p i,a q * . By Lemma 2 in [19] then VC p g pt H p i,a q u i P V ,a P A i qq ď 2 D p 1 ` log D q max i P V ,a P A i VC p H p i,a q q . Note that D P O p n q . F or k P O p 1 q , by eq.(5), we ha ve VC p g pt H p i,a q u i P V ,a P A i qq P O p k n log 2 n q . F or k P O p n q , by eq.(6), we have VC p g pt H p i,a q u i P V ,a P A i qq P O p n 2 log n q . Finally , note that our a nalysis of VC p g p H 1 , . . . , H n qq provides a bo und with resp ect to PSNE, while we a re in terested o n PSNE sets. Therefore, d p H q ď max i P V | A i | VC p g p H 1 ,..., H n qq P O p e VC p g p H 1 ,..., H n qq q and w e prove our cla im. Generalization Bo un d. Next, w e show that if the nu mber o f samples is gr eater than O p k n log 2 n q for sparse graphs o r O p n 2 log n q for dens e graphs, then the empir ical MLE minimizer is close to the b e st achiev able ex pected log-likeliho o d. Lemma 2 (Genera liz ation b ound) . Fix δ, ε P p 0 , 1 q . L et H b e the cla ss of p olymatrix gr aphic al games with n no des and at most k p ar en ts p er no de. Assume an arbitr ary data distribution D . Assume that S is a dataset of m joint actio ns (of the n players), e ach indep endently dr awn fr om D . If m P O p 1 ε 2 p k n log 2 n ` lo g 1 δ qq for k P O p 1 q or m P O p 1 ε 2 p n 2 log n ` lo g 1 δ qq for k P O p n q , then P S r E D r L p G , p q p x q ´ L G ,q p x qs ď ε s ě 1 ´ δ . Pr o of. F or clarity , let L S p G , q q ” 1 m ř x P S L G ,q p x q and L D p G , q q ” E D r L G ,q p x qs . By Le mma 11 in [9], for any game G and for 0 ă q 2 ă q 1 ă q ă 1, if for any ε ą 0 we hav e | L S p G , q q ´ L D p G , q q| ď ε ^ | L S p G , q 2 q ´ L D p G , q 2 q| ď ε ñ | L S p G , q 1 q ´ L S p G , q 1 q| ď ε . The a bove implies that for an y ga me G a nd for a n y ε ą 0, we hav e that p@ q P B Q G q | L S p G , q q ´ L D p G , q q| ď ε ñ p@ q P Q G q | L S p G , q q ´ L D p G , q q| ď ε , where B Q G is the b oundary of the set Q G , i.e ., B Q G “ t | N E p G q| | A | , 1 ´ 1 2 | A | u . F rom the ab ov e, the union bo und, the Ho effding’s inequality and Lemma 1, we ha ve that P S rp@ G P H , q P Q G q | L S p G , q q ´ L D p G , q q| ď ε 2 s “ 1 ´ P S rpD G P H , q P Q G q | L S p G , q q ´ L D p G , q q| ą ε 2 s ě 1 ´ P S rpD G P H , q P B Q G q | L S p G , q q ´ L D p G , q q| ą ε 2 s ě 1 ´ 2 d p H q P S r| L S p G , q q ´ L D p G , q q| ą ε 2 s ě 1 ´ 4 d p H q e ´ mε 2 { 2 ě 1 ´ δ , 4 where d p H q is the num b er of PSNE sets that c a n b e pr o - duced by games in H , as defined in Lemma 1. The fa ctor 2 in 2 d p H q in the union b ound comes from the fact that the set B Q G has exactly tw o elements. Let T p n, k q ” k n log 2 n if k P O p 1 q , and T p n, k q ” n 2 log n if k P O p n q . By s o lv- ing for m in the last inequa lit y , since d p H q P O p e T p n,k q q , we get m P O p 1 ε 2 p T p n, k q ` log 1 δ qq . W e prov ed so far that with probability a t least 1 ´ δ , we hav e | L S p G , q q ´ L D p G , q q| ď ε 2 simult aneously for a ll G P H and q P Q G . Additiona lly , since p p G , p q q is the pa ir with minimum L S p G , q q from all G P H and q P Q G , w e hav e that E D r L p G , p q p x q ´ L G ,q p x qs “ L D p p G , p q q ´ L D p G , q q ď L S p p G , p q q ` ε 2 ´ L S p G , q q ` ε 2 ď ε , with pr obability at least 1 ´ δ , which proves our claim. Sufficien t Samples for PSNE Recov ery . Fina lly , we show that if the num b er of samples is greater than O p k n log 2 n q for spar se graphs or O p n 2 log n q for dense graphs, then MLE correctly r ecov ers the PSNE with high probability . Theorem 3 (Sufficien t samples fo r PSNE recov ery) . Fix δ, ε P p 0 , 1 q . L et H b e the class of p olymatrix gr aph- ic al games with n no des and at m ost k p ar ents p er no de. Assume that the data distribution D “ P G ˚ ,q ˚ for some true game G ˚ P H and mixt ur e p ar ameter q ˚ P Q G ˚ . Assume that S is a dataset of m joint actions (of t he n players), e ach indep endently dr awn fr om D . If m P O p 1 ε 2 p k n log 2 n ` lo g 1 δ qq for k P O p 1 q or m P O p 1 ε 2 p n 2 log n ` lo g 1 δ qq for k P O p n q , then P S r N E p G ˚ q Ď N E p p G qs ě 1 ´ δ . pr ovide d that | N E p G ˚ q| ě 2 and ε ă β p| N E p G ˚ q| , q ˚ q wher e 4 β p r, q q “ 1 log p 2 | A | 2 q ¨ ˚ ˝ q log q r ` p 1 ´ q q lo g 1 ´ q | A |´ r ´ r ´ 1 r q log q r ´ 1 ´ ` q r ` 1 ´ q ˘ log 1 ´ q | A |´ r ` 1 ˛ ‹ ‚ . Pr o of. Here, we follow a worst c ase approach in which we analyze the identifiabilit y o f the P SNE set of G ˚ with resp ect to a game G ´ that has one PSNE less than G ˚ . F or our argument, sho wing the existence of such polyma - trix g raphical game G ´ is no t necessa ry . In fact, a mor e general ar gument could b e made with resp ect to a game that has k ´ ě 1 less PSNEs than G ˚ . The a nalysis for k ´ “ 1 provides the sufficien t conditions fo r the general case k ´ ě 1. 4 β p r, q q ď q 2 r . This maximum v alue is reached when | A | Ñ 8 . F or c la rity , let c p n q ” log p 2 | A | 2 q , y N E ” N E p p G q and N E ˚ ” N E p G ˚ q . Defin e the game G ´ by its PSNE set N E ´ ” N E p G ´ q as follows. Define the set N E ´ “ N E ˚ ´ t x u fo r some x P N E ˚ . It ca n b e easily verified that | N E ´ | “ | N E ˚ | ´ 1 , | N E ˚ X N E ´ | “ | N E ˚ | ´ 1 , | N E ˚ Y N E ´ | “ | N E ˚ | , | N E ˚ ´ N E ´ | “ 1 , | N E ´ ´ N E ˚ | “ 0 . F or any pair of g ames G , G 1 P H , let N E ” N E p G q and N E 1 ” N E p G 1 q . F or any pa ir o f g ames G , G 1 P H , and mixture para meters q P Q G and q 1 P Q G 1 , w e hav e E P G ,q r log p G 1 ,q 1 p x qs “ ÿ x P A p G ,q p x q log p G 1 ,q 1 p x q “ ÿ x P N E X N E 1 p G ,q p x q log p G 1 ,q 1 p x q ` ÿ x P N E ´ N E 1 p G ,q p x q log p G 1 ,q 1 p x q ` ÿ x P N E 1 ´ N E p G ,q p x q log p G 1 ,q 1 p x q ` ÿ x R N E Y N E 1 p G ,q p x q log p G 1 ,q 1 p x q “ | N E X N E 1 | | N E | q log q 1 | N E 1 | ` | N E ´ N E 1 | | N E | q log 1 ´ q 1 | A |´| N E 1 | ` | N E 1 ´ N E | | A |´| N E | p 1 ´ q q log q 1 | N E 1 | ` | A |´| N E Y N E 1 | | A |´| N E | p 1 ´ q q log 1 ´ q 1 | A |´| N E 1 | . (7 ) Note tha t the pa ir p G ´ , q ˚ q is w ell defined. More formally , since | N E ´ | “ | N E ˚ | ´ 1 then we hav e that Q G ´ “ pp| N E ˚ | ´ 1 q{| A | , 1 ´ 1 {p 2 | A |qs . Thus, q ˚ P Q G ˚ ñ q ˚ P Q G ´ . F r om eq.(7), we hav e KL p P G ˚ ,q ˚ } P G ´ ,q ˚ q “ E P G ˚ ,q ˚ r log p G ˚ ,q ˚ p x q ´ lo g p G ´ ,q ˚ p x qs “ q ˚ log q ˚ | N E ˚ | ` p 1 ´ q ˚ q log 1 ´ q ˚ | A |´| N E ˚ | ´ | N E ˚ |´ 1 | N E ˚ | q ˚ log q ˚ | N E ˚ |´ 1 ´ ´ q ˚ | N E ˚ | ` 1 ´ q ˚ ¯ log 1 ´ q ˚ | A |´| N E ˚ |` 1 . By the as sumption in the theorem and the ab ove, we hav e that c p n q ε ă c p n q β p n, | N E ˚ | , q ˚ q “ KL p P G ˚ ,q ˚ } P G ´ ,q ˚ q . (8) Note that since D “ P G ˚ ,q ˚ then N E p G q “ N E p G ˚ q and q “ q ˚ . By Lemma 2 and eq.(4), if m P O p 1 ε 2 p k n log 2 n ` log 1 δ qq for k P O p 1 q or m P O p 1 ε 2 p n 2 log n ` lo g 1 δ qq for k P O p n q , then c p n q ε ě c p n q E P G ˚ ,q ˚ r L p G , p q p x q ´ L G ˚ ,q ˚ p x qs “ E P G ˚ ,q ˚ r log p G ˚ ,q ˚ p x q ´ lo g p p G , p q p x qs “ KL p P G ˚ ,q ˚ } P p G , p q q . 5 Note that from the ab ov e and eq.(8), w e hav e that KL p P G ˚ ,q ˚ } P p G , p q q ă KL p P G ˚ ,q ˚ } P G ´ ,q ˚ q . That is, the empirical MLE minimizer p p G , p q q is better than the pair p G ´ , q ˚ q . Ther efore, y N E includes all the PSNE in N E ˚ , i.e., N E ˚ Ď y N E and we prov e our claim. Remark. A similar ar gument a s in Theor em 3 can be used to show that N E p p G q Ď N E p G ˚ q , although the sufficient n umber of samples increas es to O p k n 3 log 2 n q for spa rse g raphs, and O p n 4 log n q for dense gra phs. (The function β in such a case do e s not contain the 1 { log p 2 | A | 2 q P O p 1 { n q factor.) 5 Necessary Samples for PSNE Reco v ery In this section, we sho w tha t if the n umber of samples is les s than Ω p k n log 2 n q for sparse graphs or Ω p n 2 log n q for dense graphs , then any conceiv able metho d fails to recov er the PSNE with pr obability at least 1 { 2 . Theorem 4 (Necessary samples fo r PSNE recovery) . L et H b e the cla ss of p olymatrix gr aphic al games with n no des and at most k p ar ents p er no de. Assume that the tru e game G ˚ is chosen uniformly at r andom (by natu re ) fr om a finite su bset of H . Assume that the tru e mixtur e p a- r ameter q ˚ is known to the le arner. After cho osing the true game G ˚ , natur e gener ates a da taset S of m joint ac- tions (of the n players), e ach indep en dent ly dr awn fr om P G ˚ ,q ˚ . A s sume that a le arner uses the dataset S in or- der to cho ose a game p G . If m P Ω p k n log 2 n q for k P O p 1 q or m P Ω p n 2 log n q for k P O p n q , then P G ˚ ,S r N E p p G q ‰ N E p G ˚ qs ě 1 { 2 , for any c onc eivable le arning me cha nism for cho osing p G . Pr o of. Let A i “ t 1 , . . . , | A i |u for a ll i P V , w.l.o.g. Let Π “ t π | π Ď V ^ | π | “ k u . Let π P Π b e the set of k “in- fluen tial” play ers. Assume that nature picks π unifor mly at random from the ` n k ˘ elements in Π. F or a fixed π , we will constr uct a true game G π . F o r clarity , we define G π ” G ˚ and q ” q ˚ . The g oal of the learner is to use the dataset S in o rder to cho ose a set p π of k play ers, and to output a game G p π ” p G . F or a fixed π , we construct a game G π with a single PSNE (i.e., | N E p G π q| “ 1) as follows. The k “influential” play ers do not have any parent, i.e., N p i q “ H for i P π . W e for ce the “influent ial” play ers i P π to hav e a best resp onse 1 , by setting their p otential functions as follows. p@ i P π q u ii p x i q “ 1 r x i “ 1 s . By eq.(2), the lo cal pay off function for i P π b ecomes u i p x i q “ 1 r x i “ 1 s . The remaining n ´ k “ influenced” play ers have the k “ influen tial” play ers a s parent s, i.e., N p i q “ π for i R π . W e force the “ influenced” play ers to hav e a b est res po ns e 2, by setting their p otential functions as follows p@ i R π q u ii p x i q “ 0 , p@ i R π , j P π q u ij p x i , x j q “ 1 r x i “ 2 , x j “ 1 s . By eq.(2), the lo cal pay off function for i R π b ecomes u i p x i , x N p i q q “ ř j P π 1 r x i “ 2 , x j “ 1 s . The constructed game G π has a single PSNE x π . More sp ecifically p@ i P π q x π i “ 1 , p@ i R π q x π i “ 2 , N E p G π q “ t x π u . Since we assume a known fixed mixture parameter q and since | N E p G π q| “ 1, the P MF defined in eq.(3) reduces to p π p x q ” p G π ,q p x q “ 1 r x “ x π s q ` 1 r x ‰ x π s 1 ´ q | A |´ 1 . Let P π denote the probabilit y distribution defined b y the PMF p π p¨q . Clearly , π ‰ π 1 ô x π ‰ x π 1 . T hus, for all π ‰ π 1 the Kullback-Leibler divergence is bo unded as fol- lows KL p P π } P π 1 q “ ÿ x P A p π p x q log p π p x q ´ ÿ x P A p π p x q log p π 1 p x q “ p π p x π q log p π p x π q ` ÿ x ‰ x π p π p x q log p π p x q ´ p π p x π q log p π 1 p x π q ´ p π p x π 1 q log p π 1 p x π 1 q ´ ÿ x Rt x π , x π 1 u p π p x q log p π 1 p x q “ q lo g q ` p| A | ´ 1 q 1 ´ q | A |´ 1 log ´ 1 ´ q | A |´ 1 ¯ ´ q log ´ 1 ´ q | A |´ 1 ¯ ´ 1 ´ q | A |´ 1 log q ´p| A | ´ 2 q 1 ´ q | A |´ 1 log ´ 1 ´ q | A |´ 1 ¯ “ | A | q ´ 1 | A |´ 1 ´ log q ´ lo g ´ 1 ´ q | A |´ 1 ¯¯ . Assume that the v alue of the mixture para meter (known to the learner) is q ” 2 {| A | P Q G π . Thus, for all π ‰ π 1 we hav e KL p P π } P π 1 q “ log p| A |´ 1 q´ log p| A |{ 2 ´ 1 q | A |´ 1 P O p 1 {p n log n q . Conditioned on π , S is a dataset of m i.i.d. join t ac- tions drawn fro m P π . That is, S | π „ P m π . The mu- tual information can be b ounded by a pa irwise KL-based bo und [21] as follows I p π , S q ď 1 | Π | 2 ÿ π P Π ÿ π 1 P Π KL p P m π } P m π 1 q ď max π ‰ π 1 KL p P m π } P m π 1 q “ m max π ‰ π 1 KL p P π } P π 1 q P O p m {p n log n qq . 6 Note that p π “ π ô N E p G p π q “ N E p G π q . Let T p n, k q ” k log n if k P O p 1 q , and T p n, k q ” n if k “ n { 2. Next, w e show that log | Π | P Ω p T p n, k qq . F or k P O p 1 q , we hav e | Π | “ ` n k ˘ ě p n k q k and th us log | Π | P Ω p k log n q “ Ω p T p n, k qq . F or k “ n { 2, we hav e | Π | “ ` n n { 2 ˘ ě p n n { 2 q n { 2 “ 2 n { 2 and th us log | Π | P Ω p n q “ Ω p T p n, k qq . By the F a no’s inequal- it y [5] on the Mar ko v chain π Ñ S Ñ p π we have P G ˚ ,S r N E p p G q ‰ N E p G ˚ qs “ P π ,S r N E p G p π q ‰ N E p G π qs “ P π ,S r p π ‰ π s ě 1 ´ I p π , S q ` log 2 log | Π | ě 1 ´ O ˆ m {p n log n q T p n, k q ˙ “ 1 { 2 . By solving the last equality , we prove our claim. 6 Concluding Remarks There are several wa ys o f extending this resea r ch. Other noise pro cesses can b e analyzed, s uch as a lo cal noise mo del where the obser v ations are drawn from the PSNE set, and subsequently , eac h action is independently cor- rupted by nois e. Other equilibria concepts can also b e studied, such as mixed-s tr ategy Nash equilibr ia, corr e- lated equilibria and epsilo n Nas h eq uilibr ia. Ac kno wledgements. W e thank Xi Chen and Richard Cole for the helpful a nd v aluable discussions. References [1] R. Aumann. Sub jectivity and correla tio n in random- ized strategies. J ou rn al of Mathematic al Ec onomics , 1:67–9 6, 1974 . [2] O. Ben-Zwi and A. Ronen. Lo c al and global price of anarch y of graphical g ames. The or etic al Computer Scienc e , 4 12:119 6–120 7, 201 1. [3] B. Blum, C.R. Shelto n, and D. K oller. A contin ua- tion metho d for Nash equilibr ia in structured g ames. Journal of Artificial In tel ligenc e Rese ar ch , 25 :457– 502, 2006. [4] E. Brenner and D. Son tag. Sparsit yBo ost: A new scoring function for learning Bayesian netw ork struc- ture. Un c ertainty in A rtificial In t el ligenc e , pages 112–1 21, 2013. [5] T. Cov er and J. Thomas. Elements of Informatio n The ory . John Wiley & Sons, 2nd edition, 2006 . [6] C. Dask alakis, P . Goldb erg, and C. P apadimitriou. The complexity of computing a Nas h equilibrium. Communic ations of the ACM , 52(2):89– 9 7, 2009. [7] D. F udenberg a nd J. Tiro le. Game Th e ory . The MIT Press, 1991 . [8] A. Ghoshal and J. Ho norio. F r om b ehavior to sparse graphical games: Efficient recovery of equilibria. IEEE A l lerton Confer en c e on Communic ation, Con- tr ol, and Computing , pages 12 20–12 27, 2016. [9] J. Honor io and L. Or tiz. Learning the structure a nd parameters of large - po pulation gr a phical games fro m behaviora l da ta. Journal of Machine L e arning R e- se ar ch , 16(Jun):115 7–121 0, 2015. [10] M. Irfa n a nd L . Ortiz. O n influence, stable b ehav- ior, and the most influential individuals in netw orks: A game-theo retic approach. Artificial Intel ligenc e , 215:79 –119, 2014. [11] E. J anovsk a ja. Equilibrium situations in multi- matrix ga mes . Litovski ˘ ı Matematicheski ˘ ı Sb ornik , 8:381– 384, 1968 . [12] A. Jiang and K . Leyton- Brown. Polynomial-time computation of exa c t correlated equilibrium in co m- pact games . ACM Ele ctr onic Commer c e Confer en c e , pages 119–1 26, 2011. [13] S. Kak ade, M. Kear ns , J. Lang ford, and L. O rtiz. Correla ted equilibria in gra phical g ames. ACM Ele c- tr onic Commer c e Confer enc e , pages 42–4 7 , 2003. [14] M. Kear ns, M. Littman, and S. Singh. Graphical mo dels for game theory . Unc ertainty in Artificial Intel ligenc e , pages 253 –260, 200 1. [15] J. Nash. Non-co op era tiv e games. Annals of Mathe- matics , 54(2):28 6–295 , 1951 . [16] T. Neylon. Sp arse Solutions for Line ar Pr e dictio n Pr oblems . P hD thesis, New Y or k Universit y , May 2006. [17] L. Ortiz and M. Kear ns. Nas h propa gation for lo opy graphical games. Neur al Information Pr o c essing Sys- tems , 15:817 –824, 2002. [18] C. Papadimitriou and T. Roug hgarden. Computing correla ted eq uilibr ia in mult i-play er ga mes. Journal of the ACM , 55(3):1–2 9, 2008 . [19] E. Sontag. VC dimension of ne ur al net works. In Neur al Networks and Machine L e arning , pages 69– 95. Springer, 1998 . [20] D. Vickrey and D. Koller. Multi-agent algorithms for solving gr aphical games. Asso ciation for the A dvanc ement of Artificial Int el ligenc e Confer enc e , pages 345–3 51, 2002. [21] B. Y u. Ass ouad, F ano, and Le Cam. In T or gersen E. Pollard D. and Y ang G., editor s, F estschrift for Lu- cien L e Cam: Re se ar ch Pap ers in Pr ob ability and Statistics , pages 423 –435. Springer New Y o rk, 1997 . 7

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment