A survey of statistical network models

A Surv ey of Statistical Net w ork Mo dels Anna Golden b erg Univ ersit y of T oron to Alice X. Zheng Microsoft Researc h Stephen E. Fien b erg Carnegie Mellon Univ ersit y Edoardo M. Airoldi Harv ard Univ ersit y Decem b er 2009 2 Con ten ts Preface 1 1 In tro duction 3 1.1 Ov erview of Mo deling Approac hes . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 What This Surv ey Do es Not Co ver . . . . . . . . . . . . . . . . . . . . . . . 7 2 Motiv ation and Dataset Examples 9 2.1 Motiv ations for Netw ork Analysis . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Sample Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.1 Sampson’s “Monastery” Study . . . . . . . . . . . . . . . . . . . . . . 11 2.2.2 The Enron Email Corpus . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.3 The Protein Interaction Netw ork in Budding Y east . . . . . . . . . . 14 2.2.4 The Add Health Adolescent Relationship and HIV T ransmission Study 14 2.2.5 The F ramingham “Ob esity” Study . . . . . . . . . . . . . . . . . . . 16 2.2.6 The NIPS Paper Co-Authorship Dataset . . . . . . . . . . . . . . . . 17 3 Static Net work Mo dels 21 3.1 Basic Notation and T erminology . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 The Erd¨ os-R ´ en yi-Gilb ert Random Graph Mo del . . . . . . . . . . . . . . . . 22 3.3 The Exc hangeable Graph Mo del . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 The p 1 Mo del for So cial Net w orks . . . . . . . . . . . . . . . . . . . . . . . . 27 3.5 p 2 Mo dels for So cial Net w orks and Their Bay esian Relatives . . . . . . . . . 29 3.6 Exp onen tial Random Graph Mo dels . . . . . . . . . . . . . . . . . . . . . . . 30 3.7 Random Graph Mo dels with Fixed Degree Distribution . . . . . . . . . . . . 32 3.8 Blo c kmo dels, Sto chastic Blo ckmodels and Communit y Disco v ery . . . . . . . 33 3.9 Laten t Space Mo dels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.9.1 Comparison with Sto c hastic Blo ckmodels . . . . . . . . . . . . . . . . 38 4 Dynamic Mo dels for Longitudinal Data 41 4.1 Random Graphs and the Preferential A ttachm ent Mo del . . . . . . . . . . . 41 4.2 Small-W orld Mo dels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3 Duplication-A ttac hment Mo dels . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.4 Con tin uous Time Mark o v Chain Mo dels . . . . . . . . . . . . . . . . . . . . 47 i 4.5 Discrete Time Marko v Mo dels . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5.1 Discrete Mark o v ER GM Mo del . . . . . . . . . . . . . . . . . . . . . 51 4.5.2 Dynamic Laten t Space Mo del . . . . . . . . . . . . . . . . . . . . . . 52 4.5.3 Dynamic Con textual F riendship Mo del (DCFM) . . . . . . . . . . . . 53 5 Issues in Netw ork Mo deling 57 6 Summary 61 Bibliograph y 65 ii Preface Net w orks are ubiquitous in science and ha ve b ecome a fo cal p oin t for discussion in everyda y life. F ormal statistical mo dels for the analysis of net w ork data hav e emerged as a ma jor topic of interest in div erse areas of study , and most of these inv olv e a form of graphical rep- resen tation. Probabilit y mo dels on graphs date bac k to 1959. Along with empirical studies in so cial psyc hology and so ciology from the 1960s, these early w orks generated an active “net w ork communit y” and a substantial literature in the 1970s. This eﬀort mov ed in to the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning net w ork literature in statistical physics and computer science. The gro wth of the W orld Wide W eb and the emergence of online “net working comm unities” suc h as F ac eb o ok , MyS- p ac e , and Linke dIn , and a host of more sp ecialized professional netw ork communities has in tensiﬁed in terest in the study of netw orks and net w ork data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. W e b egin with an ov erview of the historical developmen t of statistical netw ork mo deling and then w e in tro duce a n umber of examples that ha ve been studied in the net work literature. Our subsequent discussion fo cuses on a num b er of prominent static and dynamic net w ork mo dels and their interconnections. W e emphasize formal mo del descriptions, and pa y sp ecial attention to the in terpretation of parameters and their estimation. W e end with a description of some op en problems and c hallenges for machine learning and statistics. 1 2 Chapter 1 In tro duction Man y scien tiﬁc ﬁelds in volv e the study of netw orks in some form. Net works hav e b een used to analyze interpersonal so cial relationships, communi cation net works, academic pap er coauthorships and citations, protein in teraction patterns, and muc h more. P opular b o oks on net w orks and their analysis b egan to app ear a decade ago, [see, e.g., 24 ; 50 ; 318 ; 319 ; 68 ] and online “net working communities” suc h as F ac eb o ok , MySp ac e , and Linke dIn are an even more recen t phenomenon. In this w ork, w e surv ey selectiv e asp ects of the literature on statistical mo deling and analysis of net w orks in so cial sciences, computer science, physics, and biology . Given the v olume of b o oks, pap ers, and conference pro ceedings published on the sub ject in these diﬀeren t ﬁelds, a single comprehensiv e survey w ould b e imp ossible. Our goal is far more mo dest. W e attempt to chart the progress of statistical mo deling of netw ork data ov er the past sev ent y years and to outline succinctly the ma jor sc ho ols of thought and approaches to netw ork mo deling and to describ e some of their interconnections. W e also attempt to iden tify ma jor statistical gaps in these mo deling eﬀorts. F rom this o v erview one migh t then syn thesize and deduce promising future research directions. Kolaczyk [ 177 ] pro vides a complemen tary statistical o v erview. The existing set of statistical netw ork mo dels ma y be organized along sev eral ma jor axes. F or this article, w e choose the axis of static vs. dynamic mo dels. Static net work mo dels concen trate on explaining the observed set of links based on a single snapshot of the net w ork, whereas dynamic netw ork mo dels are often concerned with the mec hanisms that go v ern c hanges in the net w ork ov er time. Most early examples of net w orks w ere single static snapshots. Hence static netw ork mo dels hav e b een the main fo cus of research for many y ears. Ho wev er, with the emergence of online net works, more data is a v ailable for dynamic analysis, and in recent years there has b een growing interest in dynamic mo deling. In the remainder of this chapter we pro vide a brief historical o verview of netw ork mo deling approac hes. In subsequen t c hapters we introduce some examples studied in the net w ork literature and give a more detailed comparativ e description of select mo deling approaches. 3 1.1 Ov erview of Mo deling Approac hes Almost all of the “statistically” orien ted literature on the analysis of netw orks derives from a handful of seminal pap ers. In so cial psychology and so ciology there is the early w ork of Simmel and W olﬀ [ 268 ] at the turn of the last century and Moreno [ 221 ] in the 1930s as w ell as the empirical studies of Stanley Milgram [ 215 ; 298 ] in the 1960s; in mathematics/probabilit y there is the Erd¨ os-R ´ eny i pap er on random graph mo dels [ 94 ]. There are other pap ers that dealt with these topics con temp oraneously or even earlier. But these are the ones that appear to ha v e had lasting impact. Moreno [ 221 ] inv en ted the so ciogram — a diagram of p oints and lines used to represen t relations among p ersons, a precursor to the graph representation for netw orks. Luce and others dev elop ed a mathematical structure to go with Moreno’s so ciograms using incidence matrices and graphs (see, e.g., [ 202 ; 200 ; 201 ; 203 ; 244 ; 282 ; 11 ]), but the structure they explored w as essen tially deterministic. Milgram ga ve the name to what is now referred to as the ”Small W orld” phenomenon — short paths of connections linking most p eople in so cial spheres — and his exp erimen ts had pro v o cative results: the shortest path b etw een any tw o p eople for completed c hains has a median length of around 6; ho w ever, the ma jority of c hains initiated in his exp eriments w ere nev er completed! (His studies provided the title for the pla y and mo vie Six De gr e es of Sep ar ation , ignoring the compleit y of his results due to the censoring.) White [ 321 ] and Fien b erg and Lee [ 100 ] gav e a formal Marko v-chain lik e mo del and analysis of the Milgram exp erimen tal data, including information on the uncompleted c hains. Milgram’s data w ere gathered in batches of transmission, and th us these mo dels can b e though t of as representing early examples of generativ e descriptions of dynamic netw ork ev olution. Recen tly , Do dds et al. [ 86 ] studied a global “replication” v ariation on the Milgram study in whic h more than 60,000 e-mail users attempted to reac h one of 18 target p ersons in 13 coun tries b y forw arding messages to acquain tances. Only 384 of 24,163 chains reached their targets but they estimate the median length for completions to b e 7, by assuming that attrition o ccurs at random. The so cial science net work researc h comm unity that arose in the 1970s was built up on these earlier eﬀorts, in particular the Erd¨ os-R ´ enyi-Gilbert mo del. Research on the Erd¨ os- R ´ en yi-Gilb ert mo del (along with works b y Katz et al. [ 166 ; 168 ; 167 ]) engendered the ﬁeld of random graph theory . In their pap ers, Erd¨ os and R ´ en yi w ork ed with ﬁxed n umber of vertices, N , and n umber of edges, E , and studied the prop erties of this mo del as E increases. Gilb ert studied a related tw o-parameter version of the mo del, with N as the num b er of vertices and p the ﬁxed probabilit y for choosing edges. Although their descriptions might at ﬁrst app ear to b e static in nature, we could think in terms of adding edges sequen tially and thus turn the mo del in to a dynamic one. In this alternative binomial version of the Erd¨ os-R ´ en yi- Gilb ert mo del, the key to asymptotic b ehavior is the v alue λ = pN . There is a “phase c hange” asso ciated with the v alue of λ = 1, at which p oin t w e shift from seeing many small connected comp onents in the form of trees to the emergence of a single “giant connected comp onen t.” Probabilists such as Pittel [ 243 ] imp orted ideas and results from sto c hastic pro cesses in to the random graph literature. Holland and Leinhardt [ 149 ]’s p 1 mo del extended the Erd¨ os-R ´ en yi-Gilb ert mo del to allo w 4 for diﬀeren tial attraction (p opularit y) and expansiv eness, as well as an additional eﬀect due to reciprocation. The p 1 mo del w as log-linear in form, whic h allo wed for easy computation of maxim um lik eliho o d estimates using a con tingency table form ulation of the model [ 101 ; 102 ]. It also allo wed for v arious generalizations to m ultidimensional netw ork structures [ 103 ] and sto c hastic blo c kmo dels. This approac h to mo deling net w ork data quickly ev olved in to the class of p ∗ or exp onen tial random graph mo dels (ERGM) originating in the work of F rank and Strauss [ 110 ] and Strauss and Ik eda [ 287 ]. A trio of pap ers demonstrating procedures for using ER GMs [ 316 ; 241 ; 254 ] led to the wide-spread use of ER GMs in a descriptiv e form for cross sectional netw ork structures or cumulativ e links for net works—what we refer to here as static mo dels. F ull maxim um likelihoo d approaches for ERGMs app eared in the w ork of Snijders and Handco c k and their collab orators, some of which we describ e in c hapter 3 . Most of the early examples of net works in the so cial science literature w ere relativ ely small (in terms of the num b er of no des) and in v olved the study of the netw ork at a ﬁxed p oin t in time or cumulativ ely ov er time. Only a few studies (e.g., Sampson’s 1968 data on novice monks in the monastery [ 259 ]) collected, rep orted, and analyzed net w ork data at multiple p oin ts in time so that one could truly study the evolution of the net work, i.e., net work dynamics. The focus on relativ ely small net w orks reﬂected the state-of-art of computation but it was suﬃcien t to trigger the discussion of ho w one might assess the ﬁt of a net w ork mo del. Should one fo cus on “small sample” prop erties and exact distributions giv en some form of minimal suﬃcien t statistic, as one often did in other areas of statistics, or should one lo ok at asymptotic prop erties, where there is a sequence of net w orks of increasing size? Ev en if we ha v e “rep eated cross-sections” of the net w ork, if the netw ork is truly evolving in con tinuous time w e need to ask how to ensure that the con tinuous time parameters are estimable. W e return to man y of these question in subsequen t c hapters. In the late 1990s, ph ysicists began to work on netw ork mo dels and study their prop erties in a form similar to the macro-lev el descriptions of statistical ph ysics. Barab´ asi, Newman, and W atts, among others, pro duced what we can think of as v ariations on the Erd¨ os-R´ en yi- Gilb ert mo del whic h either con trolled the gro wth of the net w ork or allo wed for diﬀerential probabilities for edge addition and/or deletion. These v ariations were intended to pro duce phenomena suc h as “h ubs,” “lo cal clustering,” and “triadic closures.” The resulting mo dels ga v e us ﬁxed degree distribution limits in the form of pow er la ws — v ariations on preferen tial attac hmen t mo dels (“the ric h get richer”) that date bac k to Y ule [ 329 ] and Simon [ 269 ] (see also [ 218 ]) — as well as what b ecame kno wn as “small w orld” mo dels. The small-world phenomenon, whic h harks back to Milgram’s 1960s studies, usually refers to tw o distinct prop erties: (1) small a v erage distance and (2) the “clustering” eﬀect, where tw o no des with a common neigh b or are more likely to b e adjacent. Many of these authors claim that these prop erties are ubiquitous in realistic netw orks. T o mo del netw orks with the small-world phenomenon, it is natural to utilize randomly generated graphs with a p o wer law degree distribution, where the fraction of no des with degree k is prop ortional to k − a for some p ositiv e exponent a . Many of the most relev ant pap ers are included in an edited collection b y Newman et al. [ 231 ]. More recen tly this style of statistical physics mo dels ha ve b een used to detect communit y structure in netw orks, e.g., see Girv an and Newman [ 122 ] and 5 Bac kstrom et al. [ 20 ], a phenomenon whic h has its coun terpart description in the so cial science net w ork mo deling literature. The probabilistic literature on random graph mo dels from the 1990s made the link with epidemics and other evolving sto c hastic phenomena. Pic king up on this idea, W atts and Strogatz [ 320 ] and others used epidemic mo dels to capture general c haracteristics of the ev olution of these new v ariations on random net works. Durrett [ 91 ] has pro vided us with a b o ok-length treatmen t on the topic with a num b er of in teresting v ariations on the theme. The app eal of sto chastic pro cesses as descriptions of dynamic net w ork mo dels comes from b eing able to exploit the extensiv e literature already developed, including the existence and the form of stationary distributions and other mo del features or prop erties. Ch ung and Lu [ 69 ] pro vide a complementary treatment of these mo dels and their probabilistic prop erties. One of the principal problems with this div erse net work literature that w e see is that, with some notable exceptions, the statistical to ols for estimation and assessing the ﬁt of “statistical physics” or sto c hastic pro cess mo dels is lacking. Consequently , no attention is paid to the fact that real data may often b e biased and noisy . What authors in the netw ork literature hav e often relied up on is the extraction of key features of the related graphical net w ork representation, e.g., the use of p o wer laws to represen t degree distributions or mea- sures of cen trality and clustering, without any indication that they are either necessary or suﬃcien t as descriptors for the actual netw ork data. Moreo ver, these summary quantities can often b e highly misleading as the critique by Stouﬀer et al. [ 285 , 286 ] of metho ds used b y Barab´ asi [ 25 ] and V´ azquez et al. [ 304 ] suggest. Barab´ asi claimed that the dynamics of a n um b er of h uman activities are scale-free, i.e., he sp eciﬁcally rep orted that the probability distribution of time interv als b etw een consecutive e-mails sen t b y a single user and time dela ys for e-mail replies follo w a p o wer-la w with exp onen t − 1, and he prop osed a priorit y- queuing pro cess as an explanation of the burst y nature of human activit y . Stouﬀer et al. [ 286 ] demonstrated that the rep orted p o w er-law distribution was solely an artifact of the analysis of the empirical data and used Bay es factors to sho w that the prop osed mo del is not representativ e of e-mail communication patterns. See a related discussion of the p o or ﬁt of p ow er laws in Clauset et al. [ 74 ]. There are several works, ho wev er, that try to address mo del ﬁtting and model comparison. F or example, the work of Williams and Martinez [ 323 ] sho w ed ho w a simple t wo-parameter mo del predicted “k ey structural prop erties of the most complex and comprehensive fo o d w ebs in the primary literature”. Another go o d example is the w ork of Middendorf et al. [ 214 ] where the authors used net work motif coun ts as input to a discriminative systematic classiﬁcation for deciding which conﬁguration mo del the actual observ ed net work came from; they lo ok ed at p ow er law, small-w orld, duplication-m utation and duplication-m utation-complemen tation and other mo dels (seven in total) and concluded that the duplication-m utation-complementation mo del describ ed the protein-protein inter- action data in Dr osophila melano gaster sp ecies b est. Mac hine learning approac hes emerged in sev eral forms o v er the past decade with the empirical studies of F aloutsos et al. [ 97 ] and Kleinberg [ 173 , 172 , 174 ], who in tro duced a model for whic h the underlying graph is a grid—the graphs generated do not ha ve a p o wer law degree distribution, and eac h v ertex has the same exp ected degree. The strict 6 requiremen t that the underlying graph b e a cycle or grid renders the mo del inapplicable to webgraphs or biological net works. Durrett [ 91 ] treats v ariations on this mo del as well. More recently , a num b er of authors hav e lo ok ed to com bine the sto c hastic blo ckmodel ideas from the 1980s with latent space mo dels, mo del-based clustering [ 137 ] or mixed-mem b ership mo dels [ 9 ], to pro vide generative models that scale in reasonable wa ys to substan tial-sized net w orks. The class of mixed membership mo dels resembles a form of soft clustering [ 95 ] and includes the latent Diric hlet allo cation mo del [ 41 ] from mac hine learning as a sp ecial case. This class of mo dels oﬀers muc h promise for the kinds of net w ork dynamical pro cesses w e discuss here. 1.2 What This Surv ey Do es Not Co v er This survey fo cuses primarily on statistical netw ork mo dels and their applications. As a consequence there are a n um b er of topics that we touc h up on only brieﬂy or essen tially not at all, such as • Pr ob ability the ory asso ciate d with r andom gr aph mo dels. The probabilistic literature on random graph mo dels is no w truly extensive and the bulk of the theorems and pro ofs, while interesting in their own righ t, are largely unconnected with the presen t exp osition. F or excellent in tro ductions to this literature, see Chung and Lu [ 69 ] and Durrett [ 91 ]. F or related results on the mathematics of graph theory , see Bollob´ as [ 43 ]. • Eﬃcient c omputation on networks. There is a substan tial computer science litera- ture dealing with eﬃcien t calculation of quantities asso ciated with net work structures, suc h as shortest paths, net work diameter, and other measures of connectivit y , cen tral- it y , clustering, etc. The edited v olume b y Brandes and Erlebac h [ 48 ] contains go o d o v erviews of a num b er of these topics as well as other computational issues asso ciated with the study of graphs. • Use of the network as a to ol for sampling. Adaptiv e sampling strategies mo dify the sampling probabilities of selection based on observ ed v alues in a netw ork structure. This strategy is b eneﬁcial when searc hing for rare or clustered p opulations. Thompson and Seber [ 296 ] and Thompson [ 293 ] discuss adaptive sampling in detail. There is also related w ork on target sampling [ 294 ] and resp onden t-driven sampling [ 258 ; 305 ]. • Neur al networks. Neural net works originated as simple mo dels for connections in the brain but ha ve more recen tly b een used as a computational to ol for pattern recognition (e.g., Bishop [ 38 ]), mac hine learning (e.g., Neal [ 228 ]), and mo dels of cognition (e.g., Rogers and McClelland [ 257 ]). • Networks and e c onomic the ory. A relativ ely new area of study is the link b etw een net w ork problems, economic theory , and game theory . Some useful en trees to this literature are Ev en-Dar and Kearns [ 96 ], Goy al [ 131 ], Kearns et al. [ 169 ], and Jackson 7 [ 160 ], whose bo ok con tains an excellent semi-tec hnical in tro duction to net work concepts and structures. • R elational networks. This is a v ery p opular area in mac hine learning. It uses proba- bilistic graphical mo dels to represen t uncertaint y in the data. The types of “net w orks” in this area, such as Bay es nets, dep endency diagrams, etc., ha v e a diﬀeren t meaning than the netw orks w e consider in this review. The main diﬀerence is that the net- w orks in our w ork are considered to “b e given” or arising directly from prop erties of the netw ork under study , rather than b eing representativ e of the uncertaint y of the relationships b et ween no des and no de attributes. There is a multitude of literature on relational netw orks, e.g., see F riedman et al. [ 112 ], Geto or et al. [ 117 ], Neville and Jensen [ 229 ]; Neville et al. [ 230 ], and Geto or and T ask ar [ 116 ]. • Bi-p artite gr aphs. These are graphs that represen t measuremen t on tw o p opulations of ob jects, suc h as individuals and features. The graphs in this context are seldom the b est represen tation of the data, with exception p erhaps of binary measurements or when the true p opulations hav e comparable sizes. Recent work on exchangeable Rasc h matrices is related to to this topic and p otentially relev an t for netw ork analysis. Lauritzen [ 186 , 187 ]; Bassetti et al. [ 29 ] suggest applications to bipartite graphs. • A gent-b ase d mo deling. Building on older ideas suc h as cellular automata, agen t-based mo deling attempts to sim ulate the sim ultaneous op erations of m ultiple agents, in an eﬀort to re-create and predict the actions of complex phenomena. Because the in ter- est is often on the interaction among the agen ts, this domain of research has b een link ed with netw ork ideas. With the recent adv ances in high-p erformance computing, sim ulations of large-scale so cial systems ha ve b ecome an activ e area of research, e.g., see [ 46 ]. In particular, there is a strong in terest in areas that rev olve around national securit y and the military , with studies on the eﬀects of catastrophic ev ents and bio- logical warfare, as well as computational explorations of p ossible reco very strategies [ 57 ; 59 ]. These w orks are the contemporary counterparts of more classical w ork at the in terface b et ween artiﬁcial in telligence and the so cial sciences [ 54 ; 56 ; 55 ]. 8 Chapter 2 Motiv ation and Dataset Examples 2.1 Motiv ations for Net w ork Analysis Wh y do we analyze net works? The motiv ation behind net work analysis is as diverse as the origin of netw ork problems within diﬀering academic ﬁelds. Before we delve into details of the “how” of statistical net work modeling, we start with some examples of the “wh y .” This c hapter also includes descriptions of p opular datasets for in terested readers who may wish to exercise their mo deling muscles. So cial scien tists are often in terested in questions of in terpretation suc h as the meanings of edges in a social net work [ 181 ]. Do they arise out of friendliness, strategic alliance, obligation, or something else? When the meaning of edges are known, the ob ject is often to characterize the structure of those relations (e.g., whether friendships or strategic alliances are hierarc hical or transitiv e). A large v olume of statistically-orien ted so cial science literature is dedicated to mo deling the mechanisms and relations of netw ork prop erties and testing hypotheses ab out net w ork structure, see, e.g., [ 280 ]. Ph ysicists, on the other hand, tend to be in terested in understanding parsimonious mec h- anisms for netw ork formation [ 28 ; 235 ]. F or example, a common mo deling goal is to explain ho w a giv en net work comes to hav e its particular degree distribution or diameter at time t . Sev eral netw ork analysis concepts ha v e found nic hes in computational biology . F or ex- ample, work on protein function classiﬁcation can b e thought of as ﬁnding hidden groups in the protein-protein interaction net w ork [ 7 ; 8 ] to gain b etter understanding of underlying bi- ological pro cesses. Lab el propagation (no de similarity) in netw orks can b e harnessed to help with functional gene annotation [ 226 ]. Graph alignment can b e used to lo cate subgraphs that are common among sp ecies, thus adv ancing our understanding of ev olution [ 105 ]. Mo- tif ﬁnding, or more generally the searc h for subgraph patterns, also has man y applications [ 17 ]. Com bining net w orks from heterogeneous data sources helps to impro ve the accuracy of predicted genetic in teractions [ 327 ]. Heterogeneity of netw ork data sources in biology in tro duces a lot of noise into the global net work structure, esp ecially when net works created for diﬀerent purp oses (suc h as protein co-regulation and gene co-expression) are com bined. [ 225 ] addresses net w ork de-noising via degree-based structure priors on graphs. F or a review 9 of biological applications of netw orks, please see [ 332 ]. The task of ﬁnding hidden groups is also relev ant in analyzing comm unication net works, e.g., in detecting possible laten t terrorist cells [ 30 ]. The related task of discov ering the “roles” of individual no des is useful for identit y disam biguation [ 36 ] and for business organization analysis [ 207 ]. These applications often take the mac hine learning approac h of graph parti- tioning, a topic previously known in so cial science and statistics literature as blo c kmo deling [ 199 ; 89 ]. A related question is functional clustering, where the goal is not to statistically cluster the net w ork, but to discov er members of dynamic communities with similar functions based on existing netw ork connectivit y [ 122 ; 232 ; 234 ; 266 ]. In the machine learning comm unit y , netw orks are often used to predict missing informa- tion, which can b e edge related, e.g., predicting missing links in the netw ork [ 238 ; 73 ; 198 ], or attribute related, e.g., predicting how lik ely a movie is to b e a b ox oﬃce hit [ 229 ]. Other applications include lo cating the crucial missing link in a business or a terrorist net w ork, or calculating the probabilit y that a customer will purc hase a new pro duct, giv en the pattern of purc hases of his friends [ 142 ]. The latter question can more generally b e stated as predict- ing individual’s preferences given the preferences of her “friends”. This research direction has ev olv ed in to an area of its o wn under the name of r e c ommender systems , whic h has recen tly receiv ed a lot of media attention due to the competition b y the largest online movie ren tal company Netﬂix. The company has a warded a prize of one million dollars to a team of researc hers that w ere able to predict customer ratings of mo vies with higher than 10% accuracy than their own in-house system [ 290 ]. The concept of information propagation also ﬁnds many applications in the netw ork domain, suc h as virus propagation in computer net works [ 310 ], HIV infection netw orks [ 222 ; 163 ; 164 ], viral marketing [ 87 ] and more generally gossiping [ 170 ]. Here some work fo cuses on ﬁnding net w ork conﬁgurations optimal for routing, while other researc h assumes that the net w ork structure is giv en and fo cus on suitable mo dels for disease or information spread. 2.2 Sample Datasets A plethora of data sets are av ailable for netw ork analysis, and more are emerging every y ear. W e pro vide a quick guided tour of the most p opular datasets and applications in eac h ﬁeld. In his ground-breaking pap er, Milgram [ 215 ] exp erimented with the construction of in- terp ersonal social net w orks. His result that the median length of completed c hains w as appro ximately 6 led to the p op-culture coining of the phrase “six degrees of separation.” Sub jects of subsequent studies ranged from so cial in teractions of monks [ 259 ], to hierar- c hies of elephants [ 209 ; 303 ], to sexual relationships b et w een adults of Colorado [ 176 ], to friendships amongst elementary school studen ts [ 141 ; 299 ]. While a lot of biological applications fo cus on the study of protein-protein interaction net w orks [ 114 ; 115 ; 184 ; 248 ; 328 ], metab olic net works [ 158 ], functional and co-expression gene similarit y netw orks and gene regulatory netw orks [ 111 ; 309 ], computer science applica- tions rev olv e around e-mail [ 207 ], the internet [ 97 ; 63 ; 151 ], the web [ 152 ; 13 ], academic pap er co-authorship [ 127 ] and citation netw orks [ 204 ; 216 ]. Citation netw orks hav e a long history 10 of mo deling in diﬀeren t areas of researc h starting with the seminal paper of de Solla Price [ 83 ] and more recent ly in physics [ 190 ]. With the recen t rise of online netw orks, computer science and so cial science researchers are also starting to examine blogger netw orks suc h as LiveJournal , so cial net works found on F riendster , F ac eb o ok , Orkut , and dating net w orks suc h as Match.c om . T errorist net works (often sim ulated) and telecommunication netw orks ha v e come under similar scrutin y , esp ecially since the even ts of Septem b er 11, 2001 (e.g., see [ 182 ; 250 ; 249 ; 62 ]). There has also been work on ecological net w orks such as fo o dw ebs [ 323 ; 16 ], neuronal net w orks [ 188 ], net work epidemiology [ 306 ], economic trading netw orks [ 123 ], transp orta- tion net works (roads, railw a ys, airplanes; e.g., [ 113 ]), resource distribution net works, mobile phone net w orks [ 92 ] and many others. Sev eral net work data rep ositories are a v ailable on public w ebsites and as part of pac k ages. F or example, UCINet 1 includes a lot of well known smaller scale datasets suc h as the Davis Southern Club W omen dataset [ 80 ], Zac hary’s k arate club dataset [ 330 ], and Sampson’s monk data [ 259 ] describ ed b elo w. P a jek 2 con tains a larger set of small and large net works from domains such as biology , linguistics, and fo o d-web. Additional datasets in a v ariet y of domains include pow er grid netw orks, US politics, cellular and protein net works and others 3 . A collection of large and v ery large directed and undirected netw orks in the areas of comm unication, citation, internet and others are a v ailable as part of Stanford Netw ork Analysis P ac k age (SNAP) 4 . W e no w in tro duce six examples of netw orks studied in the literature, describing the data in reasonable detail and including graphs depicting the netw orks wherev er feasible. F or each net w ork example w e articulate sp eciﬁc questions of interest. 2.2.1 Sampson’s “Monastery” Study A classic example of a so cial netw ork is the one deriv ed from the surv ey administered by Samsp on and published in his do ctoral dissertation [ 259 ]. Figure 2.1 displa ys the net w ork deriv ed from the “whom do y ou like” so ciometric relations in this dataset. Sampson sp en t sev eral mon ths in a monastery in New England, where a n umber of no vices w ere preparing to join a monastic order. Sampson’s original analysis w as ro oted in direct an throp ological ob- serv ations. He strongly suggested the existence of tigh t factions among the no vices: the lo yal opp osition (whose members joined the monastery ﬁrst), the young turks (whose mem b ers joined later on), the outcasts (who w ere not accepted in either of the t wo main factions), and the w av erers (who did not take sides). The ev ents that to ok place during Sampson’s sta y at the monastery supp orted his observ ations. F or instance, John and Gregory , t wo mem b ers of the young turks, were exp elled ov er religious diﬀerences, and other members resigned 1 http://www.analytictech.com/ucinet/ 2 http://vlado.fmf.uni- lj.si/pub/networks/data/ 3 http://www- personal.umich.edu/ ~ mejn/netdata/ http://cdg.columbia.edu/cdg/datasets http://www.nd.edu/ ~ networks/resources.htm 4 http://snap.stanford.edu/data/ 11 Ambrose_9 Victor_8 Greg_2 Mark_7 Albert_16 John_1 Basil_3 Elias_17 Armand_13 Simp_18 Bonaven_5 Peter_4 Romul_10 Louis_1 1 Berth_6 Hugh_14 Winf_12 Boni_15 Figure 2.1: Net w ork deriv ed from “whom do y ou lik e” so ciometric relations collected by Sampson. shortly after these even ts. Ab out a year after lea ving the monastery , Sampson surv ey ed all of the novices, and asked them to rank the other novices in terms of four so ciometric relations: like/dislik e, esteem, p ersonal inﬂuence, and alignment with the monastic credo, retrosp ectiv ely , at four diﬀerent ep o c hs spanning his sta y at the monastery . The presence of a well deﬁned social structure within the monastery (the factions) that can b e inferred from responses to the survey , as well as the so cial dynamics of subtle ideo- logical conﬂicts that led to the dissolution of the monastic order, hav e m uch in trigued b oth statisticians and so cial scien tists for the past four decades. Researchers typically consider the faction lab els assigned b y Sampson to the no vices as the anthropological ground truth in their analysis. F or example analyses, we refer to [ 103 ; 137 ; 81 ; 9 ]. 2.2.2 The Enron Email Corpus The Enron email corpus has b een widely studied in recen t mac hine learning net work litera- ture. Enron Corporation w as an energy and trading compan y specializing in the mark eting of electricit y and gas. In 2000 it w as the sev enth largest company in the United States with re- p orted rev enues of o v er $100 billion. On Decem b er 2, 2001, Enron ﬁled for bankruptcy . The sudden collapse cast suspicions ov er its management and prompted federal in vestigations. Thirt y-four Enron oﬃcials were prosecuted and top Enron executives and asso ciates w ere subsequen tly found to b e guilt y of accounting fraud. During the inv estigation, the courts subp o enaed extensiv e email logs from most of Enron’s employ ees, and the F ederal Energy Regulatory Commission (FERC) published the database online. 5 Subsequen tly , researc hers 5 http://www.ferc.gov/industries/electric/indus- act/wec/enron/info- release.asp 12 Figure 2.2: E-mail exchange data among 151 Enron executiv es, using a threshold of a mini- m um of 5 messages for each link. Source: [ 153 ]. in the CALO (Cognitive Assistant that Learns and Organizes) pro ject corrected integrit y problems in the dataset. 6 The original FER C dataset contains 619,446 email messages (ab out 92% of Enron’s staﬀ emails), and the cleaned-up CALO dataset contains 200,399 messages from 158 users. Another version of the data consists of the conten ts of the mail folders of the top 151 executives, con taining ab out 225,000 messages cov ering a p erio d from 1997 to 2004. 7 Figure 2.2 and Figure 2.3 give net work snapshots of the e-mail traﬃc among these 151 executiv es with thresholds of 5 and 30 messages, resp ectively . Researc h activit y on the Enron dataset range from do cumen t classiﬁcation to so cial- 6 http://www.cs.cmu.edu/ ~ enron/ 7 http://www.isi.edu/ ~ adibi/Enron/Enron.htm Figure 2.3: E-mail exchange data among 151 Enron executiv es, using a threshold of a mini- m um of 30 messages for each link. Source: [ 153 ]. 13 net w ork analysis to visualization. A collection of pap ers w orking with the Enron corpus w ere gathered together in a sp ecial 2005 issue of Computational & Mathematic al Or ganization The ory , see [ 58 ]. 2.2.3 The Protein In teraction Net w ork in Budding Y east The budding y east is a unicellular organism that has b ecome a de-facto mo del organism for the study of molecular and cellular biology [ 47 ]. There are ab out 6,000 proteins in the budding yeast, which interact in a n umber of wa ys [ 64 ]. F or instance, proteins bind together to form protein complexes, the ph ysical units that carry out most functions in the cell [ 184 ]. In recen t years, a large amoun t of resources has b een directed to collect exp erimental evidence of ph ysical proteins binding, in an eﬀort to infer and catalogue protein complexes and their multifaceted functional roles [e.g. 98 ; 159 ; 300 ; 114 ; 143 ]. Curren tly , there are four main sources of in teractions b etw een pairs of proteins that target proteins lo calized in diﬀerent cellular compartments with v ariable degrees of success: (i) literature curated in teractions [ 248 ], (ii) yeast t wo-h ybrid (Y2H) in teraction assays [ 328 ], (iii) protein fragmen t complemen tation (PCA) in teraction assa ys [ 291 ], and (iv) tandem aﬃnit y puriﬁcation (T AP) in teraction assays [ 115 ; 184 ]. These collections include a total of ab out 12,292 protein in teractions [ 162 ], although the num b er of such interactions is estimated to b e b et ween 18,000 [ 328 ] and 30,000 [ 307 ]. Figure 2.4 shows a p opular image of the interaction net work among proteins in the budding y east, pro duced as part of an analysis b y Barab´ asi and Oltv ai [ 27 ]. Statistical metho ds hav e b een developed for analyzing man y asp ects of this large protein in teraction netw ork, including de-noising [ 32 ; 8 ], function prediction [ 227 ], and identiﬁcation of binding motifs [ 23 ]. 2.2.4 The Add Health Adolescen t Relationship and HIV T rans- mission Study The National Longitudinal Study of Adolescent Health (Add Health) is a study of adoles- cen ts in the United States dra wn from a represen tative sample of middle, junior high, and highsc ho ols. The study fo cused on patterns of friendship, sexual relationships, as well as disease transmissions. T o date, four wa ves of surv eys ha v e b een collected o v er the course of ﬁfteen y ears. W av e I surveys o ccurred b etw een 1994 to 1995 and included 90,118 students from 145 sc ho ols across the coun try . Eac h studen t completed an in-sc ho ol questionnaire on his or her family background, school life and activities, friendships, and health status. Administrators from participating sc ho ols also completed questionnaires ab out studen t demography and sc ho ol curriculum and services. In addition, 20,745 studen ts were c hosen for an in-home in terview that included more sensitiv e topics suc h as sexual b ehavior. F or 16 selected sc ho ols (t w o large and fourteen small), Add Health attempted to administer the in-home survey to all enrolled studen ts. This saturated sample distinguishes itself from the ego-centric and 14 Figure 2.4: A p opular image of the protein interaction netw ork in Sac char omyc es c er evisiae , also kno wn as the budding yeast. The ﬁgure is repro duced with p ermission. Source: [ 27 ]. sno wball samples collected from past studies; it allo ws for the construction of relationship net w orks with more accurate global characteristics. The fully observed friendship net works in all the schools are also a v aluable resource and an imp ortan t con tribution of this work. W av e I I data collection o ccurred 18-mon ths after W av e I in 1996 and follow ed up on the in-home interviews. The dataset co v ered 14,738 adolescen ts and 128 school administrators. Based on the data collected from W av e I and I I, Bearman et al. [ 31 ] constructed the timed sequence of relationship net w orks amongst students from the t wo large sc ho ols with saturated sampling. The resulting sexual relationship net w ork b ears strong resemblance to a spanning tree as opp osed to previously hypothesized core or in v erse-core structures 8 (See Figure 2.5 .) W av e I I I in terviews were conducted in 2001 and 2002 with topics including marriage, 8 A core is a group of inter-connected individuals who sit at the center of the graph and in teract with individuals on the p eriphery . An in verse core is a group of central individuals who are connected to those on the p eriphery but not to each other. 15 Figure 2.5: The Add Health sexual relationships netw ork of US highschool adolescents. This ﬁgure is repro duced with p ermission. Source: Bearman et al. [ 31 ] c hildb earing, and sexually transmitted diseases. Of the original W av e I in-home respondents, 15,170 w ere in terview ed again for W av e I I I. Of these, 13,184 participants provided oral ﬂuid sp ecimens for HIV testing. Morris et al. [ 223 ] studied the prev alence of HIV infections among y oung adults based on data collected in W av e I I I. W av e IV interviews w ere conducted in 2007 and 2008 with the original W av e I resp on- den ts, who are now disp ersed across the nation in all 50 states. Of the original resp onden ts, 92.5% w ere lo cated and 80.3% were interview ed. The interview included a comprehensiv e surv ey of the so cial, emotional, spiritual, and ph ysical asp ects of health. Ph ysical measure- men ts, biosp ecimen, and geographical data were also collected. F or detailed information ab out the data, as w ell as access to the public-domain and restricted-access datasets, see http://www.cpc.unc.edu/projects/addhealth . 2.2.5 The F ramingham “Ob esit y” Study One of the most famous and important epidemiological studies w as initiated in F ramingham, Massac h usetts, a suburb of Boston, in 1948 with an originally enrolled cohort of 5209 p eople. In 1971 inv estigators initiated an “oﬀspring” cohort study which enrolled most of the chil- dren of the original cohort and their sp ouses. Participan ts completed a questionnaire and underw en t ph ysical examinations (including measuremen ts of height and w eigh t) in three- 16 y ear p erio ds b eginning 1973, 1981, 1985, 1989, 1992, 1997, 1999. Christakis and F owler [ 65 ] deriv e b o dy mass index information on a total of 12,067 individuals who app eared in any of the F ramingham Heart cohorts (one “close friend” for eac h cohort mem b er). 9 There w ere 38,611 observ ed family and so cial ties (edges) to the core 5,124 cohort mem b ers. Through a series of netw ork snapshots and statistical analyses, Christakis and F o wler describ ed the ev olution of the “clustering” of ob esity in this so cial net w ork. In particular they claim to hav e examined whether the data conformed to “small-w orld,” “scale-free,” and “hierarchical” t yp es of of random graph net w ork mo dels. Figure 2.6 depicts data on the largest connected sub comp onent (the so-called giant comp onent) for the net work in 2000, whic h consists of 2200 individuals. Other analyses in their pap er explore attributions of the individuals via longitudinal logistic-regression models with lagged eﬀects. Subsequently , they ha v e published similar pap ers fo cused on the dynamics of smoking behavior o ver time [ 66 ] and on happiness [ 67 ], b oth using the structure of F ramingham “oﬀspring” cohort. This work has come under criticism by others. F or example Cohen-Cole and Fletcher note that there are plausible alternative explanations to the netw ork structure based on con- textual factors [ 77 ], and in a separate pap er demonstrate that the same metho dology detects “implausible” so cial netw ork eﬀects for such medical conditions as acne and headaches as w ell as for physical heigh t [ 78 ]. The authors answ er to these criticisms can b e found in [ 108 ]. The question of the magnitude and signiﬁcance of so cial netw ork eﬀects is still a sub ject of an ongoing debate. 2.2.6 The NIPS P ap er Co-Authorship Dataset The NIPS dataset con tains information on publications that app eared in the Neur al In- formation Pr o c essing Systems (NIPS) conference pro ceedings, volumes 1 through 12, cor- resp onding to years 1987-1999—the pre-electronic submission era. The original collection con tained scanned full pap ers made av ailable by Y ann LeCunn. Sam Ro weis subsequently pro cessed the data to glean information suc h as title, authorship information, and w ord coun ts p er do cument. In total, there are 2,037 authors and 1,740 pap ers with an a verage of 2.29 authors p er pap er and 1.96 pap ers p er author. The NIPS database is a v ailable from Sam Row eis’ website 10 in raw and MA TLAB formats along with a detailed description and information on its construction. V arious authors hav e used the NIPS data to analyze author-to-author connectivit y in static [ 126 ] as w ell as dynamic settings [ 264 ]. Li and McCallum [ 197 ] mo deled the text of the do cumen ts and Sark ar et al. [ 265 ] analyzed the tw o-mo de net work (author-w ord-author) in a dynamic con text. In Figure 2.7 we reproduce a graphic illustration of the inferred dynamic ev olution of the netw ork from [ 263 ]. 9 A b o dy-mass index v alue (weigh t in kg. divided by the square of the height in meters) of 30 or more w as taken to indicate obesity . 10 http://www.cs.toronto.edu/ ~ roweis/data.html 17 Figure 2.6: Ob esity netw ork from F ramingham oﬀspring cohort data. Each no de represen ts one p erson in the dataset (a total of 2200 in this picture). Circles with red b orders denote w omen, with blue b orders – men. The size of eac h circle is prop ortional to the bo dy-mass index. The color inside the circle denotes ob esit y status - y ellow is ob ese (b o dy-mass index ≥ 30, green is non-ob ese. The colors of ties b et ween no des indicate relationships - purple denotes a friendship or marital tie and orange is a familial tie. This ﬁgure is repro duced with p ermission. Source: [ 65 ]. 18 NIPS 1995-1998 NIPS 1991-1994 Figure 2.7: NIPS pap er co-authorship data. Eac h p oint represents an author. Two authors are linked b y an edge if they hav e co-authored at least one pap er at NIPS. Left: 1991-1994. Righ t: 1995-1998. Eac h graph con tains all the links for the selected p eriod. Sev eral w ell kno wn p eople in the Mac hine Learning ﬁeld are highligh ted. The size of the circles around selected individuals dep end on their num b er of collab orations. Colors are meant to facilitate visualization. This ﬁgure is repro duced with p ermission. Source: [ 263 ]. 19 20 Chapter 3 Static Net w ork Mo dels A n umber of basic net work mo dels are essen tially static in nature. The statistical activities asso ciated with them fo cus on certain lo cal and global net w ork statistics and the exten t to whic h they capture the main elemen ts of actual realized netw orks. In this chapter, w e brieﬂy summarize tw o lines of research. The ﬁrst originates in the mathematics communit y with the Erd¨ os-R´ enyi-Gilbert mo del and led to tw o t yp es of generalizations: (i) the “statistical ph ysics” generalizations that led to p o wer laws for degree distributions—the so-called scale- free graphs, and (ii) the exc hangeable graph models that in tro duce w eak dep endences among the edges in a controlled fashion, whic h ultimately lead to a range of more structured con- nectivit y patterns and enable mo del comparison strategies ro oted in information theory . A second line of researc h originated in the statistics and so cial sciences communities in resp onse to a need for models of so cial net works. The p 1 mo del of Holland and Leinhardt, whic h in some sense generalizes the Erd¨ os-R ´ en yi-Gilb ert model, and the more general descriptiv e fam- ily of exp onen tial random graph mo dels eﬀectively initiate this line of mo deling. Some of these mo dels also hav e a gener ative interpretation that allo ws us to think about their use in a dynamic, ev olutionary setting. W e deﬁne and discuss popular dynamic interpretations of the data generating pro cess, including the generativ e in terpretation, in c hapter 4 . 3.1 Basic Notation and T erminology In theoretical computer science, a graph or net work G is often deﬁned in terms of no des and edges, G ≡ G ( N , E ), where N is a set of no des and E a set of edges, and N = |N | , E = |E | . In the statistical literature, G is often deﬁned in terms of the no des and the corresp onding measuremen ts on pairs of no des, G ≡ G ( N , Y ). Y is usually represen ted as a square matrix of size N × N . F or instance, Y ma y b e represen ted as an adjacency matrix Y with binary elemen ts in a setting where w e are only concerned with enco ding presence or absence of edges b et w een pairs of no des. F or undirected relations the adjacency matrix is symmetric. Henceforth w e will work with graphs mostly deﬁned in terms of its set of N no des and its binary adjacency matrix Y con taining P ij Y ij = E directed edges. No des in the net work may represen t individuals, organizations, or some other kind of unit of study . Edges corresp ond 21 to t yp es of links, relationships, or interactions b et ween the units, and they ma y b e directed, as in the Holland-Leinhardt mo del, or undirected, as in the Erd¨ os-R´ enyi-Gilbert mo del. A note ab out terminology: in computer science, graphs con tain nodes and edges; in so cial sciences, the corresponding terminology is usually actors and ties. W e largely follo w the computer science terminology in this review. 3.2 The Erd¨ os-R ´ en yi-Gilb ert Random Graph Mo del The mathematical biology literature of the 1950s con tains a num b er of pap ers using what w e no w kno w as the net work mo del G ( N , p ), which for a net w ork of N no des sets the probabilit y of an edge b etw een each pair of no des equal to p , indep endently of the other edges, e.g., see Solomonoﬀ and Rap op ort [ 281 ] who discuss this mo del as a description of a neural netw ork. But the formal properties of simple random graph netw ork mo dels are usually traced bac k to Gilb ert [ 119 ], who examined G ( N , p ), and to Erd¨ os and R ´ en yi [ 93 ]. The Erd¨ os-R´ enyi-Gilbert random graph mo del, G ( N , E ), describ es an undirected graph in v olving N no des and a ﬁxed n um b er of edges, E , c hosen randomly from the  N 2  p ossible edges in the graph; an equiv alent in terpretation is that all  ( N 2 ) E  graphs are equally likely . 1 The G ( N , p ) mo del has a binomial lik eliho o d where the probabilit y of E edges is ` ( G ( N , p ) has E edges | p ) = p E (1 − p ) ( N 2 ) − E , or, equiv alently , in terms of the N × N binary adjacency matrix Y ` ( Y | p ) = Q i 6 = j p Y ij (1 − p ) 1 − Y ij . The lik eliho o d of the G ( N , E ) mo del is a h yp ergeometric distribution and this induces a uni- form distribution ov er the sample space of p ossible graphs. The G ( N , p ) mo del sp eciﬁes the probabilit y of ev ery edge, p , and con trols the expected n um b er of edges, p ·  N 2  . The G ( N , E ) mo del sp eciﬁes the num b er of edges, E , and implies the exp ected “marginal” probability of ev ery edge, E /  N 2  . The G ( N , p ) mo del is more commonly found in mo dern literature on random graph theory , in part b ecause the indep endence of edges simpliﬁes analysis [see, e.g., 69 ; 91 ]. Erd¨ os and R´ enyi [ 94 ] w en t on to describ e in detail the b ehavior of G ( N , E ) as p = E /  N 2  increases from 0 to 1. In the binomial v ersion the k ey to asymptotic b eha vior is the v alue of λ = pN . One of the imp ortant Erd¨ os-R ´ en yi results is that there is a phase c hange at λ = 1, where a giant connected comp onent emerges while the other components remain relatively small and mostly in the form of trees [see 69 ; 91 ]. More formally , P1. If λ < 1, then a graph in G ( N , p ) will ha ve no connected comp onents of size larger than O (log N ), a.s. as N → ∞ . P2. If λ = 1, then a graph in G ( N , p ) will hav e a largest comp onen t whose size is of O ( N 2 / 3 ), a.s. as n → ∞ . 1 Both v ersions are often referred to as Erd¨ os-R ´ enyi models in the curren t literature. 22 P3. If λ tends to a constant c > 1, then a graph in G ( N , p ) will hav e a unique “giant” comp onen t con taining a p ositiv e fraction of the no des, a.s. as N → ∞ . No other comp onen t will con tain more than O (log N ) no des, a.s. as N → ∞ . A summary of a pro of using branc hing pro cesses is given in the app endix of this chapter. Some of the pro of concepts will b e useful for discussion of exchangeable graph mo dels in section 3.3 . The Erd¨ os-R´ enyi-Gilbert model has spa wned an enormous num b er of mathematical pa- p ers that study and generalize it, e.g., see [ 43 ]. But few of them are especially relev an t for the actual statistical analysis of net work data. In essence, the mo del dictates that every no de in a graph has approximately the same n umber of neighbors. Empirically there are few observed netw orks with such simple structure, but w e still need formal tools for decid- ing on ho w p o or a ﬁt the mo del pro vides for a given observ ed netw ork, and what kinds of generalized net work mo dels app ear to b e more appropriate. This has led to tw o separate literatures, one of which has focused on formal statistical prop erties asso ciated with estimat- ing parameters of net w ork mo dels—the p 1 and exp onen tial random graph mo dels describ ed b elo w—and a second that identiﬁes selected predicted features of mo dels and empirically c hec ks observed netw orks for those features. The latter is largely asso ciated with pap ers emanating from statistical ph ysics and computer science, sev eral of whic h are describ ed in detail in chapter 4 . 3.3 The Exc hangeable Graph Mo del The exc hangeable graph mo del provides the simplest p ossible extension of the original ran- dom graph mo del b y introducing a w eak form of dep endence among the probability of sam- pling edges (i.e., exc hangeability) that is due to non-observable no de attributes, in the form of no de-sp eciﬁc binary strings. This extension helps fo cus the analysis, whether empirical or theoretical, on the interpla y b etw een connectivity of a graph and its no de-sp eciﬁc sources of v ariabilit y [ 1 ; 5 ]. Consider the following data generating pro cess for an exchangeable graph mo del, whic h generates binary observ ations on pairs of no des. 1. Sample no de-sp eciﬁc K -bit binary strings for each no de n ∈ N ~ b n ∼ unif (v ertex set of K -hypercub e), 2. Sample directed edges for all no de pairs n, m ∈ N × N Y nm ∼ Bern  q ( ~ b n , ~ b m )  , where ~ b 1: N are K -bit bina ry strings 2 , and q maps pairs of binary strings in to the [0 , 1] in terv al. This generation pro cess induces w eakly dep enden t edges. The edges are conditionally inde- 2 Note that the space of K -bit binary strings can b e mapp ed one-to-one to the vertex set of the K - h yp ercub e, i.e., the unit hypercub e in K dimensions. 23 p enden t giv en the binary string represen tations of the inciden t no des. They are exchange able in the sense of De Finetti [ 82 ]. F rom a statistical p ersp ective, the exchangeable graph model w e surv ey here [ 1 ; 5 ] pro- vides p erhaps the simplest step-up in complexity from the random graph mo del [ 93 ; 119 ]. In the data generation pro cess, the bit strings are equally probable but the induced probabilities of observing edges are diﬀerent. A class of random graphs with suc h a prop ert y has b een recen tly redisco vere d and further explored in the mathematics literature, where the class of suc h graphs is referred to as inhomo gene ous random graphs [ 45 ]. An alternativ e and arguably more in teresting set of sp eciﬁcations can b e obtained by imp osing dep endence among the bits at each no de. This can b e accomplished b y sampling sets of dep endent probabilities from a family of distributions on the unit hypercub e, ~ p n ∈ [0 , 1] K , and then sampling the bits indep enden tly giv en these dep enden t probabilities. 1. Sample no de-sp eciﬁc K -bit binary strings for each no de n ∈ N ~ p n ∼ h yp ercub e ( ~ µ, σ, α ), where σ > ( K − 1) · α > 0 , b nk ∼ Bern ( p nk ), for k = 1 , . . . , K 2. Sample directed edges for all no de pairs n, m ∈ N × N Y nm ∼ Bern  q ( ~ b n , ~ b m )  , In the hypercub e distribution 3 , ~ µ, σ, α con trol the frequency , v ariability and correlation of the bits within a string, respectively; and q maps binary pairs of strings in to the unit in terv al. In the exchangeable graph mo del, the num b er of bits, K , captures the complexity of the graph. F or instance, for K < N the mo del provides a compression of the graph. F or directed graphs the function q is asymmetric in the arguments. The sparsit y of the bit strings is controlled b y the parameter α > 0. A larger v alue of α leads to larger negative correlation among the bits and thereby a sparser netw ork. In suc h an exchangeable graph mo del there are tw o main sources of v ariabilit y: (i) the probabilit y of an edge decreases with the num b er of bits K , as more complexit y reduces the c hances of an edge, and (ii) the probabilit y of an edge increases with 1 /α , as concen trating density in the corners of the unit K -hypercub e impro v es the c hances of an edge. While this mo del do es not quite ﬁt the deﬁnition of non- homogeneous mo dels of Bollob´ as et al. [ 45 ], it is tractable enough to allow the analysis of the gian t comp onen t in ( K, α ) space, by lev eraging the branc hing pro cess strategy dev elop ed b y Durrett [ 91 ] (see the appendix at the end of the c hapter). As in Durrett’s analysis, the 3 The hypercub e distribution can b e obtained using a hierarchical construction as follows. Sample ~ u ∼ Normal k ( ~ µ, Σ), where u ∈ R k and Σ ii = α , Σ ij = β for i 6 = j . Then deﬁne p i = (1 + e − u i ) − 1 for i = 1 . . . k . The resulting densit y for ~ p , where ~ p ∈ [0 , 1] k is f P ( ~ p | ~ µ, α, β ) = | 2 π Σ | − 1 2 Q d j =1 p j (1 − p j ) exp  − 1 2 (log( ~ p/ (1 − ~ p )) − ~ µ ) 0 Σ − 1 (log( ~ p/ (1 − ~ p )) − ~ µ )  . F or more details see [ 4 ]. 24 gian t comp onen t emerges b ecause a num b er of smaller comp onen ts must in tersect with high probabilit y . In exchangeable graph mo dels ho wev er, the giant comp onen t has a p eculiar structure; connected comp onents are themselves connected to form the giant comp onent as so on as bit strings that match on tw o bits app ear with high probabilit y . Figure 3.1 provides a graphical illustration of this in tuition. No des that bridge t wo connected comp onen ts are Figure 3.1: L eft p anel. An example adjacency matrix that correspond to a fully connected comp onen t among 100 no des. R ight p anel. The clustering co eﬃcien t as a function of α on a sequence of graphs with 100 no des. Here σ = 12, and log( µ i ) = 1 K for ev ery i = 1 . . . K . eviden t in the left panel. Note that there are no no des that bridge three comp onents, as bit strings that match on three bits is an unlikely even t in a graph with 100 no des. Giv en a graph, w e can infer the corresp onding set of binary strings from data. The lik eliho o d that corresp ond to an exc hangeable graph mo del is simple to write, ` ( Y | θ ) = Z d ~ b 1: N  Y n,m Pr ( Y n,m | ~ b n , ~ b m , q ) Y n Pr ( ~ b n | θ )  , where θ = ( ~ µ, σ, α ) or an appropriate set of parameters. W e can apply standard inference tec hniques [ 2 ; 9 ]. Fitting an exchangeable graph mo del allows us to assess the complexity of an observed graph, lev eraging notions from information theory . F or instance, we can use the minimum description length (MDL) principle to decide ho w man y bits w e need to explain the observ ed connectivity patterns with high probabilit y . W e can also quantify ho w m uc h information is retained at diﬀeren t bit-lengths, and plot the corresponding information proﬁle for K < N and an entrop y histogram for an y giv en v alue of K . The exchangeable graph mo del allows for algorithmic comparison of an y set of statistical mo dels that are proposed to summarize an observ ed graph. As an illustration, consider an observ ed graph G and tw o alternative mo dels A and B . Rather than comparing how w ell mo dels A and B reco ver the degree distribution of G or other graph statistics, and indep enden tly of whether it makes sense to directly compare the t wo likelihoo ds of A and B (in fact, these mo dels need not ha ve a lik eliho o d), we can pro ceed as follo ws. 25 1. Given a graph G , ﬁt mo dels A (Θ a ) and B (Θ b ) to obtain an estimate of their parameters Θ a E st and Θ b E st resp ectiv ely . 2. Sample M graphs at random from the supp ort of A (Θ a E st ) and B (Θ b E st ). 3. Compute the distributions of summary statistics based on notion from information theory , suc h as information proﬁle and entrop y histogram, corresp onding to the 2 M graphs sampled from A and B . 4. Compare mo dels in terms of the distribution on the statistics ab o ve, such as the com- plexit y of the tw o mo dels’ supp orts and their similarit y to the complexity of G . The exc hangeable graph model also allo ws for ev aluation of the distribution of the num b er of bit strings with I matc hing bits, for an y in teger I < K . In theory this distribution leads to exp ectations on the num b er of no des that bridge I comm unities, where the members of each comm unit y ha ve only one out of I matching bits. In practice, we ma y w ant to sp ecify K in adv ance so that eac h bit corresp onds to a w ell deﬁned prop ert y . F or instance, in applications to biology , no des may corresp ond to proteins and the K bits enco de presence or absence of sp eciﬁc protein domains. The distribution on the num b er of I matchings leads to p-v alues that summarize how unexp ected it is to observ e binding ev en ts among a set of proteins that share a certain combination of domains. Ov erall, the exchangeable graph mo del introduces weak dep endences among the edges of a random graph in a con trolled fashion, whic h ultimately lead to a range of more struc- tured connectivit y patterns and enable model comparison strategies ro oted in notions from information theory . The fo cus here is not on mo deling p er se. In fact, the mo del is k ept as simple as p ossible. Rather, the fo cus is on mo deling as a means to establish a technical link b et ween graph connectivity and no de attributes. This technical link is useful to address some of the issues listed in Chapter 5 . F or more details see [ 5 ]. There exist other complex graph mo dels in the netw ork analysis literature that induce exc hangeable or partially exc hangeable edges. W e will discuss laten t space mo dels [ 146 ; 137 ] and sto chastic blo ckmodels [ 236 ; 7 ; 9 ] as examples. These mo dels can all b e traced bac k to an original analysis of multiv ariate so ciometric relations, measurements of relations represen ted as v ectors rather than scalars, that w as developed a few decades ago [ 103 ]. The diﬀerence in these mo dels and the exchangeable graph mo del lies in the in terpretation of the latent v ariables and in the goal of the analysis. Laten t space mo dels in terprets the laten t v ariables as laten t p ositions in a so cial space, and blo ckmodels in terpret the laten t mem b ership v ectors in terms of functional asso ciation or communit y membership. In the exc hangeable graph mo del, the laten t binary strings do not carry seman tic meaning, rather they are mathematical artifacts that help to represen t a graph and induce an expressiv e parametric family of distributions [ 15 ; 165 ; 5 ]. Most imp ortan tly , the excha ngeable graph mo del is mean t to b e a to ol to r epr esent and explor e the space of connectivity patterns in a smo oth, principled semi-parametric fashion. In this regard, exchangeable graph mo dels diﬀer substan tially from latent space mo dels or sto chastic blo ckmodels. 26 3.4 The p 1 Mo del for So cial Net w orks A conceptually separate thread of researc h developed in parallel in the statistics and so cial sciences literature, starting with the introduction of the p 1 mo del. Consider a directed graph on the set of n no des. Holland and Leinhardt’s p 1 mo del fo cuses on dyadic pairings and k eeps trac k of whether no de i links to j , j to i , neither, or b oth. It con tains the follo wing parameters: • θ : a base rate for edge propagation, • α i (expansiv eness): the eﬀect of an outgoing edge from i , • β j (p opularit y): the eﬀect of an incoming edge in to j , • ρ ij (recipro cation/m utuality): the added eﬀect of recipro cated edges. Let P (0 , 0) be the probability for the absence of an edge b etw een i and j , P ij (1 , 0) the probabilit y of i linking to j (“1” indicates the outgoing no de of the edge), P ij (1 , 1) the probabilit y of i linking to j and j linking to i . The p 1 mo del posits the follo wing probabilities (see [ 149 ]): log P ij (0 , 0) = λ ij , (3.1) log P ij (1 , 0) = λ ij + α i + β j + θ , (3.2) log P ij (0 , 1) = λ ij + α j + β i + θ , (3.3) log P ij (1 , 1) = λ ij + α i + β j + α j + β i + 2 θ + ρ ij . (3.4) In this represen tation of p 1 , λ ij is a normalizing constant to ensure that the probabilities for each dyad ( i, j ) add to 1. F or our present purp oses, assume that the dyad is in one and only one of the four p ossible states. The recipro cation eﬀect, ρ ij , implies that the o dds of observing a mutual dyad, with an edge from no de i to no de j and one from j to i , is enhanced by a factor of exp( ρ ij ) ov er and ab ov e what we would exp ect if the edges o ccured indep enden tly of one another. The problem with this general p 1 represen tation is that there is a lac k of identiﬁcation of the recipro cation parameters. The following special cases of p 1 are identiﬁable and of sp ecial in terest: 1. α i = 0, β j = 0, and ρ ij = 0. This is basically an Erd¨ os-R ´ en yi-Gilb ert model for directed graphs: each directed edge has the same probability of app earance. 2. ρ ij = 0, no r e cipr o c al eﬀe ct . This mo del eﬀectively fo cuses solely on the degree distri- butions in to and out of no des. 3. ρ ij = ρ , c onstant r e cipr o c ation . This w as the version of p 1 studied in depth b y Holland and Leinhardt using maximum likelih o o d estimation. 27 4. ρ ij = ρ + ρ i + ρ j , e dge-dep endent r e cipr o c ation . Fien b erg and W asserman [ 101 , 102 ] describ ed this mo del and ho w to ﬁnd maxim um likelihoo d estimate for the parameters. In the constant recipro cation setting, the elev ated probabilit y of recipro cal edges do es not dep end on the dy ad, whereas edge-dep enden t recipro cation dictates multiplicativ e increases of the recipro cation probability based on no de-sp eciﬁc parameters. The likelihoo d function for the p 1 mo del is clearly in exp onential family form. F or the constan t recipro cation v ersion, w e ha v e log P r p 1 ( y ) ∝ y ++ θ + X i y i + α i + X j y + j β j + X ij y ij y j i ρ, (3.5) where a “+” denotes summing ov er the corresp onding subscript. The minimal suﬃcien t statistics (MSSs) are y i + , y + j , and P ij y ij y j i . Then using the usual exp onen tial family theory we kno w that the likelihoo d equations are found by setting the MSSs equal to their exp ectations (cf. [ 308 ]). Holland and Leinhardt gav e an explicit iterative algorithm for solving these equations with the added constraints that the probabilities for each dyad add to 1. A ma jor problem with the p 1 and related mo dels, recognized by Holland and Leinhardt, is the lac k of standard asymptotics to assist in the dev elopment of goo dness-of-ﬁt pro cedures for the model. Since the num b er of { α i } and { β j } increase directly with the n um b er of no des, w e hav e no consistency results for the maxim um lik eliho o d estimates, and no simple wa y to test for ρ = 0, for example. A few ad ho c ﬁxes hav e b een suggested in literature, the most direct of which deals with the problem b y setting subsets of the { α i } and { β j } equal to one another (see the discussion of blo ckmodels b elo w) or b y considering them as arising from common prior distributions (see, e.g., [ 311 ]). Fien b erg et al. [ 104 ] recen tly suggested the use of tools from algebraic statistics to ﬁnd Mark ov basis generators for the model and the conditional distribution of the data giv en the MSSs. Fien b erg and W asserman prop osed a sligh tly diﬀerent dyad-based data representation for the p 1 mo del. Conceptually , the dyad considers the t wo directed measurements together: { D ij = ( y ij , y j i ) } . In their work, they deﬁne x ij k l = ( 1 if D ( y ij , y j i ) = ( k , l ) , 0 otherwise , where k and l tak e the v alues of 1 or 0. This representation conv erts the dyad { D ij = ( y ij , y j i ) } in to a 2 × 2 table with exactly one entry of 1 and the rest 0. Now if w e collect the data for the n ( n − 1) / 2 dy ads together, they form an n × n × 2 × 2 incomplete con tingency table with “structural” zeros do wn the diagonal of the n × n marginal (i.e., no self loops), and “duplicate” data for eac h dy ad ab ov e and below the diagonal. In this redundan t 4-wa y table, the mo del of no second-order in teraction corresp onds to p 1 with constan t recipro cation, and the standard iterative prop ortional ﬁtting algorithm 4 can b e used to compute the maximum 4 F or details on IPF for contingency tables, see [ 39 ; 99 ] 28 lik eliho o d estimates. Fienberg et al. [ 103 ] sho w that same type of contingency table rep- resen tation also w orks for the correlated p 1 mo del for m ultiple relations, and Mey er [ 213 ] pro vides a tec hnical statistical rational for these con tingency table represen tations. Holland and Leinhardt analyzed Sampson’s monk dataset (c.f. subsection 2.2.1 and [ 259 ]) using the p 1 mo del. Fienberg et al. [ 103 ] analyzed an 8-relation version of the Sampson data (4 p ositive and 4 negativ e) using their multiple-relation generalizations of p 1 , but fo cusing on an aggregation of the 18 monks into the three blo c ks iden tiﬁed in [ 322 ]: a top-esteemed blo c k of 7 monks with an unam biv alently p ositive attitude to wards itself, in conﬂict with a more am biv alen t blo c k of 7, and a blo ck of 4 outcasts and waiv erers. 3.5 p 2 Mo dels for So cial Net w orks and Their Ba y esian Relativ es In the statistical literature, the notion of ﬁxe d eﬀe cts t ypically refers to a set of unkno wn constan t quan tities, each of whic h is used to partly explain the v ariabilit y of the observ ations corresp onding to a unit of analysis, e.g., an individual or a pair of individuals. This con trasts the notion of r andom eﬀe cts , whic h refers to a set of unknown v ariable quantities that serv e a similar purp ose and are dra wn from the same underlying distribution. The p 1 mo del treats expansiv eness, { α i } , and p opularit y , { β j } , as ﬁxed eﬀects asso ciated with unique nodes in the net work. Often it mak es more sense to think ab out the ensemble of expansiv eness and/or p opularity eﬀects as a sample dra wn from some underlying distri- bution, and then estimate the parameters of that distribution. This t yp e of random eﬀects net w ork mo del has b een dev elop ed in a series of pap ers b y Snijders and his collab orators and they refer to it as the p 2 net w ork mo del, e.g., see v an Duijn et al. [ 301 ]. It is reasonably straigh tforw ard to tak e any of the m ultiv ariate v ariations on p 1 and generate a family of m ulti-lev el mo dels with mixtures of ﬁxed and random eﬀects in the spirit of p 2 , e.g., see Zijlstra et al. [ 333 ]. Ba y esian extensions of frequentist approac hes often in volv e p ositing a statistical mo del for ﬁxed eﬀects, th us conv erting them in to random eﬀects. The principal distinction b etw een the p 2 mo dels and Ba yesian extensions of p 1 is that, in the latter, the other unkno wn constan t quan tities, λ, θ , ρ , may b e also conv erted into random eﬀects. F urthermore, there may b e additional lev els to the m ultilev el hierarc hy in these models, and there are prior distributions on the parameters at the highest level of the hierarc hy (cf. Gill and Swartz [ 121 ]; W ang and W ong [ 311 ]). It should come as no surprise that authors using the Bay esian approac h ha ve w ork ed with Mon te Carlo Mark o v c hain (MCMC) methods as ha v e those using versions of p 2 . MCMC implementations of p 2 mo dels in STOCNET 5 are well-suited for net works with a relativ ely large num b er of no des, e.g., Zijlstra et al. [ 333 ] study net work data from 20 Dutch high sc ho ols with a total of 1,232 pupils. 5 STOCNET is a freestanding Soft ware pack age for the statistical analysis of so cial net w orks, a v ailable at http://stat.gamma.rug.nl/stocnet/ . 29 3.6 Exp onen tial Random Graph Mo dels Under the assumption that tw o p ossible edges are dep enden t only if they share a common no de, 6 F rank and Strauss [ 110 ] prov ed the follo wing characterization for the probabilit y distribution of undirected Marko v graphs: Pr θ { Y = y } = exp  n − 1 X k =1 θ k S k ( y ) + τ T ( y ) + ψ ( θ , τ )  y ∈ Y , (3.6) where θ := { θ k } and τ are parameters, ψ ( θ, τ ) is the normalizing constan t, and the statistics S k and T are counts of sp eciﬁc structures suc h as edges, triangles, and k -stars: n um b er of edges: S 1 ( y ) = P 1 ≤ i ≤ j ≤ n y ij , n um b er of k -stars ( k ≥ 2): S k ( y ) = P 1 ≤ i ≤ n  y i + k  , n um b er of triangles: T ( y ) = P 1 ≤ i ≤ j ≤ h ≤ n y ij y ih y j h . Note that there is a dep endence structure to the parameters of this mo del, with edges b eing con tained in 2-stars, and 2-stars being con tained in b oth triangles and three-stars. Certain v ariations of this ERGM mo del that inv olve directed edges are natural generalizations of the p 1 mo del. Alternativ e parameterizations that go b ey ond Marko v graph mo dels ha v e b een recen tly prop osed, e.g., see [ 280 ; 317 ; 21 ]. F rank and Strauss [ 110 ] w orked mainly with the three parameter mo del where θ 3 , . . . , θ n − 1 = 0. They prop osed a pseudo-likelihoo d parameter estimation metho d [ 287 ] that maximizes ` ( θ ) = X i u ( y ) − ψ ( θ )  . (3.7) The statistics u ( y ) are counts of graph structures. Although they are not indep endent— they count o verlapping sets of edges—they are assumed indep endent in the pseudo-lik eliho o d. Ignoring these correlations is a bad idea; it causes extreme sensitivit y of the predicted n umber of edges to small c hanges in the v alue of certain parameters [ 302 ]. Park and Newman [ 240 ] formally characterized sensitivity issues. Snijders et al. [ 280 ] recen tly proposed a v arian t of 6 This is the deﬁnition of Marko v prop erty for spatial processes on a lattice in [ 33 ]. 30 these mo dels where the ma jor problem of double-counting is mitigated but not ov ercome. Hun ter and Handco c k [ 155 ] estimate likelihoo d ratios for nearb y { θ i } using a MCMC pro ce- dure related to the work of Gey er and Thompson [ 118 ]. Their estimation pro cedure can b e used for mo dels based on distributions in the curved exp onential family . Robins et al. [ 256 ] describ e problems asso ciated with the estimation of parameters in man y ERGMs, inv olving near degeneracies of the likelihoo d function and th us of metho ds used to estimate parameters using maximum lik eliho o d. F or example, for a certain com- bination of ERGM statistics, the likelihoo d function may hav e m ultiple, clearly distinct mo des, and there are v ery few net w ork conﬁgurations—often radically diﬀerent from each other—that hav e non-zero probabilities. This is a topic of current theoretical and empirical in v estigation ro oted in the theory of discrete exp onential families [ 136 ; 251 ]. F or a discus- sion of mixing times of MCMC metho ds for ERGMs and the relev ance to conv ergence and degeneracies, see [ 35 ]. There are t w o carefully constructed pack ages of routines that are av ailable for analyzing net w ork data using ER GMs: statnet 7 and SIENA 8 . These pac k ages fo cus on the use of MCMC metho ds for estimating the parameters in ER GMs. Remark. It is p ossible to express the curren t formulation of exponential random graphs using the formalism of undirected graphical mo dels and the Hammersley-Cliﬀord theorem [ 76 ; 33 ]. W e can write the likelihoo d of an arbitrary undirected graph as Pr( y | θ ) = Q c ∈C ψ ( y c | θ c ) z , (3.8) where y c denotes the nodes in clique c , θ c denotes the corresponding set of parameters, ψ are non-normalized p oten tials o v er the cliques, and z = P Q c ∈C ψ ( y c | θ c ) is the normalization constan t. If the lik eliho o d is in the exp onen tial family , then the log potentials are linear in θ c and “features” u ( y c ), and we can write: Pr( y | θ ) = exp n X c ∈C log ψ ( y c | θ c ) − log z o = exp n X c ∈C θ > c u ( y c ) − log z o = exp n θ > u ( y ) − log z o . Within the exp onential family , the adv antage is that computing deriv atives and lik eliho o d and deriving the corresp onding EM algorithm are feasible, although p ossibly computationally exp ensiv e, b y using v ariational approximation strategies and Monte Carlo metho ds. A lot of metho dology on the sub ject has b een dev elop ed in the area of mac hine learning. There, 7 A pac k age written for the R statistical environmen t describ ed at http://csde.washington.edu/ statnet/ . See also the do cumentation in [ 138 ; 157 ; 224 ; 129 ]. 8 Sim ulation In vestigation for Empirical Netw ork Analysis—a freestanding pack age av ailable at http: //stat.gamma.rug.nl/snijders/siena.html . 31 undirected graphs app ear primarily in the context of relational learning and imaging. F or an in-depth discussion on exact and appro ximation metho ds and for references see [ 247 ; 308 ]. 3.7 Random Graph Mo dels with Fixed Degree Distri- bution The Erd¨ os-R´ enyi-Gilbert random graph mo del is fully symmetric and the exp ected degree (the n umber of edges associated with a node) is the same for all no des in the graph, follo wing a binomial distribution. A n umber of natural extensions of the Erd¨ os-R ´ en yi-Gilb ert mo del result in v arying no de degrees. F or example, • the preferential attac hment mo del [ 26 ] captures the formation of hubs in a graph (see section 4.1 ); • the one-parameter “small-w orld” mo del [ 320 ] in terp olates b etw een an ordered ﬁnite- dimensional lattice and an Erd¨ os-R ´ en yi-Gilb ert random graph in order to produce local clustering and triadic closures (see section 4.2 ). Alb ert and Barab´ asi [ 12 ] describ e a num b er of v ariants on these themes. Many of the in v estigators exploring the use of such mo dels often fo cus on the empirical degree dis- tribution, claiming for example that it follo ws a p o wer-la w in man y real world netw orks (cf. [ 26 ; 232 ; 69 ; 91 ]). The pap ers utilizing these “statistical physics” st yle mo dels often talk ab out ﬁxed-degree distributions [e.g., 239 ], and they either ﬁx the degree-distribution parameters or compute distributions that are conditional on some function of the degree distributions or sequences, suc h as their exp ectations (cf. [ 235 ; 70 ]). Soft ware is av ailable to sample from the space of random graphs with a given degree distribution based on Monte Carlo Mark o v c hain metho ds [ 42 ; 138 ]. There would app ear to b e a direct link b etw een these ideas and the represen tation of degree distributions in the family of p 1 mo dels. In the latter, the α i and β i parameters represen t the out-degree and in-degree for the i th no de, and the corresp onding suﬃcient statistics are the empirical v alues for these. In the statistical literature there is a long tradi- tion of lo oking at distributions conditional on minimal suﬃcient statistics, and for netw ork mo dels suc h a notion w as in vestigated as early as 1975 b y Holland and Leinhardt, who looked at the version of p 1 with ρ = 0, conditioned on the empirical in-degree and out-degree for all no des in the netw ork [ 147 ]. This allows for the calculation of an exact distribution that is indep enden t of the { α i } and { β i } b y en umerating all p ossible adjacency matrices in the reference set with the observed in-degrees and out-degrees. There is the exp ectation that suc h an approac h could lead to a uniformly most p ow erful test for ρ = 0, but there is no theory to supp ort this exp ectation as of y et. McDonald, Smith and F orster [ 211 ] suggest an iterativ e approach for such calculations using a Metrop olis-Hastings algorithm to generate from the conditional distribution of the triad census giv en the indegrees, the out-degrees and the n um b er of m utual dy ads. In a pair of pap ers [ 279 ; 280 ], Snijders and colleagues explore suc h conditioning for maxim um lik eliho o d estimation for exp onential random graph mo dels, 32 largely as a mec hanism for av oiding the degeneracies and near degeneracies observ ed when unconditional maximum likelihoo d is used, cf. section 3.6 and [ 256 ]. Snijders [ 274 ] does something similar for dynamic mo dels for graphs. Rob erts [ 252 ] suggests an algorithm for the conditional distribution of the p 1 mo del where ρ ij = ρ giv en the full set of minimal suﬃ- cien t statistics, but McDonald et al. [ 211 ] oﬀer a coun terexample and suggest an alteration of their algorithm to generate the prop er exact distribution. Generating suc h exact distri- butions is a very tricky matter in discrete exp onential families b ecause of the need to utilize appropriate Marko v bases, either explicitly as in Diaconis and Sturmfels [ 85 ] or implicitly . It is unclear whether the prop osals in this literature are in fact reac hing all p ossible tables asso ciated with the distribution. Blitzstein and Diaconis [ 42 ] explore diﬀeren t eﬃcien t mec hanisms for generating random graphs with ﬁxed degree sequence and explicitly make the link b etw een the “statistical ph ysics” and “so ciological” literatures, whereas the earlier pap ers b y Newman [ 232 ] and P ark and Newman [ 239 ] reference exponential random graphs but only approac h the notion of ﬁxed degree distributions from a statistical ph ysics p ersp ective, focusing on c haracteristics of net w ork ensembles rather that maxim um likelihoo d estimation and assessment of go o dness- of-ﬁt. 3.8 Blo c kmo dels, Sto c hastic Blo c kmo dels and Com- m unit y Disco v ery A problem whic h has b een a fo cus of attention for at least 40 y ears in the net w ork literature has b een the search for an “optimal partition” of the no des into groups or blo cks. In the so ciometric literature this was known as blo c kmo deling. A formalization of netw orks in terms of non-sto c hastic blo cks go es bac k at least as far as Lorrain and White [ 199 ]. Their pap er and the discussion of structural equiv alence ga v e rise to inn umerable pap ers in mathematical so ciology , (see, e.g., [ 53 ]) and algorithmic search strategies for determining blo cks (see, e.g., [ 19 ; 88 ; 89 ]). By embedding these ideas within a framework of random graphs, Holland et al. [ 150 ] explained how a sp ecial v ersion of p 1 could b e used to describ e a random graph mo del with predeﬁned blo c ks. (See also the related discussion in [ 103 ] and [ 311 ].) A true sto chastic blo c kmo del approach, ho w ever, in volv es the disc overy of the blo c k struc- ture as part of the mo del searc h strategy [ 314 ], and the ﬁrst attempts at doing this within the framework of p 1 and its exp onential family generalizations was due to Nowic ki and Sni- jders, who fo cused on tec hnical issues such as non-iden tiﬁabilit y in a restricted version of the blo c kmo del [ 277 ; 236 ; 237 ; 79 ]. A comprehensive statistical treatment of these models w as recen tly dev elop ed for analyzing protein in teraction data [ 7 ; 8 ] and then further dev elop ed in the context of so cial net work data [ 9 ]. Handco c k et al. [ 137 ] approac h this sto chastic blo c k- mo deling problem through a com bination of laten t space mo dels and traditional clustering. W e decrib e some of this w ork in more detail b elow. More recently in the statistical ph ysics and computer science literatures the problem has gone under the lab el of detection of communit y structure, e.g., see [ 122 ; 232 ; 71 ; 233 ; 33 266 ; 217 ]. This literature is now voluminous and seemingly unconnected to the statistical blo c kmo del work. The basic idea, in b oth the mo del-based and algorithmic approac hes as w ell as the com- m unit y detection literature, is that no des that are hea vily in terconnected should form a blo c k or communit y . The no des are reordered to display the blo c ks do wn the diagonal of the adjacency matrix represen ting the netw ork. Moreov er, the connections b etw een no des in diﬀeren t blo cks app ear in muc h sparser oﬀ-diagonal blo c ks. In mo del-based approac hes, the partition of the no des maximizes a statistical criterion link ed to the mo del, e.g., a lik e- liho o d function, whereas most algorithmic solutions maximize ad ho c criteria related to the “densit y” of links within and b etw een blo cks. More formally , a blo ckmodel is a mo del of net w ork data that relies on the intuitiv e notion of structur al e quivalenc e : t wo no des are deﬁned to b e structurally equiv alent if their connectivit y with similar nodes is similar—this is a “soft” deﬁnition. 9 F ollowing up this idea, w e can imagine collapsing structurally equiv alent no des together to form a sup er-no de, or a blo c k in the language of blo ckmodels. Keeping the notion of a blo ck in mind we can now revisit and sharp en the deﬁnition of structurally equiv alent no des: given N nodes and K blo c ks, let Y N × N b e the adjacency matrix of the graph G ( N , Y ), then t wo no des a and b are structurally equiv alent, and th us b elong to the same blo c k h , if their connectivity patterns C a and C b with no des in other blo cks are similar. The equiv alence betw een connectivity patterns of no des a and b can b e formally stated as follo ws: C a ≡  Y ( a, i ∈ h k ) : ∀ h k 6 = h  ≈ C b , where the index i runs ov er the no des other than a, b , the index k runs ov er the blo c ks other than h , h k is the set of no des in blo ck k , and ≈ quantiﬁes similarit y according to a suitable distance metric. This deﬁnition relies on a pre-sp eciﬁed partitioning of the N no des into K blo c ks. A blo c kmo del is useful, for instance, in the analysis of so cial relations where blo cks ma y corresp ond to so cial factions, as well as in the analysis of protein interactions where blo c ks ma y corresp ond to stable protein complexes. Collapsing no des in to blo c ks b y leveraging the notion of structural equiv alence abov e is a more general task than clustering. Consider, for example, the green no des 7–9 in the left panel of Figure 3.2 . They are structurally equiv alent according to the deﬁnition ab o ve, as C 7 ≈ C 8 ≈ C 9 , although there are no direct connections among the no des 7–9 themselv es. In this sense, no des 7–9 would not represen t a tight cluster according to measures of similarity based on direct connectivity . Blocks that would corresp ond to clusters can b e obtained b y pre-sp ecifying an identit y blo ckmodel, B = I k , in which all oﬀ-diagonal blo cks equal zero and all diagonal blo cks equal one. A t the technical level, w e need t wo sets of parameters in order to instantiate a blockmodel: (i) the blo c kmo del itself, B , is a K × K matrix, in whic h the B ( g , h ) entry sp eciﬁes, for instance, the av erage probability that no des in block g ha ve connections directed to no des in blo ck h , and (ii) a mapping b etw een no des and blo cks, ~ π 1: N = Π, where the no de- sp eciﬁc array summarizes some notion of mem b ership. Airoldi et al. [ 9 ], for instance, sp ecify 9 The term sto chastic e quivalenc e is often used in place of structural equiv alence, e.g., see [ 315 ]. 34 1 2 3 4 5 6 8 9 7 (7,8,9) (1,2,3) (4,5,6) Figure 3.2: Left: An example graph. Right: The corresp onding blo c kmo del, where red no des hav e b een collapsed in to the red blo c k and similarly for the other colors. Note that this problem is not a t ypical clustering problem, as the green no des do not share an y direct connections; eac h green no de, ho wev er, has connections directed to blue no des, and con- nections directed from red no des. In other words, given the partition into mono c hromatic blo c ks, the no des in the green blo c k share patterns of connectivity to no des in other blo cks. the mapping in terms of mixed membership arrays, in whic h π n ( h ) sp eciﬁes the relative frequency of interactions. No de n participates in 2N-2 in teractions in total and instan tiates connectivit y patterns that are typical of no des in its blo c k. These tw o sets of parameters, B and Π, are t w o laten t sources of v ariability that compete to explain the observ ed connectivit y . Ho w ever, the blo c kmo del B explains global asymmetric blo ck connectivit y patterns, while the (mixed) membership mapping Π explains no de-speciﬁc symmetric connectivity patterns. In this sense, instan tiating a blo c kmo del in terms of B and Π do es not introduce any source of non-iden tiﬁability b eyond the usual m ultiplicit y of parametric conﬁgurations that lead to exactly the same lik eliho o d—well characterized in this mo del by Nowic ki and Snijders [ 236 ]. As a concrete example, consider the mixed membership sto chastic blo ckmodel (MMB) in tro duced b y [ 9 ]; the data generating pro cess for a graph G = ( N , Y ) is the follo wing. 1. F or eac h no de p ∈ N : 1.1 Sample mixed mem b ership ~ π p ∼ Diric hlet K  ~ α  . 2. F or eac h pair of no des ( p, q ) ∈ N × N : 2.1 Sample mem b ership indicator, ~ z p → q ∼ m ult K ( ~ π p ). 2.2 Sample mem b ership indicator, ~ z p ← q ∼ m ult K ( ~ π q ). 2.3 Sample in teraction, Y ( p, q ) ∼ Bern ( ~ z > p → q B ~ z p ← q ). Note that the group mem b ership of eac h no de is c ontext dep endent . That is, each no de ma y assume diﬀeren t mem b ership when interacting or b eing interacted with by diﬀerent p eers. Statistically , each no de is an admixture of group-sp eciﬁc in teractions. The tw o sets of laten t group indicators are denoted by { ~ z p → q : p, q ∈ N } =: Z → and { ~ z p ← q : p, q ∈ N } =: Z ← . 35 Also note that the pairs of group memberships that underlie interactions need not b e equal; this fact is useful for c haracterizing asymmetric in teraction netw orks. Equalit y ma y b e enforced when mo deling symmetric interactions. Inference in the blo ckmodel is challenging, as the in tegrals that need to b e solved to compute the likelihoo d cannot b e ev aluated analytically . F or simplicit y , the likelihoo d is ` ( Y | ~ α, B ) = Z Π Z Z Pr( Y | Z , B ) Pr( Z | Π) Pr(Π | ~ α ) d Z d Π . While the inner in tegral is easily solv able 10 , the outer in tegral is not. Exact inference is thus not an option. T o complicate things, the num b er of observ ations scales as the square of the n umber of no des, O ( N 2 ). Sampling algorithms such as Mon te Carlo Marko v chains are t ypically to o slo w for real-size problems in the natural, so cial, and computational sciences. Airoldi et al. [ 9 ] suggest a nested v ariational inference strategy to approximate the p osterior distribution on the laten t v ariables, (Π , Z ). (V ariational metho ds scale to large problems without lo osing m uc h in terms of accuracy [ 3 ; 49 ; 308 ].) Bic k el and Chen [ 37 ], the most recent contribution to this literature, brings new t wists to the mo del-based approac h of comm unity disco ver y . They use a blo ckmodel to formalize a giv en net w ork in terms of its comm unity structure. The main result of this w ork implies that comm unit y detection algorithms based on the mo dularit y score of Newman and Girv an [ 122 ] are (asymptotically) biased. It sho ws that using mo dularity scores can lead to the disco very of an incorrect communit y structure ev en in the fa vorable case of large graphs, where com- m unities are substan tial in size and comp osed of man y individuals. This w ork also pro ves that blo c kmo dels and the corresp onding likelihoo d-based algorithms are (asymptotically) un biased and lead to the discov ery of the correct comm unity structure. The pro of relies on the exc hangeabilit y results dev elop ed in the statistics comm unity [ 15 ; 165 ] applied to paired measuremen ts [ 84 ]. 3.9 Laten t Space Mo dels The intuition at the core of latent space mo dels is that eac h no de i ∈ N can b e represented as a p oin t z i in a “lo w dimensional” space, say R k . The existence of an edge in the adjacency matrix, Y ( i, j ) = 1, is determined b y the distance among the corresp onding pair of no des in the lo w dimensional space, d ( z i , z j ), and b y the v alues of a n umber of cov ariates measured on each no de individually . The laten t space mo del w as ﬁrst introduced b y Hoﬀ et al. [ 146 ] with applications to so cial netw ork analysis, and has b een recently extended in a n um b er of directions to include treatment of transitivit y , homophily on no de-sp eciﬁc attributes, clustering, and heterogeneity of no des [ 144 ; 137 ; 183 ]. 10 The inner integral resolv es in to a series of sums, each one o ver the support of an individual ~ z v ariable. The supp ort is the same for all such ~ z v ariables, and it is giv en by the N vertices of the K -dimensional unit h yp ercub e. In other words, the inner in tegral is a series of sums, each o ver the same N elements. 36 The conditional probability mo del for the adjacency matrix Y is Pr( Y | Z , X , Θ) = Y i 6 = j Pr( Y ( i, j ) | Z i , Z j , X ij , Θ) , where X are co v ariates, Θ are parameters, and Z are the p ositions of no des in the lo w di- mensional latent space. Eac h relationship Y ( i, j ) is sampled from a Bernoulli distribution whose natural parameter dep ends on Z i , Z j , X ij and Θ. In their mo del, Hoﬀ et al. [ 146 ] gen- erated the paired observ ations Y ( i, j ) starting from the relev ant pair of node represen tations, ( Z i , Z j ), through a distance mo del, pair sp eciﬁc cov ariates X ij , and parameters Θ = ( α , β ). The log-o dds ratio is then: log Pr( Y ( i, j ) = 1) 1 − Pr( Y ( i, j ) = 1) = α + β 0 X ij − | Z i − Z j | ≡ η ij , and the corresp onding log likelihoo d is log Pr( Y | η ) = X n 6 = m  η ij · Y ij − log (1 + e η ij )  . One can easily extend the latent space mo deling approach to weigh ted netw orks. In the general case, paired observ ations Y may be mo deled using a generalized linear mo del that mak es use of Z 1: N X ij , and Θ. F ollowing the formalism in [ 210 ], a generalized linear mo del that generates the observed edge w eigh ts can be sp eciﬁed in terms of three quantitativ e elemen ts: i. the error mo del Pr( Y ij ), i.e., the mo del for the observ ed edge w eights with mean µ ij = E [ y ij ]; ii. the linear mo del η ij = η ij ( β , Z i , Z j ); iii. the link function g ( µ ij ) = η ij , which maps the support of µ ij to that of η ij —t ypically R . F or example, in the binary graph, the error mo del is Pr( Y ij ) = Bern( µ ij ), where µ ij ∈ [0 , 1] for all no de pairs ( i, j ) ⊆ N ; the linear mo del is η ij = β + d ( Z i , Z j ); the link function is g ( µ ij ) = log  µ ij 1 − µ ij  , with its in v erse b eing µ ij = 1 1+exp( − η ij ) [ 146 ]. In a graph with non- negativ e, integer edge w eights, w e can p osit Pr( Y ij ) = Poi ( µ ij ), where µ ij ∈ R + for all no de pairs ( i, j ) ∈ N ; the linear mo del is η ij = β + d ( Z i , Z j ), the same as in the previous example; the link function is g ( µ ij ) = log ( µ ij ), and its inv erse is µ ij = e η ij . In the general case, the generalized linear mo del for η ij ma y also include an explicit distance mo del d in the latent space Z : η ij = η ij  β , Z i , Z j  = η ij  β , d ( Z i , Z j )  . 37 Note that it is possible to re-parametrize Z i = ρ i ω i to separate the p osition in a latent reference space, Ω, from its magnitude, ρ i , a scalar. It is a simple intuition that suggests the use of an explicit distance mo del in the latent space. In a binary graph, for example, edges are more lik ely to b e generated b etw een pairs of nodes whose representations in the laten t space are close. A p opular choice of distance measures is Euclidean distance. Estimation can b e done via MCMC sampling. Inference in laten t space mo dels has b een carried out via Mon te Carlo Marko v c hain in net w orks with up to sev eral thousand no des [ 130 ]. Scalability issues remain to b e addressed b efore larger net w orks can b e analyzed. 3.9.1 Comparison with Sto c hastic Blo c kmo dels The latent space mo del of Hoﬀ et al. [ 146 ] pro jects no des on to a laten t Euclidean space by in v erting the logistic link. While in practice there is often in terest in iden tifying groups of similar no des, e.g. individuals or proteins, there is no explicit clustering mo del in the latent space. T o iden tify groups of similar no des, clustering metho ds m ust b e used to analyze the set of latent p ositions inferred by the laten t space mo del. T o allow joint inference on laten t p ositions and clusters, Handco ck et al. [ 137 ] introduce an explicit clustering mo del in the laten t space in the form of a mixture of (spherical) Gaussians.  Pr( Y | Z , X , Θ) = Q i 6 = j Pr( Y ( i, j ) | Z i , Z j , X ij , Θ) , Z i ∼ P k N ( µ k , σ 2 k · I ) . This mo del combines the original laten t space mo del [ 146 ] with a ﬁnite mixture of Gaussians approac h to clustering [ 297 ; 205 ]. It p osits that the latent p ositions Z i ∈ R d come from a k -dimensional mixture mo del. This extension is related to the sto chastic blo ckmodel of [ 9 ], which p osits a latent mem- b ership v ector for eac h node. These v ectors can b e view ed as cluster assignmen t probabilities for each no de. The observed binary relationships b etw een no des are mediated by p er-pair laten t v ariables, eac h drawn conditioned on a no de’s mixed mem b ership vector. In its gen- eral form, the blo c kmo del allo ws for multiple relations and cov ariates. Similarly , the mo del in [ 137 ] is also a hierarc hical mo del, as a Gaussian distribution is placed on the laten t p osi- tions Z i . In contrast, ho w ever, eac h no de b elongs to a single cluster and the corresp onding partition go v erns the observed relationships. There can b e v ariance in the latent p osition v ariables, but the idea of b elonging to t wo or more groups cannot b e represen ted. P osterior uncertain t y about cluster membership is diﬀerent from having an explicit distribution that con trols mixed membership, which carries with it an additional lev el of uncertaint y . With that said, the latent space in which no des are pro jected in [ 137 ] is somewhat comparable to the space of cluster prop ortions in [ 9 ]. The former maps no des to a Euclidean space, while the latter maps no des to the simplex. Both mo dels share the same goal: inferring latent structure that explains the v ariability of the connectivit y in an observed net work. In the mixed mem b ership mo del, full MCMC for an y but the simplest problems is unreasonably exp ensiv e. Airoldi et al. [ 9 ] app eal to 38 v ariational metho ds for a computationally eﬃcien t appro ximation to the p osterior. These metho ds can scale to large matrices (e.g., millions of no des) b ecause of the simpliﬁed appro xi- mation, but at an unkno wn cost to accuracy . It w ould b e interesting to explore computational tradeoﬀs for the laten t space cluster mo del [ 137 ] as the sample size grows and when large n um b ers of co v ariates are added. Remark. Blei and Fienberg [ 40 ] argue that a sto chastic blo c kmo del and node-sp eciﬁc mixed mem b ership v ectors are t wo sets of parameters that are directly in terpretable in terms of notions and concepts relev an t to so cial scien tists, and b etter suited to assist these scientists in extracting substantiv e kno wledge from noisy data, to ultimately inform or supp ort the dev elopmen t of new h yp otheses and theories. Applying the mixed mem b ership stochastic blo ckmodel (MMB) to Sampsons data demon- strates b oth similarities and diﬀerences [ 2 ]. F or instance, BIC suggests the existence of three factions among the 18 monks when ﬁtting the MMB, but the groupings diﬀer from those found b y the laten t space cluster mo del. One ma jor b eneﬁt of applying mixed mem b ership mo del to the data is the abilit y to quantitativ ely identify t w o out of three of the no vices that Sampson lab eled as waver ers in his analysis based on an throp ological observ ations. This could lead to the formation of a so cial theory of failure in isolated communities, with a p ossibilit y to b e conﬁrmed with real longitudinal data [ 2 ]. In the sociology literature, certain sp eciﬁcations of blo c kmo dels are referred to as la- tent class mo dels, and certain speciﬁcations of latent space mo dels are referred to as latent distanc e mo dels. Hoﬀ [ 145 ] pro vides a nice comparison, b oth theoretical and empirical, of these tw o t yp es of mo dels with the eigenmo del . The eignemo del is based on a singular v alue decomp osition of the so cio-matrix, it can capture more connectivit y patterns than the laten t class and the latent distance mo dels, for a given de gr e e of mo del complexity , which can b e the n umber of classes, the num b er of dimensions in the laten t space, or the num b er of eigenv ectors. There is a price to pay , how ever. The eigenmo del is the least amenable to in terpretation among the three models, as the inferred patters that capture connectivit y are in terms of eigenv ectors. The laten t space mo del can b e interpreted in terms of dis- tances. The laten t class mo dels can b e in terpreted in terms of blo cks of connectivity , or tigh t micro-comm unities; this is the easiest mo del to in terpret. App endix: Phase T ransition Beha vior of the Erd¨ os-R ´ en yi- Gilb ert Mo del A simple w ay to analyze the phrase transition b ehavior of Erd¨ os-R ´ enyi-Gilbert mo dels at λ = 1 is to study the emergence of the gian t comp onen t as a branc hing pro cess [ 91 ]. Intuitiv ely , consider branc hing pro cesses that start at every no de: for certain v alues of λ all the branching pro cesses will k eep growing with high probabilit y . Their supp orts, i.e., the sets of no des in v olved in eac h pro cess, will intersect with high probabilit y , leading to the emergence of the 39 gian t comp onen t, G , in whic h eac h no de can b e reached from ev ery other no de. The following formal argumen t comes from lecture notes by Guetz and Constantine [ 133 ] based on pro ofs given by Janson et al. [ 161 ]. Pic k a no de v ∈ N . If v is connected to all of the no des in G , then w e say that v is satur ate d in G . No w work as follo ws: pick a no de v and place it on the list. Then, iden tify all its neigh b ors in G , and add them to the list. Next, tak e the ﬁrst unsaturated no de on the list and add to the list all of its neighbors which are not already in it. The pro of is constructed by considering the distribution of the num b er of nodes an unsaturated no de adds to the list and by using Chernoﬀ b ounds to bound the size of the connected comp onent each no de b elongs to. F or details on this proof please see [ 43 ]. Bollob´ as et al. [ 45 ] carried out an extensive analysis of the phase transition that mathematically characterizes emergence of the giant comp onent in inhomogeneous random graphs. 40 Chapter 4 Dynamic Mo dels for Longitudinal Data In chapter 3 we fo cused on mo dels for static netw orks, that consider a cross-section of a real net w ork at a giv en p oin t in time. How ever, real net w orks often con tain a dynamic component. In the language of net works, dynamics can b e translated into the birth and death of edges and no des. F or example, in a friendship netw ork, new no des may b e introduced at any time and old no des may drop out due to inactivit y; links of friendships and alliances may b e ev en more brittle. Dynamic net w ork mo deling has b een a neglected sibling of static net work mo deling, partly due to the added complexit y and partly due to a lac k of datasets to study . Sampson’s monastery study [ 259 ] pro duced one of the earliest datasets with information on the dynamics in the net w ork of the 18 initiates. The original research, ho wev er, focused on the net w ork structure at each giv en time p oin t, rather than mo deling the underlying dynamics explicitly . As online comm unities gain in p opularit y , w e are b eginning to get access to an increasing n um b er of dynamic net w ork datasets of m uc h larger size and longer time span. A t the same time, adv ances in statistical and computational metho ds for inference and learning ha v e enabled dev elopment of ric her mo dels. Bearing this in mind, in this c hapter w e consider three diﬀerent classes of mo dels. W e b egin b y revisiting the Erd¨ os-R ´ enyi-Gilbert random graph model and its generalizations, viewing them as mo dels for dynamic pro cesses. Then w e turn to con tinuous time Mark o v pro cess mo dels (CMPM) and their discrete time cousins (suc h as a dynamic version of ER GM and other recently prop osed mo dels). 4.1 Random Graphs and the Preferen tial A ttac hmen t Mo del Man y v ariations on the classical Erd¨ os-R´ en yi-Gilb ert random graph mo del in section 3.2 are t ypically considered to b e static mo dels, in that they mo del a single, static snapshot of the net w ork, as opp osed to m ultiple snapshots recorded at diﬀerent time steps. Ho w ever, they also contain pro cesses for link addition and mo diﬁcation, whic h is a dynamic pro cess that ma y ha v e generated the observed graph, though there is no attempt to ﬁt these dynamic mo del 41 prop erties to observed data. F or this reason, we view them as “pseudo-dynamic” mo dels and discuss three examples here: the Erd¨ os-R´ en yi-Gilb ert mo del, preferential attac hment mo del, and small-w orld mo dels. F or example, we can view the Erd¨ os-R´ enyi-Gilbert mo del G ( N , E ), itself as a dynamic pro cess used to generate a random graph: • start from the graph of N unconnected no des at time 0; • at eac h subsequen t time step, add a diﬀeren t edge to the netw ork with probability p = E /  N 2  . By con ven tion, we usually ﬁx the n umber of no des at N , although w e can extend the pro cess to allow for addition of no des. This mo del assumes that edges (and no des) are not remov ed once they are added. The degree distribution for G ( N , E ) is binomial. But as N gets large, N p tends to a constan t, so it is appro ximately P oisson. Durrett [ 91 ] provides a ric h discussion for situating this dynamic description with the tradition of discrete time random w alks and branc hing pro cesses. In particular, he uses this represen tation to explore the emergence of the gian t comp onen t describ ed in section 3.2 (see app endix of c hapter 3 ). The Erd¨ os-R´ enyi-Gilbert mo del is simple and easy to study but do es not address man y issues present in real net w ork dynamics. One of the ma jor criticisms [ 26 ] of this mo del cen ters on the fact that it do es not pro duce a scale-free netw ork, i.e., the resulting no de degree distribution do es not follow a pow er la w. The net work literature is replete with claims that man y real netw orks exhibit the p o wer-la w phenomenon, (cf. [ 12 ]), and muc h subsequen t researc h has focused on ho w v arious generalizations of the Erd¨ os-R ´ enyi-Gilbert mo del conform to the p ow er law degree distribution. Mollo y and Reed [ 219 ] w ere the ﬁrst to describ e how to construct graphs with a general degree distribution and they w ent on to describ e the emergence of the giant comp onent in that context as well [ 220 ]. Barab´ asi and Albert [ 26 ] described a dynamic preferen tial attac hmen t (P A) mo del sp ecif- ically designed to generate scale-free netw orks. A t time 0, the mo del starts out with N 0 unconnected no des. A t eac h subsequen t time step, a new no de is added with m ≤ N 0 edges. The probabilit y that the new no de is connected to an existing no de is prop ortional to the degree of the latter. In other w ords, the new no de picks m no des out of the existing net work according to the multinomial distribution p i = δ i P j δ j , where δ i denotes the (undirected) degree of node i . This mo del, whic h was describ ed m uch earlier in the statistical literature b y Y ule [ 329 ] and Simon [ 269 ], is intended to describ e net w orks that gro w from a small n ucleus of nodes and follo w a “ric h-get-ric her” sc heme. The assumption is that, for instance, a new w eb page will more likely link via a URL to a w ell-known web page as opp osed to a little-kno wn one. Mitzenmacher [ 218 ] giv es a brief history of generative mo dels for p ow er law distributions. The preferential attac hmen t model of Barab´ asi and Albert results in a net w ork with a pow er la w degree distribution whose exp onent is empirically determined to be γ B A = 42 2 . 9 ± 0 . 1, whereas the Erd¨ os-R ´ en yi-Gilb ert model has a P oisson degree distribution. Man y extensions of the mo del hav e b een prop osed that allow for ﬂexible p o wer-la w exp onents, edge mo diﬁcations, non-uniform dep endence on the no de degree distributions, etc. F or example, Dorogo vtsev and Mendes [ 90 ] prop osed that creating an edge to no de i should b e prop ortional not just to its degree k i but also to its age, decaying as ( t − t i ) − ν , where ν is a tunable parameter. This leads to a pow er la w degree distribution only if ν < 1. Barab´ asi et al. [ 28 ] and Durrett [ 91 ] provide an accoun t of this and other extensions to the original mo del of Alb ert and Barab´ asi. Alternative graph generation mechanisms app ear ev ery da y—R–MA T [ 60 ],‘winners don’t take all’ [ 242 ],‘forest ﬁre’ [ 194 ],‘butterﬂy’ [ 212 ] and R TG [ 10 ], to name a few. The latest, R TG mo del, prov es conformance to 11 empirical laws observ ed in real netw orks. The main goal of these random graph mo dels is to describ e a pro cess that could generate netw orks emulating certain known netw ork prop erties. The generativ e pro cess could then giv e an insigh t into the dynamics that led to the observed net w ork. But these mo dels are often applied to net work data are gathered at a few p oints in time (sometimes only once). Th us the net w orks are often examined statically . It has b een recently p ointed out that, con trary to previous claims, the empirical laws that generative mo dels aim to emulate are not alw a ys supp orted b y real data. Visual com- parison are not suﬃcien t for determining the go o dness of ﬁt of a mo del. F or example, a lot of atten tion has recently been paid to the degree distribution. Figure 4.1 sho ws indegree and outdegree distributions for blog and query databases from an unnamed large compan y . They are plotted on a log-log scale and the do wn ward slop es, if ﬁtted b y straigh t lines, w ould b e visually similar to pow er la w distributions with exp onents less than 2. A careful examination of these plots, ho wev er, reveals a curvilinear relationship in all cases, whic h suggests that there is a diﬀeren t generating pro cess than those usually used to justify p o wer la ws in em- pirical net w ork data. Data such as that display ed in Figure 4.1 are often ﬁtted b y ordinary least squares or even by eye; often the claim is that a degree distribution is scale-free except for a cutoﬀ at very high or v ery low degrees, without any adjustment for searching for a cutoﬀ ! There hav e b een a num b er of recent eﬀorts to assess the ﬁt of degree distributions, suc h as those asso ciated with p ow er laws in log-log plots, with more rigor, e.g., see [ 74 ]. As in this example, results from such careful assessmen ts of ﬁt often contradict the assumption of linearit y . Li et al. [ 196 ] give a “structural metric” for examining simple connected graphs having iden tical degree distributions and derive theoretical prop erties of scale-free graphs. They pro vide at least one p ossible w a y to assess whether a graph corresp onding to a net w ork is in fact scale-free. F or more informal discussions related to this theoretical work, see [ 14 ; 324 ]. Flaxman et al. [ 106 ; 107 ] describ e a class of netw ork mo dels linked to the preferential attac hmen t mo del that also yield a p ow er-law degree distribution. Most descriptions of generativ e models fall short of studying the full parameter space and do not prop ose pro cedures for ﬁtting the prop osed metho ds to real data, though there are a few works that suggest maximum lik eliho o d, MCMC and other framew orks for ﬁtting these mo dels to data (for e.g. [ 34 ; 75 ; 214 ; 323 ]). One of the notable exceptions is w ork based on Kroneck er graph m ultiplication. What started as y et another generative pro cedure [ 192 ] 43 Figure 4.1: Log-log plots of degree distributions for a query data bases and a blog data base from a compan y database. Left: Blog indegree and outdegree distributions. Righ t: Query indegree and outdegree distributions. Source: Data from an unnamed large compan y , stored in iLab, Carnegie Mellon Universit y . has turned in to a well analyzed metho dology [ 195 ] with an eﬃcien t algorithm for mo del ﬁtting, analysis of the parameter space, and mo del selection. This w ork go es further in understanding real netw ork structure and pro vides a w a y for principled graph sampling. 4.2 Small-W orld Mo dels W atts and Strogatz [ 320 ] prop osed a small-w orld mo del which can b e thought of as a “pseudo- dynamic” mo del in the sense we describ ed in section 4.1 . This one-parameter “small-world” mo del in terp olates b et w een an ordered ﬁnite-dimensional lattice and an Erd¨ os-R ´ enyi-Gilbert random graph in order to pro duce lo cal clustering and triadic closures. Bollob´ as and Ch ung [ 44 ] had previously noted that adding random edges to a ring of N no des drastically reduces the diameter of the netw ork. The W atts-Strogatz mo del b egins with a ring lattice with N no des and k edges p er no de, and randomly rewires each edge with probability p . As p go es from 0 to 1, the construction mov es to ward an Erd¨ os-R´ en yi-Gilb ert mo del. They and others who follow ed, studied the b ehavior of suc h small-world net works when 0 < p < 1. This mo del is not dynamic although it is often used to describ e net works that ev olve ov er time. Figure 4.2 shows a small-world graph for n = 25 no des and 2 rewirings p er no de. Klein b erg [ 174 ] in tro duced a v ariation on the small-w orld mo del where random edges are added to a ﬁxed grid. Starting with an underlying ﬁnite-dimensional grid, he added shortcut edges, where the probabilit y that t wo nodes are connected b y a long edge dep ends on the distance b etw een them in the grid. More precisely , the probability that tw o non-adjacent no des x and y are connected is prop ortional to d ( x, y ) − α . With α set to the dimension of the lattice, the greedy routing algorithm can ﬁnd paths from one no de to another in a p olylogarithmic n um b er of exp ected steps. 44 Figure 4.2: Small-world graph for N = 25 no des and 2 rewirings p er no de. The red edges form the ring lattice and the blue edges the rewiring. This graph w as generated using the Ja v a applet at http://cs.gmu.edu/ ~ astavrou/smallworld.html Sev eral follo w-up w orks hav e made adjustmen ts to Klein b erg’s rewiring procedure in attempt to impro ve the understanding and eﬃciency of the na vigability of netw orks. F or example, Clauset and Mo ore [ 72 ] suggested to rewire a long distance edge from no de x , if while performing a greedy w alk o v er to y , the original top ology of the netw ork did not allow to reac h y within T thresh steps. The edge w as rewired to the place where the searc h gav e up (the no de reac hed after T thresh steps of the walk).They show that through this rewiring pro cedure the netw ork degree distribution conv erges to a p o wer la w, where α = α rewired . Their w ork also studied ﬁnite size eﬀects and sho wed that α opt → d , as n → ∞ rather slo wly . Sandb erg [ 260 , 261 ] and Sandberg and Clark e [ 262 ] in tro duced a diﬀerent rewiring sc heme with the end goal to make the net w ork more amenable to statistical analysis. Starting with N nodes on a ring, eac h with t wo neigh b or links and a long range link, the mo del of Sandberg [ 260 ] randomly rewires a graph in the following steps: • at eac h time step j = 1 , 2 , 3 , . . . , c ho ose a random starting no de x and a target node y and p erform greedy routing from x to y ; • indep endently and with (small) probabilit y x , up date the long-range link of each no de on the resulting path to p oint to y . 45 This deﬁnes a Marko v chain on a collection of lab eled graphs. Sandb erg and Clarke [ 262 ] conjecture that when the c hain achiev es stationarity , the distribution of distances spanned b y long-range links is (close to) theoretical optim um for searc h and the expected length of searc hes is p olylogarithmic. They supp ort the conjecture by a series of sim ulations. This metho dology has b een applied to the study of p eer-to-p er (P2P) net w orks. Durrett [ 91 ] discusses links b etw een small-w orld models and stochastic processes. Typical usage of small-world mo dels include empirical analyses in volving aggregate summary statis- tics (see, e.g., [ 18 ; 231 ]). There are as y et no formal statistical metho ds for examining the ev olution of small-w orld net w ork mo dels and for assessing their ﬁt to net work data measured o v er time. 4.3 Duplication-A ttac hmen t Mo dels Duplication-A ttac hment mo dels w ere originally dev elop ed in the computer science theory comm unit y to study the w orld wide web as a directed graph [ 175 ; 185 ]. These mo dels aim at describing prop erties of a snapshot of the w eb graph at a sp eciﬁc time, that is, a static directed graph. The data generating pro cess underlying these mo dels, how ever, is explicitly dynamic. The follo wing example demonstrates some basic assumptions b ehind the dynamics. Consider a newly added web page A , whic h provides a new no de in the web graph. The creator of w eb page A will then add hyp er-links to it, whic h pro vide new directed edges in the w eb graph. In particular, some of these hyper-links will p oint to other web pages regardless of whether their topical con ten t matc hes the topical con tent of w eb page A , but most of these h yp er-links will p oint to w eb pages with a topical conten t that closely matc hes the topical con ten t of w eb page A . T echnically , there are many p ossible sp eciﬁcations and v ariants. The basic duplication- attac hmen t mo del prop osed and analyzed by Kumar et al. [ 185 ] is as follows. Denote the graph at time t as G t = ( N t , E t ). A t each step, sa y t + 1, one new no de N is added to G t . The new no de is connected to a pr ototyp e no de m , chosen uniformly at random among those in N t . Then d out-links are added to no de N . The i th out-link is c hosen as follows: with probabilit y α the destination no de is chosen uniformly at random among those in N t , and with probabilit y 1 − α the destination no de is taken to b e the i th out-link of the protot yp e no de m . Note that this is p ossible since the algorithm generates a constan t degree graph. Rather than prop osing estimation strategies for the t wo parameters ( α , d ) of this particular duplication-attac hmen t mo del, the goal of the analysis of Kumar et al. [ 185 ] is on deriving results about top ological properties of duplication-attac hment graphs, describ ed as functions of the t w o parameters ( α, d ). Recent extensions of this mo del include a mo del where frac- tions of b oth out-links and in-links of the prototype no de m are c opie d b y the newly added no de N [ 193 ]. The goal of the analyses in this line of research, ho wev er, remains that of replicating prop erties of observ ed graphs, with a few exceptions. In the biological context, duplication-attac hmen t mo dels hav e app eared to b e useful in mo deling protein-protein in- teraction net w orks. F or example, Ratmann et al. [ 245 ] prop osed a mixture of preferen tial attac hmen t and duplication div ergence with parent-c hild attachmen t mo del to assess evo- 46 lutionary dynamics of protein in teraction netw orks of H. pylori and P. falcip arum . They prop osed a lik eliho o d-free MCMC-based routine to estimate p osterior of net w ork summary statistics. A more general review of w ork in mo deling dynamics (ev olution) on the basis of protein-protein in teraction data is av ailable in [ 246 ]. Wiuf et al. [ 326 ] ha v e dev elop ed a recursiv e construction of the lik eliho o d for duplication- attac hmen t mo dels, eﬀectively enabling principled statistical data analysis, estimation and inference. 4.4 Con tin uous Time Mark o v Chain Mo dels The use of con tinuous Mark ov pro cesses to model dynamic net w orks w as ﬁrst prop osed by Holland and Leinhardt [ 148 ] and W asserman [ 312 ] and most recen tly studied by Snijders and colleagues [ 275 ; 276 ]. As shall b ecome clear in this section, contin uous Marko v pro cess mo dels (CMPM) are in timately tied to the ERGM mo dels describ ed in section 3.6 . Within the CMPM family , netw ork edges are tak en to b e binary (either absen t or present, but not weigh ted), and the evolution o ccurs one edge at a time. Mo del v arian ts arise due to the man y p ossible sp eciﬁcations of edge c hange probability . Some exceptions to this general approac h include the part y mo del of Ma yer [ 206 ], where m ultiple edges are allo wed to c hange at the same time, and the work of Koskinen and Snijders [ 179 ], whic h deals with Bay esian parameter inference metho ds for the case where not all edge mo diﬁcations are observ ed. W e b egin b y pro viding a quick reminder of con tin uous Marko v pro cesses, b orrowing notation from [ 275 ]. Deﬁne { Y ( t ) | t ∈ T } to b e a sto c hastic pro cess, where Y ( t ) has a ﬁnite outcome space Y and T is a con tinuous time interv al. Supp ose that a Marko v condition holds: for any p ossible outcome ˜ y ∈ Y and an y pair of time p oints { t a < t b | t a , t b ∈ T } , Pr { Y ( t b ) = ˜ y | Y ( t ) = y ( t ) , ∀ t : t ≤ t a } = Pr { Y ( t b ) = ˜ y | Y ( t a ) = y ( t a ) } . (4.1) In other w ords, supp osing that t b denotes the future and t a the presen t, then conditioning on the past is equiv alent to conditioning on the present when it comes to determining the future. If the probability in Equation 4.1 depends only on t b − t a , then one can pro ve that Y ( t ) has a stationary transition distribution, and the transition matrix Pr( t b − t a ) := h Pr { Y ( t b ) = ˜ y | Y ( t a ) = y } i y , ˜ y ∈Y (4.2) can b e written as a matrix exp onential Pr( t ) = e tQ , (4.3) where Q is kno wn as the intensity matrix with elements q ( y , ˜ y ). The elemen ts q ( y , ˜ y ) can b e though t of as the slop e (rate of change) of the probabilit y of state c hange as a function of time, i.e., Pr { Y ( t +  ) = ˜ y | Y ( t ) = y } ≈ q ( y , ˜ y ). The diagonal elements q ( y , y ) are negativ e and are deﬁned so that the ro ws of Q sum to zero. When mo deling a so cial net work, the outcome space Y is tak en to b e all p ossible edge conﬁgurations of an N -node net work, and an individual conﬁguration y ∈ Y is tak en to b e 47 a binary v ector of length  N 2  . W e use the shorthand q ij ( y ) to denote the prop ensit y for the edge b et ween no de i and j to ﬂip into its opp osite v alue under conﬁguration y . The function q ij ( y ) completely sp eciﬁes the dynamics of the net w ork mo del. W e now review sev eral v arian ts of CMPM which diﬀer only in their deﬁnition of q ij ( y ). Indep enden t arc, recipro cit y , and p opularit y mo dels. The indep endent ar c mo del emplo ys the simplest deﬁnition of q ij ( y ): Indep enden t arc mo del: q ij ( y ) = λ y ij , (4.4) i.e., Y ij c hanges from 0 to 1 at a rate λ 0 , and from 1 to 0 at rate λ 1 . In this mo del, mo diﬁcation to one edge do es not dep end on the setting of other edges. The mo del is simple enough that the transition probabilities Pr( t ) can b e derived in closed form (see, e.g., T aylor and Carlin [ 292 ] p. 362-364). Maximum likelihoo d parameter estimation for this mo del w as discussed in [ 278 ]. In the r e cipr o city mo del, the rate of c hange in y ij dep ends only on the recipro cal edge y j i : Recipro cit y mo del: q ij ( y ) = λ y ij + µ y ij y j i . (4.5) Th us, if no link curren tly exists b et ween no des i and j , then the prop ensity for adding either directed edge is λ 0 ; if one directed edge exists, then the recipro cal edge is added with prop ensit y λ 0 + µ 0 . If one directed edge exists, then it is deleted with rate λ 1 . If b oth edges exist, then the deletion prop ensit y for either is λ 1 + µ 1 . The transition matrix Pr( t ) can b e deriv ed but has a complicated form [ 189 ; 272 ]. Along the same line of dev elopmen t, the p opularity mo del and the exp ansiveness model [ 312 ; 313 ] deﬁne the c hange rate for edge y ij to b e dep enden t on y + j , the in-degree of node j , or y i + , the out-degree of no de i : P opularit y mo del: q ij ( y ) = λ y ij + π y ij y + j , (4.6) Expansiv eness mo del: q ij ( y ) = λ y ij + π y ij y i + . (4.7) Edge-orien ted dynamics. Snijders [ 276 ] outlines t wo categories of transition dynamics: edge-orien ted and no de-oriented. In b oth cases, the in tensit y matrix is factored in to t wo comp onen ts: one con trols the opp ortunity for change, and the other sp eciﬁes the prop ensity of c hange. More precisely , the con tinuous time Marko v pro cess is now split into tw o sub- pro cesses; the ﬁrst op erating in the contin uous time domain and dictating when a change should o ccur; the second dealing with the probabilit y of the discrete even t of individual edge ﬂips. Both edge-oriented and no de-oriented dynamics can be interpreted as sto c hastic optimizations of a p otential function f ( y ) on the netw ork conﬁguration. The diﬀerence is that, in the edge-oriented case, f is based on global statistics of the netw ork, whereas in the no de-orien ted case, f is deﬁned for eac h no de’s lo cal neighborho od. Moreov er, the c hoice of which edge to ﬂip diﬀers b et ween the tw o form ulations. 48 Using y ( i, j, z ) to denote the conﬁguration where the edge e ij has the v alue z ∈ { 0 , 1 } , edge-orien ted dynamics can b e written in the follo wing general form: q ij ( y ) = ρp ij ( y ) , (4.8) where p ij ( y ) = exp( f ( y ( i, j , 1 − y ij ))) exp( f ( y ( i, j , 0))) + exp( f ( y ( i, j, 1))) . (4.9) Th us, in edge-oriented dynamics eac h edge follo ws an indep enden t Poisson pro cess, so that the time until the next even t has an exp onential distribution with parameter ρ . When an ev en t o ccurs for edge i → j , the edge ﬂips to its opp osite v alue with probability p ij ( y ). The p oten tial function f ( y ) is usually deﬁned as a linear combination of net w ork statis- tics: f ( y ) = X k β k s k ( y ) . (4.10) This should start to lo ok familiar. Indeed the CMPM process with edge-oriented dynamics is equiv alent to the Gibbs sampling pro cess for ERGMs (where the next edge to b e up dated is selected randomly). The statistics s k ( y ) for node k take on the usual forms (see T able 4.1 ). Num b er of directed arcs: s 1 ( y ) = X ij y ij Num b er of recipro cated arcs: s 2 ( y ) = X ij y ij y j i Num b er of pairs of arcs with the same target: s 3 ( y ) = X ij k y kj y j i Num b er of pairs of arcs with the same origin: s 4 ( y ) = X ij k y ik y ij Num b er of paths of length tw o: s 5 ( y ) = X ij k y ij y j k Num b er of transitiv e triplets: s 6 ( y ) = X ij k y ij y ik y j k T able 4.1: The table of netw ork statistics for a directed so cial netw ork. The statistics in T able 4.1 assume directed graphs, ho wev er it is easy to come up with the corresp onding statistics for undirected graphs. F or example, in the undirected case all the edges are “recipro cal” and thus s 1 and s 2 are com bined in to s 0 ( y ) = P i,j >i ∈ N y ij . Due to their close relations to ER GMs, edge-oriented mo dels suﬀer the same fate of degeneracy . F or example, if the parameter β for transitive triplets is not to o small, then with high probabilit y the simulated netw ork will b e a complete graph. Ho wev er, compared to static net w orks, degeneracy in the longitudinal case is not as muc h a concern, as the complete graph will only emerge at some distant time in the future. 49 No de-orien ted dynamics. F ully no de-oriented dynamics [ 275 ] deﬁnes the in tensit y ma- trix as q ij ( y ) = ρ i p ij ( y ) , (4.11) where p ij ( y ) = exp( f i ( y ( i, j, 1 − y ij ))) P h 6 = i exp( f i ( y ( i, h, 1 − y ih ))) . (4.12) Th us the indep enden t Poisson pro cesses for determining edge change opp ortunity are now deﬁned for eac h no de (with in tensity ρ i ) as opp osed to each edge. Given the opp ortunit y for edge c hange, eac h no de seeks to optimize its o wn p otential function as deﬁned by f i ( y ) = X k β k s ik ( y ) . (4.13) The function f i ( y ) is similar to the global p oten tial f ( y ) in Equation 4.10 but only aggregates o v er the lo cal neighborho o d of no de i . No de i fa v ors c hanging the incident edge that would lead to the biggest increase in its p oten tial. Edge-no de mixed dynamics. Snijders [ 276 ] also suggested a form of mixed dynamics where the opp ortunit y for c hange is edge-oriented, but the p otential functions are no de- orien ted: q ij ( y ) = ρ exp( f i ( y ( i, j, 1 − y ij ))) P h 6 = i exp( f i ( y ( i, h, 1 − y ih ))) . (4.14) Th us the opp ortunity to mo dify eac h edge i → j follows indep endent P oisson pro cesses with parameter ρ . But given the opp ortunity for c hange, the probabilit y of an actual ﬂip dep ends on no de i ’s lo cal netw ork conﬁguration. Remark. Parameter estimation in CPCM models has until recen tly b een done via metho d of moments, where the exp ected v alues are obtained through MCMC on simulated net works [ 273 ]. Koskinen and Snijders [ 179 ] prop osed a Ba yesian inference metho d that allows for computation of the p osterior distribution of the parameters and treats missing v alues more adequately . F or details of the pro cedure, please refer to Koskinen and Snijders [ 179 ]. 4.5 Discrete Time Mark o v Mo dels In this section, we outline three recent prop osals of dynamic netw ork mo dels op erating in the discrete time domain (see also [ 22 ]). All three mo dels hav e the Mark ov prop erty and represen t the lik eliho o d as a sequence of factored conditional probabilities Pr( Y 1 , Y 2 , . . . , Y T ) = Pr( Y T | Y T − 1 ) Pr( Y T − 1 | Y T − 2 ) · · · Pr( Y 2 | Y 1 )) , (4.15) where { Y 1 , . . . , Y T } is a sequence of T observ ed snapshots of the net work. Banks and Carley [ 22 ] discussed the simplest version of suc h mo dels. See also [ 253 ]. 50 4.5.1 Discrete Mark o v ERGM Mo del Hannek e and Xing [ 139 ] prop osed a natural extension of the ER GM mo del in the discrete Mark o v domain. Unlik e the set up in the con tinuous domain, the p otential function in this mo del in v olve the statistics of tw o consecutive conﬁgurations of the netw ork: Pr( y t | y t − 1 ) = 1 Z exp { X k β k s k ( y t , y t − 1 ) } . (4.16) T able 4.2 lists a few examples of netw ork statistics deﬁned on pairs of net work snapshots. Densit y of edges: s 1 ( y t , y t − 1 ) = 1 ( n − 1) X ij y t ij Stabilit y: s 2 ( y t , y t − 1 ) = 1 ( n − 1) X ij [ y t ij y t − 1 ij + (1 − y t ij )(1 − y t − 1 ij )] Recipro cit y: s 3 ( y t , y t − 1 ) = n X ij y t j i y t − 1 ij . X ij y t − 1 ij T ransitivity: s 4 ( y t , y t − 1 ) = n X ij k y t ik y t − 1 ij y t − 1 j k . X ij k y t − 1 ij y t − 1 j k T able 4.2: The table of netw ork statistics for pairs of netw ork snapshots. The basic mo del ma y b e extended to allow for multiple relations, no de attributes, and K-th order Marko v dep endencies of the form Pr( Y K +1 , Y K +2 , . . . , Y T | Y 1 , . . . , Y K ) = T Y t = K +1 Pr( Y t | Y t − K , . . . , Y t − 1 ) , (4.17) where Pr( Y t | Y t − K , . . . , Y t − 1 ) = 1 Z exp { X k β k s k ( Y t , . . . , Y t − K ) . (4.18) The join t distribution of the ﬁrst K net work snapshots may b e represented b y an ER GM for the ﬁrst snapshot, and a ( k − 1)-th order discrete Marko v dep endency mo del for Y k . The paired net w ork statistics ma y b e extended ov er K netw ork sequences. Maxim um likelihoo d parameter estimates may b e computed via any n umerical approxi- mation tec hnique such as the Newton-Raphson metho d. Computation of the gradien t and Hessian requires the mean and cov ariance of the sequence netw ork statistics, which are ex- actly computable for a pair of net w orks, but require Gibbs sampling in the K -sequence case [ 139 ]. The lik eliho o d of this model is w ell b eha ved if the minim um suﬃcien t statistics in volv e only dy ads, ho w ever, similar to its static counterpart, the full dynamic ERGM is prone to lik eliho o d degeneracy . 51 4.5.2 Dynamic Laten t Space Mo del Sark ar and Moore [ 264 ] extended the static laten t space mo del of Hoﬀ et al. [ 146 ] (cf. sec- tion 3.9 ) in the time domain. Recall that in the static latent space mo del, the log o dds ratio of a link b etw een no des i and j dep ends on the distance b etw een their laten t p ositions z i and z j . The dynamic latent space mo del allo ws the laten t p ositions to c hange o v er time in Gaussian-distributed random steps: Z t | Z t − 1 ∼ N ( Z t − 1 , σ 2 I ) . (4.19) The observ ation mo del is a mo diﬁed v ersion of the original latent space mo del 1 : p L ij := p L ( y ij = 1) = 1 1 + exp( d ij − r ij ) , (4.20) where d ij is the Euclidean distance b et ween i and j in latent space, and r ij is a radius of inﬂuence deﬁned as c × (max( δ i , δ j ) + 1) ( δ i and δ j b eing the degrees of no de i and j , resp ectiv ely). The “radius of inﬂuence” is based on the assumption that the higher the maxim um degree of the t wo end no des, the more lik ely the edge. This may b e true in citation netw orks where proliﬁc authors are more likely to form new co-authorships. The constan t 1 is added to ensure that the radius is non-zero, and c is estimated from data b y a line-searc h (a minimization metho d in one dimension). The link probabilit y p ij is deﬁned to b e a mixture b etw een the mo diﬁed latent space link probabilit y p L ij and a noise probability ρ . The idea is that pairs of no des who are outside of each other’s radius hav e only a lo w noise probabilit y of establishing a link, while no des within eac h other’s radii follow the probabilit y p L ij : p ij = κ ( d ij ) p L ij K ( d ij ) + (1 − κ ( d ij )) ρ. (4.21) The full observ ation mo del is then Pr( Y t | Z t ) = Y i ∼ j p ij Y i  j (1 − p ij ) , (4.22) where i ∼ j denotes the presence of an edge from i to j . The laten t space p ositions Z t are estimated in sequence for t = 1 . . . T b y maximizing the lik eliho o d of the observ ed Y t : Z t = argmax Z Pr( Y t | Z ) Pr( Z | Z t − 1 ) . (4.23) The authors prop ose conjugate gradient optimization starting from an initial estimate of the laten t p ositions based on a m ultidimensional scaling (MDS) transform of the observ ed pairwise distances. T o eliminate rotational ambiguit y , a Pro crustean (rotationally inv ariant) transform is applied to the MDS transform so that Z t is aligned with Z t − 1 . Applying the model to the NIPS paper co-authorship dataset (cf. subsection 2.2.6 ), the authors ga v e anecdotal evidence of the v alidity of the changing em b eddings of sev eral w ell 1 Note that in this dynamic version of the latent space model, links are assumed to be undirected. 52 kno wn mac hine learning researc hers ov er time. The dynamics of the researc hers’ laten t p ositions allo w ed for an insigh t in to the evolution of the mac hine learning comm unit y . Sark ar et al. [ 265 ] also prop osed a richer mo del based on [ 124 ], which impro ved up on previous work in tw o wa ys. One of the diﬀeren tiating features of this work w as the abilit y to sim ultaneously em b ed w ords and authors in to the latent space, which allo wed for represen ta- tion of a tw o-mo de net w ork. The ma jor adv antage, how ever, was the inference metho d—the authors prop osed a Kalman-ﬁlter like dynamic pro cedure, which allow ed for estimation of the p osterior distributions o v er the p ositions of the authors in the latent space. Prop osed pro cedure w as applied to a simulated NIPS dataset. The impact of this line of w ork is dichotomous: ﬁrst, it oﬀers an explanation of the net w ork at ev ery time step, and second, it enables an accurate and eﬃcient prediction of the state of the net w ork at a time step in the future. The prop osed inference pro cedures made it p ossible for netw ork mo deling to scale to large dynamic collections of data. The dra wbac k of this approach is the lack of an explicit mechanism that could explain the dynamics behind the real netw orks. Another latent mo del for citation netw orks w as developed in the ph ysics communit y . Leic h t et al. [ 190 ] prop osed to use laten t v ariables to capture the grouping of pap ers that ha v e similar citation proﬁles o ver time. The netw ork in this case is a directed acyclic graph and the no des are pap ers rather than authors. Using as example a set of opinions from the US Supreme Court and their citations b etw een the y ears of 1789 and 2007, the authors sho wed ho w a simple latent mo del w as able to reco ver, in a completely unsup ervised manner, the diﬀeren t eras in US Supreme court opinion references. The parameters of the mo del, except for the n umber of latent classes, were estimated using an EM algorithm. Diﬀerent num b ers of laten t classes w ere tested and eac h rev ealed something new ab out the underlying data. The authors also compared the latent metho d to a clustering based on netw ork mo dularity [ 233 ]. Even with the information ab out time (directionalit y in the graph) remo v ed, the laten t v ariable mo del was still able to discov er the same split b etw een tw o groups of opinions that happ ened around 1937. The netw ork mo dularity clustering in a wa y v alidated the outcome of the latent mo del. In a separate exp erimen t, Leic h t et al. [ 190 ] sho w ed that deterministic approac hes suc h as “hub s and authorities” and eigen vector centralit y [ 171 ] disco vered in teresting netw ork prop erties that were not rev ealed by the statistical mo dels. The deterministic analyses sho w ed several signiﬁcan t drops in the age of authorities sited, meaning that once in a while, the younger set of opinions b ecame the new authorities and that the pro cess happ ened in a “decisiv e” manner, rather than gradually . In this wa y , deterministic net w ork analysis approac hes complemen t statistical mo dels. 4.5.3 Dynamic Con textual F riendship Mo del (DCFM) The dynamic con textual friendship mo del (DCFM) of Golden b erg and Zheng [ 128 ] repre- sen ts an attempt to capture several asp ects of the complexity of the ev olution of real so cial net w orks o ver time. In a real-life friendship net work, p eople may meet and in teract with eac h other under diﬀerent contexts (e.g., school, work pro jects, so cial outings, etc.), and the 53 strength of interpersonal relationships change ov er time based on these interactions. DCFM oﬀers suc h a mec hanism for netw ork evolution, where edges ha ve weigh ts that indicate the strength of the relationship, and eac h no de is giv en a distribution ov er so cial interaction spheres (contexts). Con text is deﬁned to b e any activit y where p eople may in teract with eac h other. At eac h giv en time step, eac h no de chooses a random context according to the no de’s distribution o ver con texts. No des that app ear in the same context up date the w eights of the links b etw een them. The probability of a w eight increase (or decrease) dep ends on whether the pair had a chance to meet (a coin toss in a mo del) and the “friendliness” pa- rameter of the individuals inv olved. The p ossibilit y of b oth p ositiv e and negativ e w eigh t up dates allows for edge birth and death o ver time. A n extension of the mo del also allows for addition and deletion of no des. The underlying dynamics is captured b y a ﬁrst-order Marko v chain mo del. Letting W t denote the w eigh ted adjacency matrix at time t , the basic generativ e pro cess at time t can b e formalized as follows: 1. F or eac h no de i , sample con text C i ∼ m ult( θ i ), where θ i denotes the con text distribu- tion parameters. 2. F or each pair of no des i and j in the same context, sample meeting v ariable M ij ∼ Bern( ν i ν j ), where ν i and ν j represen t the “friendliness” of no des i and j ; 3. W t ij = ( P oi( λ h ( W t − 1 ij + 1)) if M ij = 1 , P oi( λ ` ( W t − 1 ij )) otherwise , where λ h and λ ` are hyperparameters indicating the rates of growth and decay , resp ec- tiv ely . The idea is that a meeting should increase the edge w eight with high probabilit y , otherwise the weigh t decays. The parameters θ i , ν i , λ h , λ ` all hav e conjugate priors and are estimated through Gibbs sam- pling [ 331 ]. The mo del can generate net works with a num b er of diﬀeren t prop erties. F or example, Figure 4.3 sho ws v arious degree distributions generated by DCFM, while Figure 4.4 demon- strates p ossible relation dynamics. Pair (47 , 45) shows a brief resuming of the relationship, whic h dissolves again in the next momen t. While DCFM is capable of em ulating suc h long- term memory of past relationships, it do es so at the cost of added mo del complexity . F ew datasets con tain weigh ted relationships. The Enron dataset (cf. subsection 2.2.2 ) con tains email exchanges that can b e aggregated on a w eekly basis to simulate strength of relationships. In the NIPS dataset (cf. subsection 2.2.6 ), the num b er of join t publications p er year can represent the strength of the coauthorship. In these cases, the DCFM con texts can b e tak en to b e the topics of emails or articles, and the friendliness parameters can b e estimated using the metho d of moments. One dra wback of DCFM is its lac k of iden tiﬁability; it is imp ossible to tell without additional knowledge whether an individual formed man y friendships b ecause he frequen tly 54 Figure 4.3: Log-log plot of the degree distributions of a net work with 200 p eople. ν i is dra wn from Beta(1 , 3) for the plot on the left, and from Beta(1 , 8) for the right hand side. Solid lines represent a linear ﬁt and dashed lines quadratic ﬁt to the data. Con texts are drawn ev ery 50-th timesteps. 0 100 200 300 400 500 600 0 300 (11,33) time 0 100 200 300 400 500 600 0 300 (52,49) time 0 100 200 300 400 500 600 0 10 (47,45) time 0 100 200 300 400 500 600 0 1500 (52,53) Figure 4.4: W eight dynamics for 4 diﬀerent pairs in a DCFM simulated net w ork of 600 p eople o v er 600 time steps. Contexts switches o ccur every 50-th timestep and b = 3. 55 c hanges contexts and is very friendly or b ecause the con texts themselves tend to b e large. Also, weigh ted net work data are hard to come b y and th us pseudo-w eigh ts often hav e to b e used. The DCFM model is imp ortant in its own right: the life-mimicking , ric h generative mec hanism is a step to wards realistic complex mo dels that ultimately can b e used to explain the in tricacies of observed data, esp ecially if additional information ab out con texts and individuals’ friendliness is av ailable. 56 Chapter 5 Issues in Net w ork Mo deling There are a n um b er of ma jor statistical mo deling and inferential c hallenges in the analysis of net w ork data that go well b ey ond those describ ed in previous sections of this article. These relate to b oth the qualit y and the ease of statistical inference and we mention a few of them here: Net w ork Visualization. With the rise of online so cial net works and net work mo deling, w e ha ve seen a proliferation of visualization to ols, esp ecially those based on v ariations of constrain t-based spring mo del algorithms, e.g., see the discussion and references in Shnei- derman and Aris [ 267 ]. The automated algorithms often use no de degrees or some form of distance metric b et ween no des to arrange their placemen t. F or example, SoNIA 1 is a p opular pack age for visualizing dynamic or longitudinal net w ork data; it can b e used as a platform for the developmen t, testing, and comparison of v arious static and dynamic la yout tec hniques. How ever, little is known ab out ho w to eﬀectively combine visualization with the kinds of statistical mo dels w e review here, esp ecially if one wan ts to use the visualization as another to ol in the analysis of net work data. Computabilit y . Can we do statistical estimation computations and mo del ﬁtting exactly for large netw orks, e.g., by full MCMC metho ds for mixed mem b ership and exp onen tial random graph mo dels, or do w e need to resort to appro ximations such as those inv olv ed in the v ariational approximation emplo yed b y [ 8 ; 9 ]? F or ER GM models a newly up dated suite of programs and do cumentation is no w av ail- able [ 138 ; 157 ; 224 ; 129 ]. The SIENA pac k age 2 dev elop ed b y Snijders and colleagues con- tains a complemen tary suite of programs that are particularly useful for longitudinal net work analyses (though Rinaldo et al. [ 251 ] sp eak w ords of caution). The pac k ages are capable of learning net w orks of size up to a few thousand no des. The truth is that it is unrealistic to exp ect that really large netw orks with millions of no des can b e estimated using exact metho ds. Ev en v ariational approximations, whic h ha v e 1 http://sonia.stanford.edu/ 2 http://stat.gamma.rug.nl/siena.html 57 their own drawbac ks suc h as sensitivity to the starting p oin t, are not realizable for netw orks on a really large scale. The k ey to net work mo deling and parameter estimation is to tak e in to accoun t the sparsity that comes with size. The methods that are go o d on small or medium-sized but relatively dense netw orks, migh t b e computationally infeasible or con tain in v alid assumptions for larger netw orks. As we gear up to mo del v ery large netw orks, it is imp ortan t to fo cus not only on the disadv antages that size brings but also on its adv antages. Asymptotics and Assessing Go o dness of Fit. There is no standard large sample asymptotics for net works (e.g., as N goes to inﬁnit y) that can b e used to assess the go o dness- of-ﬁt of mo dels. Th us w e ma y hav e serious problems with v ariance estimates for parameters and with conﬁdence or p osterior in terv al estimates. While a few models with a small num b er of ﬁxed parameters ha ve well-behav ed asymptotics, the problems here tend to b e the inherent dep endence of net work data and the gro wth in the num b er of parameters to b e estimated as N increases. Hab erman [ 134 ] comments brieﬂy on asymptotics in his discussion of the p 1 mo del, and notes the similarity to issues for the Rasch mo del from item resp onse theory . The lack of asymptotics means that we may ha ve problems of consistency of estimators, but it also means that there is no standard basis for model comparison and assessing go o dness of ﬁt. Most other authors ha ve addressed these issues either empirically , e.g., Hun ter et al. [ 156 ], or not at all. There are tw o alternative approac hes. W e can consider assessing ﬁt or comparing mo dels using exact distributions giv en the minimal suﬃcient statistics (MSSs). This works for simple models but not ob viously for the general class of ER GMs or most dynamic mo dels in the literature. F urther, for man y of the mo dels, esp ecially those in volving latent v ariables, the MSSs are the data themselv es. Alternativ ely , we could think in terms of some form of cross-v alidation for mo del selection and assessmen t. The problem with cross-v alidation is the boundary eﬀects asso ciated with subsets of nodes. This is directly related to the problem of sampling in netw orks. Bic k el and Chen [ 37 ] address the problem of asymptotics in the context of blo ckmodeling or comm unit y disco v ery , and the metho ds they exploit ma y b e useful in a broader con text when the num b er of parameters to b e estimated grows as N increases. Sampling. Do our data represen t the entire net w ork or are they based on only a subnet- w ork or subgraph? When the data come from a subgraph, even one selected at random, w e need to worry ab out the eﬀects at the b oundary 3 and the attendant biases they bring to parameter estimates, cf. the negativ e result in Stumpf et al. for scale-free models in whic h they show the exten t and nature of the bias [ 289 ]. Most of the early results on sampling for net w ork data fo cused on random subgraphs and exploited the traditional statistical theory of design-based sampling, in whic h the prop erties of the netw ork are assumed to b e ﬁxed, and w e ev aluate sample quan tities by considering their distribution under all p ossible similarly 3 The b oundary is the collection of observ ed no des which hav e links to the unobserved no des. The b oundary can p otentially include all observ ed no des. Only no des for which the set of kno wn links are certain to be complete are not included in the b oundary – the condition that is hard to satisfy in real w orld net works. 58 selected subgraphs. F or details, see the man y pap ers b y Ove F rank [ 109 ; 295 ] and oth- ers [ 125 ; 135 ; 258 ]. Wiuf and Stumpf [ 325 ] and Stumpf and Thorne [ 288 ] recently adopted a related but diﬀerent approac h fo cusing on prop erties such as degree distributions using binomial random sample sizes from “large” graphs. Others suc h as Lesko vec and F aloutsos [ 191 ] examine asp ects of the question in an empirical but ad ho c fashion. The relev ance of sampling for mo del-based net w ork inference was ﬁrst addressed by Thompson and F rank [ 295 ], and further dev elop ed b y Handco c k and Gile [ 135 ], who adapt MCMC algorithms for exp onen tial random graph mo dels to accoun t for sampling designs. T o date, these are the only w orks to seriously explore this imp ortan t topic. Airoldi and Carley [ 6 ] quantify the sensitivit y of alternative sampling algorithms to generate graphs that share similar top olog- ical prop erties, as w ell as the divergence of top ological prop erties of algorithms for sampling p opular net w ork mo dels. W e exp ect the issue of sampling to b e of relev ance to virtually all of the mo dels and w e need to explore their consequences. This will b e esp ecially true when w e try to update mo del parameter estimates based on extracts of data in a dynamic fashion. Missing data. Along with sampling arises a question of the treatment of missing data in statistical net w orks. Usually , the non-resp ondents to surveys are excluded from the analysis and the mo deling considers only individuals for whic h all data is a v ailable. A few works deal with missing data directly . The empirical impact of nonrespondents in a survey to analysis is considered in [ 284 ], the mo deling implications and inference for non-resp onden ts in ERGM can b e found in [ 255 ; 120 ; 178 ]. Missing data in longitudinal studies is the sub ject of [ 154 ]. This w ork mak es assumptions ab out sampling strategies to justify the estimation of missing edges using a Missing at Random assumption. Because this is not in general a correct assumption we ha v e an in teresting set of op en problems. Kossinets [ 180 ] considers three missing data mec hanisms: netw ork b oundary sp eciﬁcation (non-inclusion of actors or aﬃliations), surv ey non-resp onse, and censoring by vertex degree (ﬁxed c hoice design), and examines their eﬀect on a study of a scientiﬁc collab oration netw ork. One type of missing data - links or relations - can b e treated as a prediction task b y treating links b etw een no des in a giv en net work as probabilistic quantiti es and using statistical mo dels based on the av ailable data to estimate the likelihoo d of those edges b eing there. The problem of prediction is often addressed in the mac hine learning communit y and we discuss it next. Prediction. In our review of the literature on netw orks across man y disciplines we ha ve found limited metho dological work fo cussing on ev aluating and comparing the predictive abilit y of v arious mo dels, static or dynamic. There are pap ers on link prediction in the relational net work model literature (e.g. [ 238 ]). Lib en-No w ell and Kleinberg [ 198 ] dev elop approac hes to link prediction based on measures for analyzing the “pro ximity” of no des in a net work, e.g., the WWW. In biological literature, a num b er of pap ers examine the problem of predicting missing links in biological netw orks (e.g. [ 327 ] is one of the earlier w orks). Ho wev er, these papers fo cus on ho w to clev erly com bine heterogeneous data in order to disco v er new links. The ev aluation is usually limited to cross-v alidation on the 59 kno wn links—information that is incomplete and av ailable only for a few organisms. In the so ciological literature on organizations, there is often interest in distinguishing among organizations on the basis of their netw ork structure, so there would clearly b e in terest in utilizing methodology for prediction based on net work structure. Because making predictions of v arious sorts from dynamic net w ork mo dels ﬁts w ell within the machine learning paradigm, w e exp ect to see many more pap ers on the topic in the not to o distan t future. Em b eddability . Underlying most dynamic net work mo dels is a con tinuous time stochastic pro cess ev en though the data used to study the mo dels and their implications may come in the form of rep eated snapshots at discrete time p oin ts (ep o chs)—a form of time sampling as opp osed to no de sampling referred to abov e—or cum ulative net w ork links. In such circum- stances we need to tak e special care in ho w w e represen t and estimate the con tinuous-time parameters in the actual data realizations used to ﬁt mo dels. This is known in the statistical literature as the em b eddability problem and w as studied for Mark ov processes in the 1970s b y Singer and Spilerman [ 270 , 271 ] for so cial pro cesses, and more recen tly by Hansen and Sc heinkman [ 140 ] in the context of econometric mo dels and by others in the computational ﬁnance literature. W asserman [ 313 ] and v arious pap ers by Snijders and his collab orators illustrate ho w to address embedding in some simple dynamic mo dels. Iden tiﬁabilit y . Iden tiﬁability of mo del parameters is a technical issue in statistics that refers to the fact that m ultiple solutions may exist (in the parametric space) that lead to exactly the same lik eliho o d. In this sense, no inference pro cedure can distinguish b et ween these solutions. F or instance, in a mixture mo del w e can p ermute the assignmen ts of p oin ts to mixture comp onents to obtain an equiv alent solution. There are a num b er of pap ers that describ e the issue in v arious mo dels (e.g., [ 283 ; 132 ]) and from diﬀeren t p ersp ectiv es (e.g., [ 51 ; 52 ] from the algebraic p ersp ectiv e). A few solutions to address this issue hav e b een prop osed recen tly . Some consider inference on equiv alence classes in a blo c kmo del for net w ork data [ 236 ]. Others pre-pro cess the data to identify a reference solution that drives the inference [ 137 ]. Com bining links with their attributes. In man y net work data sets, esp ecially those arising in mac hine learning con texts, there are attributes asso ciated with the net w ork links. F or example in e-mail and blog databases, the attributes may b e tak en to b e the con ten ts of the messages or p ostings. There is an emerging literature fo cused on cascades of suc h links but few pap ers are situated in a full net work mo del setting and few authors attempt to combine the mo dels for links with mo dels for message or p osting texts. This is a natural extension to mo dels describ ed here, especially the mixed mem b ership sto c hastic blo c kmo dels of section 3.8 , since the text could naturally b e mo deled b y mixed-mem b ership topic mo dels. McCallum et al. [ 208 ] and Chang and Blei [ 61 ] suggest diﬀeren t w a ys to approach this kind of combination mo del. Dynamic mo dels that combine evolving blo ck and topic structures w ould b e of sp ecial in terest for such applications. 60 Chapter 6 Summary The ubiquit y of net w orks in areas as div erse as the social sciences, biology , computer science, ph ysics, and economics, has spa wned extensiv e literature on the sub ject. In this review, w e discussed in detail a few main trends in the statistic al netw ork mo deling literature, fo cussing on models that ha v e historically inspired many others as w ell as a few recent prop osals. By c harting the evolution of statistical net work mo deling approac hes, we p ointed out explicit connections b et w een the discussed mo dels. Figure 6.1 pro vides a visual diagram of mo del inﬂuence; an arrow p oin ting from A to B means either that the dev elopment of mo del A inﬂuenced the subsequen t dev elopmen t of mo del B, or that B can b e viewed as a generalization of A. The literature on net w ork mo deling may be divided along diﬀerent lines of motiv ation. Mo dels primarily in tro duced in the ph ysics literature are motiv ated b y asymptotic prop erties of net works, whereas the literature stemming from statistics and statistical so cial science is concerned with the inference step in addition. Thus, the main criticism of the random graph mo dels primarily dev elop ed in statistical ph ysics is the lack of the assessmen t of the ﬁt of the mo dels to the data. The main drawba ck in the statistical literature is the lac k of the comprehensiv e asymptotic analysis. Though degeneracy found in the limiting case of the earlier v ersions of the ERGM has b een addressed, a more broad analysis is still missing. In this work we made a distinction b etw een static and dynamic mo dels. Descriptive mo dels such as p 1 , p 2 , and ERGM are clearly static as they infer a set of suﬃcien t statistics from a single snapshot of an existing netw ork. The families of con tin uous and discrete time Mark o v models, on the other hand, are clearly dynamic as they seek to mo del m ultiple snapshots of an evolving netw ork. The Erd¨ os-R´ en yi-Gilb ert, preferential attac hmen t, and small-w orld models, while ultimately aim to mo del a single time p oint snapshot of a net w ork, are usually describ ed via generativ e processes, where edges are added one at a time. These mo dels can th us b e considered as either static, with resp ect to what they model, or dynamic, with resp ect to how they’re represented. In this work we refer to them as pseudo-dynamic. Within the category of static mo dels we discussed t w o main directions: mo dels that tak e net w orks as giv en (see section 3.4 , section 3.5 , and section 3.6 ) and mo dels that assume and estimate latent structures ( section 3.8 and section 3.9 ). Latent structure mo dels hav e to mak e certain assumptions ab out the data. Sto c hastic blockmodels assume structural 61 Erdös-Rényi-Gilbert random graph models (Gilbert 1959, Erdös-Rényi 1959) Small-World studies (Milgram 1967) Exchangeable graph model (Airoldi 2009) p 1 models (Holland and Leinhardt 1981) p 2 random effects model (van Duijn, Snijders, Zijlstra, 2004, 2006) p* models / ERGM (Frank and Strauss 1986) Continuous T ime Markov Models (Holland and Leinhardt 1977, W asserman 1977, Snijders 2005, 2006) Latent space models (Hoff, Raftery , Handcock 2002, Handcock et al. 2007) Mixed membership blockmodel (Airoldi et al, 2008) Discrete Markov ERGM (Hanneke and Xing 2006) Dynamic latent space model (Sarkar and Moore 2005) Dynamic contextual friendship model (Zheng and Goldenberg 2006) Preferential attachment model (Barabási and Albert 1999) Small-World model (W atts and Strogatz 1998) Duplication attachment model (Kumar et al 2000, Wiuf et al 2006) Figure 6.1: Netw ork summarizing the relations b et w een mo dels discussed in our review. White nodes denote static models, yello w no des – “pseudo-dynamic” and green – dynamic mo dels. Arro ws indicate inspiration or inﬂuence of the mo del at the source on the mo del at the target. equiv alence of the no des, whereas latent space mo dels assume the existence of an em b edding of the netw ork in a low dimensional space. These mo dels allow for b etter understanding of the data in cases where it is b eliev ed to contain hidden structure. W e divided the category of dynamic mo dels in to con tinuous time Mark ov m o dels and dis- crete time Mark o v mo dels. CMPM ( section 4.4 ) assumes that the adjacency matrix ev olv es according to a contin uous Mark o v chain whose in tensit y matrix can dep end on v arious edge and no de dynamics. Discrete time Marko v netw ork mo dels deal with a set of netw ork snap- shots observ ed at v arious time p oints. Examples of discrete time Mark ov net work mo dels include dynamic extensions of ER GM ( subsection 4.5.1 ) and the laten t space mo del ( sub- section 4.5.2 ), the duplication-attac hmen t mo del, as well as a generativ e dynamic mo del for friendship net w orks ( subsection 4.5.3 ). Despite the man y adv ances in net w ork mo deling ov er the last decade, there remains a host of unresolv ed issues. W e listed some of the issues in chapter 5 . W e feel that, from a 62 statistics or machine learning p ersp ectiv e, the biggest breakthroughs are to b e made in the areas of inference and dynamic mo deling. Creating a mo del or p erhaps ﬁxing an existing one in such a wa y that provides realistic generativ e and inference mec hanisms which can iden tiﬁably infer parameters of a large real world netw ork would make a great con tribution to the statistical netw ork mo deling communit y . 63 64 Ac kno wledgmen ts This researc h was partly supp orted by United States National Institute of General Medical Sciences Center of Excellence gran t P50 GM071508, by National Science F oundation gran ts DBI-0546275, I IS-0513552, by National Institutes of Health gran t R01 GM071966 to Prince- ton Univ ersity , by National Science F oundation grant DMS-0907009 to Harv ard Universit y , and b y National Science F oundation gran t DMS-0631589 and partial supp ort from U.S. Arm y Researc h Oﬃce Con tract W911NF o910360 to the Departmen t of Statistics, Carnegie Mellon Univ ersit y . Edoardo M. Airoldi was a p ostdo ctoral fellow in the Departmen t of Computer Science and the Lewis-Sigler Institute for In tegrativ e Genomics at Princeton Univ ersit y when a large p ortion of this work w as carried out. W e thank three anon ymous reviewers for their v aluable comments, as w ell as their helpful additions and corrections to our citation list. W e thank Joseph Blitzstein and Pa vel Krivitsky for a careful reading and the correction of a n um b er of infelicities. W e ﬁnally wish to thank L´ aszl´ o Barab´ asi and Z´ oltan Oltv ai; P e- ter Bearman, James Mo o dy , and Katherine Sto vel; James F owler and Nic holas Christakis; Purnamrita Sark ar and Andrew Mo ore for giving p ermission to re-print ﬁgures from their original pap ers [ 27 ; 31 ; 65 ; 263 ]. 65 66 Bibliograph y [1] E. M. Airoldi. Bayesian Mixe d Memb ership Mo dels of Complex and Evolving Networks . PhD thesis, School of Computer Science, Carnegie Mellon Universit y , 2006. [2] E. M. Airoldi. Mo del-based clustering for social netw orks: Discussion. Journal of the R oyal Statistic al So ciety, Series A , 170(2):330–331, 2007. [3] E. M. Airoldi. Getting started in probabilistic graphical mo dels. PL oS Computational Biolo gy , 3(12):e252, 2007. [4] E. M. Airoldi. A family of distributions on the unit hypercub e. T echnical Rep ort 2, Departmen t of Statistics, Harv ard Universit y , 2009. [5] E. M. Airoldi. The exc hangeable graph mo del. T echnical Rep ort 1, Departmen t of Statistics, Harv ard Universit y , 2009. [6] E. M. Airoldi and K. M. Carley . Sampling algorithms for pure net w ork top ologies: A study on the stabilit y and the separabilit y of metric em b eddings. ACM SIGKDD Explor ations , 7(2):13–22, 2005. [7] E. M. Airoldi, D. M. Blei, E. P . Xing, and S. E. Fien b erg. A latent mixed-mem b ership mo del for relational data. In Pr o c e e dings of the 3r d International Workshop on Link Disc overy: Issues, Appr o aches and Applic ations (LinkKDD ’05), in conjunction with the 11th International A CM SIGKDD Confer enc e , pages 82–89. ACM Press, New Y ork, 2005. [8] E. M. Airoldi, D. M. Blei, S. E. Fien b erg, and E. P . Xing. Mixed membership analysis of high-throughput in teraction studies: Relational data. 0294 , 2007. [9] E. M. Airoldi, D. M. Blei, S. E. Fien b erg, and E. P . Xing. Mixed mem b ership sto chastic blo c kmo dels. Journal of Machine L e arning R ese ar ch , 9:1981–2014, 2008. [10] L. Ak oglu and C. F aloutsos. RTG: A recursive realistic graph generator using random t yping. In Data Mining and Know le dge Disc overy, 19(2):194–209, Springer Nether- lands, 2009. 67 [11] R. D. Alba. A graph-theoretic deﬁnition of a so ciometric clique. Journal of Mathe- matic al So ciolo gy , 3:113–126, 1973. [12] R. Alb ert and A.-L. Barab´ asi. Statistical mec hanics of complex net works. R eviews of Mo dern Physics , 74(1):47–97, 2002. [13] R. Alb ert, H. Jeong, and A.-L. Barab´ asi. Diameter of the world wide w eb. Natur e , 401:130–131, 1999. [14] D. L. Alderson. Catc hing the ‘net work science’ bug: Insight and opp ortunity for the op erations researc her. Op er ations R ese ar ch , 56(5):1047–1065, 2008. [15] D. J. Aldous. Exchangeabilit y and related topics. In L e ctur e Notes in Mathematics , v olume 1117, pages 1–198. Springer Berlin / Heidelb erg, 1985. (Also in Ecole d’Ete St Flour 1983). [16] S. Allesina, D. Alonso, and M. Pascual. A general mo del for fo o d w eb structure. Scienc e , 320(5876):658–661, 2008. [17] U. Alon. Net w ork motifs: Theory and exp erimen tal approac hes. Natur e R eviews Genetics , 8:450–461, 2007. [18] L. A. N. Amaral, A. Scala, M. Barth ´ el ´ em y , and H. E. Stanley . Classes of small-w orld net w orks. Pr o c e e dings of the National A c ademy of Scienc es , 97(21):11149–11152, 2000. [19] P . Arabie, S. A. Bo orman, and P . R. Levitt. Constructing blo c kmo dels: How and wh y . Journal of Mathematic al Psycholo gy , 17(1):21–63, 1978. [20] L. Backstrom, D. Huttenlo c her, J. Kleinberg, and X. Lan. Group formation in large so cial netw orks: Mem b ership, growth, and ev olution. In Pr o c e e dings of the 12th A CM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , pages 44–54. A CM Press, New Y ork, 2006. [21] D. Banks and K. M. Carley . Metric inference for so cial net works. Journal of Classiﬁ- c ation , 11(1):121–149, 1994. [22] D. Banks and K. M. Carley . Mo dels for netw ork evolution. Journal of Mathematic al So ciolo gy , 21:173–196, 1996. [23] E. Banks, E. Nabiev a, R. Peterson, and M. Singh. NetGrep: F ast netw ork sc hema searc hes in in teractomes. Genome Biolo gy , 9(9):R:138, 2008. http://genomebiology. com/content/9/9/R138 . [24] A.-L. Barab´ asi. Linke d: The New Scienc e of Networks . P erseus, Cam bridge, MA, 2002. [25] A.-L. Barab´ asi. The origin of bursts and hea vy tails in h uman dynamics. Natur e , 435: 207–211, 2005. 68 [26] A.-L. Barab´ asi and R. Alb ert. Emergence of scaling in random netw orks. Scienc e , 286 (5439):509–512, 1999. [27] A.-L. Barab´ asi and Z. Oltv ai. Net work biology: Understanding the cell’s functional organization. Natur e R eviews Genetics , 5(2):101–113, 2004. [28] A.-L. Barab´ asi, H. Jeong, Z. Neda, E. Ra v asz, A. Sc h ub ert, and T. Vicsek. Evolution of the so cial netw ork of scien tiﬁc collab oration. Physic a A , 311(3–4):590–614, 2002. [29] F. Bassetti, M. Cosentino Lagomarsino, and S. Mandra. Exchangeable random net- w orks. Internet Mathematics , 4(4):357–400, 2007. [30] J. Baumes, M. Goldb erg, M. Magdon-Ismail, and W. A. W allace. Disco vering hidden groups in communication net works. In L e ctur e Notes in Computer Scienc e , v olume 3073, pages 378–389. Springer Berlin / Heidelb erg, 2004. [31] P . S. Bearman, J. Mo o dy , and K. Stov el. Chains of aﬀection: The structure of ado- lescen t romantic and s exual netw orks. Americ an Journal of So ciolo gy , 110(1):44–91, 2004. [32] A. Bernard, D. S. V aughn, and A. J. Hartemink. Reconstructing the top ology of protein complexes. In T. Sp eed and H. Huang, editors, R ese ar ch in Computational Mole cular Biolo gy 2007 (RECOMB07) , v olume 4453 of L e ctur e Notes in Bioinformatics , pages 32–46. Springer Berlin / Heidelb erg, 2007. [33] J. Besag. Spatial in teraction and the statistical analysis of lattice systems. Journal of the R oyal Statistic al So ciety, Series B , 36(2):192–236, 1974. [34] I. Bez´ ak ov´ a, A. Kalai, and R. San thanam. Graph mo del selection using maxim um lik eliho o d. In Pr o c e e dings of the 23r d International Confer enc e on Machine L e arning , v olume 148 of ACM International Confer enc e Pr o c e e ding Series , pages 105–112. ACM Press, New Y ork, 2006. [35] S. Bhamidi, G. Bresler, and A. Sly . Mixing time of exp onential random graphs. In Pr o c e e dings of the 49th A nnual IEEE Symp osium on F oundations of Computer Scienc e , pages 803–812. IEEE Computer So ciety , W ashington, D.C., 2008. [36] I. Bhattachary a. Col le ctive Entity R esolution in R elational Data . PhD thesis, Univ er- sit y of Maryland, 2006. [37] P . J. Bick el and A. Chen. A nonparametric view of net work mo dels and Newman- Girv an and other mo dularities. Pr o c e e dings of the National A c ademy of Scienc es , (to app ear), 2009. [38] C. M. Bishop. Neur al Networks for Pattern R e c o gnition . Oxford Universit y Press, 1995. 69 [39] Y. M. M. Bishop, S. E. Fien b erg, and P . W. Holland. Discr ete Multivariate A nalysis: The ory and Pr actic e . MIT Press, Cam bridge, MA, 1975. Reprin ted by Springer-V erlag, 2007. [40] D. M. Blei and S. E. Fien b erg. Mo del-based clustering for so cial netw orks: Discussion. Journal of the R oyal Statistic al So ciety, Series A , 170(2):332, 2007. [41] D. M. Blei, A. Y. Ng, and M. I. Jordan. Laten t Diric hlet allocation. Journal of Machine L e arning R ese ar ch , 3:993–1022, 2003. [42] J. Blitzstein and P . Diaconis. A sequential imp ortance sampling algorithm for gener- ating random graphs with prescrib ed degrees. T echnical report, Stanford Universit y , 2006. [43] B. Bollob´ as. R andom Gr aphs . Cam bridge Univ ersit y Press, New Y ork, 2nd edition, 2001. [44] B. Bollob´ as and F. R. K. Ch ung. The diameter of a cycle plus a random matching. SIAM Journal on Discr ete Mathematics , 1(3):328–333, 1988. [45] B. Bollob´ as, S. Janson, and O. Riordan. The phase transition in inhomogeneous random graphs. R andom Structur es & Algorithms , 31(1):3–122, 2007. [46] E. Bonab eau. Agent-based mo deling: Metho ds and techniques for sim ulating h uman systems. Pr o c e e dings of the National A c ademy of Scienc es , 99(Suppl. 3):7280–7287, 2002. [47] D. Botstein, S. A. Chervitz, and J. M. Cherry . Y east as a mo del organism. Scienc e , 277(5330):1259–1260, 1997. [48] U. Brandes and T. Erlebac h, editors. Network Analysis: Metho dolo gic al F oundations , v olume 3418 of L e ctur e Notes in Computer Scienc e . Springer Berlin /Heidelb erg, 2005. [49] M. Braun and J. McAuliﬀe. V ariational inference for large-scale mo dels of discrete c hoice. , 2007. [50] M. Buc hanan. Nexus: Smal l Worlds and the Gr oundbr e aking Scienc e of Networks . W. W. Norton & Company , New Y ork, 2002. [51] M.-L. G. Buot and D. S. P . Ric hards. Counting and lo cating the solutions of p olynomial systems of maximum likelihoo d equations, I. Journal of Symb olic Computation , 41(2): 234–244, 2006. [52] M.-L. G. Buot and D. S. P . Ric hards. Counting and lo cating the solutions of p olynomial systems of maximum lik eliho o d equations, II: The Behrens-Fisher problem. http: //arXiv.org/abs/0709.0957 , 2007. 70 [53] R. S. Burt. Mo dels of net w ork structure. Annual R eview of So ciolo gy , 6:79–141, 1980. [54] K. M. Carley . Group stabilit y: A socio-cognitive approach. In E. La wler, B. Mark ovsky , C. Ridgew a y , and H. W alk er, editors, A dvanc es in Gr oup Pr o c esses, pages 1–44. JAI Press, Green wic h, CT, 1990. [55] K. M. Carley . Smart agen ts and organizations of the future. In L. Lievrou w and S. Livingstone, editors, The Handb o ok of New Me dia, pages 206–220. Sage, Thousand Oaks, CA, 2002. [56] K. M. Carley and A. New ell. The nature of the so cial agent. Journal of Mathematic al So ciolo gy , 19(4):221–262, 1994. [57] K. M. Carley and J. Reminga. ORA: Organizational Risk Analyzer. http://www. casos.cs.cmu.edu/projects/ora/ , 2004. [58] K. M. Carley and D. Skillicorn. Sp ecial issue on analyzing large scale net works: The Enron corpus. Computational & Mathematic al Or ganization The ory , 11(3):179–181, Springer Netherlands, 2005. [59] K. M. Carley , D. B. F ridsma, E. Casman, A. Y ahja, N. Altman, L.-C. Chen, B. Kamin- sky , and D. Na v e. BioWar: Scalable agent-based mo del of bioattac ks. IEEE T r ansac- tions on Systems, Man, and Cyb ernetics, Part A: Systems and Humans , 36(2):252–265, 2006. [60] D. Chakrabarti, Y. Zhan, and C. F aloutsos. R-MAT: A recursive mo del for graph mining. In Pr o c e e dings of the 4th SIAM International Confer enc e on Data Mining , 2004. [61] J. Chang and D. M. Blei. Relational topic mo dels for do cumen t netw orks. In Pr o- c e e dings of the 12th International Confer enc e on Artiﬁc al Intel ligenc e and Statistics (AIST A TS ’09) , 2009. [62] H. Chen, E. Reid, J. Sinai, A. Silke, and B. Ganor, editors. T err orism Informatics: Know le dge Management and Data Mining for Homeland Se curity . Springer-V erlag, New Y ork, 2008. [63] Q. Chen, H. Chang, R. Govindan, S. Jamin, S. J. Shenker, and W. Willinger. The origin of p ow er laws in in ternet top ologies revisited. In Pr o c e e dings of the 21st Annual Joint Confer enc e of the IEEE Computer and Communic ation So cieties , 2:608–617, 2002. [64] J. M. Cherry , C. Ball, S. W eng, G. Juvik, R. Schmidt, C. Adler, B. Dunn, S. Dwight, L. Riles, R. K. Mortimer, and D. Botstein. Genetic and physical maps of Sac char omyc es c er evisiae . Natur e , 387(6632 Suppl.):67–73, 1997. 71 [65] N. A. Christakis and J. H. F owler. The spread of ob esit y in a large so cial netw ork ov er 32 y ears. New England Journal of Me dicine , 357(370-379), 2007. [66] N. A. Christakis and J. H. F owler. The collectiv e dynamics of smoking in a large so cial net w ork. New England Journal of Me dicine , 358:2249–2258, 2008. [67] N. A. Christakis and J. H. F owler. Dynamic spread of happiness in a large so cial net w ork: Longitudinal analysis ov er 20 y ears in the Framingham Heart Study . British Me dic al Journal , 337:a2338, 2008. [68] N. A. Christakis and J. H. F o wler. Conne cte d: The Surprising Power of Our So cial Networks and How They Shap e Our Lives . Little, Bro wn and Co., New Y ork, 2009. [69] F. Chung and L. Lu. Complex Gr aphs and Networks . American Mathematical So ciety , Pro vidence, RI, 2006. [70] F. Ch ung, L. Lu, and V. V u. The spectra of random graphs with giv en exp ected degrees. Pr o c e e dings of the National A c ademy of Scienc es , 100(11):6313–6318, 2003. [71] A. Clauset. Finding lo cal comm unity structure in netw orks. Physic al R eview E , 72(2): 026132, 2005. [72] A. Clauset and C. Mo ore. How do net works b ecome na vigable? abs/cond- mat/0309415 , 2003. [73] A. Clauset, C. Mo ore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in netw orks. Natur e , 453:98–101, 2008. [74] A. Clauset, C. R. Shalizi, and M. E. J. Newman. P ow er-law distributions in empirical data. SIAM R eview , 51(4):661–703, 2009. [75] R. Clegg, R. Landa, U. Harder, and M. Rio. Ev aluating and optimising mo dels of net w ork gro wth. , 2009. [76] P . Cliﬀord. Mark ov random ﬁelds in statistics. In G. R. Grimmett and D. J. A. W elsh, editors, Disor der in Physic al Systems: A V olume in Honour of John M. Hammersley , pages 19–32. Oxford Universit y Press, 1990. [77] E. Cohen-Cole and J. M. Fletc her. Is ob esit y con tagious? So cial netw orks vs. en viron- men tal factors in the ob esity epidemic. Journal of He alth Ec onomics , 27:1382–1387, 2008. [78] E. Cohen-Cole and J. M. Fletc her. Detecting implausible so cial netw ork eﬀects in acne, height, and headac hes: Longitudinal analysis. British Me dic al Journal , 337: a2533, 2008. 72 [79] J. Copic, M. O. Jackson, and A. Kirman. Iden tifying comm unity structures from net w ork data via maxim um lik eliho o d metho ds. The B.E. Journal of The or etic al Ec o- nomics , 9(1), 2009. [80] A. Davis, B. B. Gardner, M. R. Gardner, and J. J. W allach. De ep South: A So cial A nthr op olo gic al Study of Caste and Class . Univ ersity of Chicago Press, 1941. Reprin ted b y Univ ersit y of South Carolina Press, 2009. [81] G. B. Da vis and K. M. Carley . Clearing the F OG: Fuzzy , o verlapping groups for so cial net w orks. So cial Networks , 30(3):201–212, 2008. [82] B. de Finetti. The ory of pr ob ability, Vol. 1-2 . John Wiley & Sons, New Y ork, 1990. Reprin t of the 1974–1975 translation. [83] D. J. de Solla Price. Netw orks of scientiﬁc pap ers: The pattern of bibliographic refer- ences indicates the nature of the scientiﬁc researc h fron t. Scienc e , 149(3683):510–515, 1965. [84] P . Diaconis and S. Janson. Graph limits and exc hangeable random graphs. T echnical rep ort, Departmen t of Statistics, Stanford Universit y , 2008. [85] P . Diaconis and B. Sturmfels. Algebraic algorithms for sampling from conditional distributions. A nnals of Statistics , 26(1):363–397, 1998. [86] P . S. Do dds, R. Muhamad, and D. J. W atts. An exp erimen tal study of search in global so cial net w orks. Scienc e , 301(5634):827–829, 2003. [87] P . Domingos. Mining so cial net works for viral marketing. IEEE Intel ligent Systems , 20(1):80–82, 2005. [88] P . Doreian, V. Batagelj, and A. F erligo j. Generalized blo ckmodeling of tw o-mo de net w ork data. So cial Networks , 26:29–53, 2004. [89] P . Doreian, V. Batagelj, and A. F erligo j. Gener alize d Blo ckmo deling (Structur al Anal- ysis in the So cial Scienc es) . Cam bridge Univ ersit y Press, 2004. [90] S. N. Dorogo vtsev and J. F. F. Mendes. Scaling b eha vior of developing and decaying net w orks. Eur ophysics L etters , 52(1):33, 2000. [91] R. Durrett. R andom Gr aph Dynamics . Cambridge Universit y Press, 2006. [92] N. Eagle, A. Pen tland, and D. Lazer. Inferring friendship net work structure by using mobile phone data. Pr o c e e dings of the National A c ademy of Scienc es , 106(36):15274– 15278, 2009. [93] P . Erd¨ os and A. R ´ enyi. On Random Graphs, I. Public ationes Mathematic ae , 6:290–297, 1959. 73 [94] P . Erd¨ os and A. R´ enyi. The evolution of random graphs. Magyar T ud. Akad. Mat. Kutat´ o Int. K¨ ozl. , 5:17–61, 1960. [95] E. A. Eroshev a, S. E. Fien b erg, and J. Laﬀerty . Mixed-mem b ership mo dels of scientiﬁc publications. Pr o c e e dings of the National A c ademy of Scienc es , 101(Suppl. 1):5220– 5227, 2004. [96] E. Even-Dar and M. Kearns. A small world threshold for economic net work formation. In B. Sc h¨ olkopf, J. Platt, and T. Hoﬀman, editors, A dvanc es in Neur al Information Pr o c essing Systems (NIPS) , volume 19, pages 385–392. MIT Press, Cambridge, MA, 2007. [97] M. F aloutsos, P . F aloutsos, and C. F aloutsos. On p ow er-law relationships of the in ter- net top ology . In Pr o c e e dings of the Confer enc e on Applic ations, T e chnolo gies, A r chite c- tur es, and Pr oto c ols for Computer Communic ation (SIGCOMM ’99) , pages 251–261. A CM Press, New Y ork, 1999. [98] S. Fields and O. Song. A no vel genetic system to detect protein-protein interactions. Natur e , 340(6230):245–246, 1989. [99] S. E. Fien b erg. Analysis of Cr oss-Classiﬁe d Cate goric al Data . MIT Press, Cam bridge, MA, 2nd edition, 1980. Reprin ted b y Springer-V erlag, 2007. [100] S. E. Fien b erg and S. K. Lee. On small w orld statistics. Psychometrika , 40(2):219–228, 1975. [101] S. E. Fien b erg and S. S. W asserman. Categorical data analysis of single so ciometric relations. So ciolo gic al Metho dolo gy , pages 156–192, 1981. [102] S. E. Fien b erg and S. S. W asserman. An exponential family of probabilit y distributions for directed graphs: Comment. Journal of the A meric an Statistic al Asso ciation , 76 (373):54–57, 1981. [103] S. E. Fienberg, M. M. Mey er, and S. S. W asserman. Statistical analysis of multiple so ciometric relations. Journal of the A meric an Statistic al Asso ciation , 80:51–67, 1985. [104] S. E. Fienberg, S. P etro vi ´ c, and A. Rinaldo. Algebraic statistics for p 1 random graph mo dels: Mark ov bases and their uses. In S. Sinharay and N. J. Dorans, editors, Pap ers in Honor of Paul W. Hol land . Educational T esting Service, 2009. [105] J. Flannic k, A. No v ak, B. S. Sriniv asan, H. H. McAdams, and S. Batzoglou. Græmlin: General and robust alignmen t of m ultiple large in teraction netw orks. Genome R ese ar ch , 16(9):1169–1181, 2006. [106] A. D. Flaxman, A. M. F rieze, and J. V era. A geometric preferential attachmen t mo del of net w orks. Internet Mathematics , 3(2):187–206, 2006. 74 [107] A. D. Flaxman, A. M. F rieze, and J. V era. A geometric preferential attachmen t mo del of net w orks II. Internet Mathematics , 4(1):87–112, 2007. [108] J. F owler and N. Christakis. Estimating peer eﬀects on health in social netw orks. Journal of He alth Ec onomics , 27(5):1400–1405, 2008. [109] O. F rank. Netw ork sampling and mo del ﬁtting. In P . J. Carrington, J. Scott, and S. S. W asserman, editors, Mo dels and Metho ds in So cial Network A nalysis , pages 31–56. Cam bridge Univ ersit y Press, 2005. [110] O. F rank and D. Strauss. Marko v graphs. Journal of the A meric an Statistic al Asso ci- ation , 81(395):832–842, 1986. [111] N. F riedman. Inferring cellular net works using probabilistic graphical mo dels. Scienc e , 303(5659):799–805, 2004. [112] N. F riedman, L. Geto or, D. Koller, and A. Pfeﬀer. Learning probabilistic relational mo dels. In Pr o c e e dings of the 16th International Joint Confer enc e on Artiﬁcial Intel- ligenc e (IJCAI-99) , pages 1300–1309, 1999. [113] M. T. Gastner and M. E. J. Newman. Shape and eﬃciency in spatial distribution net w orks. Journal of Statistic al Me chanics: The ory and Exp eriment , 1:P01015, 2006. [114] A.-C. Ga vin, M. B¨ osche, R. Krause, P . Grandi, M. Marzio ch, A. Bauer, J. Sc hultz, J. M. Ric k, A.-M. Michon, C.-M. Cruciat, M. Remor, C. H¨ ofert, M. Sc helder, M. Bra- jeno vic, H. Ruﬀner, A. Merino, K. Klein, M. Hudak, D. Dickson, T. Rudi, V. Gnau, A. Bauch, S. Bastuck, B. Huhse, C. Leutw ein, M.-A. Heurtier, R. R. Copley , A. Edel- mann, E. Querfurth, V. Rybin, G. Drew es, M. Raida, T. Bouwmeester, P . Bork, B. Seraphin, B. Kuster, G. Neubauer, and G. Superti-F urga. F unctional organiza- tion of the y east proteome by systematic analysis of protein complexes. Natur e , 415: 141–147, 2002. [115] A.-C. Ga vin, P . Alo y , P . Grandi, R. Krause, M. Bo esche, M. Marzio ch, C. Rau, L. J. Jensen, S. Bastuck, B. D ¨ umpelfeld, A. Edelmann, M.-A. Heurtier, V. Hoﬀman, C. Hoe- fert, K. Klein, M. Hudak, A.-M. Mic hon, M. Schelder, M. Schirle, M. Remor, T. Rudi, S. Ho op er, A. Bauer, T. Bou wmeester, G. Casari, G. Drew es, G. Neubauer, J. M. Ric k, B. Kuster, P . Bork, R. B. Russell, and G. Sup erti-F urga. Proteome surv ey rev eals mo dularit y of the y east cell machinery . Natur e , 440(7084):631–636, 2006. [116] L. Geto or and B. T ask ar, editors. Intr o duction to Statistic al R elational L e arning . MIT Press, Cam bridge, MA, 2007. [117] L. Geto or, N. F riedman, D. Koller, and B. T ask ar. Learning probabilistic mo dels of link structure. Journal of Machine L e arning R ese ar ch , 3:679–707, 2003. 75 [118] C. J. Gey er and E. A. Thompson. Constrained Mon te Carlo maximum lik eliho o d for dep enden t data (with discussion). Journal of the R oyal Statistic al So ciety, Series B , 54:657–699, 1992. [119] E. N. Gilb ert. Random graphs. Annals of Mathematic al Statistics , 30(4):1141–1144, 1959. [120] K. J. Gile and M. S. Handco ck. Model-based assessmen t of the impact of missing data on inference for netw orks. CSSS W orking pap er No. 66, 2006. [121] P . S. Gill and T. B. Swartz. Bay esian analysis of directed graphs data with application to so cial net w orks. Applie d Statistics , 53(2):249–260, 2004. [122] M. Girv an and M. E. J. Newman. Comm unity structure in so cial and biological net- w orks. Pr o c e e dings of the National A c ademy of Scienc es , 99(12):7821–7826, 2002. [123] K. S. Gleditsch. Expanded trade and GDP data. Journal of Conﬂict R esolution , 46 (5):712–724, 2002. [124] A. Glob erson, G. Chec hik, F. Pereira, and N. Tish b y . Euclidean embedding of co- o ccurrence data. Journal of Machine L e arning R ese ar ch , 8:2265–2295, 2007. [125] S. Go el and M. J. Salganik. Resp onden t-driven sampling as Mark ov chain Mon te Carlo. Statistics in Me dicine , 28(17):2202–2229, 2009. [126] A. Golden b erg and A. Mo ore. T ractable learning of large Ba yes net structures from sparse data. In Pr o c e e dings of the 21st International Confer enc e on Machine L e arning , page 44. ACM Press, New Y ork, 2004. [127] A. Golden b erg and A. Mo ore. Bay es net graphs to understand coauthorship netw orks. In KDD Workshop on Link Disc overy: Issues, Appr o aches and Applic ations , 2005. [128] A. Golden b erg and A. Zheng. Exploratory study of a new mo del for evolving net- w orks. In E. M. Airoldi, D. M. Blei, S. E. Fienberg, A. Golden b erg, E. P . Xing, and A. X. Zheng, editors, Statistic al Network Analysis: Mo dels, Issues and New Dir e ctions, v olume 4503 in L e ctur e Notes in Computer Scienc e . Springer Berlin / Heidelberg, 2007. [129] S. M. Go o dreau, M. S. Handco ck, D. R. Hun ter, C. T. Butts, and M. Morris. A statnet tutorial. Journal of Statistic al Softwar e , 24(9):1–26, 2008. [130] S. M. Go o dreau, J. A. Kitts, and M. Morris. Birds of a feather, or friend of a friend? Using exp onential random graph mo dels to inv estigate adolescent so cial netw orks. Demo gr aphy , 46(1):103–125, 2009. [131] S. Go yal. Conne ctions: An Intr o duction to the Ec onomics of Networks . Princeton Univ ersit y Press, 2007. 76 [132] B. Gr ´ un and F. Leisch. Dealing with lab el switching in mixture mo dels under gen uine m ultimo dality . Journal of Multivariate Analysis , 100(5):851–861, 2008. [133] A. Guetz and P . Constan tine. Lecture notes for course on Information Net works, 2007. http://www.stanford.edu/class/msande337/notes/Lec1.pdf . [134] S. J. Haberman. An exponential family of probabilit y distributions for directed graphs: Commen t. Journal of the Americ an Statistic al Asso ciation , 76(373):60–61, 1981. [135] M. S. Handco ck and K. J. Gile. Mo deling net works from sampled data. Annals of Applie d Statistics , 4(1), 2010. [136] M. S. Handco ck, G. L. Robins, T. A. B. Snijders, J. Mo o dy , and J. Besag. Assessing degeneracy in statistical mo dels of so cial netw orks. Journal of the A meric an Statistic al Asso ciation , 76:33–50, 2003. [137] M. S. Handco c k, A. E. Raftery , and J. T antrum. Mo del-based clustering for so cial net w orks (with discussion). Journal of the R oyal Statistic al So ciety, Series A , 170: 301–354, 2007. [138] M. S. Handco c k, D. R. Hun ter, C. T. Butts, S. M. Go o dreau, and M. Morris. statnet: Soft w are to ols for the represen tation, visualization, analysis and sim ulation of net work data. Journal of Statistic al Softwar e , 24(1):12–25, 2008. [139] S. Hanneke and E. P . Xing. Discrete temp oral mo dels of so cial netw orks. In E. M. Airoldi, D. M. Blei, S. E. Fienberg, A. Golden b erg, E. P . Xing, and A. X. Zheng, editors, Statistic al Network Analysis: Mo dels, Issues and New Dir e ctions, volume 4503 of L e ctur e Notes in Computer Scienc e . Springer Berlin / Heidelb erg, 2007. [140] L. P . Hansen and J. A. Sc heinkman. Back to the future: Generating moment implica- tions for contin uous-time Marko v pro cesses. Ec onometric a , 63(4):767–804, 1995. [141] K. M. Harris, F. Florey , J. T ab or, P . S. Bearman, J. Jones, and R. J. Udry . The National L ongitudinal Study of A dolesc ent He alth: Rese ar ch Design . T echnical rep ort, Carolina P opulation Cen ter, Univ ersit y of North Carolina, Chap el Hill, 2003. [142] S. Hill, F. Pro v ost, and C. V olinsky . Netw ork-based mark eting: Iden tifying lik ely adopters via consumer netw orks. Statistic al Scienc e , 21(2):256–276, 2006. [143] Y. Ho, A. Gruhler, A. Heilbut, G. D. Bader, L. Mo ore, S.-L. Adams, A. Millar, P . T aylor, K. Bennett, K. Boutilier, L. Y ang, C. W olting, I. Donaldson, S. Schandorﬀ, J. Shewnarane, M. V o, J. T aggart, M. Goudreault, B. Musk at, C. Alfarano, D. Dew ar, Z. Lin, K. Michalic ko v a, A. R. Willems, H. Sassi, P . A. Nielsen, K. J. Rasm ussen, J. R. Andersen, L. E. Johansen, L. H. Hansen, H. Jespersen, A. P o dtelejniko v, E. Nielsen, J. Crawford, V. P oulsen, B. D. Sørensen, J. Matthiesen, R. C. Hendric kson, F. Glee- son, T. P a wson, M. F. Moran, D. Duro c her, M. Mann, C. W. V. Hogue, D. Figeys, and 77 M. Ty ers. Systematic identiﬁcation of protein complexes in Sac char omyc es c er evisiae b y mass sp ectrometry . Natur e , 415:180–183, 2002. [144] P . D. Hoﬀ. Random eﬀects mo dels for net w ork data. In R. Breiger, K. M. Carley , and P . E. P attison, editors, Dynamic So cial Network Mo deling and Analysis: Workshop Summary and Pap ers , pages 303–312. The National Academies Press, W ashington, D.C., 2003. [145] P . D. Hoﬀ. Mo deling homophily and sto c hastic equiv alence in symmetric relational data. In J. C. Platt, D. Koller, Y. Singer, and S. Row eis, editors, A dvanc es in Neur al Information Pr o c essing Systems (NIPS) , v olume 20, pages 657–664. MIT Press, 2008. [146] P . D. Hoﬀ, A. E. Raftery , and M. S. Handco c k. Latent space approac hes to so cial net w ork analysis. Journal of the Americ an Statistic al Asso ciation , 97(460):1090–1098, 2002. [147] P . W. Holland and S. Leinhardt. Lo cal structure in so cial net works. So ciolo gic al Metho dolo gy , 7:1–45, 1976. [148] P . W. Holland and S. Leinhardt. A dynamic mo del for so cial net works. Journal of Mathematic al So ciolo gy , 5(1):5–20, 1977. [149] P . W. Holland and S. Leinhardt. An exp onential family of probability distributions for directed graphs (with discussion). Journal of the A meric an Statistic al Asso ciation , 76(373):33–65, 1981. [150] P . W. Holland, K. B. Lask ey , and S. Leinhardt. Sto c hastic blockmodels: First steps. So cial Networks , 5(2):109–137, 1983. [151] P . Holme, J. Karlin, and S. F orrest. An integrated mo del of traﬃc, geograph y and econom y in the in ternet. A CM SIGCOMM Computer Communic ation R eview , 38(3): 7–15, 2008. [152] B. A. Hub erman and L. A. Adamic. Gro wth dynamics of the world-wide web. Natur e , 401:131, 1999. [153] S. Huh and S. E. Fien b erg. T emp orally-evolving mixed membership sto c hastic blo c k- mo dels: Exploring the Enron e-mail database. In Pr o c e e dings of the NIPS Workshop on Analyzing Gr aphs: The ory & Applic ations , Whistler, British Colum bia, 2008. [154] M. Huisman and C. Steglic h. T reatment of non-resp onse in longitudinal net w ork stud- ies. So cial Networks , 30(4):297–308, 2008. [155] D. R. Hun ter and M. S. Handco c k. Inference in curv ed exp onential family mo dels for net w orks. Journal of Computational and Gr aphic al Statistics , 15(3):565–583, 2006. 78 [156] D. R. Hunter, S. M. Go o dreau, and M. S. Handco ck. Go o dness of ﬁt of so cial net work mo dels. Journal of the A meric an Statistic al Asso ciation , 103(481):248–258, 2008. [157] D. R. Hunter, M. S. Handco c k, C. T. Butts, S. M. Go odreau, and M. Morris. ergm: A pac k age to ﬁt, sim ulate and diagnose exp onen tial-family mo dels for netw orks. Journal of Statistic al Softwar e , 24(3), 2008. http://www.jstatsoft.org/v24/i03/paper . [158] M. Huss and P . Holme. Currency and commo dity metab olites: Their identiﬁcation and relation to the mo dularity of metab olic netw orks. IET Systems Biolo gy , 1:280– 285, 2007. [159] T. Ito, K. T ashiro, S. Muta, R. Ozaw a, T. Chiba, M. Nishizaw a, K. Y amamoto, S. Kuhara, and Y. Sak aki. T ow ard a protein-protein in teraction map of the bud- ding yeast: A comprehensiv e system to examine t w o-hybrid in teractions in all p ossible com binations betw een the y east proteins. Pr o c e e dings of the National A c ademy of Scienc es , 97(3):1143–1147, 2000. [160] M. O. Jac kson. So cial and Ec onomic Networks . Princeton Universit y Press, 2008. [161] S. Janson, T. Luczak, and A. Ruci ´ nski. R andom Gr aphs . John Wiley & Sons, New Y ork, 2000. [162] L. J. Jensen and P . Bork. Bio chemistry: Not comparable, but complemen tary . Scienc e , 322(5898):56–57, 2008. [163] J. H. Jones and M. S. Handco ck. Social netw orks (communication arising): Sexual con tacts and epidemic thresholds. Natur e , 423:605–606, 2003. [164] J. H. Jones and M. S. Handco ck. An assessmen t of preferen tial attachmen t as a mec hanism for h uman sexual net work formation. In Pr o c e e dings of the R oyal So ciety, Series B , volume 270, n umber 1520, pages 1123–1128, 2003. [165] O. Kallenberg. Probabilistic symmetries and in v ariance principles. In Pr ob ability and its Applic ations . Springer, New Y ork, 2005. [166] L. Katz. The distribution of the n umber of isolates in a so cial group. Annals of Mathematic al Statistics , 23(2):271–276, 1952. [167] L. Katz and J. H. P ow ell. Probabilit y distributions of random v ariables associated with a structure of the sample space of so ciometric inv estigations. Annals of Mathematic al Statistics , 28(2):442–448, 1957. [168] L. Katz and T. R. Wilson. The v ariance of the n umber of m utual c hoices in sociometry . Psychometrika , 21(3):299–304, 1956. [169] M. Kearns, S. Suri, and N. Montfort. An exp erimen tal study of the coloring problem on h uman sub ject netw orks. Scienc e , 313(5788):824–827, 2006. 79 [170] D. Kemp e, J. Kleinberg, and E. T ardos. Inﬂuen tial no des in a diﬀusion mo del for so cial netw orks. In Automata, L anguages and Pr o gr amming , v olume 3580 of L e ctur e Notes in Computer Scienc e , pages 1127–1138. Springer Berlin / Heidelb erg, 2005. [171] J. M. Klein b erg. Authoritativ e sources in a h yp erlinked en vironment. Journal of the A CM (JA CM) , 46(5):604–632, 1999. [172] J. M. Klein b erg. Navigation in a small world—it is easier to ﬁnd short chains b etw een p oin ts in some net w orks than others. Natur e , 406:845, 2000. [173] J. M. Klein b erg. The small-w orld phenomenon: An algorithmic p ersp ective. In Pr o- c e e dings of the 32nd ACM Symp osium on The ory of Computing , pages 163–170. A CM Press, New Y ork, 2000. [174] J. M. Kleinberg. Small-w orld phenomena and the dynamics of information. In A dvanc es in Neur al Information Pr o c essing Systems (NIPS) , volume 14. MIT Press, Cam bridge, MA, 2001. [175] J. M. Kleinberg, S. R. Kumar, P . Raghav an, S. Ra jagopalan, and A. S. T omkins. The w eb as a graph: Measurements, mo dels and metho ds. In Computing and Combina- torics , volume 1627 of L e ctur e Notes in Computer Scienc e , pages 1–17. Springer Berlin / Heidelb erg, 1999. [176] A. S. Klo vdahl, J. J. Potterat, D. E. W o o dhouse, J. B. Muth, S. Q. Muth, and W. W. Darro w. So cial net w orks and infectious disease: The Colorado Springs study . So cial Scienc e & Me dicine , 38(1):79–88, 1994. [177] E. D. Kolacyzk. Statistic al Anaysis of Network Mo dels . Springer, New Y ork, 2009. [178] J. Koskinen, G. L. Robins, and P . E. Pattison. Analysing exp onen tial random graph (p-star) mo dels with missing data using Bay esian data augmentation. T echnical rep ort, Departmen t of Psychology , School of Behavioural Science, Universit y of Melb ourne, Austrailia, 2008. [179] J. H. Koskinen and T. A. B. Snijders. Ba yesian inference for dynamic so cial net work data. Journal of Statistic al Planning and Infer enc e , 137(12):3930–3938, 2007. [180] G. Kossinets. Eﬀects of missing data in so cial net works. So cial Networks , 28(3):247– 268, 2006. [181] D. Krac khardt. The ties that torture: Simmelian tie analysis in organizations. R ese ar ch in the So ciolo gy of Or ganizations , 16:183–210, 1999. [182] V. E. Krebs. Mapping netw orks of terrorist cells. Conne ctions , 24(3):43–52, 2002. [183] P . N. Krivitsky , M. S. Handco c k, A. E. Raftery , and P . D. Hoﬀ. Representing degree distributions, clustering, and homophily in so cial netw orks with latent cluster random eﬀects mo dels. So cial Networks , 31(3):204–213, 2009. 80 [184] N. J. Krogan, G. Cagney , H. Y u, G. Zhong, X. Guo, A. Ignatchenk o, J. Li, S. Pu, N. Datta, A. P . Tikuisis, T. Punna, J. M. Peregr ´ ın-Alv arez, M. Shales, X. Zhang, M. Da v ey , M. D. Robinson, A. Paccanaro, J. E. Bray , A. Sheung, B. Beattie, D. P . Ric hards, V. Canadien, A. Lalev, F. Mena, P . W ong, A. Starostine, M. M. Canete, J. Vlasblom, S. W u, C. Orsi, S. R. Collins, S. Chandran, R. Ha w, J. J. Rilstone, K. Gandi, N. J. Thompson, G. Musso, P . St Onge, S. Ghann y , M. H. Lam, G. Butland, A. M. Altaf-Ul, S. Kanay a, A. Shilatifard, E. O’Shea, J. S. W eissman, C. J. Ingles, T. R. Hughes, J. P arkinson, M. Gerstein, S. J. W o dak, A. Emili, and J. F. Green blatt. Global landscap e of protein complexes in the yeast Sac char omyc es c er evisiae . Natur e , 440(7084):637–643, 2006. [185] R. Kumar, P . Raghav an, S. Ra jagopalan, D. Siv akumar, A. T omkins, and E. Upfal. Sto c hastic mo dels for the w eb graph. In Pr o c e e dings of the 41st Annual Symp osium on F oundations of Computer Scienc e , pages 57–65, 2000. [186] S. L. Lauritzen. Rasc h models with exc hangeable ro ws and columns. In J. M. Bernardo et al., Bayesian Statistics 7 , pages 215–232. Oxford Univ ersit y Press, 2003. [187] S. L. Lauritzen. Exc hangeable Rasch matrices. R endic onti di Matematic a, Serie VII , 28(1):83–95, 2008. [188] S. Lee and C. F. Stev ens. General design principle for scalable neural circuits in a v ertebrate retina. Pr o c e e dings of the National A c ademy of Scienc es , 104(31):12931– 12935, 2007. [189] R. T. A. J. Leenders. Mo dels for net w ork dynamics: A Marko vian framew ork. Journal of Mathematic al So ciolo gy , 20:1–21, 1995. [190] E. A. Leic h t, G. Clarkson, K. Shedden, and M. Newman. Large-scale structure of time ev olving citation net w orks. Eur op e an Physics Journal B , 59(1):75–83, 2007. [191] J. Lesk ov ec and C. F aloutsos. Sampling from large graphs. In Pr o c e e dings of the 12th A CM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , pages 631–636. ACM Press, New Y ork, 2006. [192] J. Lesko vec, D. Chakrabarti, J. Klein b erg, and C. F aloutsos. Realistic, mathematically tractable graph generation and ev olution, using Kronec ker multiplication. In Know l- e dge Disc overy in Datab ases: PKDD 2005 , volume 3721 of L e ctur e Notes in Computer Scienc e , pages 133–145. Springer Berlin / Heidelb erg, 2005. [193] J. Lesko vec, J. Klein b erg, and C. F aloutsos. Graphs ov er time: Densiﬁcation laws, shrinking diameters and p ossible explanations. In Pr o c e e dings of the 11th ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , pages 177–187. A CM Press, New Y ork, 2005. 81 [194] J. Lesk ov ec, J. Kleinberg, and C. F aloutsos. Graph ev olution: Densiﬁcation and shrink- ing diameters. ACM T r ansactions on Know le dge Disc overy fr om Data (TKDD) , 1(1): 2, 2007. [195] J. Lesk ov ec, D. Chakrabarti, J. Klein b erg, C. F aloutsos, and Z. Ghahramani. Kronec ker graphs: An approac h to mo deling netw orks. , 2009. [196] L. Li, D. Alderson, J. C. Doyle, and W. Willinger. T o wards a theory of scale-free graphs: Deﬁnition, prop er ties, and implications. Internet Mathematics , 2(4):431–523, 2005. [197] W. Li and A. McCallum. Pac hinko allo cation: DAG-structured mixture mo dels of topic correlations. In Pr o c e e dings of the 23r d International Confer enc e on Machine L e arning , volume 148 of ACM International Confer enc e Pr o c e e ding Series , pages 577– 584. A CM Press, New Y ork, 2006. [198] D. Lib en-Now ell and J. Kleinberg. The link prediction problem for so cial netw orks. In Pr o c e e dings of the 12th International Confer enc e on Information and Know le dge Management (CIKM ’03) , pages 556–559. ACM Press, New Y ork, 2003. [199] F. Lorrain and H. C. White. Structural equiv alence of individuals in so cial net w orks. Journal of Mathematic al So ciolo gy , 1:49–80, 1971. [200] R. D. Luce. Connectivity and generalized cliques in so ciometric group structure. Psy- chometrika , 15(2):169–190, 1950. [201] R. D. Luce. Net w orks satisfying minimality conditions. Americ an Journal of Mathe- matics , 75(4):825–838, 1953. [202] R. D. Luce and A. D. P erry . A metho d of matrix analysis of group structure. Psy- chometrika , 14(2):95–116, 1949. [203] R. D. Luce, J. Macy , Jr., and R. T agiuri. A statistical mo del for relational analysis. Psychometrika , 20(4):319–327, 1955. [204] G. S. Mann, D. Mimno, and A. McCallum. Bibliometric impact measures lev eraging topic analysis. In Pr o c e e dings of the 6th A CM/IEEE-CS Joint Confer enc e on Digial Libr aries , pages 65–74. ACM Press, New Y ork, 2006. [205] J.-M. Marin, K. Mengersen, and C. P . Rob ert. Ba yesian modelling and inference on mixtures of distributions. In D. Dey and C. R. Rao, editors, Handb o ok of Statistics 25 , pages 15840–15845. Elsevier Sciences, 2005. [206] T. F. May er. Parties and net works: Sto chastic models for relationship netw orks. Jour- nal of Mathematic al So ciolo gy , 10:51–103, 1984. 82 [207] A. McCallum, A. Corradda-Emmanuel, and X. W ang. T opic and role discov ery in so cial net works. In Pr o c e e dings of the International Joint Confer enc e on A rtiﬁcial Intel ligenc e , pages 786–791, 2005. [208] A. McCallum, X. W ang, and N. Mohan t y . Join t group and topic discov ery from re- lations and text. In E. M. Airoldi, D. M. Blei, S. E. Fienberg, A. Golden b erg, E. P . Xing, and A. Zheng, editors, Statistic al Network Analysis: Mo dels, Issues and New Dir e ctions , v olume 4503 of L e ctur e Notes in Computer Scienc e , pages 28–44. Springer Berlin / Heidelb erg, 2007. [209] K. McComb, C. Moss, S. M. Durant, L. Bak er, and S. Sayialel. Matriarc hs as rep osi- tories of so cial knowledge in African elephan ts. Scienc e , 292(5516):491–494, 2001. [210] P . McCullagh and J. A. Nelder. Gener alize d Line ar Mo dels . Chapman & Hall/CR C, 2nd edition, 1989. [211] J. W. McDonald, P . W. F. Smith, and J. J. F orster. Marko v chain Mon te Carlo exact inference for so cial netw orks. So cial Networks , 29(1):127–136, 2007. [212] M. McGlohon, L. Ak oglu, and C. F aloutsos. W eigh ted graphs and disconnected compo- nen ts: P atterns and a generator. In Pr o c e e dings of the 14th International Confer enc e on Know le dge Disc overy and Data Mining , pages 524–532. ACM Press, New Y ork, 2008. [213] M. M. Mey er. T ransforming contingency tables. A nnals of Statistics , 10(4):1172–1181, 1982. [214] M. Middendorf, E. Ziv, and C. H. Wiggins. Inferring netw ork mechanisms: The Drosophila melanogaster protein in teraction netw ork. Pr o c e e dings of the National A c ademy of Scienc es , 102(9):3192–3197, 2005. [215] S. Milgram. The small world problem. Psycholo gy T o day , 1(1):60–67, 1967. [216] D. Mimno and A. McCallum. Mining a digital library for inﬂuen tial authors. In Pr o c e e dings of the 7th ACM/IEEE-CS Joint Confer enc e on Digital Libr aries , pages 105–106. A CM Press, New Y ork, 2007. [217] N. Mishra, R. Schreiber, I. Stanton, and R. E. T arjan. Finding strongly-knit clusters in so cial net w orks. Internet Mathematics , 5(1-2):155–174, 2008. [218] M. Mitzenmac her. A brief history of generative mo dels for p o w er la w and lognormal distributions. Internet Mathematics , 1(2):226–251, 2004. [219] M. Molloy and B. Reed. A critical p oin t for random graphs with a giv en degree sequence. R andom Structur es and Algorithms , 6(2–3):161–180, 1995. 83 [220] M. Mollo y and B. Reed. The size of the largest comp onent of a random graph on a ﬁxed degree sequence. Combinatorics, Pr ob ability and Computing , 7:295–306, 1998. [221] J. Moreno. Who Shal l Survive? Nervous and Men tal Disease Publishing Compan y , W ashington, D.C., 1934. [222] M. Morris and M. Kretzschmar. Concurren t partnerships and transmission dynamics in net w orks. So cial Networks , 17(3–4):299–318, 1995. [223] M. Morris, M. S. Handco ck, W. C. Miller, C. A. F ord, J. L. Sc hmitz, M. M. Hobbs, M. S. Cohen, K. M. Harris, and J. R. Udry . Prev alence of HIV infection among young adults in the United States: Results from the Add Health Study . A meric an Journal of Public He alth , 96(6):1091–1097, 2006. [224] M. Morris, M. S. Handcock, and D. R. Hun ter. Sp eciﬁcation of exp onen tial-family ran- dom graph mo dels: T erms and computational asp ects. Journal of Statistic al Softwar e , 24(4), 2008. http://www.jstatsoft.org/v24/i04 . [225] Q. Morris, B. F rey , and C. P aige. Denoising and un tangling graphs using degree priors. In A dvanc es in Neur al Information Pr o c essing Systems (NIPS) , v olume 16. MIT Press, Cam bridge, MA, 2003. [226] S. Mostafavi, D. Ray , D. W arde-F arley , C. Grouios, and Q. Morris. GeneMANIA: A real-time multiple asso ciation netw ork integration algorithm for predicting gene function. Genome Biolo gy , 9(Suppl. 1):S4, 2008. [227] E. Nabiev a, K. Jim, A. Agarw al, B. Chazelle, and M. Singh. Whole-proteome predic- tion of protein function via graph-theoretic analysis of interaction maps. Bioinformat- ics , 21(Suppl. 1):i302–i310, 2005. [228] R. M. Neal. Bayesian L e arning for Neur al Networks , v olume 118 of L e ctur e Notes in Statistics . Springer-V erlag, New Y ork, 1996. [229] J. Neville and D. Jensen. Collectiv e classiﬁcation with relational dep endency net works. In Pr o c e e dings of the 2nd Multi-R elational Data Mining Workshop, 9th ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , 2003. [230] J. Neville, D. Jensen, L. F riedland, and M. Ha y . Learning relational probabilit y trees. In Pr o c e e dings of the Ninth ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , pages 625–630. A CM Press, New Y ork, 2003. [231] M. Newman, A.-L. Barab´ asi, and D. J. W atts, editors. The Structur e and Dynamics of Networks . Princeton Universit y Press, 2006. [232] M. E. J. Newman. Detecting comm unit y structure in netw orks. Eur op e an Physics Journal B , 38(2):321–330, 2004. 84 [233] M. E. J. Newman. Mo dularit y and communit y structure in net works. Pr o c e e dings of the National A c ademy of Scienc es , 103(23):8577–8582, 2006. [234] M. E. J. Newman. Finding comm unit y structure in net w orks using the eigen v ectors of matrices. Physic al R eview E , 74(3):036104, 2006. [235] M. E. J. Newman, D. J. W atts, and S. H. Strogatz. Random graph mo dels of so cial net w orks. Pr o c e e dings of the National A c ademy of Scienc e , 99(Suppl. 1):2566–2572, 2002. [236] K. Nowic ki and T. A. B. Snijders. Estimation and prediction for sto chastic blo c kstruc- tures. Journal of the Americ an Statistic al Asso ciation , 96(455):1077–1087, 2001. [237] M. Nunkesser and D. Sawitzki. Blo c kmo dels. In U. Brandes and T. Erlebach, editors, Network A nalysis , volume 3418 of L e ctur e Notes in Computer Scienc e , pages 253–292. Springer Berlin / Heidelb erg, 2005. [238] J. O’Madadhain, P . Smyth, and L. Adamic. Learning predictive models for link for- mation. In Pr o c e e dings of the International Sunb elt So cial Network Confer enc e , 2005. [239] J. P ark and M. E. J. Newman. Statistical mechanics of net w orks. Physic al R eview E , 70:066117, 2004. [240] J. P ark and M. E. J. Newman. Solution of the tw o-star mo del of a netw ork. Physics R eviews E , 70:066146, 2004. [241] P . E. Pattison and S. S. W asserman. Logit mo dels and logistic regressions for so cial net w orks: II. Multiv ariate relations. British Journal of Mathematic al and Statistic al Psycholo gy , 52(2):169–193, 1999. [242] D. M. Pennock, G. W. Flake, S. La wrence, E. J. Glov er, and C. L. Giles. Winners don’t tak e all: Characterizing the comp etition for links on the web. Pr o c e e dings of the National A c ademy of Scienc es , 99(8):5207–5211, 2002. [243] B. Pittel. On tree census and the gian t comp onent in sparse random graphs. R andom Structur es A lgorithms , 1(3):311–342, 1990. [244] R. Radner and A. T ritter. Communication in net w orks. T echnical Rep ort Ec2098, Co wles Commission, Univ ersit y of Chicago, 1954. [245] O. Ratmann, O. Jørgensen, T. Hinkley , M. P . H. Stumpf, S. Ric hardson, and C. Wiuf. Using likelihoo d-free inference to compare evolutionary dynamics of the protein net- w orks of H . pylori and P . falcip arum . PL oS Computational Biolo gy , 3(11):2266–2278, 2007. [246] O. Ratmann, C. Wiuf, and J. W. Pinney . F rom evidence to inference: Probing the ev olution of protein interaction netw orks. HFSP Journal , 3(5):290–306, 2009. 85 [247] P . Ra vikumar. Appr oximate Infer enc e, Structur e L e arning and F e atur e Estimation in Markov R andom Fields. PhD thesis, Machine Learning Department, School of Computer Science, Carnegie Mellon Universit y , 2007. [248] T. Reguly , A. Breitkreutz, L. Boucher, B.-J. Breitkreutz, G. C. Hon, C. L. My ers, A. P arsons, H. F riesen, R. Oughtred, A. T ong, C. Stark, Y. Ho, D. Botstein, B. An- drews, C. Bo one, O. G. T roy anskya, T. Idek er, K. Dolinski, N. N. Batada, and M. Ty- ers. Comprehensive curation and analysis of global interaction netw orks in Sac cha- r omyc es c er evisiae . Journal of Biolo gy , 5(4):11, 2006. [249] E. Reid and H. Chen. Mapping the contemporary terrorism researc h domain: Re- searc hers, publications, and institutions analysis. In Intel ligenc e and Se curity Infor- matics , volume 3495 of L e ctur e Notes in Computer Scienc e , pages 322–339. Springer Berlin / Heidelb erg, 2005. [250] E. Reid, J. Qin, W. Chung, J. Xu, Y. Zhou, R. Sc humak er, M. Sageman, and H. Chen. T errorism knowledge disco very pro ject: A kno wledge discov ery approac h to addressing the threats of terrorism. In Intel ligenc e and Se curity Informatics , v olume 3073 of L e ctur e Notes in Computer Scienc e , pages 125–145. Springer Berlin / Heidelb erg, 2004. [251] A. Rinaldo, S. E. Fienberg, and Y. Zhou. On the geometry of discrete exp onential families with application to exp onential random graph mo dels. Ele ctr onic Journal of Statististics , 3:446–484, 2009. [252] J. M. Rob erts, Jr. Simple metho ds for sim ulating so ciomatrices with giv en marginal totals. So cial Networks , 22(3):273–283, 2000. [253] G. L. Robins and P . E. P attison. Random graph mo dels for temp oral pro cesses in so cial net w orks. Journal of Mathematic al So ciolo gy , 25:5–41, 2001. [254] G. L. Robins, P . E. P attison, and S. S. W asserman. Logit mo dels and logistic regressions for so cial net w orks: III. Valued relations. Psychometrika , 64(3):371–394, 1999. [255] G. L. Robins, P . E. Pattison, and J. W o olcock. Missing data in netw orks: Exp onential random graph (p*) mo dels for netw orks with non-resp onden ts. So cial Networks , 26(3): 257–283, 2004. [256] G. L. Robins, T. A. B. Snijders, P . W ang, M. S. Handco c k, and P . E. P attison. Recent dev elopmen ts in exp onential random graph ( p ∗ ) mo dels for so cial netw orks. So cial Networks , 29(2):192–215, 2007. [257] T. T. Rogers and J. L. McClelland. Semantic Co gnition: A Par al lel Distribute d Pr o- c essing Appr o ach . MIT Press, Cam bridge, MA, 2004. [258] M. J. Salganik and D. D. Heck athorn. Sampling and estimation in hidden p opulations using resp onden t-driv en sampling. So ciolo gic al Metho dolo gy , 34:193–239, 2004. 86 [259] F. S. Sampson. A Novitiate in a Perio d of Change: A n Exp erimental and Case Study of So cial R elationships . PhD thesis, Cornell Universit y , 1968. [260] O. Sandberg. Se ar ching in a Smal l World . PhD thesis, Division of Mathematical Statistics, Departmen t of Mathematical Sciences, Chalmers Univ ersity of T echnology and G¨ oteb org Univ ersit y , G¨ oteb org, Sw eden, 2005. [261] O. Sandb erg. Neigh b or selection and hitting probabilit y in small-w orld graphs. A nnals of Applie d Pr ob ability , 18(5):1771–1793, 2008. [262] O. Sandb erg and I. Clarke. The evolution of navigable small-w orld netw orks. http: //arXiv.org/abs/cs/0607025 , 2006. [263] P . Sark ar and A. W. Mo ore. Dynamic so cial netw ork analysis using latent space mo dels. In A dvanc es in Neur al Information Pr o c essing Systems (NIPS) , volume 18, pages 1145–1152. MIT Press, Cambridge, MA, 2005. [264] P . Sark ar and A. W. Mo ore. Dynamic so cial netw ork analysis using laten t space mo dels. SIGKDD Explor ations: Sp e cial Edition on Link Mining , 7(2):31–40, 2005. [265] P . Sark ar, S. M. Siddiqi, and G. J. Gordon. A laten t space approach to dynamic em b edding of co-o ccurrence data. In Pr o c e e dings of the 11th International Confer enc e on Artiﬁcial Intel ligenc e and Statistics (AI-ST A TS ’07) , 2007. [266] C. R. Shalizi, M. F. Camp eri, and K. L. Klinkner. Discov ering functional comm unities in dynamical net works. In E. M. Airoldi, D. M. Blei, S. E. Fienberg, A. Golden b erg, E. P . Xing, and A. Zheng, editors, Statistic al Network A nalysis: Mo dels, Issues and New Dir e ctions , volume 4503 of L e ctur e Notes in Computer Scienc e , pages 140–157. Springer Berlin / Heidelb erg, 2007. [267] B. Shneiderman and A. Aris. Netw ork visualization b y semantic substrates. IEEE T r ansactions on Visualization and Computer Gr aphics , 12(5):733–740, 2006. [268] G. Simmel and K. H. W olﬀ. The So ciolo gy of Ge or g Simmel . The F ree Press, New Y ork, 1950. [269] H. A. Simon. On a class of sk ew distribution functions. Biometrika , 42(3–4):425–440, 1955. [270] B. Singer and S. Spilerman. So cial mobilit y mo dels for heterogenous p opulations. So ciolo gic al Metho dolo gy , 5:356–401, 1973–1974. [271] B. Singer and S. Spilerman. The represen tation of so cial pro cesses by Marko v mo dels. The Americ an Journal of So ciolo gy , 82(1):1–54, 1976. [272] T. A. B. Snijders. The transition probabilities of the recipro city mo del. Journal of Mathematic al So ciolo gy , 23(4):241–253, 1999. 87 [273] T. A. B. Snijders. The statistical ev aluation of so cial net work dynamics. So ciolo gic al Metho dolo gy , 31:361–395, 2001. [274] T. A. B. Snijders. Accoun ting for degree distributions in empirical analysis of netw ork dynamics. In R. L. Breiger, K. M. Carley , and P . E. Pattison, editors, Dynamic So cial Network Mo deling and A nalysis: Workshop Summary and Pap ers , pages 146–161. The National Academies Press, W ashington, D.C., 2003. [275] T. A. B. Snijders. Mo dels for longitudinal net w ork data. In P . J. Carrington, J. Scott, and S. S. W asserman, editors, Mo dels and Metho ds in So cial Network A nalysis , c hap- ter 11. Cambridge Universit y Press, New Y ork, 2005. [276] T. A. B. Snijders. Statistical metho ds for net work dynamics. In S. R. Luc hini et al., editors, Pr o c e e dings of the XLIII Scientiﬁc Me eting, Italian Statistic al So ciety , pages 281–296, P ado v a: CLEUP , 2006. [277] T. A. B. Snijders and K. No wicki. Estimation and prediction for stochastic blo c kmo dels for graphs with latent blo ck structure. Journal of Classiﬁc ation , 14(1):75–100, 1997. [278] T. A. B. Snijders and M. A. J. v an Duijin. Simulation for statistical inference in dy- namic net work models. In R. Conte, R. Hegselmann, and P . T erna, editors, Simulating So cial Phenomena , pages 493–512. Springer, Berlin, 1997. [279] T. A. B. Snijders and M. A. J. v an Duijn. Conditional maximum lik eliho o d estimation under v arious sp eciﬁcations of exp onen tial random graph mo dels. In J. Hagb erg, edi- tor, Contributions to So cial Network A nalysis, Information The ory, and Other T opics in Statistics; A F estschrift in honour of Ove F r ank , pages 117–134. Department of Statistics, Univ ersit y of Sto c kholm, Sto ckhol m, Sweden , 2002. [280] T. A. B. Snijders, P . E. Pattison, G. L. Robins, and M. S. Handco c k. New sp eciﬁcations for exp onen tial random graph mo dels. So ciolo gic al Metho dolo gy , 36:99–153, 2006. [281] R. Solomonoﬀ and A. Rap op ort. Connectivit y of random nets. Bul letin of Mathematic al Biolo gy , 13(2):107–117, 1951. [282] S. Spilerman. Structural analysis and the generation of so ciograms. Behavior al Scienc e , 11:312–318, 1966. [283] M. Stephens. Bay esian analysis of mixtures with an unkno wn n um b er of components— an alternativ e to reversible jump metho ds. Annals of Statistics , 28(1):40–74, 2000. [284] D. Stork and W. Ric hards. Nonresp ondents in communication net w ork studies. Gr oup & Or ganization Management , 17(2):193–209, 1992. [285] D. B. Stouﬀer, R. D. Malmgren, and L. A. N. Amaral. Commen t on Barab´ asi, Nature 435, 207 (2005). , 2005. 88 [286] D. B. Stouﬀer, R. D. Malmgren, and L. A. N. Amaral. Log-normal statistics in e-mail comm unication patterns. , 2008. [287] D. Strauss and M. Ik eda. Pseudolikelihoo d estimation for so cial netw orks. Journal of the Americ an Statistic al Asso ciation , 85(409):204–212, 1990. [288] M. P . H. Stumpf and T. Thorne. Multi-mo del inference of net w ork prop erties from incomplete data. Journal of Inte gr ative Bioinformatics , 3(2):32, 2006. http: //journal.imbio.de/index.php?paper_id=32 . [289] M. P . H. Stumpf, C. Wiuf, and R. M. May . Subnets of scale-free netw orks are not scale-free: Sampling prop erties of net w orks. Pr o c e e dings of the National A c ademy of Scienc es , 102(12):4221–4224, 2005. [290] S. Sw asey . Netﬂix aw ards $1 million Netﬂix prize and announces second $1 million c hallenge. W all Street Journal, Septem b er 21, 2009. [291] K. T arassov, V. Messier, C. R. Landry , S. Radinovic, M. M. Serna Molina, I. Shames, Y. Malitsk ay a, J. V ogel, H. Bussey , and S. W. Michnic k. An in viv o map of the y east protein in teractome. Scienc e , 320(5882):1465–1470, 2008. [292] H. M. T aylor and S. Carlin. An Intr o duction to Sto chastic Mo deling . Academic Press, New Y ork, 3rd edition, 1998. [293] S. K. Thompson. Adaptiv e web sampling. Biometrics , 62(4):1224–1234, 2006. [294] S. K. Thompson. T argeted random walk designs. Survey Metho dolo gy , 32(1):11–24, 2006. [295] S. K. Thompson and O. F rank. Mo del-based estimation with link-tracing sampling designs. Survey Metho dololo gy , 26(1):87–98, 2000. [296] S. K. Thompson and G. A. F. Seb er. A daptive Sampling . Wiley , New Y ork, 1996. [297] D. M. Titterington, A. F. M. Smith, and U. E. Mak o v. Statistic al Analysis of Finite Mixtur e Distributions . John Wiley & Sons, New Y ork, 1986. [298] J. T rav ers and S. Milgram. An exp erimen tal study of the small w orld problem. So- ciometry , 32(4):425–443, 1969. [299] R. J. Udry . The National L ongitudinal Study of A dolesc ent He alth: (Add he alth) Waves I and II, 1994–1996; Wave III, 2001–2002 . T echnical rep ort, Carolina Population Cen ter, Univ ersit y of North Carolina, Chap el Hill, 2003. [300] P . Uetz, L. Giot, G. Cagney , T. A. Mansﬁeld, R. S. Judson, J. R. Knight, D. Lo ck- shon, V. Nara yan, M. Sriniv asan, P . Pochart, A. Qureshi-Emili, Y. Li, B. Go dwin, D. Cono v er, T. Kalbﬂeisc h, G. Vija yada mo dar, M. Y ang, M. Johnston, S. Fields, and 89 J. M. Rothberg. A comprehensive analysis of protein-protein interactions in Sac cha- r omyc es c er evisiae . Natur e , 403(6770):623–627, 2000. [301] M. A. J. v an Duijn, T. A. B. Snijders, and B. J. H. Zijlstra. p 2 : A random eﬀects mo del with cov ariates for directed graphs. Statistic a Ne erlandic a , 58(2):234–254, 2004. [302] M. A. J. v an Duijn, K. J. Gile, and M. S. Handco ck. A framew ork for the comparison of maxim um pseudo-likelihoo d and maximum lik eliho o d estimation of exp onential family random graph mo dels. So cial Networks , 31(1):52–62, 2009. [303] E. A. V ance, E. A. Arc hie, and C. J. Moss. So cial netw orks in African elephants. Computational & Mathematic al Or ganization The ory, http: // www. springerlink. com/ content/ enpk5g428272927m , 2008. T o app ear in prin t, 2009. [304] A. V´ azquez, J. G. Oliv eira, Z. Dezs¨ o, K. Goh, I. Kondor, and A.-L. Barab´ asi. Mo deling bursts and heavy tails in h uman dynamics. Physic al R eview E , 73:036127, 2006. [305] E. V olz and D. D. Heck athorn. Probability based estimation theory for resp ondent driv en sampling. Journal of Oﬃcial Statistics , 24(1):79–97, 2008. [306] E. V olz and L. A. Meyers. Epidemic thresholds in dynamic contact netw orks. Journal of the R oyal So ciety Interfac e , 6(32):233–241, 2009. [307] C. von Mering, R. Krause, B. Snel, M. Cornell, S. G. Oliv er, S. Fields, and P . Bork. Comparativ e assessmen t of large-scale data sets of protein-protein interactions. Natur e , 417(6887):399–403, 2002. [308] M. J. W ainwrigh t and M. I. Jordan. Graphical mo dels, exp onen tial families, and v ariational inference. F oundations and T r ends in Machine L e arning , 1(1–2):1–305, 2008. [309] A. M. W alczak, A. Mugler, and C. H. Wiggins. A stochastic spectral analysis of transcriptional regulatory cascades. Pr o c e e dings of the National A c ademy of Scienc es , 106(16):6529–6534, 2009. [310] Y. W ang, D. Chakrabarti, C. W ang, and C. F aloutsos. Epidemic spreading in real net- w orks: An eigenv alue viewp oint. In Pr o c e e dings of the 22nd International Symp osium on R eliable Distribute d Systems (SRDS ’03) , pages 25–34, 2003. [311] Y. Y. W ang and G. Y. W ong. Sto c hastic blo ckmodels for directed graphs. Journal of the Americ an Statistic al Asso ciation , 82(397):8–19, 1987. [312] S. S. W asserman. Sto chastic Mo dels for Dir e cte d Gr aphs . PhD thesis, Departmen t of Statistics, Harv ard Universit y , 1977. [313] S. S. W asserman. Analyzing so cial netw orks as sto chastic pro cesses. Journal of the A meric an Statistic al Asso ciation , 75(370):280–294, 1980. 90 [314] S. S. W asserman and C. Anderson. Sto c hastic a p osteriori blo ckmodels: Construction and assessmen t. So cial Networks , 9(1):1–36, 1987. [315] S. S. W asserman and K. F aust. So cial Network Analysis: Metho ds and Applic ations . Cam bridge Univ ersit y Press, 1994. [316] S. S. W asserman and P . E. Pattison. Logit mo dels and logistic regression for so cial net w orks: I. An in tro duction to Marko v graphs and p ∗ . Psychometrika , 61(3):401–425, 1996. [317] S. S. W asserman, G. L. Robins, and D. Steinley . Statistical mo dels for netw orks: A brief review of some recent research. In E. M. Airoldi, D. M. Blei, S. E. Fienberg, A. Goldenberg, E. P . Xing, and A. X. Zheng, editors, Statistic al Network Analysis: Mo dels, Issues and New Dir e ctions , v olume 4503 of L e ctur e Notes in Computer Scienc e . Springer Berlin / Heidelb erg, 2007. [318] D. J. W atts. Smal l Worlds: The Dynamics of Networks b etwe en Or der and R andom- ness . Princeton Universit y Press, 1999. [319] D. J. W atts. Six De gr e es: The Scienc e of a Conne cte d A ge . W. W. Norton & Compan y , New Y ork, 2003. [320] D. J. W atts and S. H. Strogatz. Collective dynamics of ‘small-w orld’ netw orks. Natur e , 393(6684):440–442, 1998. [321] H. C. White. Searc h parameters for the small w orld problem. So cial F or c es , 49(2): 259–264, 1970. [322] H. C. White, S. A. Bo orman, and R. L. Breiger. So cial structure from m ultiple net- w orks. I. Blo c kmo dels of roles and positions. The Americ an Journal of So ciolo gy , 81 (4):730–780, 1976. [323] R. J. Williams and N. D. Martinez. Simple rules yield complex fo o d webs. Natur e , 404(6774):180–183, 2000. [324] W. Willinger, D. Alderson, and J. C. Doyle. Mathematics and the in ternet: A source of enormouse confusion and great p otential. Notic es of the Americ an Mathematic al So ciety , 56(5):586–599, 2009. [325] C. Wiuf and M. P . H. Stumpf. Binomial subsampling. Journal of the R oyal So ciety, Series A , 462(2068):1181–1195, 2006. [326] C. Wiuf, M. Brameier, O. Hagb erg, and M. P . H. Stumpf. A likelihoo d approach to analysis of netw ork data. Pr o c e e dings of the National A c ademy of Scienc es , 103(20): 7566–7570, 2006. 91 [327] S. L. W ong, L. V. Zhang, A. H. Y. T ong, Z. Li, D. S. Goldb erg, O. D. King, G. Lesage, M. Vidal, B. Andrews, H. Bussey , C. Bo one, and F. P . Roth. Combining biologi- cal net works to predict genetic in teractions. Pr o c e e dings of the National A c ademy of Scienc es , 101(44):15682–15687, 2004. [328] H. Y u, P . Braun, M. A. Yildirim, I. Lemmens, K. V enk atesan, J. Sahalie, T. Hirozane- Kishik a w a, F. Gebreab, N. Li, N. Simonis, T. Hao, J. F. Rual, A. Dricot, A. V azquez, R. R. Murra y , C. Simon, L. T ardivo, S. T am, N. Svrzik apa, C. F an, A. S. de Smet, A. Mot yl, M. E. Hudson, J. Park, X. Xin, M. E. Cusic k, T. Mo ore, C. Bo one, M. Sny- der, F. P . Roth, A.-L. Barab´ asi, J. T av ernier, D. E. Hill, and M. Vidal. High-qualit y binary protein interaction map of the yeast in teractome netw ork. Scienc e , 322(5898): 104–110, 2008. [329] G. U. Y ule. A mathematical theory of ev olution, based on the conclusions of Dr. J. C. Willis, F.R.S. Philosophic al T r ansactions of the R oyal So ciety of L ondon, Series B, Containing Pap ers of a Biolo gic al Char acter , 213:21–87, 1925. [330] W. W. Zac hary . An information ﬂow mo del for conﬂict and ﬁssion in small groups. Journal of A nthr op olo gic al R ese ar ch , 33:452–473, 1977. [331] A. Zheng and A. Golden b erg. A generativ e mo del for dynamic con textual friendship net w orks. T ec hnical rep ort, Mac hine Learning Departmen t, Carnegie Mellon Univer- sit y , 2006. [332] X. Zhu, M. Gerstein, and M. Snyder. Getting connected: Analysis and principles of biological net w orks. Genes Development , 21(9):1010–1024, 2007. [333] B. J. H. Zijlstra, M. A. J. v an Duijn, and T. A. B. Snijders. The multilev el p 2 mo del: A random eﬀects mo del for the analysis of multiple so cial net w orks. Metho dolo gy , 2 (1):42–47, 2006. 92

A survey of statistical network models

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment