Active Discovery of Network Roles for Predicting the Classes of Network Nodes

Activ e Disco v ery of Net w ork Roles for Predicting the Classes of Net w ork No des Leto P eel 1 , ∗ 1 Dep artment of Computer Scienc e, University of Color ado, Boulder, CO 80309 No des in real w orld netw orks often ha ve class lab els, or underlying attributes, that are related to the wa y in which they connect to other no des. Sometimes this relationship is simple, for instance no des of the same class are ma y be more likely to b e connected. In other cases, how ever, this is not true, and the wa y that no des link in a netw ork exhibits a diﬀerent, more complex relationship to their attributes. Here, w e consider netw orks in which we know how the nodes are connected, but w e do not know the class lab els of the no des or how class lab els relate to the netw ork links. W e wish to iden tify the best subset of no des to label in order to learn this relationship betw een no de attributes and netw ork links. W e can then use this discov ered relationship to accurately predict the class lab els of the rest of the netw ork no des. W e present a mo del that iden tiﬁes groups of no des with similar link patterns, whic h we call network r oles , using a generative blockmodel. The mo del then predicts labels by learning the mapping f rom net work roles to class lab els using a maximum margin classiﬁer. W e choose a subset of no des to lab el according to an iterativ e margin-based active learning strategy . By integrating the disco very of netw ork roles with the classiﬁer optimisation, the active learning pro cess can adapt the netw ork roles to better represen t the net w ork for node classiﬁcation. W e demonstrate the mo del b y exploring a selection of real world net works, including a marine fo od w eb and a netw ork of English words. W e sho w that, in con trast to other netw ork classiﬁers, this mo del achiev es go o d classiﬁcation accuracy for a range of netw orks with diﬀerent relationships b etw een class lab els and netw ork links. I. INTR ODUCTION Man y naturally occurring netw orks can b e decomp osed in to sets of no des that link to the rest of the net w ork in similar wa ys. These sets of equiv alent no des not only share link structure but often share common attributes. F or example, assortative communities in online bidding net works corresp ond to the main user groups according to common interest [27], in blog net works blogs tend to link to others of the same p olitical view [1], and in biolog- ical net w orks proteins tend to link to others that perform similar functions [9]. In addition, there are types of dis- assortativ e link patterns where no des that are dissimilar prefer to link to each other. F or example, in a netw ork of sexual relationships linked entities tend to b e of the opp osite gender [5], sp ecies in a fo o d web tend to eat or- ganisms of a diﬀerent sp ecies [2], and in an adjacency net- w ork of English words adjectives tend to precede nouns. When we encoun ter a new netw ork, we may not ini- tially know ho w link patterns relate to the attributes of no des. The netw ork could contain relations that are as- sortativ e, disassortative or a mixture therein. If we wish to p erform learning tasks suc h as classiﬁcation, it is im- p ortan t to understand the relationship b et w een net work links and no de attributes. One wa y w e can analyse the link patterns in netw orks is to iden tify groups of no des that link in equiv alent wa ys. W e call these groups of no des with similar link patterns network r oles . By analysing the pattern of links within and betw een net work roles w e can understand the o v erall net work structure. ∗ leto.peel@colorado.edu Class lab els, on the other hand, represen t attributes or descriptions of the no des. Consider the aforementioned w ord netw ork, in which a link indicates that words are adjacen t in text and the direction of the link indicates the w ord order. Let eac h of the words b e assigned one of tw o p ossible class labels, “noun” or “adjective”. W ords la- b elled adjective are descriptive words and words lab elled noun are naming words. Here the class lab el tells us something ab out the w ord, but without prior knowledge of the language, it does not tell us ho w it links with other w ords. Therefore, class lab els tell us something ab out the no des and netw ork roles tell us about how no des link to eac h other. Our goal is to iden tify the relationship b et w een the tw o. In English we can use netw ork roles to describ e the link pattern that nouns follow adjectives. Ho wev er, in the F renc h language adjectives may come b efore or after a noun so while they hav e the same class labels, they are describ ed by a diﬀeren t set of roles. F urthermore, in F renc h certain types of adjectives such as gr and , b e au and b on come before the noun while others suc h as am ´ eric ain , noir , r ond usually come after the noun. In this case, no des of the same class displa y heterogeneity as they do not all link to the netw ork in the same wa y , therefore we can use multiple roles to represen t the heterogeneous link patterns of this class. A graphical representation of some p ossible role and class combinations are sho wn in Figure 1. Each node has a discrete class attribute (indicated by colour). The diﬀerence b et w een the netw orks is how the no des link to the rest of the netw ork. No des of the same role (i.e. hav e similar link patterns) are enclosed in a b o x. The top left netw ork illustrates assortative homogeneous classes, while the net w ork in the bottom left shows disassortativ e 2 Homogeneous Heterogeneous Assortative Disassortative FIG. 1: F our examples of classes and roles in netw orks. No de colour indicates class lab el. No des of that hav e similar link patterns are enclosed in a b ox; we call these netw ork roles. A homogeneous class is made up of a single role while a heterogeneous one contains multiple roles. classes. Both of these cases are homogeneous b ecause the links of each class is describ ed by a single role. In con- trast, the link structure of a heterogeneous class cannot b e adequately represented b y a single role and instead m ultiple roles are required to describ e the link patterns of each class (right). In this w ork w e consider the scenario where netw orks links are kno wn but the no de class labels are not. Al- though the lab els of the netw ork no des are not immedi- ately a v ailable, we can query the netw ork for more in- formation (e.g. by conducting ﬁeld work or lab exp eri- men ts etc.). The task is to query a subset of the lab els to discov er net work roles, which can not only b e used to understand the relationship betw een classes, but can also b e used to predict missing class lab els. Our approac h uses only the link information and is ap- plicable to a range of assortative and disassortative struc- tures with no a priori assumptions ab out the structure t yp e. W e also do not assume that all no des of a par- ticular class link to the rest of the net work in the same w ay . W e achiev e this b y using a v ariant of the sto chastic blo c kmo del [14, 24] to model the probabilities of observ- ing links b etw een a pair of roles. The use of a blo c kmo del giv es us the ﬂexibilit y to mo del a wide range of net work structures, e.g. assortative, disassortative or a mixture of the tw o. It also allows for directed relations. The particular blo ckmodel v arian t we use allows for individual no des to hav e mixed role memberships. This means that a class con tain multiple roles, and also that roles can b e shared across classes. F or example, consider a predator-prey net work where no des represen t sp ecies and directed links represent who-e ats-whom , w e may b e able to lab el no des as carnivores, omnivores, herbivores and plants. While the carniv ore and herbivore classes could b e represen ted by the distinct roles { e ats-animals , N x D X R Y Y R A N x N N x K N x K N x 1 N x 1 In depen den t (ii d) Netw o r k Di m ens i on ali t y R edu c tio n R o le Discovery C l a ss i ﬁ c a tio n C l a ss i ﬁ c a tio n FIG. 2: A comparison of an iid classiﬁcation pro cess with dimensionalit y reduction (top) and the netw ork classiﬁcation pro cess of our mo del (b ottom). In the iid case the rows of the input feature matrix, X , are indep enden t while the adjacency matrix, A , of the netw ork is not. e ats-plants } , a no de lab elled as omnivore would take on a mixture of these roles. T o learn the relationship b etw een net work roles and class lab els, we require some alignmen t of the classes and roles, i.e. so we can iden tify a mapping from roles to classes. W e achiev e this with a blo c kmo del that incorp o- rates a maximum margin classiﬁer (e.g. Supp ort V ector Mac hine [11]). This new mo del allows us to use kno wn no de lab els to inﬂuence the discov ery of roles to improv e their alignment to the class lab els. Therefore acquiring more no de lab els improv es the correspondence b et ween roles and classe s. T o determine which no des to acquire to eﬃcien tly disco v er net work roles, w e emplo y a margin- based active learning strategy . Once we ha ve trained our mo del, we can use the net- w ork roles w e hav e discov ered to make predictions ab out the node lab els w e do not know. This classiﬁcation pro- cess is somewhat analogous to p erforming dimensional- it y reduction (e.g. Principal Comp onent Analysis) on iid data b efore p erforming classiﬁcation. By representing the netw ork with roles, w e ac hieve similar goals to the iid case, i.e. reduce dimensions, remov e linear correla- tions. How ev er, we also gain an additional b eneﬁt of conditional indep endence, since although the no des in the net work are b y nature dep enden t on each other, once w e condition on the net work roles the class lab els become conditionally indep endent of each other. W e test our metho d on a selection of real world net- w orks and show that, in con trast to previous work, it 3 p erforms well for a range of netw ork structures and even when no des with the same class lab el connect to the rest of the net work in diﬀerent wa ys. I I. RELA TED WORK In this w ork we examine the problem of ho w to explore a netw ork in order to discov er the relationship b etw een no de lab els and netw ork links. Related to this problem are the tasks of collective classiﬁcation and active learn- ing in net works. These methods make predictions ab out no de lab els, but unlike our work, they do not explicitly try to identify the type of relationship b et ween links and lab els. The problem of classifying no des in netw orks (referred to as collective classiﬁcation) has received a lot of at- ten tion in recent years (e.g. [8, 15 – 17, 19, 29, 33, 39]). These metho ds consider the hard problem of making a join t, or c ol le ctive , prediction ab out the class lab els of no des. This is hard b ecause for N no des with C p os- sible class lab els, there are C N diﬀeren t wa ys to label the no des collectively . In the activ e learning setting, the task is to select a goo d subset of no des to lab el in order to b est predict the remaining no des. Of these collective classiﬁcation metho ds, the relev ant ones are those that are applicable to the univ ariate setting, i.e. they do not use attributes of the nodes as features for classiﬁcation. Most collectiv e classiﬁers mak e assumptions, either ex- plicitly or implicitly , that do not necessarily hold for all net works. The most signiﬁcant one being assortativity (homophily), the is the assumption that linked nodes are more likely to b e of the same class. This assumption do es not hold in all cases as not all class lab els are re- lated to the netw ork structure in this wa y . The Marko v assumption that node labels are conditionally indep en- den t given the lab els of its neigh b ours (e.g. [17, 20, 22]), is also frequently used. This assumption allows one com- binatorially hard collective problem to b e broken in to man y easier related problems that can b e solved locally at the node level in conjunction with iterativ e algorithms to propagate these predictions around the netw ork. W e mak e a similar, but less restrictive assumption that the lab els of al l no des are conditionally indep enden t of eac h other, given a set of unobserv ed latent v ariables that w e infer from the netw ork structure. I II. ROLE DISCO VER Y IN NETW ORKS W e presen t a mo del for disco vering net work roles (groups of nodes with similar link patterns) to help us understand the relationship b etw een the class labels of no des and the netw ork links. Blo c kmo dels are a t yp e of probabilistic generative net work model and are a natural c hoice for this task since they can mo del a wide range of net work structure types, e.g. assortative, bipartite, core- p eriphery , hierarc hical. Our mo del is based on a mixed membership blo ck- mo del [25] and therefore assumes that each no de a b e- longs to a distribution o ver roles. The nodes’ distribution o ver roles determines the probability of a link existing b et w een an y pair of nodes. This assumption allows us to treat the links in the netw ork as b eing conditionally indep enden t of eac h other given the netw ork roles. Since the blo c kmo del is a type of probabilistic gener- ativ e mo del, we can sp ecify it according to the assumed data generating pro cess: 1. F or a given netw ork dra w a distribution o v er the p ossible K 2 net work role interactions 1 π ∼ Diric hlet( α ) 2. F or each role k ∈ { 1 , 2 , ..., K } : • Draw a distribution ov er no des φ ∼ Diric hlet( β ) 3. F or each interaction i ∈ { 1 , 2 , ..., I } : • Draw a role-role in teraction pair z i = ( z s , z r ) i , z i ∼ Categorical( π ) • Draw a sender no de s i ∼ Categorical( φ z s ) • Draw a receiv er no de r i ∼ Categorical( φ z r ) F or a given netw ork, G = ( V , E ), this blo c kmo del de- ﬁnes a likelihoo d function ov er the N = |V | no des and M = |E | links or interactions. The model assigns eac h link, { s i , r i } , in the netw ork a latent v ariable z i represen t- ing a distribution o ver p ossible role pairs. Each no de’s distribution ov er roles, ¯ z v , is then given by a normalised sum o v er the indicator v ectors of length K describing the net work role of the sender ( z s i ) and receiv er ( z r i ) no des: ¯ z v = 1 n v X i : s i = v z s i + X i : r i = v z r i ! . (1) Exact ev aluation of the blo c kmo del likelihoo d, p ( { s, r }| α, β ), is intractable since it requires a sum ov er all possible com binations of the latent v ariables, z . Using a distribution e q and Jensen’s inequalit y , we approximate 1 Conceptually , we can think of π as a matrix with dimensions K × K such that each element π k 1 ,k 2 represents the probability of observing a link from role k 1 to role k 2 . 4 the log lik eliho o d with a more tractable low er b ound 2 : log p ( { s, r }| α, β ) = log Z Z X z p ( π , φ, z , { s, r }| α, β ) d π d φ = log Z Z X z p ( π , φ, z , { s, r }| α, β ) e q ( π , φ, z ) e q ( π , φ, z ) d π d φ ≥ Z Z X z e q ( π , φ, z ) log p ( π , φ, z , { s, r }| α, β ) e q ( π , φ, z ) d π d φ = E e q [log p ( π , φ, z , { s, r }| α , β )] − E e q [log e q ( π , φ, z )] , (2) where E e q is the exp ectation under the e q distribution. Since the Diric hlet and Categorical distributions are con- jugate, we can solve the integration analytically to give us a tigh ter b ound on the log likelihoo d [32]: L ( q ; α, β ) , E q [log p ( z , { s, r }| α, β )] − E q [log q ( z )] (3) ≥ E e q [log p ( π , φ, z , { s, r }| α , β )] − E e q [log e q ( π , φ, z )] . IV. SUPER VISED ROLE DISCO VER Y Our goal is to b etter understand the relationship b e- t ween netw ork link structure and the no de class lab els. In the previous section w e discussed how a blockmodel can b e used to identify net work roles and that netw ork roles can be used to describe how the net w ork links. Therefore to ac hieve our goal we need to ﬁnd a mapping b et ween the net work roles and the class lab els. T o ﬁnd this mapping w e use a maximum margin classiﬁer, e.g. supp ort vec- tor machine (SVM) [11]). W e choose maxim um margin classiﬁers because along with their strong generalisation guaran tees [36], they hav e also demonstrated empirical success on a broad range of iid tasks. Additionally , they ha ve b een eﬀectiv e when used in conjunction with prob- abilistic mo dels to solve problems including optical char- acter recognition [31] and do cumen t classiﬁcation [38]. Eﬀectiv e metho ds for margin-based active learning in iid data hav e also been established [13, 28, 34]. A classiﬁer is a function F : X → Y that maps a fea- ture v ector x ∈ X to a lab el y ∈ Y . Since the blo ckmodel assigns each no de, v , a vector, ¯ z v , represen ting its distri- bution ov er roles, we can use ¯ z v as a the input feature v ector. The classiﬁer can then learn the relationship b e- t ween roles and class lab els. In this work we restrict F to b e a function of the form: F ( ¯ z v , η ) = arg max y ∈Y η T y ¯ z v , (4) 2 This is true for any choice of distribution, e q , and is an equality when e q is equal to the posterior distribution. corresp onding to a linear classiﬁer in whic h η is a ma- trix of co eﬃcients with dimensions K × C and C = |Y | is the num ber of diﬀerent class lab els. A classiﬁer with go od generalisation prop erties is one that maximises the margin, i.e. the distance b etw een the training p oin ts and the separating linear decision boundary [36]. In the case when C = 2 the supp ort v ector machine pro vides an ef- fectiv e metho d for learning a binary maxim um margin classiﬁer. Solving the multi-class ( C > 2) case corre- sp onds to the following constrained optimisation [10]: min η ,ξ 1 2 || η || 2 + D N N X v =1 ξ v ∀ v , y s.t. : ( η T y v ¯ z v − η T y ¯ z v ≥ 1 − δ y ,y v − ξ v ξ v ≥ 0 , (5) where D is a p ositiv e regularisation constan t, δ p,q is the Kronec ker delta and 1 − δ y ,y v represen ts our loss func- tion, which equals 0 for the correct prediction and 1 for an y other prediction. T he slack v ariables, { ξ v } are used for classes that cannot b e separated by a linear classiﬁer b y allowing some misclassiﬁcation in the training data. Since w e minimize ov er the ξ v ’s, when the classes are linearly separable then ξ v = 0 ∀ v . The regularisation pa- rameter, D , controls the scale of the training misclassiﬁ- cation p enalty and can b e set b y using cross-v alidation. Here, the multiclass margin, by which the true class y v is fav oured ov er another class y , is given by: η T y v ¯ z v − η T y ¯ z . (6) The constrain ts in Eq.(5) ensure that all training in- stances lie on the correct side of the decision boundary , if p ossible, and a ξ v > 0 indicates that training instance v is misclassiﬁed. Solving Eq.(5) 3 requires the introduction of Lagrange m ultipliers, b µ , and solving the Lagrangian: L ( η , ξ , b µ ) = 1 2 || η || 2 + D N N X v =1 ξ v + N X v =1 X y ∈Y b µ y v [ η T y v ¯ z v − η T y ¯ z v + 1 − δ y ,y v − ξ v ] ∀ v , y s.t. : b µ y v ≥ 0 . (7) Within this framew ork the simplest approach would then b e: ﬁrst, infer the no de roles using the blo ckmodel, and second, train the SVM classiﬁer using the kno wn class lab els and the inferred roles. Ho wev er, the lik eli- ho od function of blo ckmodels often contains a large n um- b er of lo cal optima p ertaining to go o d but distinct solu- tions [12]. As a result, optimising Eq.(3) migh t not result 3 In practice this inv olves solving an equiv alent dual formulation, which we omit for brevity . 5 ALGORITHM 1: Inference algorithm for Maximum-Margin Blo c kmo del Initialise λ, η randomly . Initialise b µ = 0. while relative improv ement in L > 10 − 6 do for i = 1 to |E | do up date λ i using Eq.(11). end for up date b µ, η by optimising Eq.(8). end while Predict unknown lab els b y v using Eq.(15) in the disco very of roles that relate to the class lab els 4 . Instead, we in tro duce the maximum-margin blo c kmo del (MaxBM), that treats the training of the classiﬁer and the inference of the roles as a joint optimisation problem giv en b y: min q ,η,ξ −L ( q ; α, β ) + 1 2 || η || 2 + D N N X v =1 ξ v ∀ v , y s.t. : ( E q [ η T y v ¯ z v − η T y ¯ z v ] ≥ 1 − δ y ,y v − ξ v ξ v ≥ 0 , (8) The ﬁrst part of the optimisation corresp onds to the negativ e lo wer b ound on the lik eliho od of the blockmodel (Eq.(3)), while the second part corresp onds to the mar- gin based classiﬁer (Eq.(5)). F orm ulating the problem lik e this means that the mo del will av oid lo cally optimal solutions (i.e. net work roles) if they do not help with the classiﬁcation task. As with other blo ckmodels, the MaxBM treats the presence or absence of each link as b eing conditionally in- dep enden t given the latent role assignmen t. Additionally , the mo del assumes that giv en the role assignmen t, the class lab els are conditionally indep enden t of each other and the net work structure. V. INFERENCE W e ﬁt the MaxBM mo del to the observed netw ork us- ing an exp ectation-maximisation (EM) style pro cedure (see Algorithm 1). In the exp ectation step, we infer the laten t v ariables (i.e. the netw ork roles z ), and in the max- imisation step, w e learn the mo del parameters (i.e. the classiﬁcation coeﬃcients η ). Once the algorithm has con- v erged, w e can use the inferred mo del to make predictions ab out the unknown no de lab els. 4 In the worst case the inferred roles could b e orthogonal to the class labels. A. Inferring the netw ork roles W e infer the net work roles, z , using v ariational Ba yesian (VB) inference. VB inference [4] has the ad- v antage o ver sampling-based methods due to con vergence that is faster and easier to diagnose. In VB, a v aria- tional p osterior distribution is used to lo wer b ound the log likelihoo d (see Eq.(2)). This v ariational distribution is restricted to a set of tractable distributions to appro x- imate the true posterior distribution. Most frequen tly this restriction is that the distribution is fully factorised, kno wn as a mean-ﬁeld approximation. As we ha ve in tegrated out the parameters π and φ , our v ariational p osterior distribution, q , is ov er the latent roles, z , only . This distribution takes the form: q ( z ) = Y i q ( z i | λ i ) , (9) where each λ i is a | K × K | -dimensional v ariational pa- rameter for a categorical distribution o ver pairs of roles for link i . W e optimise ov er q ( z ) by optimising q ( { z i } ) for each edge in the net work until conv ergence. Since our model is comp osed of probabilit y distributions from the conjugate exp onen tial family the up dates take a particular general form [30]: q ( z i ) ∝ exp( E ∼ q z i [log p ( Y , Z )]) (10) where Y represen ts the observed data and E ∼ q z i is the exp ectation under all q z 0 i for all i 0 6 = i . Exact calculation of the exp ectation in (10) is computationally exp ensiv e and therefore we use a ﬁrst order T aylor expansion to appro ximate the up date equations [3, 30]. The netw ork roles are therefore up dated according to: λ i,k 1 ,k 2 ∝  d ¬ i k 1 ,k 2 + α k 1 ,k 2  ( n ¬ i s i ,k 1 + β )( n ¬ i r i ,k 2 + β ) ( n ¬ i · ,k 1 + N β )( n ¬ i · ,k 2 + N β + δ k 1 ,k 2 ) × exp 1 n s i X y µ y s i E [ η y s i ,k 1 − η y ,k 1 ] + 1 n r i X y µ y r i E [ η y r i ,k 2 − η y ,k 2 ] ! (11) where d k 1 ,k 2 is the count of links from role k 1 to role k 2 , and n v ,k is the num b er of times no de v in teracts as role k . The ﬁrst line of (11) relates to the unsup ervised part of the up date. This part of the up date is used to ﬁt a blo c kmo del without the use of class lab els and can b e used as an alternative to the Gibbs sampling metho d used in [25]. The last tw o terms are due to the max- margin formulation of Eq.(8) and are non-zero for the no des that lie on the decision b oundary , i.e. the supp ort v ectors. These create a bias in the mo del that encour- ages it to disco ver net work roles that make more accurate predictions on these diﬃcult examples. 6 B. Learning the classiﬁcation co eﬃcients As a consequence of the conditional indep endence of the class lab els given the netw ork roles, it is p ossible to use standard SVM optimisation algorithms 5 , such as [35], to obtain the optimal µ and η . How ever, diﬀerent to the standard SVM case the classiﬁer inputs (netw ork roles) are not observed ﬁxed v alues, but are instead la- ten t v ariables represen ted by our p osterior distribution. W e represent our curren t approximation to the p osterior distribution with a series of exp ectations, sp eciﬁcally: E q [ z i ] = λ i , (12) E q [ η T y ¯ z v ] = η y n v T " X i : s i = v λ s i + X i : r i = v λ r i # , (13) where n v is the degree of no de v and λ v is a K -length v ector representing the marginal probability of sender or receiv er positions, i.e.: λ s i = " X k 2 λ i, 1 ,k 2 , X k 2 λ i, 2 ,k 2 , · · · , X k 2 λ i,K,k 2 # T , λ r i = " X k 1 λ i,k 1 , 1 , X k 1 λ i,k 1 , 2 , · · · , X k 1 λ i,k 1 ,K # T . (14) C. Predicting class lab els Once the inference algorithm has con verged and we ha ve an estimate of the netw ork roles and classiﬁer co ef- ﬁcien ts, w e can then p erform classiﬁcation on the unla- b elled no des in the netw ork according to: b y v = arg max y ∈Y η T y ¯ z v . (15) VI. A CTIVE LEARNING In man y practical scenarios, the lab els of the netw ork no des ma y not be immediately a v ailable at training time. Acquisition of these lab els inv olves some cost; either the cost of consulting an exp ert, conducting an inv estigation or carrying out lab oratory exp eriments. In order to min- imise the cost incurred, w e wish to select the most in- formativ e examples. This is done using activ e learning. Activ e learning is a form of sup ervised learning that in- v olves in teractively selecting the nodes to lab el as part of the learning pro cess. In this setting, netw ork no des start oﬀ unlab elled and the active learning algorithm aims to 5 W e use a mo diﬁed version of SVM multiclass : http://svmlight. joachims.org/svm_multiclass.html c ho ose the b est subset of no des to lab el in order to im- pro ve the classiﬁcation of the remaining no des. This is undertak en in a greedy manner. At eac h stage of the algorithm a single unlab elled no de is selected to explore and is added to the training set. This selection o ccurs based on the output of the classiﬁer and an active learn- ing criterion. As our metho d incorp orates a maximum margin clas- siﬁer it is p ossible to tak e adv antage of activ e learning criteria developed for supp ort vector machines. This in volv es querying a sample in relation to the decision b oundary . W e employ a simple strategy that chooses no des that hav e the smallest margin [34] as intuitiv ely these represent the examples where the classiﬁer is most uncertain. W e use the multiclass margin [28] as the query function: Q multiclass = arg min v ∈{ 1 ,...,N } η T b y v ¯ z v − η T e y v ¯ z v . (16) where b y v represen ts the predicted class for no de v , as giv en in (15), and e y v = arg max y 6 = b y η T y ¯ z v is the second most likely class label. The query function (16) therefore selects the node that has the smallest margin betw een the top t wo comp eting class lab els. VI I. EXPERIMENT AL RESUL TS F our real world netw orks were used to test our Max- BM approac h to learning the relationship betw een net- w ork links and no de classes: 6 a so cial netw ork, a word net work, a marine fo o d web, and a citation netw ork. The ﬁrst dataset is Zachary’s Karate Club [37], a social net work of friendships b etw een 34 members of a k arate club. The club split in to tw o factions, one led b y the instructor and the other by the club president. The class lab els corresp ond to the faction each no de b elonged to after the split. The second net work is comprised of 112 frequently o c- curring nouns and adjectives in the no v el David Copp er- ﬁeld b y Charles Dic kens [23]. It is a directed net w ork where the links indicate adjacent w ords and the order they app ear in the text. The third netw ork is a fo od w eb of consumer-resource in teractions b et w een 488 sp ecies in the W eddell Sea [7]. The directed edges link each predator to its prey . W e p erform tw o classiﬁcation tasks on this dataset, accord- ing to the attributes feeding type and habitat; these at- tributes partition the netw ork in diﬀerent wa ys. F eeding t yp e takes one of C = 6 classes: primary pro ducer, car- niv orous, carnivorous/necro v orous, detrivorous, herbivo- rous/detriv orous and omnivorous. Habitat has C = 5 6 A Python implementation of the Max-BM model along with the datasets used are av ailable at http://gdriv.es/letopeel/code. html 7 24 25 26 27 20 21 22 23 28 29 1 3 2 5 4 7 6 9 8 11 10 13 12 15 14 17 16 33 18 31 30 34 19 32 1 2 3 4 6 7 11 5 17 13 4 22 18 8 12 1 2 14 20 3 24 30 26 25 27 23 21 19 15 16 28 33 32 34 29 31 10 9 Inter-faction Instructor President 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 3 4 1 2 3 4 log (number of links) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 FIG. 3: Left: The k arate club netw ork. The node colour reﬂects the class lab el, i.e. whic h faction the no de b elongs to, while the link colour reﬂects the assignment of roles. Midd le: A visual representation of the distribution of inferred roles (columns) for each no de (rows) in the netw ork. Right: The role interaction matrix. The diagonal blo c ks indicate that the netw ork is assortativ e. classes, namely b en thic, b en thop elagic, demersal, land- based, and p elagic. Finally , the fourth net work is the muc h studied Cora citation net w ork [29] containing 2708 machine learning pap ers with directed links indicating citations, and C = 7 class lab els indicating one of 7 sub ject areas. F or each netw ork the algorithm was initialised by ac- quiring the lab el of one randomly chosen no de from each class. It wou ld b e p ossible to learn the num b er of classes as the netw ork is explored by adding a new class when- ev er a new label was encoun tered. It w ould also be p ossi- ble to optimise the num b er of netw ork roles using cross- v alidation at each stage. In this work, ho w ever, the n um- b er of roles was ﬁxed, since w e found in practise that as long as w e had a suﬃcient n um b er of roles, the exact setting of this parameter did not hav e a substantial ef- fect on the results. The reason for this is that the mo del only uses the roles it needs to represent the link patterns that corresp ond to classes (see Sec. VI I B 1). W e there- fore set the num b er of netw ork roles to twice the num b er of classes, i.e. K = 2 C . Eac h stage of the learning process consisted of carrying out the inference pro cedure in Section V to conv ergence, follo wed b y the selection of a new no de to be lab elled. At eac h stage the margin cost parameter D was optimised using cross-v alidation. The rep orted results are av eraged o ver 50 random initialisations. In the following, we examine the netw ork roles discov- ered and the p erformance of the algorithm in discov ering these roles. A. Role Discov ery When confronted with a new netw ork dataset it is im- p ortan t to understand the patterns of complex interac- tions betw een the nodes. This is useful to further our understanding of how a system works, and also to b etter understand the data so that w e can build better algo- rithms for prediction. In this section w e examine the net work roles found using the MaxBM mo del and how they can b e used to understand the structure of the net- w ork in relation to the attributes of no des. First, we consider the k arate club netw ork. This is a w ell-studied netw ork, largely due to its simple structure and small size. Figure 3 (left) sho ws a visualisation of the net work. The no de colours indicate the factions (which w e use as class lab els) to which the no des b elong to after the split. No de 1 and no de 34 represent the Instructor and club Presiden t resp ectively . In Figure 3 (righ t) we see a visualisation of the logarithm of the n um b er of links that o ccurs b et ween the roles. All the links o ccur on the diagonal so this tells us that the netw ork links are assortativ e. Figure 3 (middle) sho ws the no de mem b erships to roles. In this visualisation, the rows corresp ond to no des in the netw ork and the columns corresp ond to the K = 4 roles. The rows in the top half are the no des in the Presi- den t’s faction, while the ro ws in the b ottom belong to the Instructor’s faction. Notice that the fourth column do es not contain any membership. Although the mo del had 4 roles av ailable, it determined that three were suﬃcient to represen t the netw ork structure. This visualisation sho ws the relationship b etw een roles and classes. W e can also interpret the roles in terms of the link patterns they represen t. Role 1 and Role 3 represen t interaction within the Instructor and Presiden t factions resp ectiv ely , while 8 Role 2 represen t interactions across factions. The second netw ork is the w ord net work of adjectives and nouns. Figure 4 sho ws the inferred role mem b erships (left) and role interactions (right). This is a net work that has been previously describ ed as being approximately bi- partite [21, 23] due to the tendency for adjectives to pre- cede nouns in the English language. W e observ e this relationship in Roles 1 and 4, which accoun t for the ma- jorit y of the nouns and adjectives resp ectiv ely . Roles 2 and 3 are therefore used by the mo del to account for the w ords that do not follow this simple rule. The third netw ork is the fo o d web of consumer- resource in teractions (i.e. who-eats-whom) and is shown in Figure 5. As there are t wo classiﬁcation tasks we c ho ose to examine the netw ork roles in relation to the feeding type class lab el. This is b ecause it is the easier of the tw o to interpret without the use of more detailed domain sp eciﬁc knowledge. W e see in Fig. 5(right) that the ma jorit y of the in teractions lie on the oﬀ-diagonal, indicating that the netw ork is disassortative. This seems reasonable as it tells us that in general, sp ecies of one t yp e tend to eat sp ecies of a diﬀerent t yp e. Lo oking closer at the role memberships (Fig. 5(left)), we see ho w the class lab els relate to netw ork roles. W e see that Role 1 is exclusively comp osed of primary producers. Roles 2 to 5 (green ov erlay) corresp ond to herbiv ore species and Roles 6 to 12 (blue o v erlay) corresp ond to carnivore sp ecies. W e might therefore think of these roles as b eing e ats-plant s and e ats-animals resp ectiv ely and the indi- vidual roles b eing another lev el of organisation within these roles. The omniv ore sp ecies are distributed across all of these roles, since we know that omnivores eat b oth plan ts and animals. B. Classiﬁcation Performance W e sho wed in the previous section that the MaxBM can b e used to disco ver the pattern b etw een net work links and no de attributes. Now we examine the p erformance of the mo del in disco vering this pattern as w e go through the pro cess of acquiring lab els. Each time we acquire the lab el for a new no de, w e quantify the performance according to the accuracy of the mo del predictions on the rest of the net work. W e compare our algorithm (MaxBM+Margin) with t wo baseline strategies: r andom , where no des were queried in a random order, and de gr e e , where no des where queried in order of largest degree. W e also compare against three other blo ckmodel approaches: a standard blo c kmo del using m utual information strategy( Mo or e et al.(MI) ) as describ ed in [21], the blo ckmodel of [26] using entrop y ( Pe el (Entr opy) ), and a SVM using the roles disco vered using an unsupervised blockmodel [25] ( BM+SVM (Entr opy) ). In addition, w e also compare against a selection of uni- v ariate collectiv e classiﬁcation and net work based active learning methods from the literature. A summary of 1 2 3 4 Adjective Noun 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 3 4 1 2 3 4 log (number of links) 0.0 0.8 1.6 2.4 3.2 4.0 4.8 5.6 FIG. 4: L eft: The distribution of inferred roles (columns) for the no des (rows) in the word netw ork. Right: The role in teraction matrix. 1 2 3 4 5 6 7 8 9 10 11 12 Primary Carniv Carn/Nec Detriv Herb/Det Omniv eats-plants eats-animals 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 log (number of links) 0 1 2 3 4 5 6 7 8 FIG. 5: L eft: The distribution of inferred roles (columns) for the no des (rows) in the fo o d web netw ork. R ight: The role in teraction matrix. these classiﬁcation and active learning selection strate- gies is giv en b elow 7 : • Macskassy & Pr ovost (Entr opy) : The weigh ted v ote relational neigh b our classiﬁer [19] and select- ing the no de that has the highest entrop y predic- tion. • Lu & Geto or (Entr opy) : The netw ork-only version of the count-link logistic regression classiﬁer [17] used with the ab ov e en tropy metho d. • Bilgic & Geto or (Entr opy) : The naiv e Bay es clas- siﬁer and selecting the no de with the highest sum of entrop y ov er its neighbours and itself [6]. • Zhu et al. (ERM) : The Gaussian harmonic clas- siﬁer using empirical risk minimisation (ERM) to select the next no de [39]. • Macskassy (ERM) : The same as abov e but using heuristics to select a subset of no des to ev aluate using the ERM metho d [18]. 7 W e use the netkit [20] implementations a v ailable at http:// netkit- srl.sourceforge.net/ 9 0 5 10 15 20 25 30 No. of nodes queried 20 30 40 50 60 70 80 90 100 Percentage of nodes classified correctly Karate Max-BM (Margin) Max-BM (Random) Max-BM (Degree) Moore et al. (MI) Peel (Entropy) BM + SVM (Margin) 0 20 40 60 80 100 No. of nodes queried Word 0 100 200 300 400 No. of nodes queried Food Web (Feeding) 0 100 200 300 400 No. of nodes queried Food Web (Habitat) FIG. 6: A comparison of the p erformance of lab elling no des for the blo c kmo del-based approaches. 0 5 10 15 20 25 30 No. of nodes queried 0 20 40 60 80 100 Percentage of nodes classified correctly Karate Max-BM (Margin) Macskassy&Provost (Entropy) Lu and Getoor (Entropy) Bilgic & Getoor (Entropy) Macskassy (ERM) Zhu et al (ERM) 0 20 40 60 80 100 No. of nodes queried Word 0 100 200 300 400 No. of nodes queried Food Web (Feeding) 0 100 200 300 400 No. of nodes queried Food Web (Habitat) FIG. 7: Performance of lab elling no des using collective classiﬁer approaches. 1. Kar ate club network Figures 6 and 7(outer left) show the performance of the diﬀeren t algorithms on the k arate club net work. It can b e seen that although our margin-based approach initially performs worse than random, after exploring ab out 10 no des it has almost p erfect classiﬁcation ac- curacy . Ho wev er, we see that a num b er of the other ap- proac hes out-p erform our metho d, particularly the Mac- skassy & Pr ovost(Entr opy) and Macskassy (ERM) ap- proac hes. The reason for this b eing that these mo dels ha ve strong assortativity assumptions and, as w e show ed in the previous section, this is an assortativ e netw ork. While our approach needs to learn the structure of the net work, these collective classiﬁers only require a goo d c hoice of no des to mak e accurate predictions. 2. Wor d network Figures 6 and 7(cen tre left) sho w, at each stage, the prop ortion of unexplored nodes in the w ord net work that are lab elled correctly by eac h of the metho ds. Here it can be seen that the accuracy of the MaxBM (Mar- gin) metho d quickly rises to 90% and after querying just 20 no des is close to 100% accurate. T o start with, the blo c kmo del metho d, Pe el (Entr opy) , performs almost as w ell as the margin approach, but the performance drops sligh tly after exploring about half of the net work. W e can also see that the MaxBM metho d p erforms better than all the other approac hes, since those that ac hiev e 100% accuracy , require the lab els of at least half of the net work b efore they do. In [21] they observe that, b ecause adjectives in the English language tend to precede nouns, no des with high out-degree can b e classiﬁed as adjectives and no des with high in-degree as nouns. Consequen tly they ﬁnd that their mutual information approach, Mo or e et al. (MI) , tends to query no des with almost equal in- and out- de- gree ﬁrst; these represen t no de labels ab out whic h the mo del is most uncertain. W e p erformed a similar analysis using our MaxBM ap- proac h. W e found that the MaxBM quickly disco vered the main roles represen ting the nouns that follow ed ad- jectiv es, i.e. Roles 1 and 4 in Fig. 4; these no des tended to b e queried last. With Roles 1 and 4 in place, the model c ho oses no des that don’t ﬁt this pattern to determine ho w to b est assign the remaining roles. T o understand why the algorithm chose this ordering, w e considered the degree of each no de not only in terms 10 of its in- and out- degree, but in terms of its in- and out- degree to e ach class . W e found that the ma jority of the adjectives in the netw ork not only had a higher out-degree than in-degree, but also a higher out-degree to nouns than out-degree to adjectives. That is to sa y , these adjectiv es preceded nouns more than they preceded other adjectives. These no des tended to be queried last. In contrast, adjectives that did not link to the rest of the net work in this w ay were queried earlier since the model w as more uncertain ab out their classiﬁcation. A similar analysis of the nouns sho ws that the subset of nouns ex- plored last nev er follow nouns and hav e a relatively high in-degree. 3. F o o d web In the fo o d web net work w e ha ve t wo classiﬁcation tasks. The feeding t yp e task is shown in Figures 6 and 7(cen tre right) and the habitat task is in Figures 6 and 7(outer right). W e did not run the Zhu et al. (ERM) metho d on this netw ork due to the long running time. F or b oth tasks the margin strategy with the MaxBM mo del and the Pe el (Entr opy) metho ds outp erform the other algorithms. Again, w e examine the query order under the MaxBM (Mar gin) for b oth classiﬁcation tasks. F or the classiﬁca- tion of feeding type, the algorithm tends to query primary pro ducers last and the omnivores early on. As we saw in Fig.5, the primary pro ducer is a homogeneous class and is represen ted as a role deﬁned as having no outgoing links (Role 1). Omniv ores, on the other hand, are the hardest to distinguish from the rest of the netw ork as they ha v e the greatest v ariation in link patterns. F or the habitat classiﬁcation task, the last half of the net work no des to b e explored tend to b e of the b enthic and pelagic classes. In [21] it was suggested that the div ersity of the sp ecies contributed to the misclassiﬁca- tion of more than half of the benthic species. In con- trast to previous work, the evidence suggests that our approac h performs well even when the diversit y is large. The MaxBM mo del can capture this diversit y b ecause it allo ws for class heterogeneity , i.e. the mo del can use mul- tiple roles to mo del the v ariety of link patterns within a class. F urthermore, the fact that the b en thic no des are queried later on suggests that not only do es this mo del predict the b en thic class lab els more accurately , but it do es so with greater conﬁdence. 4. Cor a citation network W e do not run the Mo or e et al.(MI) or the Zhu et al.(ERM) learning algorithms as they do not scale well to netw orks of this size. Figure 8 shows the accuracy of predicting the unlab elled no des as the netw ork is ex- plored. Unlik e for the other netw orks, we do not explore 0 200 400 600 800 1000 1200 No. of nodes queried 0 20 40 60 80 100 Percentage of nodes classified correctly Cora Max-BM (Margin) Macskassy&Provost(Entropy) Lu and Getoor (Entropy) Bilgic & Getoor (Entropy) Macskassy (ERM) Peel (Entropy) FIG. 8: Performance of lab elling no des within the Cora citation netw ork. the entire netw ork and instead terminate the learning pro cess once half of the no des hav e b een queried. W e see that the Pe el (Entr opy) method performs b y far the worst. It can b e seen that the MaxBM (Mar- gin) metho d p erforms b etter, but it is the Macskassy & Pr ovost(Entr opy) and Macskassy (ERM) metho ds that p erform best. By the time half of the netw ork has b een explored all of the methods except Pe el (Entr opy) and Lu & Geto or (Entr opy) reach 90% accuracy . This is a similar pattern to that observed in the k arate club net- w ork. Both the k arate club so cial net work and the Cora citation netw ork are assortativ e and homogeneous [26]. Beneﬁt of sup ervise d r ole disc overy Finally , w e make a general comment on the b eneﬁt gained by allo wing the class lab els to inﬂuence the role disco very pro cess. The BM+SVM metho d is similar to our MaxBM metho d, except that it p erforms the role disco very and classiﬁer training as separate steps and the role discov ery do es not use the kno wn class label informa- tion. In all ﬁve of the classiﬁcation tasks we see that the BM+SVM metho d p erformed w orse than the MaxBM mo del. This sho ws that the netw ork roles are adapting as a result of the acquisition of class lab els during the ac- tiv e learning process and that there is measurable b eneﬁt to incorp orating lab els in to the inference pro cess. VI II. DISCUSSION In this work we ha ve considered the problem of identi- fying the b est subset of no des to lab el, in order to discov er net work roles that describ e the relationship b etw een net- w ork links and node lab els. W e constructed an inter- pretable and ﬂexible mo del based on blo c kmo dels and margin-based classiﬁcation. W e assessed its p erformance at disco vering roles by its abilit y to accurately predict the lab els across the rest of the netw ork. W e build on previous work based on generativ e blo ck- mo dels and active learning to explore an unlab elled net- w ork. In contrast to previous w ork, w e do not assume that no des with the same lab el all connect to the rest of 11 the net work in the same w a y . By allowing for heterogene- it y within classes we can mo del a wider range of netw ork structures, while still attaining go o d classiﬁcation accu- racy on net works with simple class structures. W e hav e compared our mo del, based on classiﬁcation accuracy , to the related class of models known as collec- tiv e classiﬁers. W e see that for simple assortativ e net- w orks, some of these algorithms outp erform ours. If we kno w a priori how classes in a netw ork are distributed relativ e to their link structure and w e kno w that this relationship is assortative, then our metho d is not the metho d to use. How ever, if the relationship b et ween class lab els is y et to be disco vered then, as w e hav e demon- strated, our MaxBM mo del provides a go o d approach to explore the netw ork. F urthermore, if the link patterns are not assortativ e, then our method gives b etter classi- ﬁcation p erformance than previous metho ds. IX. A CKNOWLEDGMENTS W e thank Chr is Aicher, Mustafa Bilgic, Aaron Clauset, Abigail Jacobs, Dan Larremore, Sofus Macsk assy and Suzy Moat for helpful comments and Cris Mo ore and Y ao jia Zh u for useful discussions and providing the W ed- dell Sea dataset. The author was supp orted by the UK EPSR C-funded Eng. Do ctorate Centre in Virtual Envi- ronmen ts, Imaging and Visualisation (EP/G037159/1) [1] L. Adamic and N. Glance. The political blogosphere and the 2004 u.s. election: Divided they blog. In In LinkKDD ’05: Pr o c e e dings of the 3r d international workshop on Link Disc overy , pages 36–43, 2005. [2] S. Allesina and M. Pascual. F o o d web mo dels: a plea for groups. Ec olo gy L etters , 12(7):652–662, 2009. [3] A. Asuncion, M. W elling, P . Smyth, and Y. W. T eh. On smo othing and inference for topic mo dels. In Pr o c e e d- ings of the International Confer enc e on Unc ertainty in Artiﬁcial Intel ligenc e , 2009. [4] H. A ttias. A v ariational bay esian framew ork for graphical mo dels. In In A dvances in Neur al Information Pr o c essing Systems 12 , pages 209–215. MIT Press, 2000. [5] P . S. Bearman, J. Mo o dy , and K. Sto v el. Chains of Aﬀec- tion: The Structure of Adolescent Romantic and Sexual Net works. Americ an Journal of So ciolo gy , 110(1):44–91, July 2004. [6] M. Bilgic and L. Geto or. Link-based activ e learning. In NIPS Workshop on Analyzing Networks and L e arn- ing with Gr aphs , 2009. [7] U. Brose, L. Cushing, E. L. Berlo w, T. Jonsson, C. Banasek-Rich ter, L.-F. Bersier, J. L. Blanc hard, T. Brey , S. R. Carp enter, M.-F. C. Blandenier, et al. Bo dy sizes of consumers and their resources. Ec olo gy , 86(9):2545, 2005. [8] C. Burfoot, S. Bird, and T. Baldwin. Collective classiﬁ- cation of congressional ﬂo or-debate transcripts. In Pr o- c e e dings of the 49th Annual Me eting of the Asso ciation for Computational Linguistics: Human L anguage T e ch- nolo gies - V olume 1 , pages 1506–1515. Association for Computational Linguistics, 2011. [9] J. Chen and B. Y uan. Detecting functional modules in the yeast protein-protein interaction netw ork. Bioinfor- matics , 22(18):2283–2290, 2006. [10] K. Crammer and Y. Singer. On the algorithmic imple- men tation of multiclass kernel-based vector machines. J. Mach. L e arn. R es. , 2:265–292, Mar. 2002. [11] N. Cristianini and J. Shaw e-T aylor. A n Intr o duction to Supp ort V e ctor Machines and Other Kernel-b ase d L e arn- ing Methods . Cam bridge Universit y Press, 1 edition, 2000. [12] B. H. Go od, Y.-A. de Montjo ye, and A. Clauset. Perfor- mance of mo dularit y maximization in practical contexts. Phys. R ev. E , 81(4):046106, Apr 2010. [13] S. Har-Peled, D. Roth, and D. Zimak. Maximum mar- gin coresets for active and noise tolerant learning. In Pr o c e e dings of the 20th international joint c onfer enc e on Artiﬁc al intel ligence , pages 836–841, San F rancisco, CA, USA, 2007. Morgan Kaufmann Publishers Inc. [14] P . W. Holland, K. B. Laskey , and S. Leinhardt. Sto c has- tic blo c kmo dels: First steps. So cial Networks , 5(2):109– 137, 1983. [15] X. Kong, X. Shi, and P . S. Y u. Multi-lab el collectiv e classiﬁcation. In Pr o c e e dings of the Eleventh SIAM In- ternational Confer enc e on Data Mining , pages 618–629. SIAM / Omnipress, 2011. [16] A. Kuw adek ar and J. Neville. Relational activ e learning for join t collectiv e classiﬁcation mo dels. In Pr oc e e dings of the 28th International Confer ence on Machine L e arning (ICML-11) , pages 385–392. ACM, June 2011. [17] Q. Lu and L. Geto or. Link-based classiﬁcation. In Pr o c e e dings of the International Confer enc e on Machine L e arning (ICML) , 2003. [18] S. A. Macsk assy . Using graph-based metrics with empir- ical risk minimization to sp eed up active learning on net- w orked data. In Pr o c e e dings of the 15th ACM SIGKDD International Confer enc e on Know le dge Discovery and Data Mining , pages 597–606. ACM, 2009. [19] S. A. Macsk assy and F. Prov ost. A simple relational classiﬁer. In Pro c e e dings of the Se c ond Workshop on Multi-R elational Data Mining (MRDM-2003) at KDD- 2003 , pages 64–76, 2003. [20] S. A. Macsk assy and F. Prov ost. Classiﬁcation in net- w orked data: A to olkit and a univ ariate case study . J. Mach. L e arn. R es. , 8:935–983, May 2007. [21] C. Mo ore, X. Y an, Y. Zh u, J.-B. Rouquier, and T. Lane. Activ e learning for no de classiﬁcation in assortative and disassortativ e netw orks. In Pr oc e e dings of the 17th ACM SIGKDD international c onfer enc e on Know le dge disc ov- ery and data mining , pages 841–849. ACM, 2011. [22] J. Neville and D. Jensen. Iterative classiﬁcation in rela- tional data. pages 13–20. AAAI Press, 2000. [23] M. E. J. Newman. Finding communit y structure in net- w orks using the eigenv ectors of matrices. Phys. R ev. E , 74(3):036104, Sep 2006. [24] K. Nowic ki and T. A. B. Snijders. Estimation and Pre- diction for Sto chastic Blockstructures. Journal of the Americ an Statistic al Asso ciation , 96(455), 2001. 12 [25] J. Parkkinen, J. Sinkkonen, A. Gyenge, and S. Kaski. A blo c k mo del suitable for sparse graphs. In Pr o c e e dings of the 7th International Workshop on Mining and L e arn- ing with Gr aphs (MLG 2009) , Leuven, Belgium, July 2-4 2009. Extended Abstract. [26] L. Peel. T opological feature based classiﬁcation. In Pr o- c e e dings of the 14th International Confer enc e on Infor- mation F usion , 2011. [27] J. Reichardt and S. Bornholdt. Clustering of sparse data via netw ork communities—a protot yp e study of a large online mark et. Journal of Statistic al Me chanics: The ory and Exp eriment , 2007(06):P06016, 2007. [28] D. Roth and K. Small. Margin-based active learning for structured output spaces. In J. Frnkranz, T. Scheﬀer, and M. Spiliop oulou, editors, Machine L e arning: ECML 2006, 17th Eu r op e an Confer enc e on Machine L e arning, Berlin, Germany, Septemb er 18-22, 2006, Pr o c e e dings , v olume 4212, pages 413–424. Springer, 2006. [29] P . Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gal- lagher, and T. Eliassi-Rad. Collective classiﬁcation in net work data. AI Magazine , 29(3):93–106, 2008. [30] J. Sung, Z. Ghahramani, and S.-Y. Bang. Latent-space v ariational bay es. IEEE T r ans. Pattern Anal. Mach. In- tel l. , 30(12):2236–2242, 2008. [31] B. T ask ar, C. Guestrin, and D. Koller. Max-margin mark ov netw orks. In S. Thrun, L. Saul, and B. Sch¨ olk opf, editors, A dvanc es in Neural Information Pr o cessing Sys- tems 16 . MIT Press, Cambridge, MA, 2004. [32] Y. W. T eh, D. Newman, and M. W elling. A col- lapsed v ariational Ba y esian inference algorithm for latent Diric hlet allocation. In A dvances in Neur al Information Pr o c essing Systems , volume 19, 2007. [33] Y. Tian, T. Huang, and W. Gao. Robust collective clas- siﬁcation with contextual dependency netw ork models. In Pr o c e e dings of the Se c ond international c onfer enc e on A dvanc e d Data Mining and Applic ations , pages 173–180. Springer-V erlag, 2006. [34] S. T ong and D. Koller. Support vector machine active learning with applications to text classiﬁcation. Journal of Machine L e arning R ese ar ch , 2:45–66, 2001. [35] I. Tso chan taridis, T. Joachims, T. Hofmann, and Y. Al- tun. Large margin methods for structured and interde- p enden t output v ariables. J. Mach. L earn. R es. , 6:1453– 1484, Dec. 2005. [36] V. V apnik. Statistic al le arning the ory . Wiley , 1998. [37] W. W. Zac hary . An Information Flow Mo del for Conﬂict and Fission in Small Groups. Journal of A nthr op olo gic al R ese ar ch , 33(4):452–473, 1977. [38] J. Zhu, A. Ahmed, and E. P . Xing. MedLDA: Maximum Margin Sup ervised T opic Mo dels. Journal of Machine L e arning Rese ar ch , 13:2237–2278, 2012. [39] X. Zh u, J. Laﬀerty , and Z. Ghahramani. Com bining ac- tiv e learning and semi-sup ervised learning using gaussian ﬁelds and harmonic functions. In ICML 2003 workshop on The Continuum fr om L ab ele d to Unlab ele d Data in Machine L e arning and Data Mining , pages 58–65, 2003.

Active Discovery of Network Roles for Predicting the Classes of Network Nodes

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment