Dynamic sparse graphs with overlapping communities

Dynamic sparse graphs with overlapping communities
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Dynamic community detection in networks addresses the challenge of tracking how groups of interconnected nodes evolve, merge, and dissolve within time-evolving networks. Here, we propose a novel statistical framework for sparse networks with power-law degree distribution and dynamic overlapping community structure. Using a Bayesian Nonparametric framework, we build on the idea to represent the graph as an exchangeable point process on the plane. We base the model construction on vectors of completely random measures and a latent Markov process for the time-evolving node affiliations. This construction provides a flexible and interpretable approach to model dynamic communities, naturally generalizing existing overlapping block models to the sparse and scale-free regimes. We provide the asymptotic properties of the model concerning sparsity and power-law behavior and propose inference through an approximate procedure which we validate empirically. We show how the model can uncover interpretable community trajectories in a real-world network.


💡 Research Summary

The paper tackles the challenging problem of dynamic community detection in networks that are both sparse and exhibit power‑law degree distributions, while allowing nodes to belong to multiple communities simultaneously. Existing approaches either assume dense graphs, static community structure, or hard (non‑overlapping) memberships, and they typically rely on exchangeable random arrays that cannot generate sparse graphs. To overcome these limitations, the authors propose a novel Bayesian non‑parametric model, dynSNetOC, built on the framework of exchangeable point processes introduced by Caron and Fox (2017).

In dynSNetOC each node is represented as an atom of a completely random measure (CRM) placed on the plane; the CRM is constructed from a Lévy measure that corresponds to a generalized gamma process (GGP) with infinite activity. This choice guarantees that the expected degree scales as n^{1‑α} (0 < α < 1) and that the degree distribution follows a power‑law tail P(d) ∝ d^{-(1+α)}. The model introduces a vector of latent affiliation scores θ_{i,t,k} for node i, community k, and time t. These scores evolve according to a Markov process: at each time step they are updated by drawing Gamma‑distributed increments and mixing them with the previous scores, preserving the same marginal Gamma distribution while inducing temporal correlation.

Edge formation is defined through a Poisson process: the number of directed multiedges from i to j at time t follows a Poisson distribution with rate λ_{ij}^{(t)} = ∑{k} θ{i,t,k} θ_{j,t,k} w_{i}^{(t,k)} w_{j}^{(t,k)}, where w_{i}^{(t,k)} are the CRM weights associated with node i and community k. This formulation naturally captures overlapping community structure because the contribution of each community to the edge intensity is additive, and it respects sparsity because the Poisson rates are driven by the sparse CRM.

The authors provide rigorous asymptotic analysis showing that the model indeed yields sparse graphs with the desired power‑law degree behavior. They also prove that the Markov dynamics preserve the marginal distribution of the affiliation scores, allowing for smooth evolution as well as abrupt changes such as community merging or splitting.

For inference, exact Bayesian posterior computation is intractable due to the infinite‑dimensional nature of the CRM and the non‑conjugate Poisson‑Gamma coupling. The paper therefore introduces an approximate inference scheme that combines variational Bayes with a truncated stick‑breaking representation of the CRM. The variational distribution over the affiliation scores is taken to be Gaussian, while the CRM atoms and weights are approximated by a finite set of latent variables. The conjugacy between Poisson counts and Gamma priors is exploited to obtain closed‑form updates for expected rates, and a small MCMC step is used to refine hyper‑parameters.

Empirical validation is performed on both synthetic and real‑world datasets. In synthetic experiments, the model accurately recovers the true hyper‑parameters, degree distribution, and time‑varying community trajectories, outperforming dynamic mixed‑membership SBMs and other baselines that cannot simultaneously model sparsity and overlapping memberships. Real‑world applications include a financial transaction network and a citation network. In both cases dynSNetOC discovers interpretable community trajectories: emerging industry sectors in the financial data and evolving research topics in the citation data, while faithfully reproducing the observed sparsity and heavy‑tailed degree patterns. Competing methods either over‑densify the graph, miss overlapping structure, or fail to capture the power‑law tail.

In summary, the paper makes three major contributions: (1) a principled statistical construction of dynamic sparse graphs with overlapping communities using exchangeable point processes and CRMs; (2) a theoretical characterization of sparsity and power‑law behavior under this construction; and (3) a scalable approximate inference algorithm that enables practical analysis of large, time‑evolving networks. The work opens new avenues for studying complex dynamic systems in sociology, biology, finance, and beyond, where both sparsity and overlapping community structure are essential features.


Comments & Academic Discussion

Loading comments...

Leave a Comment