Constructing and Sampling Graphs with a Prescribed Joint Degree Distribution
One of the most influential recent results in network analysis is that many natural networks exhibit a power-law or log-normal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real life networks due to their differences on other important metrics like conductance. We believe this is, in part, because many of these real-world networks have very different joint degree distributions, i.e. the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative. We suggest understanding the relationship between network structure and the joint degree distribution of graphs is an interesting avenue of further research. An important tool for such studies are algorithms that can generate random instances of graphs with the same joint degree distribution. This is the main topic of this paper and we study the problem from both a theoretical and practical perspective. We provide an algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov Chain method for sampling them. We also show that the state space of simple graphs with a fixed degree distribution is connected via end point switches. We empirically evaluate the mixing time of this Markov Chain by using experiments based on the autocorrelation of each edge. These experiments show that our Markov Chain mixes quickly on real graphs, allowing for utilization of our techniques in practice.
💡 Research Summary
The paper addresses the fundamental problem of generating and sampling simple graphs that exactly match a prescribed joint degree matrix (JDM), i.e., the exact count of edges between vertices of degree k and degree l. While many network models focus solely on reproducing the marginal degree distribution, recent evidence shows that this is insufficient to capture important structural properties such as conductance, clustering, and assortativity. The authors therefore ask three core questions: (1) When does a given JDM correspond to a realizable labeled graph? (2) If realizable, how can one construct a simple graph with that JDM? (3) How can one uniformly sample from the space of all simple graphs sharing the same JDM?
First, the paper derives necessary and sufficient graphicality conditions for a JDM. By summing over rows and columns of the JDM one can recover the total number of edges m and the degree vector D (the number of vertices of each degree). The authors extend the classic Erdős‑Gallai theorem from degree sequences to JDMs, showing that a JDM is graphical if and only if the derived degree sequence satisfies Erdős‑Gallai and the edge counts between degree classes are compatible.
To construct a graph, the authors introduce a novel configuration model tailored to JDMs. For each degree k they create k “mini‑vertices” for every vertex of degree k, and for each edge type (k,l) they create a pair of “mini‑endpoints”, one labelled k and one labelled l. All mini‑vertices of degree k are connected to the class‑k mini‑endpoints, forming a complete bipartite component called the k‑neighborhood. The whole configuration graph consists of disjoint k‑neighborhoods. Any perfect matching in this bipartite graph corresponds, after merging paired mini‑endpoints, to a pseudograph that exactly realises the target JDM. The existence of a perfect matching follows from Hall’s theorem, establishing sufficiency of the graphicality conditions.
To obtain a simple graph (no self‑loops, no multi‑edges), the authors define an “end‑point switch” operation: select two edges (u‑v) and (x‑y) and swap their endpoints to (u‑y) and (x‑v). This operation preserves the JDM while potentially eliminating forbidden structures. They prove that repeated application of such switches makes the state space of all simple graphs with the given JDM connected.
Based on this connectivity, two Markov chains are designed. The first chain operates on the space of pseudographs generated by the configuration model; each step randomly rewires a pair of mini‑endpoints, which is essentially the classic configuration‑model chain. Using prior results (Kannan‑Mihail‑Vempala, Taylor) the authors show this chain mixes in polynomial time. The second chain works directly on simple graphs: at each step it picks two edges uniformly at random and attempts an end‑point switch, rejecting the move if it would create a self‑loop, a multi‑edge, or disconnect the graph. While a rigorous mixing‑time bound for this simple‑graph chain remains open, the authors conduct extensive experiments to estimate mixing empirically.
The experimental methodology relies on autocorrelation analysis. For each edge, the presence/absence indicator is treated as a binary time series generated by the Markov chain. Autocorrelation is computed as a function of lag; the lag at which autocorrelation falls below a small threshold (e.g., 0.01) is taken as an empirical mixing time. Across a variety of real networks—social, biological, and technological—the authors find that autocorrelation decays rapidly, typically within 5 000–20 000 steps, indicating fast mixing in practice.
The paper also compares the proposed JDM‑based generation to traditional degree‑distribution‑only models (e.g., preferential attachment, Kronecker graphs). By reconstructing the JDM of several real networks and generating graphs that match it, the authors demonstrate that key structural metrics such as clustering coefficient, conductance, core numbers, and assortativity are reproduced far more accurately than with models that only match the marginal degree distribution.
In summary, the contributions are: (1) a complete set of graphicality criteria for joint degree matrices; (2) a constructive algorithm (the JDM configuration model) that builds a pseudograph satisfying any realizable JDM; (3) an end‑point switch operation proving connectivity of the simple‑graph space; (4) two Markov chains for sampling pseudographs (with provable rapid mixing) and simple graphs (with empirically fast mixing); and (5) a practical autocorrelation‑based framework for determining when to stop the chain. These results provide a solid foundation for future work on higher‑order network models (e.g., d‑K series), dynamic networks, and rigorous statistical testing of network hypotheses that require preserving joint degree structure.
Comments & Academic Discussion
Loading comments...
Leave a Comment