Exact Graph Learning via Integer Programming

Exact Graph Learning via Integer Programming
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Learning the dependence structure among variables in complex systems is a central problem across medical, natural, and social sciences. These structures can be naturally represented by graphs, and the task of inferring such graphs from data is known as graph learning or as causal discovery if the graphs are given a causal interpretation. Existing approaches typically rely on restrictive assumptions about the data-generating process, employ greedy oracle algorithms, or solve approximate formulations of the graph learning problem. As a result, they are either sensitive to violations of central assumptions or fail to guarantee globally optimal solutions. We address these limitations by introducing a nonparametric graph learning framework based on nonparametric conditional independence testing and integer programming. We reformulate the graph learning problem as an integer-programming problem and prove that solving the integer-programming problem provides a globally optimal solution to the original graph learning problem. Our method leverages efficient encodings of graphical separation criteria, enabling the exact recovery of larger graphs than was previously feasible. We provide an implementation in the openly available R package ‘glip’ which supports learning (acyclic) directed (mixed) graphs and chain graphs. From the resulting output one can compute representations of the corresponding Markov equivalence classes or weak equivalence classes. Empirically, we demonstrate that our approach is faster than other existing exact graph learning procedures for a large fraction of instances and graphs of various sizes. GLIP also achieves state-of-the-art performance on simulated data and benchmark datasets across all aforementioned classes of graphs.


💡 Research Summary

The paper addresses the fundamental problem of learning the dependence structure among variables in complex systems, a task commonly referred to as graph learning or causal discovery when a causal interpretation is desired. Traditional approaches fall into three broad categories: (i) constraint‑based methods that iteratively remove edges based on conditional independence (CI) tests (e.g., PC, FCI), (ii) score‑based methods that optimize a global fit score under strong distributional assumptions (e.g., BIC for linear Gaussian models), and (iii) hybrid or approximate methods that combine aspects of both but do not guarantee global optimality. These methods either rely on restrictive assumptions, are sensitive to the quality of CI tests, or cannot assure that the returned graph is the best possible given the data.

The authors propose a novel, non‑parametric framework called GLIP (Graph Learning via Integer Programming) that unifies constraint‑based and score‑based ideas through integer linear programming (ILP). The key steps are: (1) perform arbitrary CI tests on the data and collect the resulting p‑values; (2) define a “disagreement score” for each tested triple, which penalizes a graph when its implied separation contradicts the observed p‑value (using a threshold such as 0.05); (3) formulate the graph learning problem as an ILP that minimizes the weighted sum of these disagreement scores subject to graphical separation constraints that encode the Markov properties of the target graph class (DAGs, ADMGs, chain graphs, etc.). The objective function thus directly reflects the degree of inconsistency between a candidate graph and the empirical CI information, while the constraints guarantee that any feasible solution respects the appropriate separation criteria.

A major technical contribution is the encoding of separation constraints. Prior exact methods (e.g., Eberhardt et al., 2025) introduced a variable for every possible path in the underlying graph, leading to a factorial growth (≈ d! variables) that limited practical applicability to ≤ 6 nodes. GLIP instead introduces variables only for the shortest connecting paths, exploiting the fact that the existence of any connecting path implies the existence of a shortest one. Consequently, the number of ILP variables grows linearly with the number of input p‑values, dramatically reducing problem size and enabling exact learning for graphs with up to 9 nodes (Markov equivalence) and up to 14 nodes for weak equivalence (conditioning sets of size ≤ 1). The authors prove (Theorem 5) that solving the ILP yields a globally optimal solution to the original graph‑learning objective (GL), and they provide separate theorems (2‑4) establishing correctness for each graph class.

The implementation is provided as an open‑source R package “glip”. Users supply CI test results (e.g., from partial correlation, kernel‑based CI tests, etc.), and the package constructs the ILP and hands it to a modern MIP solver (Gurobi, CPLEX, or open‑source alternatives). After solving, GLIP can output the learned graph, its Markov equivalence class representation (e.g., CPDAG, PAG), or weak equivalence class structures. The framework is modular: the objective function can be swapped for alternative loss formulations, and the separation encoding can be adapted to any graph class that admits a path‑based global Markov property.

Empirical evaluation includes (a) synthetic linear Gaussian models with varying edge strengths (including near‑faithfulness violations), (b) standard Bayesian network benchmarks, and (c) real‑world datasets. In simulations, GLIP outperforms the FCI algorithm when strong correlations cause FCI to erroneously delete edges, because GLIP optimizes a global disagreement score rather than making greedy edge deletions. Compared to answer‑set programming (ASP) based exact methods (Hyttinen et al., 2014, 2017), GLIP is faster on a majority of graphs up to 9 nodes, with wall‑time limits of 600 seconds. For weak equivalence learning, GLIP remains competitive up to 14 nodes. The authors also propose a “warm‑start” strategy: a fast approximate method (e.g., GES or PC) provides an initial feasible graph, which is then refined by GLIP within a user‑specified time budget, often improving the solution even if the global optimum is not reached.

The paper acknowledges scalability limits: beyond roughly 15–20 nodes, ILP solving becomes computationally intensive. Nevertheless, the authors argue that the presented minimal‑length encoding is a substantial step forward, reducing the variable count from factorial to linear, and that future work could incorporate decomposition, column generation, or parallel solving to push the boundary further.

In summary, the contribution of this work lies in (i) formulating non‑parametric graph learning as an exact ILP problem, (ii) introducing an efficient minimal‑path encoding that dramatically reduces problem size, (iii) providing theoretical guarantees of global optimality for multiple graph families, and (iv) delivering a practical, open‑source tool that outperforms existing exact methods on a range of synthetic and real datasets. This positions GLIP as a compelling option for researchers requiring exact, assumption‑lean causal discovery, especially in settings where sample size permits reliable CI testing and computational resources allow for moderate‑scale integer programming.


Comments & Academic Discussion

Loading comments...

Leave a Comment