Learning to Relax Nonconvex Quadratically Constrained Quadratic Programs
Quadratically constrained quadratic programs (QCQPs) are ubiquitous in optimization: Such problems arise in applications from operations research, power systems, signal processing, chemical engineering, and portfolio theory, among others. Despite their flexibility in modeling real-life situations and the recent effort to understand their properties, nonconvex QCQPs are hard to solve in practice. Most of the approaches in the literature are based on either Linear Programming (LP) or Semidefinite Programming (SDP) relaxations, each of which works very well for some problem subclasses but perform poorly on others. In this paper, we develop a relaxation selection procedure for nonconvex QCQPs that can adaptively decide whether an LP- or SDP-based approach is expected to be more beneficial by considering the instance structure. The proposed methodology relies on utilizing machine learning methods that involve features derived from spectral properties and sparsity patterns of data matrices, and once trained appropriately, the prediction model applies to any instance with an arbitrary number of variables and constraints. We develop classification and regression models under different feature-design setups, including a dimension-independent representation, and evaluate them on both synthetically generated instances and benchmark instances from MINLPLib. Our computational results demonstrate the effectiveness of the proposed approach for predicting the more favorable relaxation across diverse QCQP families.
💡 Research Summary
This paper addresses a practical yet under‑explored decision problem in non‑convex quadratically constrained quadratic programming (QCQP): given a particular instance, should one employ a linear programming (LP) relaxation based on McCormick envelopes or a semidefinite programming (SDP) relaxation (including a strengthened SDP′ that exploits variable bounds)? Solving both relaxations and picking the tighter bound is computationally expensive, especially for large‑scale problems. The authors therefore propose a supervised‑learning framework that predicts, from instance characteristics alone, which relaxation is expected to yield a stronger bound.
The methodology proceeds in three stages. First, the QCQP data (matrices (A_k), vectors (b_k), scalars (c_k), and variable bounds (l,u)) are transformed into a set of engineered features. The features capture three aspects that literature suggests influence relaxation strength: (i) spectral properties (minimum/maximum eigenvalues, eigenvalue distribution statistics) of each quadratic matrix, (ii) sparsity patterns (proportion of non‑zero off‑diagonal entries, average non‑zeros per row/column), and (iii) bound information (presence of finite bounds, average bound width, variability).
Second, three feature‑design schemes are introduced to handle the dimensionality issue. The fully dimension‑dependent (fDD) scheme retains all raw statistics and therefore scales with the number of variables (n) and constraints (m); it yields the richest representation but can only be applied to problem families with the same dimensions as the training set. The semi‑dimension‑dependent (sDD) scheme aggregates per‑variable statistics (e.g., mean and variance across rows) so that the feature vector no longer depends on (n) while still assuming a fixed (m). Finally, the dimension‑independent (DI) scheme further aggregates across constraints, producing a fixed‑size vector that can be used for any QCQP regardless of (n) or (m).
Third, the authors train both classical supervised models (random forests, XGBoost, Lasso regression) and a graph‑neural‑network (GNN) that directly ingests the bipartite graph formed by variables and constraints. The classical models use the engineered features, while the GNN learns representations from the raw adjacency and coefficient matrices. Two prediction tasks are considered: (a) binary classification (LP is stronger vs. SDP is stronger) and (b) regression of the absolute gap between the two relaxations’ objective values.
Experimental evaluation comprises two data sources. Synthetic instances are generated with controlled eigenvalue spectra (positive‑semidefinite, indefinite, negative‑semidefinite) and varying sparsity levels, allowing a systematic stress test of the models. Real‑world benchmarks are drawn from MINLPLib, covering a diverse set of non‑convex QCQPs from power systems, chemical engineering, finance, and graph theory. Results show that the DI model attains classification accuracies above 85 % and regression mean absolute errors below 5 % of the typical objective range, despite being applicable to any problem size. The fDD model performs best (≈92 % accuracy) when the test instances share the same dimensions as the training set. Interestingly, the handcrafted‑feature models consistently outperform the GNN in the DI setting, suggesting that domain‑specific spectral and sparsity descriptors are highly informative.
Beyond predictive performance, the authors quantify the downstream impact on a full optimization pipeline. By selecting the predicted stronger relaxation a priori, they avoid solving the weaker one and achieve an average 30 % reduction in total solve time across the test suite. This demonstrates that a lightweight machine‑learning predictor can translate into tangible computational savings for global optimization algorithms that repeatedly solve relaxations (e.g., spatial branch‑and‑bound).
The paper’s contributions are threefold: (1) a principled feature engineering strategy that captures structural properties known to affect LP and SDP strength, (2) a dimension‑independent meta‑model that generalizes across problem sizes, and (3) empirical evidence that such a predictor can meaningfully accelerate QCQP solution workflows. The authors suggest several avenues for future work, including extending the framework to higher‑order SDP hierarchies, incorporating other convex relaxations such as second‑order cone programming, and developing online learning schemes that adapt the predictor as new instances are solved. Overall, the study showcases how data‑driven decision making can be integrated into the core of non‑convex optimization, moving beyond heuristic rule‑of‑thumb choices toward systematic, instance‑specific algorithm selection.
Comments & Academic Discussion
Loading comments...
Leave a Comment