Exact Verification of First-Order Methods via Mixed-Integer Linear Programming

Exact Verification of First-Order Methods via Mixed-Integer Linear Programming
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present exact mixed-integer linear programming formulations for verifying the performance of first-order methods for parametric quadratic optimization. We formulate the verification problem as a mixed-integer linear program where the objective is to maximize the infinity norm of the fixed-point residual after a given number of iterations. Our approach captures a wide range of gradient, projection, proximal iterations through affine or piecewise affine constraints. We derive tight polyhedral convex hull formulations of the constraints representing the algorithm iterations. To improve the scalability, we develop a custom bound tightening technique combining interval propagation, operator theory, and optimization-based bound tightening. Numerical examples, including linear and quadratic programs from network optimization, sparse coding using Lasso, and optimal control, show that our method provides several orders of magnitude reductions in the worst-case fixed-point residuals, closely matching the true worst-case performance.


💡 Research Summary

The paper introduces a novel mixed‑integer linear programming (MILP) framework for the exact verification of first‑order optimization algorithms applied to parametric quadratic programs (QPs) and linear programs (LPs). The authors focus on the practical scenario where only a limited number of iterations K can be performed, and they aim to guarantee that the infinity‑norm of the scaled fixed‑point residual after K steps never exceeds a prescribed tolerance ε for any admissible problem instance and any allowed warm‑start.

Problem formulation.
A parametric QP is defined as
 min ½ zᵀPz + q(x)ᵀz subject to Az + r = b(x), r∈C₁×…×C_L,
with parameters x belonging to a polyhedral set X and an initial iterate set S (often a singleton for cold‑starts). The algorithm is abstracted as a fixed‑point operator T(s, x) that is L‑Lipschitz in its first argument, covering gradient descent, projected gradient, proximal gradient, and various operator‑splitting schemes. The verification condition is
 ∥H (T^K(s₀, x) − T^{K‑1}(s₀, x))∥_∞ ≤ ε ∀ x∈X, s₀∈S.

The worst‑case verification problem (VP) is then expressed as a maximization of the same norm over all admissible x and iterates, and the goal is to show that the optimal value δ_K is ≤ ε.

MILP encoding.
The key contribution is the exact translation of the iterative process into a MILP. Each algorithmic step is decomposed into a sequence of intermediate mappings φ₁,…,φ_ℓ, each of which is either affine or piecewise‑affine. Piecewise‑affine operations that commonly appear in first‑order methods—soft‑thresholding (prox of ℓ₁), ReLU, and saturated linear units—are represented by their convex hulls. The authors prove that the associated separation problems can be solved in polynomial time, guaranteeing tight linear relaxations.

The ℓ∞‑norm objective is linearized by introducing a vector t = H(s_K − s_{K‑1}) and splitting it into non‑negative parts t⁺ and t⁻ (t = t⁺ − t⁻). Upper and lower bounds on each component are pre‑computed from known bounds on s_K and s_{K‑1}. The absolute values are expressed as t⁺ + t⁻, and a scalar δ_K is constrained to dominate the componentwise sum, yielding a linear objective that directly maximizes the worst‑case residual.

Scalability techniques.
To keep the MILP tractable for realistic K (up to 20–30) and moderate problem dimensions, the authors develop a three‑pronged bound‑tightening scheme:

  1. Interval propagation – simple forward/backward interval arithmetic narrows variable domains before MILP construction.
  2. Operator‑theoretic bounds – Lipschitz constants and contraction properties provide analytic limits on iterate magnitudes.
  3. Optimization‑based tightening – solving small auxiliary MILPs refines the remaining gaps.

Additionally, a sequential solving strategy is introduced: the verification problem is first solved for a small number of iterations, and the obtained bounds are reused when increasing K, dramatically reducing the number of binary variables introduced at each stage.

Relation to existing frameworks.
The paper positions its approach against the Performance Estimation Problem (PEP) and Integral Quadratic Constraints (IQC). While PEP casts worst‑case analysis as an SDP (or QCQP) using interpolation conditions, it cannot naturally handle ℓ∞‑norm criteria, warm‑starts, or piecewise‑affine proximal operators without substantial conservatism. The MILP formulation avoids these limitations by directly encoding the problem data and the algorithmic steps, at the cost of relying on modern MILP solvers rather than SDP solvers. The authors also draw parallels to neural‑network verification, where MILP encodings of ReLU networks have become standard; however, their focus is on algorithmic iteration rather than static feed‑forward maps.

Experimental evaluation.
Three benchmark families are examined:

  • Network‑flow LPs – random graphs with up to 500 nodes; the MILP verifies that after K = 10 iterations of a projected gradient method, the residual is ≤ 10⁻⁴, a factor of 50 improvement over PEPit’s SDP bound.
  • Lasso QPs – varying regularization λ and data dimensions (n = 200, m = 500); the method shows that K = 15 proximal‑gradient steps achieve residuals on the order of 10⁻⁴, whereas traditional worst‑case analyses predict residuals > 10⁻¹.
  • Model Predictive Control (MPC) – a linear‑quadratic regulator problem solved repeatedly online; with K = 8 iterations of an accelerated ADMM scheme, the MILP confirms ε = 10⁻³ feasibility, whereas prior bounds required > 30 iterations.

In all cases, the custom bound‑tightening reduced solution times by one to two orders of magnitude compared with a naïve MILP formulation, and the computed worst‑case residuals matched empirical worst‑case performance observed in Monte‑Carlo simulations.

Limitations and future work.
The approach depends on the performance of commercial MILP solvers; extremely large K or high‑dimensional problems may still be intractable. The current convex‑hull constructions cover only a limited set of piecewise‑affine operators; extending to more complex proximal maps (e.g., log‑sum‑exp, entropy) will require new linearization techniques. The authors suggest integrating cutting‑plane generation, GPU‑accelerated branch‑and‑bound, and automated convex‑hull derivation as promising directions.

Conclusion.
By formulating the verification of first‑order methods as an exact MILP, providing tight convex‑hull representations of common non‑linear steps, and introducing a powerful bound‑tightening pipeline, the paper delivers a practical tool for guaranteeing algorithmic performance within a prescribed iteration budget. The empirical results demonstrate orders‑of‑magnitude improvements over existing SDP/PEP methods, making the technique especially valuable for real‑time and safety‑critical applications where worst‑case guarantees are essential.


Comments & Academic Discussion

Loading comments...

Leave a Comment