Mixed-Integer Programming for Change-point Detection

Mixed-Integer Programming for Change-point Detection
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present a new mixed-integer programming (MIP) approach for offline multiple change-point detection by casting the problem as a globally optimal piecewise linear (PWL) fitting problem. Our main contribution is a family of strengthened MIP formulations whose linear programming (LP) relaxations admit integral projections onto the segment assignment variables, which encode the segment membership of each data point. This property yields provably tighter relaxations than existing formulations for offline multiple change-point detection. We further extend the framework to two settings of active research interest: (i) multidimensional PWL models with shared change-points, and (ii) sparse change-point detection, where only a subset of dimensions undergo structural change. Extensive computational experiments on benchmark real-world datasets demonstrate that the proposed formulations achieve reductions in solution times under both $\ell_1$ and $\ell_2$ loss functions in comparison to the state-of-the-art.


💡 Research Summary

This paper introduces a novel mixed‑integer programming (MIP) framework for offline multiple change‑point detection by formulating the problem as a globally optimal piecewise‑linear (PWL) fitting task. The authors’ central contribution is a family of strengthened MIP formulations whose linear programming (LP) relaxations possess an integral projection onto the space of segment‑assignment variables. In other words, the LP relaxation of the proposed models already yields binary segment‑assignment values, eliminating the fractional solutions that plague earlier formulations and dramatically tightening the relaxation.

The work begins by reviewing existing MIP‑based change‑point models, notably the “Basic” formulation of Goldberg et al. (2021) and the “Alternate” formulation of Rebennack & Krasko (2020). Both rely on binary variables δ_{j,t} indicating whether observation t belongs to segment j, but their LP relaxations admit non‑integral δ‑values, leading to extensive branch‑and‑bound effort. To overcome this, the authors introduce an extended representation of the segment‑assignment component. By adding auxiliary constraints that enforce a specific polyhedral structure, they prove that the projection of the LP feasible region onto the δ‑space is an integer polytope. Consequently, any LP solution automatically satisfies the binary requirements, reducing the size of the search tree.

Two concrete strengthened models are presented: the Extended Basic and the Extended Alternate formulations. Both retain the original decision variables for slopes, intercepts, break‑points, and fitted values, but replace the original δ‑constraints with the new extended set. Big‑M constants are derived analytically from the data to guarantee numerical stability while preserving the integrality property.

Beyond the univariate case, the paper extends the methodology to two important multivariate settings. First, a shared‑change‑point model forces all dimensions to share the same break‑point locations while allowing each dimension its own slope and intercept. Second, a sparse change‑point model introduces binary activation variables per dimension, enabling the detection of change‑points that affect only a subset of the variables. Both extensions inherit the integral‑projection property because the underlying segment‑assignment structure remains unchanged.

Theoretical contributions include: (i) a polyhedral proof that the LP relaxation of the extended formulations is integral; (ii) a systematic construction of tight big‑M parameters; and (iii) an analysis showing that the strengthened models dominate existing MIP formulations in terms of LP bound quality.

Extensive computational experiments are conducted on a suite of real‑world benchmark datasets covering power‑system monitoring, manufacturing process control, healthcare time series, and other domains. Experiments are performed under both ℓ₁ (absolute) and ℓ₂ (squared) loss functions, and with optional continuity constraints on the fitted function. The results demonstrate that the proposed formulations achieve 30 %–70 % reductions in solution time compared with the Basic and Alternate baselines, while delivering identical or slightly better objective values. The advantage is most pronounced when the number of change‑points is large or when the data dimension increases. In the multivariate shared‑change‑point setting, solution time scales sub‑linearly with the number of dimensions, and in the sparse setting the method outperforms ℓ₀‑penalized approaches in both precision and recall.

The paper concludes that strengthening the segment‑assignment component of MIP models yields a practically viable, globally optimal approach to change‑point detection. It also outlines future research directions: (a) incremental or online MIP schemes for streaming data, (b) extensions to piecewise‑polynomial or non‑linear segment models, and (c) integration with distributed MIP solvers for massive datasets. Overall, the work bridges the gap between rigorous optimization theory and the pressing need for fast, exact change‑point detection in modern data‑intensive applications.


Comments & Academic Discussion

Loading comments...

Leave a Comment