A grid-based methodology for fast online changepoint detection
We propose a grid-based methodology for online changepoint detection that allows offline changepoint tests to be applied to sequentially observed data. The methodology achieves low update and storage costs by testing for changepoints over a dynamically updating grid of candidate changepoint locations. For a broad class of test statistics, including those based on empirical averages and certain likelihood ratios, we show that the resulting online procedure has update and storage costs that grow at most logarithmically with the sample size. We further show that finite-sample power guarantees for the offline test translate directly into non-asymptotic upper bounds on the detection delay, under a mild robustness assumption. Building upon the methodology, we construct methods for detecting changes in the mean and in the covariance matrix of multivariate data, and prove near-optimal non-asymptotic upper bounds on their detection delays. The effectiveness of the methodology is supported by a simulation study, where we compare its performance for detecting mean changes with that of state-of-the-art online methods. To illustrate its practical applicability, we use the methodology to detect structural changes in currency exchange rates in real time.
💡 Research Summary
The paper introduces a novel grid‑based framework for online changepoint detection that enables the direct use of offline test statistics in a streaming setting while keeping computational and memory costs logarithmic in the sample size. The authors begin by formalizing the online changepoint problem for a single unknown change point τ in an infinite sequence of (possibly multivariate) independent observations. An online detector is defined as an extended stopping time bτ, with false‑alarm probability FA(bτ)=P∞(bτ<∞) and detection delay D=bτ−τ as performance metrics. The key challenge is to evaluate an offline test statistic for many candidate changepoint locations without incurring linear‑time updates or storage.
To address this, the authors propose a dynamic geometric grid G(t) that evolves with time. Unlike the static geometric grid Gstat(t) = {1,…,2⌊log2(t−1)⌋} which yields O(log t) candidates but requires O(t) stored cumulative sums, the dynamic grid partitions the index set {1,…,t−1} into intervals of lengths 1,2,4,8,… and selects exactly two indices from each interval (the left‑half and right‑half representatives). As new data arrive, the selected indices are either cyclically shifted or discarded, preserving the necessary cumulative sums while keeping the number of stored values bounded by 3 log t. Lemma 1 establishes three crucial properties: (i) geometric spacing guarantees that for any distance d≤t/2 there exists a grid point g with d/2≤g≤d; (ii) logarithmic cardinality |G(t)|<3 log t; (iii) a recycling property G(t+1){1}−1⊆G(t) that ensures efficient updates.
The methodology proceeds by computing, at each time t, the offline test statistic Tt(g) for every g∈G(t) and taking the maximum T(t)=maxg∈G(t)Tt(g). If the underlying offline test is robust to constant‑factor misspecification of the changepoint location—a condition satisfied by many statistics such as the squared CUSUM for mean changes or certain likelihood‑ratio forms—then the power loss incurred by using the sparse grid is bounded by a constant factor. Consequently, finite‑sample power guarantees for the offline test translate directly into non‑asymptotic upper bounds on the detection delay of the online procedure.
Two concrete applications are developed:
-
Mean‑change detection (univariate) – Observations follow Y_i = μ1+Z_i before τ and μ2+Z_i after τ, with Z_i sub‑Gaussian. The CUSUM statistic C(t,g) measures the weighted difference between empirical means before and after a candidate change at t−g. Using the dynamic grid, the online detector requires O(log t) arithmetic operations and O(log t) stored cumulative sums per time step. The authors prove a near‑optimal minimax detection delay of order (log (1/α))/Δ² where Δ=|μ2−μ1| and α is the allowed false‑alarm level, matching known lower bounds up to constants.
-
Covariance‑change detection (multivariate) – For p‑dimensional Gaussian data with a change in the covariance matrix from Σ1 to Σ2, a likelihood‑ratio‑type statistic based on the log‑determinant and trace terms is employed. The same dynamic grid yields logarithmic update and storage costs even when p≫t. The detection delay bound scales as O(p log p / ‖Σ1⁻¹/2 Σ2 Σ1⁻¹/2 − I‖_F²), showing that the procedure remains effective in high dimensions without a dependence on the ambient dimension beyond a mild logarithmic factor.
Extensive simulations compare the proposed mean‑change detector against state‑of‑the‑art online methods such as Page‑CUSUM, GLR‑OCT, and MdFOCuS. The new method achieves comparable or higher power, lower average detection delay, and dramatically reduced memory usage (often an order of magnitude less). A real‑world case study monitors daily currency exchange rates (e.g., USD/EUR, JPY/USD) and detects structural changes in the covariance matrix in real time. Detected changepoints align with known macro‑economic events, illustrating practical relevance for finance and risk management.
The paper concludes that the dynamic geometric grid offers a general-purpose bridge from offline changepoint tests to online monitoring, preserving statistical guarantees while guaranteeing O(log t) computational and storage complexity. Limitations include the assumption of independent observations and the need for the offline test to be robust to location misspecification. Future work is suggested on extending the framework to dependent time series (ARMA, GARCH), automatic tuning of grid parameters, and handling nonlinear or multiple changepoints. Overall, the methodology represents a significant step toward scalable, statistically sound online changepoint detection.
Comments & Academic Discussion
Loading comments...
Leave a Comment