Conformal changepoint localization

Conformal changepoint localization
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We study the problem of offline changepoint localization in a distribution-free setting. One observes a vector of data with a single changepoint, assuming that the data before and after the changepoint are iid (or more generally exchangeable) from arbitrary and unknown distributions. The goal is to produce a finite-sample confidence set for the index at which the change occurs without making any other assumptions. Existing methods often rely on parametric assumptions, tail conditions, or asymptotic approximations, or only produce point estimates. In contrast, our distribution-free algorithm, CONformal CHangepoint localization (CONCH), only leverages exchangeability arguments to construct confidence sets with finite sample coverage. By proving a conformal Neyman-Pearson lemma, we derive principled score functions that yield informative (small) sets. Moreover, with such score functions, the normalized length of the confidence set shrinks to zero under weak assumptions. We also establish a universality result showing that any distribution-free changepoint localization method must be an instance of CONCH. Experiments suggest that CONCH delivers precise confidence sets even in challenging settings involving images or text.


💡 Research Summary

The paper tackles the offline changepoint localization problem under the weakest possible assumptions: the data before and after an unknown changepoint are merely exchangeable (and independent of each other), with no parametric, tail, or asymptotic conditions imposed. The authors introduce CONCH (Conformal Changepoint Localization), a framework that produces finite‑sample, distribution‑free confidence sets for the true changepoint index.

CONCH works by first defining any “change‑point plausibility” (CPP) score S: Xⁿ → ℝⁿ⁻¹, where Sₜ(x) quantifies how likely t is the changepoint. For each candidate t, a restricted permutation group Πₜ is formed: permutations that shuffle observations within the left segment (indices ≤ t) and within the right segment (indices > t) but never mix the two sides. Under the exchangeability assumption, if t equals the true changepoint ξ, every permutation in Π_ξ is equally likely, making the rank of S_ξ(x) among {S_ξ(π(x)) : π∈Π_ξ} uniformly distributed. Consequently, the p‑value
pₜ = (1/|Πₜ|) ∑{π∈Πₜ} 𝟙{Sₜ(π(x)) ≤ Sₜ(x)}
is super‑uniform under the null hypothesis H₀,ₜ (“t is the true changepoint”). By thresholding these p‑values at level α, the confidence set C
{1‑α} = {t : pₜ > α} satisfies P(ξ ∈ C_{1‑α}) ≥ 1 − α for any sample size. This is formalized in Theorem 3.1.

The choice of CPP score is crucial for the size of C_{1‑α}. The authors prove a “Conformal Neyman–Pearson Lemma” (Lemma 4.2) that characterizes the optimal score: essentially the likelihood‑ratio between post‑ and pre‑change distributions. While this optimal score requires oracle knowledge of the two distributions, the paper proposes practical near‑optimal alternatives, such as non‑parametric two‑sample test statistics (Kolmogorov–Smirnov, Wasserstein distance) or the log‑odds output of a binary classifier trained to distinguish pre‑ from post‑change data. Proposition 4.1 shows that symmetric scores (invariant under Πₜ) yield trivial p‑values equal to one, and that any monotone transformation of a score can only enlarge the confidence set, guiding the design of informative scores.

Theoretical analysis proceeds to show that, under mild regularity (e.g., a positive Kullback–Leibler divergence between the two distributions), the normalized length |C_{1‑α}|/n converges to zero as n → ∞ (Theorem 5.2). When the true likelihood ratio is known, the confidence set length remains Oₚ(1) (Theorem 5.1), demonstrating that the optimal score attains the best possible scaling.

A universality result (Theorem 6.1) proves that any distribution‑free changepoint confidence set must be an instance of the CONCH framework. This enables a simple calibration procedure (Algorithm 2) that converts any heuristic confidence set into a valid finite‑sample set by re‑computing p‑values within the appropriate permutation groups. The framework also extends to multiple changepoints (Algorithm 3) by applying the same principle to each segment produced by a “nice” segmentation algorithm.

Computationally, exact evaluation of pₜ would require enumerating all |Πₜ| permutations, which is infeasible for large n. The authors recommend Monte‑Carlo approximation of the p‑values (Equation A.1) and prove that the approximation retains the super‑uniform property (Theorem A.1). A randomized version yields exactly uniform p‑values (Theorem A.2).

Empirical studies cover synthetic data with various dimensions and distributional shifts, as well as real‑world image (Fashion‑MNIST) and text (sentiment reviews) datasets. CPP scores derived from simple mean differences, KS statistics, and deep‑learning classifiers are evaluated. Across all settings, CONCH produces substantially tighter confidence intervals than recent distribution‑free methods such as MCP‑Localization, SMUCE, or bootstrap‑based approaches, while maintaining the nominal coverage (≈95%). Notably, even subtle, high‑dimensional shifts in images and text are localized accurately, illustrating the method’s robustness.

In summary, the paper delivers a principled, fully distribution‑free method for changepoint localization that offers finite‑sample guarantees, an optimality theory for score selection, convergence guarantees, a universality theorem, and practical algorithms for both single and multiple changepoint scenarios. This work substantially advances the theoretical foundations and practical applicability of non‑parametric changepoint analysis.


Comments & Academic Discussion

Loading comments...

Leave a Comment