The MAPS Algorithm: Fast model-agnostic and distribution-free prediction intervals for supervised learning
A fundamental problem in modern supervised learning is computing reliable conditional prediction intervals in high-dimensional settings: existing methods often rely on restrictive modelling assumptions, do not scale as predictor dimension increases, or only guarantee marginal (population-level) rather than conditional (individual-level) coverage. We introduce the $\textit{lifted predictive model}$ (LPM), a new conditional representation, and propose the MAPS (Model-Agnostic Prediction Sets) algorithm that produces distribution-free conditional prediction intervals and adapts to any trained predictive model. Our procedure is bootstrap-based, scales to high-dimensional inputs and accounts for heteroscedastic errors. We establish the theoretical properties of the LPM, connect prediction accuracy to interval length, and provide sufficient conditions for asymptotic conditional coverage. We evaluate the finite-sample performance of MAPS in a simulation study, and apply our method to simulation-based inference and image classification. In the former, MAPS provides the first approach for debiasing neural Bayes estimators and constructing valid confidence intervals for model parameters given the estimators, at any desired level. In the latter, it provides the first approach that accounts for uncertainty in model calibration and label prediction.
💡 Research Summary
The paper addresses a central challenge in modern supervised learning: constructing reliable conditional prediction intervals (PIs) for high‑dimensional data without imposing restrictive model assumptions. Existing approaches—conformal prediction, split conformal, model‑free bootstrap, or Bayesian posterior methods—either guarantee only marginal (population‑level) coverage, require strong homoscedasticity or symmetry of errors, or become computationally infeasible as the predictor dimension grows.
To overcome these limitations the authors introduce the “lifted predictive model” (LPM). Instead of modeling the full conditional distribution of the response Y given the high‑dimensional covariate X, LPM focuses on the relationship between Y and the point prediction (\hat f(X)) produced by any pre‑trained model (\hat f). Formally, they write
(Y = \hat f(X) + \varepsilon),
where (\varepsilon) may depend on X and have an arbitrary continuous distribution (heteroscedastic, skewed, etc.). By conditioning on the scalar (\hat f(X)) rather than on X itself, the dimensionality problem disappears, and the conditional distribution of the residual (\varepsilon) can be estimated non‑parametrically.
The MAPS (Model‑Agnostic Prediction Sets) algorithm builds on this insight. It requires an independent calibration set that was not used to train (\hat f). For each calibration observation it computes the residual (r_i = Y_i^{\text{cal}} - \hat f(X_i^{\text{cal}})). Then a bootstrap procedure repeatedly resamples the residuals, recomputes their empirical quantiles, and aggregates the quantiles across B bootstrap replicates to obtain two functions (\hat c_1(x)) and (\hat c_2(x)). The final prediction interval for a new point (x) is (
Comments & Academic Discussion
Loading comments...
Leave a Comment