Learning with Exact Invariances in Polynomial Time

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We study the statistical-computational trade-offs for learning with exact invariances (or symmetries) using kernel regression. Traditional methods, such as data augmentation, group averaging, canonicalization, and frame-averaging, either fail to provide a polynomial-time solution or are not applicable in the kernel setting. However, with oracle access to the geometric properties of the input space, we propose a polynomial-time algorithm that learns a classifier with \emph{exact} invariances. Moreover, our approach achieves the same excess population risk (or generalization error) as the original kernel regression problem. To the best of our knowledge, this is the first polynomial-time algorithm to achieve exact (not approximate) invariances in this context. Our proof leverages tools from differential geometry, spectral theory, and optimization. A key result in our development is a new reformulation of the problem of learning under invariances as optimizing an infinite number of linearly constrained convex quadratic programs, which may be of independent interest.

💡 Research Summary

The paper addresses a fundamental question in machine learning: how to incorporate exact group invariances into kernel regression without incurring prohibitive computational costs. The authors consider a smooth, compact, boundary‑less Riemannian manifold (M) of dimension (d) and a finite isometric group (G) acting on (M). The target regression function (f^\star) is assumed to be (G)-invariant, i.e., (f^\star(gx)=f^\star(x)) for all (g\in G). Standard kernel ridge regression (KRR) in a Sobolev reproducing kernel Hilbert space (RKHS) yields optimal statistical rates (O(n^{-s/(s+d/2)})) but does not respect the invariance. A naïve solution is to average the kernel over the group, producing an invariant kernel (K_{\text{inv}}). While theoretically sound, computing (K_{\text{inv}}) requires (\Omega(n^2|G|)) time, which is infeasible when (|G|) grows factorially (e.g., permutation groups).

The core contribution is a polynomial‑time algorithm that achieves exact invariance and retains the optimal statistical risk. The method leverages the spectral decomposition of the Laplace–Beltrami operator (\Delta_M). Because (\Delta_M) commutes with every isometric group action, its eigenfunctions ({\phi_{\lambda,\ell}}) can be chosen such that each group element acts on the eigenspace associated with eigenvalue (\lambda) via an orthogonal matrix (R_g(\lambda)). Expanding any function in this basis transforms the invariance constraint into a set of linear equations (R_g(\lambda)\alpha_\lambda=\alpha_\lambda) for the coefficient vectors (\alpha_\lambda).

With this representation, the original non‑convex, constrained regression problem is reformulated as an infinite collection of finite‑dimensional convex quadratic programs (QPs), one per eigenvalue. Each QP has the form

Learning with Exact Invariances in Polynomial Time

💡 Research Summary

Comments & Academic Discussion

Leave a Comment