Learning with Exact Invariances in Polynomial Time
We study the statistical-computational trade-offs for learning with exact invariances (or symmetries) using kernel regression. Traditional methods, such as data augmentation, group averaging, canonicalization, and frame-averaging, either fail to provide a polynomial-time solution or are not applicable in the kernel setting. However, with oracle access to the geometric properties of the input space, we propose a polynomial-time algorithm that learns a classifier with \emph{exact} invariances. Moreover, our approach achieves the same excess population risk (or generalization error) as the original kernel regression problem. To the best of our knowledge, this is the first polynomial-time algorithm to achieve exact (not approximate) invariances in this context. Our proof leverages tools from differential geometry, spectral theory, and optimization. A key result in our development is a new reformulation of the problem of learning under invariances as optimizing an infinite number of linearly constrained convex quadratic programs, which may be of independent interest.
💡 Research Summary
The paper addresses a fundamental question in machine learning: how to incorporate exact group invariances into kernel regression without incurring prohibitive computational costs. The authors consider a smooth, compact, boundary‑less Riemannian manifold (M) of dimension (d) and a finite isometric group (G) acting on (M). The target regression function (f^\star) is assumed to be (G)-invariant, i.e., (f^\star(gx)=f^\star(x)) for all (g\in G). Standard kernel ridge regression (KRR) in a Sobolev reproducing kernel Hilbert space (RKHS) yields optimal statistical rates (O(n^{-s/(s+d/2)})) but does not respect the invariance. A naïve solution is to average the kernel over the group, producing an invariant kernel (K_{\text{inv}}). While theoretically sound, computing (K_{\text{inv}}) requires (\Omega(n^2|G|)) time, which is infeasible when (|G|) grows factorially (e.g., permutation groups).
The core contribution is a polynomial‑time algorithm that achieves exact invariance and retains the optimal statistical risk. The method leverages the spectral decomposition of the Laplace–Beltrami operator (\Delta_M). Because (\Delta_M) commutes with every isometric group action, its eigenfunctions ({\phi_{\lambda,\ell}}) can be chosen such that each group element acts on the eigenspace associated with eigenvalue (\lambda) via an orthogonal matrix (R_g(\lambda)). Expanding any function in this basis transforms the invariance constraint into a set of linear equations (R_g(\lambda)\alpha_\lambda=\alpha_\lambda) for the coefficient vectors (\alpha_\lambda).
With this representation, the original non‑convex, constrained regression problem is reformulated as an infinite collection of finite‑dimensional convex quadratic programs (QPs), one per eigenvalue. Each QP has the form
\
Comments & Academic Discussion
Loading comments...
Leave a Comment