Input-Label Correlation Governs a Linear-to-Nonlinear Transition in Random Features under Spiked Covariance
Random feature models (RFMs), two-layer networks with a randomly initialized fixed first layer and a trained linear readout, are among the simplest nonlinear predictors. Prior asymptotic analyses in the proportional high-dimensional regime show that, under isotropic data, RFMs reduce to noisy linear models and offer no advantage over classical linear methods such as ridge regression. Yet RFMs frequently outperform linear baselines on structured real data. We show that this tension is explained by a correlation-driven phase transition: under spiked-covariance designs, the interaction between anisotropy and input-label correlation determines whether the RFM behaves as an effectively linear predictor or exhibits genuinely nonlinear gains. Concretely, we establish a universality principle under anisotropy and characterize the RFM generalization error via an equivalent noisy polynomial model. The effective degree of this polynomial, equivalently, which Hermite orders of the activation survive, is governed by the strength of input-label correlation, yielding an explicit boundary in the correlation-spike-magnitude plane. Below the boundary, the RFM collapses to a linear surrogate and can underperform strong linear baselines; above it, higher-order terms persist and the RFM achieves a clear nonlinear advantage. Numerical simulations and real-data experiments corroborate the theory and delineate the transition between these two regimes.
💡 Research Summary
This paper investigates when random feature models (RFMs)—two‑layer networks with a fixed random first layer and a trained linear readout—provide genuine nonlinear benefits over classical linear predictors in the proportional high‑dimensional regime. Prior asymptotic analyses under isotropic data (covariance Iₙ) have shown that RFMs are asymptotically equivalent to a matched noisy linear model, implying no advantage regardless of the activation function. However, real‑world data often exhibit low‑dimensional structure and anisotropic covariance. The authors consider a spiked‑covariance data model: x ∼ N(0, Iₙ + θ γγᵀ) with a single spike direction γ, spike magnitude θ ≈ n^β (β∈
Comments & Academic Discussion
Loading comments...
Leave a Comment