Localized Sparse Principal Component Analysis of Multivariate Time Series in Frequency Domain
Principal component analysis has been a main tool in multivariate analysis for estimating a low dimensional linear subspace that explains most of the variability in the data. However, in high-dimensional regimes, naive estimates of the principal loadings are not consistent and difficult to interpret. In the context of time series, principal component analysis of spectral density matrices can provide valuable, parsimonious information about the behavior of the underlying process, particularly if the principal components are interpretable in that they are sparse in coordinates and localized in frequency bands. In this paper, we introduce a formulation and consistent estimation procedure for interpretable principal component analysis for high-dimensional time series in the frequency domain. An efficient frequency-sequential algorithm is developed to compute sparse-localized estimates of the low-dimensional principal subspaces of the signal process. The method is motivated by and used to understand neurological mechanisms from high-density resting-state EEG in a study of first episode psychosis.
💡 Research Summary
**
This paper tackles the challenge of performing interpretable principal component analysis (PCA) on high‑dimensional multivariate time series in the frequency domain. Classical PCA works well when the dimension p is fixed, but in modern applications p often exceeds the sample size n, leading to inconsistent and non‑interpretable eigenvectors. Moreover, for time‑series data the spectral density matrix varies with frequency, and a naïve application of sparse PCA either ignores the continuity across frequencies or fails to capture the fact that useful signal power is often confined to specific frequency bands.
The authors introduce a novel formulation that simultaneously enforces sparsity across variables and localization across frequency bands. Sparsity is defined via the number of non‑zero diagonal entries of the projection matrix onto the principal subspace, while localization is achieved by selecting a subset Ω of frequencies where the summed power of the top d eigenvalues exceeds a data‑driven threshold. The goal is to estimate a rank‑d projection matrix A(ω)=∑_{j=1}^d U_j(ω)U_j(ω)† that maximizes the integrated trace ∫ tr
Comments & Academic Discussion
Loading comments...
Leave a Comment