AdapDISCOM: An Adaptive Sparse Regression Method for High-Dimensional Multimodal Data With Block-Wise Missingness and Measurement Errors

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Multimodal high-dimensional data are increasingly prevalent in biomedical research, yet they are often compromised by block-wise missingness and measurement errors, posing significant challenges for statistical inference and prediction. We propose AdapDISCOM, a novel adaptive direct sparse regression method that simultaneously addresses these two pervasive issues. Building on the DISCOM framework, AdapDISCOM introduces modality-specific weighting schemes to account for heterogeneity in data structures and error magnitudes across modalities. We establish the theoretical properties of AdapDISCOM, including model selection consistency and convergence rates under sub-Gaussian and heavy-tailed settings, and develop robust and computationally efficient variants (AdapDISCOM-Huber and Fast-AdapDISCOM). Extensive simulations demonstrate that AdapDISCOM consistently outperforms existing methods such as DISCOM, SCOM, and CoCoLasso, particularly under heterogeneous contamination and heavy-tailed distributions. Finally, we apply AdapDISCOM to Alzheimers Disease Neuroimaging Initiative (ADNI) data, demonstrating improved prediction of cognitive scores and reliable selection of established biomarkers, even with substantial missingness and measurement errors. AdapDISCOM provides a flexible, robust, and scalable framework for high-dimensional multimodal data analysis under realistic data imperfections.

💡 Research Summary

This paper introduces AdapDISCOM, an adaptive direct sparse regression framework designed for high‑dimensional multimodal data that suffer simultaneously from block‑wise missingness and additive measurement errors. Building on the previously proposed DISCOM method, the authors augment the covariance‑based approach with modality‑specific weighting parameters, allowing the estimator to reflect heterogeneity in both data structure and noise magnitude across different modalities (e.g., genetics, structural MRI, PET, clinical questionnaires).

Methodology
The authors start from the linear model (y = X\beta + \varepsilon) where the true predictor matrix (X) is not fully observed. Instead, the observed matrix (Z) is corrupted by additive errors with modality‑specific variances (\gamma_k^2) and by block‑wise missingness (entire modalities missing for some subjects). They first recall the CoCoLasso correction for measurement error ((\hat\Sigma = \hat\Sigma^{\text{raw}} + \gamma^2 I)) and the DISCOM estimator for missing data ((\hat\Sigma = \alpha_1 \hat\Sigma_I + \alpha_2 \hat\Sigma_C + \alpha_3 I)).

AdapDISCOM generalizes DISCOM by introducing separate intra‑modality weights (\alpha_k) for each modality (k), a cross‑modality weight (\alpha_C), and a global ridge weight (\alpha_p). The resulting covariance estimator is

AdapDISCOM: An Adaptive Sparse Regression Method for High-Dimensional Multimodal Data With Block-Wise Missingness and Measurement Errors

💡 Research Summary

Comments & Academic Discussion

Leave a Comment