Bayesian Adaptive Lasso
We propose the Bayesian adaptive Lasso (BaLasso) for variable selection and coefficient estimation in linear regression. The BaLasso is adaptive to the signal level by adopting different shrinkage for different coefficients. Furthermore, we provide a model selection machinery for the BaLasso by assessing the posterior conditional mode estimates, motivated by the hierarchical Bayesian interpretation of the Lasso. Our formulation also permits prediction using a model averaging strategy. We discuss other variants of this new approach and provide a unified framework for variable selection using flexible penalties. Empirical evidence of the attractiveness of the method is demonstrated via extensive simulation studies and data analysis.
💡 Research Summary
The paper introduces the Bayesian Adaptive Lasso (BaLasso), a hierarchical Bayesian extension of the classic Lasso that allows coefficient‑specific shrinkage by assigning each regression coefficient its own penalty parameter. Starting from the observation that the Lasso corresponds to a Laplace prior, the authors represent this prior as a scale‑mixture of normals with an exponential mixing distribution. By introducing individual scale parameters τ_j² for each coefficient β_j and corresponding penalty parameters λ_j, the model can automatically apply stronger shrinkage to irrelevant variables and lighter shrinkage to strong signals.
The hierarchical model consists of: (i) a normal likelihood y|X,β,σ² ~ N(Xβ,σ²I); (ii) a normal prior β|σ²,τ² ~ N(0,σ²D_τ) with D_τ = diag(τ₁²,…,τ_p²); (iii) an exponential (or inverse‑Gaussian) prior for τ_j² conditional on λ_j; and (iv) a Gamma hyper‑prior for λ_j² (λ_j² ~ Gamma(r,δ)). An improper prior π(σ²) ∝ 1/σ² is used for the error variance.
Because all full conditional distributions are of standard form, a Gibbs sampler can be constructed: β follows a multivariate normal, σ² an inverse‑Gamma, each τ_j² an inverse‑Gaussian, and each λ_j² a Gamma. This sampler is computationally efficient and yields posterior samples for all parameters.
Two strategies for estimating the penalty vector λ are proposed. The empirical Bayes (EB) approach maximizes the marginal likelihood via an EM‑type algorithm where the E‑step expectation of 1/τ_j² is approximated by the average of Gibbs samples; the update λ_j^(k) = √{2 E
Comments & Academic Discussion
Loading comments...
Leave a Comment