On weight and variance uncertainty in neural networks for regression tasks

On weight and variance uncertainty in neural networks for regression tasks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We investigate the problem of weight uncertainty originally proposed by [Blundell et al. (2015). Weight uncertainty in neural networks. In International conference on machine learning, 1613-1622, PMLR.] in the context of neural networks designed for regression tasks, and we extend their framework by incorporating variance uncertainty into the model. Our analysis demonstrates that explicitly modeling uncertainty in the variance parameter can significantly enhance the predictive performance of Bayesian neural networks. By considering a full posterior distribution over the variance, the model achieves improved generalization compared to approaches that treat variance as fixed or deterministic. We evaluate the generalization capability of our proposed approach through a function approximation example and further validate it on the riboflavin genetic dataset. Our exploration encompasses both fully connected dense networks and dropout neural networks, employing Gaussian and spike-and-slab priors respectively for the network weights, providing a comprehensive assessment of how variance uncertainty affects model performance across different architectural choices.


💡 Research Summary

This paper extends the “Bayes by Backprop” framework of Blundell et al. (2015) by treating the observation variance in regression tasks as a random variable rather than a fixed hyper‑parameter. The authors introduce a latent variance parameter S, assign it a Gaussian prior, and approximate its posterior together with the network weights using a mean‑field diagonal Gaussian variational distribution. Positivity of the variance is ensured by a soft‑plus transformation g(S)=log(1+exp S). The resulting evidence lower bound (ELBO) incorporates expectations over both weights and variance, and stochastic gradient descent with the re‑parameterization trick is used to optimize the variational parameters (μ_w, ρ_w for weights and μ_L, ρ_L for variance).

Two network families are examined: (i) fully‑connected dense networks with Gaussian priors on weights, and (ii) dropout networks where a spike‑and‑slab prior models the probability of a weight being exactly zero, thereby capturing structural uncertainty. In both cases the proposed variance‑uncertainty model (VBNET‑UNC) is compared against a fixed‑variance baseline (VBNET‑FIXED) and conventional dropout‑BNN.

Experiments consist of (a) a synthetic one‑dimensional nonlinear function with added Gaussian noise, demonstrating that the fixed‑variance model yields overly narrow predictive intervals, whereas the variance‑uncertainty model produces intervals that correctly cover the true noise level; and (b) the high‑dimensional riboflavin gene expression dataset, where after PCA preprocessing the VBNET‑UNC achieves lower mean‑squared prediction error (MSPE) and higher 95 % coverage probability than all baselines, including Lasso regression. The marginalization over variance induces heavy‑tailed predictive distributions, improving robustness to outliers and providing more reliable uncertainty quantification.

Key contributions are: (1) explicit Bayesian treatment of global observation variance, separating epistemic and aleatoric uncertainties in limited‑data regimes; (2) a scalable variational inference scheme that avoids the computational burden of MCMC while remaining applicable to deep architectures; (3) integration of spike‑and‑slab dropout to jointly model weight and variance uncertainty; and (4) empirical evidence that modeling variance uncertainty enhances both predictive accuracy and calibrated uncertainty estimates in realistic, high‑dimensional regression problems.


Comments & Academic Discussion

Loading comments...

Leave a Comment