Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Bayesian neural networks increasingly sparsify their units with depth

Mariia Vladimirova 1 Julyan Arbel 1 Pablo Mesejo 2
1 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology, LJK - Laboratoire Jean Kuntzmann
Abstract : We investigate deep Bayesian neural networks with Gaussian priors on the weights and ReLU-like nonlinearities, shedding light on novel sparsity-inducing mechanisms at the level of the units of the network, both pre-and post-nonlinearities. The main thrust of the paper is to establish that the units prior distribution becomes increasingly heavy-tailed with depth. We show that first layer units are Gaussian, second layer units are sub-Exponential, and we introduce sub-Weibull distributions to characterize the deeper layers units. Bayesian neural networks with Gaussian priors are well known to induce the weight decay penalty on the weights. In contrast, our result indicates a more elaborate regularization scheme at the level of the units, ranging from convex penalties for the first two layers-weight decay for the first and Lasso for the second to non convex penalties for deeper layers. Thus, despite weight decay does not allow for the weights to be set exactly to zero, sparse solutions tend to be selected for the units from the second layer onward. This result provides new theoretical insight on deep Bayesian neural networks, underpinning their natural shrinkage properties and practical potential.
Complete list of metadata

Cited literature [26 references]  Display  Hide  Download
Contributor : Julyan Arbel <>
Submitted on : Tuesday, December 11, 2018 - 3:44:57 AM
Last modification on : Tuesday, May 11, 2021 - 11:37:38 AM
Long-term archiving on: : Tuesday, March 12, 2019 - 12:38:26 PM


Files produced by the author(s)


  • HAL Id : hal-01950657, version 1



Mariia Vladimirova, Julyan Arbel, Pablo Mesejo. Bayesian neural networks increasingly sparsify their units with depth. 2018. ⟨hal-01950657⟩



Record views


Files downloads