Bayesian neural networks increasingly sparsify their units with depth

Mariia Vladimirova 1 Julyan Arbel 1 Pablo Mesejo 2
1 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : We investigate deep Bayesian neural networks with Gaussian priors on the weights and ReLU-like nonlinearities, shedding light on novel sparsity-inducing mechanisms at the level of the units of the network, both pre-and post-nonlinearities. The main thrust of the paper is to establish that the units prior distribution becomes increasingly heavy-tailed with depth. We show that first layer units are Gaussian, second layer units are sub-Exponential, and we introduce sub-Weibull distributions to characterize the deeper layers units. Bayesian neural networks with Gaussian priors are well known to induce the weight decay penalty on the weights. In contrast, our result indicates a more elaborate regularization scheme at the level of the units, ranging from convex penalties for the first two layers-weight decay for the first and Lasso for the second to non convex penalties for deeper layers. Thus, despite weight decay does not allow for the weights to be set exactly to zero, sparse solutions tend to be selected for the units from the second layer onward. This result provides new theoretical insight on deep Bayesian neural networks, underpinning their natural shrinkage properties and practical potential.
Liste complète des métadonnées

Littérature citée [8 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01950657
Contributeur : Julyan Arbel <>
Soumis le : mardi 11 décembre 2018 - 03:44:57
Dernière modification le : jeudi 27 décembre 2018 - 13:14:58
Document(s) archivé(s) le : mardi 12 mars 2019 - 12:38:26

Fichier

BNN.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01950657, version 1

Collections

Citation

Mariia Vladimirova, Julyan Arbel, Pablo Mesejo. Bayesian neural networks increasingly sparsify their units with depth. 2018. 〈hal-01950657〉

Partager

Métriques

Consultations de la notice

33

Téléchargements de fichiers

22