Bregman Neural Networks - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

Bregman Neural Networks

Résumé

We present a framework based on bilevel optimization for learning multilayer, deep data representations. On the one hand, the lower-level problem finds a representation by successively minimizing layer-wise objectives made of the sum of a prescribed regularizer, a fidelity term and a linear function depending on the representation found at the previous layer. On the other hand, the upper-level problem optimizes over the linear functions to yield a linearly separable final representation. We show that, by choosing the fidelity term as the quadratic distance between two successive layer-wise representations, the bilevel problem reduces to the training of a feedforward neural network. Instead, by elaborating on Bregman distances, we devise a novel neural network architecture additionally involving the inverse of the activation function reminiscent of the skip connection used in ResNets. Numerical experiments suggest that the proposed Bregman variant benefits from better learning properties and more robust prediction performance.
Fichier principal
Vignette du fichier
HAL___Bregman_Neural_Network.pdf (703.1 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03132512 , version 1 (05-02-2021)
hal-03132512 , version 2 (10-02-2022)
hal-03132512 , version 3 (16-02-2022)

Identifiants

  • HAL Id : hal-03132512 , version 3

Citer

Jordan Frecon, Saverio Salzo, Massimiliano Pontil, Gilles Gasso. Bregman Neural Networks. 2022. ⟨hal-03132512v3⟩
178 Consultations
298 Téléchargements

Partager

Gmail Facebook X LinkedIn More