Bregman Neural Networks

Jordan Frecon; Saverio Salzo; Massimiliano Pontil; Gilles Gasso

Pré-Publication, Document De Travail Année : 2022

Bregman Neural Networks

(1) , (2) , (3, 2) , (4)

1
2
3
4

Jordan Frecon

Fonction : Auteur
PersonId : 1090292
IdRef : 196503922

Institut national des sciences appliquées Rouen Normandie

Saverio Salzo

Fonction : Auteur

Istituto Italiano di Tecnologia

Massimiliano Pontil

Fonction : Auteur

University College of London [London]

Istituto Italiano di Tecnologia

Gilles Gasso

Fonction : Auteur
PersonId : 178750
IdHAL : ggasso
IdRef : 151524378

Laboratoire d'Informatique, du Traitement de l'Information et des Systèmes

Résumé

We present a framework based on bilevel optimization for learning multilayer, deep data representations. On the one hand, the lower-level problem finds a representation by successively minimizing layer-wise objectives made of the sum of a prescribed regularizer, a fidelity term and a linear function depending on the representation found at the previous layer. On the other hand, the upper-level problem optimizes over the linear functions to yield a linearly separable final representation. We show that, by choosing the fidelity term as the quadratic distance between two successive layer-wise representations, the bilevel problem reduces to the training of a feedforward neural network. Instead, by elaborating on Bregman distances, we devise a novel neural network architecture additionally involving the inverse of the activation function reminiscent of the skip connection used in ResNets. Numerical experiments suggest that the proposed Bregman variant benefits from better learning properties and more robust prediction performance.

Domaines

Apprentissage [cs.LG] Optimisation et contrôle [math.OC] Machine Learning [stat.ML]

Fichier principal

HAL___Bregman_Neural_Network.pdf (703.1 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Jordan Frecon : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03132512

Soumis le : mercredi 16 février 2022-09:22:43

Dernière modification le : jeudi 18 avril 2024-16:45:02

Dates et versions

hal-03132512 , version 1 (05-02-2021)

hal-03132512 , version 2 (10-02-2022)

hal-03132512 , version 3 (16-02-2022)

Identifiants

HAL Id : hal-03132512 , version 3

Citer

Jordan Frecon, Saverio Salzo, Massimiliano Pontil, Gilles Gasso. Bregman Neural Networks. 2022. ⟨hal-03132512v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSA-ROUEN LITIS COMUE-NORMANDIE TDS-MACS UNIROUEN UNILEHAVRE INSA-GROUPE

178 Consultations

298 Téléchargements

Bregman Neural Networks

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager