Growing Neural Networks Achieve Flatter Minima - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Growing Neural Networks Achieve Flatter Minima

Résumé

Deep neural networks of sizes commonly encountered in practice are proven to converge towards a global minimum. The flatness of the surface of the loss function in a neighborhood of such minima is often linked with better generalization performances. In this paper, we present a new model of growing neural network in which we incrementally add neurons throughout the learning phase. We study the characteristics of the minima found by such a network compared to those obtained with standard feedforward neural networks. The results of this analysis show that a neural network grown with our procedure converges towards a flatter minimum than a standard neural network with the same number of parameters learned from scratch. Furthermore, our results confirm the link between flatter minima and better generalization performances as the grown models tend to outperform the standard ones. We validate this approach both with small neural networks and with large deep learning models that are state-of-the-art in Natural Language Processing tasks.
Fichier principal
Vignette du fichier
Papier.pdf (256.88 Ko) Télécharger le fichier
Fig.jpg (16.37 Ko) Télécharger le fichier
Fig.png (95.88 Ko) Télécharger le fichier

Dates et versions

hal-03402267 , version 1 (10-11-2021)

Licence

Paternité

Identifiants

Citer

Paul Caillon, Christophe Cerisara. Growing Neural Networks Achieve Flatter Minima. ICANN 2021 - 30th International Conference on Artificial Neural Networks, Sep 2021, Bratislava, Slovakia. pp.222-234, ⟨10.1007/978-3-030-86340-1_18⟩. ⟨hal-03402267⟩
125 Consultations
133 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More