Nonlinear Acceleration of Deep Neural Networks

Damien Scieur; Edouard Oyallon; Alexandre D 'Aspremont; Francis Bach

Pré-Publication, Document De Travail Année : 2018

Nonlinear Acceleration of Deep Neural Networks

Accélération Non-linéaire des Réseaux de Neurones Profonds

(1, 2) , (3, 4) , (1, 2) , (1, 2)

1
2
3
4

Damien Scieur

Fonction : Auteur
PersonId : 1002357

Statistical Machine Learning and Parsimony

Université Paris Sciences et Lettres

Edouard Oyallon

Fonction : Auteur
PersonId : 179157
IdHAL : edouard-oyallon
ORCID : 0000-0002-4826-7527
IdRef : 228745500

Département d'informatique - ENS Paris

Centre de vision numérique

Alexandre D 'Aspremont

Fonction : Auteur

Statistical Machine Learning and Parsimony

Université Paris Sciences et Lettres

Francis Bach

Fonction : Auteur
PersonId : 863086

Statistical Machine Learning and Parsimony

Université Paris Sciences et Lettres

Résumé

Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods, with marginal computational overhead. It aims to improve convergence using only the iterates of simple iterative algorithms. However, so far its application to optimization was theoretically limited to gradient descent and other single-step algorithms. Here, we adapt RNA to a much broader setting including stochastic gradient with momentum and Nesterov's fast gradient. We use it to train deep neural networks, and empirically observe that extrapolated networks are more accurate, especially in the early iterations. A straightforward application of our algorithm when training ResNet-152 on ImageNet produces a top-1 test error of 20.88%, improving by 0.8% the reference classification pipeline. Furthermore, the code runs offline in this case, so it never negatively affects performance.

Domaines

Optimisation et contrôle [math.OC]

Fichier principal

nonlinear_acceleration_of_deep_neural_networks.pdf (416.49 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Damien Scieur : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01799269

Soumis le : jeudi 24 mai 2018-15:01:21

Dernière modification le : samedi 20 avril 2024-03:09:08

Archivage à long terme le : samedi 25 août 2018-14:34:53

Dates et versions

hal-01799269 , version 1 (24-05-2018)

Identifiants

HAL Id : hal-01799269 , version 1

Citer

Damien Scieur, Edouard Oyallon, Alexandre D 'Aspremont, Francis Bach. Nonlinear Acceleration of Deep Neural Networks. 2018. ⟨hal-01799269⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA CVN CENTRALESUPELEC INRIA2 TDS-MACS PSL UNIV-PARIS-SACLAY GS-ENGINEERING GS-COMPUTER-SCIENCE

202 Consultations

663 Téléchargements

Nonlinear Acceleration of Deep Neural Networks

Accélération Non-linéaire des Réseaux de Neurones Profonds

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager