Nonlinear Acceleration of Deep Neural Networks - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2018

Nonlinear Acceleration of Deep Neural Networks

Accélération Non-linéaire des Réseaux de Neurones Profonds

Résumé

Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods, with marginal computational overhead. It aims to improve convergence using only the iterates of simple iterative algorithms. However, so far its application to optimization was theoretically limited to gradient descent and other single-step algorithms. Here, we adapt RNA to a much broader setting including stochastic gradient with momentum and Nesterov's fast gradient. We use it to train deep neural networks, and empirically observe that extrapolated networks are more accurate, especially in the early iterations. A straightforward application of our algorithm when training ResNet-152 on ImageNet produces a top-1 test error of 20.88%, improving by 0.8% the reference classification pipeline. Furthermore, the code runs offline in this case, so it never negatively affects performance.
Fichier principal
Vignette du fichier
nonlinear_acceleration_of_deep_neural_networks.pdf (416.49 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01799269 , version 1 (24-05-2018)

Identifiants

  • HAL Id : hal-01799269 , version 1

Citer

Damien Scieur, Edouard Oyallon, Alexandre D 'Aspremont, Francis Bach. Nonlinear Acceleration of Deep Neural Networks. 2018. ⟨hal-01799269⟩
202 Consultations
663 Téléchargements

Partager

Gmail Facebook X LinkedIn More