Adaptive scaling of the learning rate by second order automatic differentiation

Frédéric de Gournay; Alban Gossard

Pré-Publication, Document De Travail Année : 2022

Adaptive scaling of the learning rate by second order automatic differentiation

(1, 2) , (1, 3)

1
2
3

Frédéric de Gournay

Fonction : Auteur
PersonId : 7855
IdHAL : frederic-de-gournay
ORCID : 0000-0003-4721-3137
IdRef : 24492418X

Institut de Mathématiques de Toulouse UMR5219

Institut National des Sciences Appliquées - Toulouse

Alban Gossard

Fonction : Auteur
PersonId : 743847
IdHAL : albangossard
ORCID : 0000-0001-6782-5080

Institut de Mathématiques de Toulouse UMR5219

Université Toulouse III - Paul Sabatier

Résumé

In the context of the optimization of Deep Neural Networks, we propose to rescale the learning rate using a new technique of automatic differentiation. This technique relies on the computation of the {\em curvature}, a second order information whose computational complexity is in between the computation of the gradient and the one of the Hessian-vector product. If (1C,1M) represents respectively the computational time and memory footprint of the gradient method, the new technique increase the overall cost to either (1.5C,2M) or (2C,1M). This rescaling has the appealing characteristic of having a natural interpretation, it allows the practitioner to choose between exploration of the parameters set and convergence of the algorithm. The rescaling is adaptive, it depends on the data and on the direction of descent. The numerical experiments highlight the different exploration/convergence regimes.

Domaines

Réseau de neurones [cs.NE]

Fichier principal

Adaptive scaling of the learning rate by second order automatic differentiation.pdf (4.34 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Alban Gossard : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03748574

Soumis le : mardi 25 octobre 2022-17:55:30

Dernière modification le : mardi 16 janvier 2024-16:26:50

Dates et versions

hal-03748574 , version 1 (09-08-2022)

hal-03748574 , version 2 (25-10-2022)

Identifiants

HAL Id : hal-03748574 , version 2
ARXIV : 2210.14520

Citer

Frédéric de Gournay, Alban Gossard. Adaptive scaling of the learning rate by second order automatic differentiation. 2022. ⟨hal-03748574v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS INSA-TOULOUSE IMT UT1-CAPITOLE GENCI INSA-GROUPE UNIV-UT3 UT3-TOULOUSEINP

105 Consultations

44 Téléchargements

Adaptive scaling of the learning rate by second order automatic differentiation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager