A Generic Approach for Escaping Saddle points

Sashank J Reddi 1 Manzil Zaheer 1 Suvrit Sra 2 Barnabas Poczos 1 Francis Bach 3 Ruslan Salakhutdinov 1 Alexander J Smola 4
3 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
Abstract : A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.
Type de document :
Pré-publication, Document de travail
Liste complète des métadonnées

Contributeur : Francis Bach <>
Soumis le : jeudi 30 novembre 2017 - 07:56:23
Dernière modification le : jeudi 11 janvier 2018 - 06:28:04

Lien texte intégral


  • HAL Id : hal-01652150, version 1
  • ARXIV : 1709.01434



Sashank J Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, et al.. A Generic Approach for Escaping Saddle points. 2017. 〈hal-01652150〉



Consultations de la notice