Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Machine Learning Research Année : 2014

Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression

Résumé

In this paper, we consider supervised learning problems such as logistic regression and study the stochastic gradient method with averaging, in the usual stochastic approximation setting where observations are used only once. We show that after $N$ iterations, with a constant step-size proportional to $1/R^2 \sqrt{N}$ where $N$ is the number of observations and $R$ is the maximum norm of the observations, the convergence rate is always of order $O(1/\sqrt{N})$, and improves to $O(R^2 / \mu N)$ where $\mu$ is the lowest eigenvalue of the Hessian at the global optimum (when this eigenvalue is greater than $R^2/\sqrt{N}$). Since $\mu$ does not need to be known in advance, this shows that averaged stochastic gradient is adaptive to \emph{unknown local} strong convexity of the objective function. Our proof relies on the generalized self-concordance properties of the logistic loss and thus extends to all generalized linear models with uniformly bounded features.
Fichier principal
Vignette du fichier
bach14a.pdf (327.97 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00804431 , version 1 (25-03-2013)
hal-00804431 , version 2 (26-10-2013)
hal-00804431 , version 3 (15-03-2014)

Identifiants

Citer

Francis Bach. Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression. Journal of Machine Learning Research, 2014, 15, pp.595-627. ⟨hal-00804431v3⟩
404 Consultations
310 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More