Skip to Main content Skip to Navigation
New interface
Conference papers

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Raphaël Berthier 1, 2 Francis Bach 2, 1 Pierre Gaillard 2, 1, 3 
2 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique - ENS Paris, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
3 Thoth - Apprentissage de modèles à partir de données massives
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann
Abstract : In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation $Y = \langle \theta_*, X \rangle$ between the random output $Y$ and the random feature vector $\Phi(U)$, a potentially non-linear transformation of the inputs $U$. We analyze the convergence of single-pass, fixed step-size stochastic gradient descent on the least-square risk under this model. The convergence of the iterates to the optimum $\theta_*$ and the decay of the generalization error follow polynomial convergence rates with exponents that both depend on the regularities of the optimum $\theta_*$ and of the feature vectors $\Phi(u)$. We interpret our result in the reproducing kernel Hilbert space framework. As a special case, we analyze an online algorithm for estimating a real function on the unit interval from the noiseless observation of its value at randomly sampled points; the convergence depends on the Sobolev smoothness of the function and of a chosen kernel. Finally, we apply our analysis beyond the supervised learning setting to obtain convergence rates for the averaging process (a.k.a. gossip algorithm) on a graph depending on its spectral dimension.
Complete list of metadata

Cited literature [33 references]  Display  Hide  Download
Contributor : Raphaël Berthier Connect in order to contact the contributor
Submitted on : Monday, October 26, 2020 - 5:20:05 PM
Last modification on : Wednesday, June 8, 2022 - 12:50:06 PM


Files produced by the author(s)


  • HAL Id : hal-02866755, version 2
  • ARXIV : 2006.08212


Raphaël Berthier, Francis Bach, Pierre Gaillard. Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model. NeurIPS '20 - 34th International Conference on Neural Information Processing Systems, Dec 2020, Vancouver, Canada. pp.2576--2586. ⟨hal-02866755v2⟩



Record views


Files downloads