Theory of reproducing kernels, Transactions of the American Mathematical Society, vol.68, issue.3, pp.337-404, 1950. ,
DOI : 10.1090/S0002-9947-1950-0051437-7
Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression, J. Mach. Learn. Res, vol.15, issue.1, pp.595-627, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00804431
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n), Advances in Neural Information Processing Systems, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00831977
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems, SIAM Journal on Imaging Sciences, vol.2, issue.1, pp.183-202, 2009. ,
DOI : 10.1137/080716542
An alternative point of view on Lepski's method. Lecture Notes-Monograph Series, pp.113-133, 2001. ,
The tradeoffs of large scale learning, Advances in Neural Information Processing Systems, 2008. ,
Optimal Rates for the Regularized Least-Squares Algorithm, Foundations of Computational Mathematics, vol.7, issue.3, pp.331-368, 2007. ,
DOI : 10.1007/s10208-006-0196-8
Better mini-batch algorithms via accelerated gradient methods, Advances in Neural Information Processing Systems, 2011. ,
Best Choices for Regularization Parameters in Learning Theory: On the Bias???Variance Problem, Foundations of Computational Mathematics, vol.2, issue.4, pp.413-418, 2002. ,
DOI : 10.1007/s102080010030
Model Selection for Regularized Least-Squares Algorithm in Learning Theory, Foundations of Computational Mathematics, vol.5, issue.1, pp.59-85, 2005. ,
DOI : 10.1007/s10208-004-0134-1
Averaged least-mean-squares: bias-variance trade-offs and optimal sampling distributions, Proceedings of the International Conference on Artificial Intelligence and Statistics, p.2015 ,
Optimal distributed online prediction using mini-batches, J. Mach. Learn. Res, vol.13, issue.1, pp.165-202, 2012. ,
First-order methods of smooth convex optimization with inexact oracle, Mathematical Programming, vol.110, issue.3, pp.37-75, 2014. ,
DOI : 10.1007/s10107-013-0677-5
Non-parametric stochastic approximation with large step sizes, Annals of Statistics, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01053831
Random Iterative Models, 1997. ,
DOI : 10.1007/978-3-662-12880-0
Regularization of inverse problems, 1996. ,
From averaging to acceleration, there is only a step-size, Proceedings of the International Conference on Learning Theory (COLT), 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01136945
Smoothing Spline ANOVA Models, 2013. ,
On the Averaged Stochastic Approximation for Linear Regression, SIAM Journal on Control and Optimization, vol.34, issue.1, pp.31-61, 1996. ,
DOI : 10.1137/S0363012992226661
A distribution-free theory of nonparametric regression, 2006. ,
DOI : 10.1007/b97848
The Elements of Statistical Learning, 2009. ,
Random Design Analysis of Ridge Regression, Foundations of Computational Mathematics, vol.17, issue.36, pp.569-600, 2014. ,
DOI : 10.1007/s10208-014-9192-1
Stochastic approximation and Recursive Algorithms and Applications, 2003. ,
An optimal method for stochastic composite optimization, Mathematical Programming, vol.24, issue.1-2, pp.365-397, 2012. ,
DOI : 10.1007/s10107-010-0434-y
Concentration Inequalities and Model Selection, Lecture Notes in Mathematics, 2007. ,
Generalized Linear Models. Monographs on Statistics and Applied Probability, 1989. ,
Robust Stochastic Approximation Approach to Stochastic Programming, SIAM Journal on Optimization, vol.19, issue.4, pp.1574-1609, 2009. ,
DOI : 10.1137/070704277
URL : https://hal.archives-ouvertes.fr/hal-00976649
A method of solving a convex programming problem with convergence rate, Soviet Mathematics Doklady, vol.27, issue.1 22, pp.372-376, 1983. ,
Introductory Lectures on Convex Optimization, of Applied Optimization, 2004. ,
DOI : 10.1007/978-1-4419-8853-9
Adaptive restart for accelerated gradient schemes, Foundations of Computational Mathematics, pp.1-18, 2013. ,
Some methods of speeding up the convergence of iteration methods, USSR Computational Mathematics and Mathematical Physics, vol.4, issue.5, pp.1-17, 1964. ,
DOI : 10.1016/0041-5553(64)90137-5
Introduction to Optimization. Translations Series in Mathematics and Engineering, 1987. ,
Acceleration of Stochastic Approximation by Averaging, SIAM Journal on Control and Optimization, vol.30, issue.4, pp.838-855, 1992. ,
DOI : 10.1137/0330046
A stochastic approxiation method. The Annals of mathematical, Statistics, vol.22, issue.3, pp.400-407, 1951. ,
Less is More: Nyström Computational Regularization, Advances in Neural Information Processing Systems 28, p.2015 ,
Learning with Kernels, 2002. ,
Stochastic convex optimization, Proceedings of the International Conference on Learning Theory (COLT), 2009. ,
Support Vector Machines, Series in Information Science and Statistics, 2008. ,
Online learning as stochastic approximation of regularization paths, EEE Transactions in Information Theory, issue.99, pp.5716-5735, 2011. ,
Optimal Rates of Aggregation, Proceedings of the Annual Conference on Computational Learning Theory, 2003. ,
DOI : 10.1007/978-3-540-45167-9_23
URL : https://hal.archives-ouvertes.fr/hal-00104867
Introduction to Nonparametric Estimation, 2008. ,
DOI : 10.1007/b13794
Dual averaging methods for regularized stochastic learning and online optimization, J. Mach. Learn. Res, vol.11, pp.2543-2596, 2010. ,
On Early Stopping in Gradient Descent Learning, Constructive Approximation, vol.26, issue.2, pp.289-315, 2007. ,
DOI : 10.1007/s00365-006-0663-2
Online Gradient Descent Learning Algorithms, Foundations of Computational Mathematics, vol.8, issue.5, 2008. ,
DOI : 10.1007/s10208-006-0237-y
Solving large scale linear prediction problems using stochastic gradient descent algorithms, Twenty-first international conference on Machine learning , ICML '04, 2004. ,
DOI : 10.1145/1015330.1015332