A. Abdulle, G. Vilmart, and K. C. Zygalakis, High Order Numerical Approximation of the Invariant Measure of Ergodic SDEs, SIAM Journal on Numerical Analysis, vol.52, issue.4, pp.1600-1622, 2014.
DOI : 10.1137/130935616

URL : https://hal.archives-ouvertes.fr/hal-00858088

F. Bach, Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression, J. Mach. Learn. Res, vol.15, issue.1, pp.595-627, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00804431

F. Bach and E. Moulines, Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) Advances, in Neural Information Processing Systems (NIPS), 2013.

D. Bertsekas, Nonlinear programming, Athena Scientific, 1995.

L. Bottou and O. Bousquet, The tradeoffs of large scale learning, Advances in Neural Information Processing Systems (NIPS), 2008.

C. Chen, N. Ding, and L. Carin, On the convergence of stochastic gradient MCMC algorithms with high-order integrators, NIPS, pp.2269-2277, 2015.

A. Défossez and F. Bach, Averaged least-mean-squares: bias-variance trade-offs and optimal sampling distributions, Proceedings of the International Conference on Artificial Intelligence and Statistics, p.2015

A. Dieuleveut and F. Bach, Nonparametric stochastic approximation with large step-sizes, The Annals of Statistics, vol.44, issue.4, pp.1363-1399
DOI : 10.1214/15-AOS1391

URL : https://hal.archives-ouvertes.fr/hal-01053831

A. Dieuleveut, N. Flammarion, and F. Bach, Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression. ArXiv e-prints, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01275431

A. Durmus, U. ¸im¸sekli¸im¸sekli, E. Moulines, R. Badeau, and G. Richard, Stochastic Gradient Richardson- Romberg Markov Chain Monte Carlo, Advances in Neural Information Processing Systems, pp.2047-2055, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01354064

P. Hartman, Ordinary Differential Equations: Second Edition, Classics in Applied Mathematics. Society for Industrial and Applied Mathematics, 1982.

S. M. Jain, R. Kakade, P. Kidambi, A. Netrapalli, and . Sidford, Parallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging. ArXiv e-prints, 2016.

P. Jain, S. M. Kakade, R. Kidambi, P. Netrapalli, and A. Sidford, Accelerating Stochastic Gradient Descent, 2017.

G. L. Jones, On the Markov chain central limit theorem, Probability Surveys, vol.1, issue.0, pp.299-320, 2004.
DOI : 10.1214/154957804100000051

URL : http://arxiv.org/abs/math/0409112

H. Kushner and G. Yin, Stochastic Approximation and Recursive Algorithms and Applications, 2003.

G. Lan, An optimal method for stochastic composite optimization, Mathematical Programming, vol.24, issue.1-2, pp.365-397, 2012.
DOI : 10.1023/A:1021814225969

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.416.1110

E. Levy, Why do partitions occur in Faa di Bruno's chain rule for higher derivatives?, 2006.

L. Ljung, G. C. Pflug, and H. Walk, Stochastic approximation and optimization of random systems. DMV Seminar, 1992.
DOI : 10.1007/978-3-0348-8609-3

J. C. Mattingly, A. M. Stuart, and D. J. Higham, Ergodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise, Stochastic Processes and their Applications, vol.101, issue.2, pp.185-232, 2002.
DOI : 10.1016/S0304-4149(02)00150-3

URL : http://doi.org/10.1016/s0304-4149(02)00150-3

S. Meyn and R. Tweedie, Markov Chains and Stochastic Stability, 2009.

S. P. Meyn and R. L. Tweedie, Markov chains and stochastic stability, 1993.

A. Nedi´cnedi´c and D. Bertsekas, Convergence rate of incremental subgradient algorithms, Stochastic optimization: algorithms and applications, pp.223-264, 2001.

D. Needell, R. Ward, and N. Srebro, Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm, Advances in Neural Information Processing Systems 27, pp.1017-1025, 2014.
DOI : 10.1137/120889897

URL : http://arxiv.org/abs/1310.5715

A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro, Robust Stochastic Approximation Approach to Stochastic Programming, SIAM Journal on Optimization, vol.19, issue.4, pp.1574-1609, 2009.
DOI : 10.1137/070704277

URL : https://hal.archives-ouvertes.fr/hal-00976649

Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course. Applied Optimization, 2004.
DOI : 10.1007/978-1-4419-8853-9

Y. Nesterov and J. P. Vial, Confidence level solutions for stochastic programming, Automatica, vol.44, issue.6, pp.1559-1568, 2008.
DOI : 10.1016/j.automatica.2008.01.017

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.5840

G. C. Pflug, Stochastic Minimization with Constant Step-Size: Asymptotic Laws, SIAM Journal on Control and Optimization, vol.24, issue.4, pp.655-666, 1986.
DOI : 10.1137/0324039

B. T. Polyak and A. B. Juditsky, Acceleration of Stochastic Approximation by Averaging, SIAM Journal on Control and Optimization, vol.30, issue.4, pp.838-855, 1992.
DOI : 10.1137/0330046

A. Rakhlin, O. Shamir, and K. Sridharan, Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization. ArXiv e-prints, 2011.

H. Robbins and S. Monro, A stochastic approxiation method. The Annals of mathematical, Statistics, vol.22, issue.3, pp.400-407, 1951.

D. Ruppert, Efficient estimations from a slowly convergent Robbins-Monro process, 1988.

Y. Shalev-shwartz, N. Singer, and . Srebro, Pegasos, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.807-814, 2007.
DOI : 10.1145/1273496.1273598

S. Shalev-shwartz, O. Shamir, N. Srebro, and K. Sridharan, Stochastic convex optimization, Proceedings of the International Conference on Learning Theory (COLT), 2009.

O. Shamir and T. Zhang, Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes, Proceedings of the 30 t h International Conference on Machine Learning, 2013. differential equations. Stochastic Anal, pp.483-509, 1990.

C. Villani, Optimal transport : old and new. Grundlehren der mathematischen Wissenschaften, 2009.
DOI : 10.1007/978-3-540-71050-9

M. Welling and Y. W. Teh, Bayesian learning via Stochastic Gradient Langevin Dynamics, ICML, pp.681-688, 2011.

D. L. Zhu and P. Marcotte, Co-Coercivity and Its Role in the Convergence of Iterative Schemes for Solving Variational Inequalities, SIAM Journal on Optimization, vol.6, issue.3, pp.714-726, 1996.
DOI : 10.1137/S1052623494250415