A progressive batching L-BFGS method for machine learning. arXiv preprint, 2018. ,
Extrapolation methods, Applied Numerical Mathematics, vol.15, issue.2, 2013. ,
DOI : 10.1016/0168-9274(94)00015-8
URL : https://hal.archives-ouvertes.fr/hal-00018524
A Polynomial Extrapolation Method for Finding Limits and Antilimits of Vector Sequences, SIAM Journal on Numerical Analysis, vol.13, issue.5, pp.734-752, 1976. ,
DOI : 10.1137/0713060
Recent advances in deep learning for speech research at Microsoft, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.8604-8608, 2013. ,
DOI : 10.1109/ICASSP.2013.6639345
URL : http://research.microsoft.com/pubs/188864/ICASSP-2013-OverviewMSRDeepLearning.pdf
Extrapolating to the limit of a vector sequence, Information linkage between applied mathematics and industry, pp.387-396, 1979. ,
Chebyshev semi-iterative methods, successive overrelaxation iterative methods, and second order Richardson iterative methods, Numerische Mathematik, vol.3, issue.1, pp.157-168, 1961. ,
DOI : 10.1007/BF01386014
Accurate, large minibatch sgd: training imagenet in 1 hour. arXiv preprint, 2017. ,
Design of experiments of the nips 2003 variable selection benchmark, 2003. ,
Averaging weights leads to wider optima and better generalization. arXiv preprint, 2018. ,
Adam: A method for stochastic optimization. arXiv preprint, 2014. ,
Non-asymptotic analysis of stochastic approximation algorithms for machine learning, Advances in Neural Information Processing Systems, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00608041
Introductory lectures on convex optimization: A basic course, 2013. ,
DOI : 10.1007/978-1-4419-8853-9
Acceleration of Stochastic Approximation by Averaging, SIAM Journal on Control and Optimization, vol.30, issue.4, pp.838-855, 1992. ,
DOI : 10.1137/0330046
On the convergence of adam and beyond, International Conference on Learning Representations, 2018. ,
Nonlinear acceleration of stochastic algorithms, Advances in Neural Information Processing Systems, pp.3985-3994, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01618379
Regularized nonlinear acceleration, Advances In Neural Information Processing Systems, pp.712-720, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01384682
Nonlinear acceleration of cnns, Workshop track of International Conference on Learning Representations (ICLR), 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01805251
Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude, COURSERA: Neural networks for machine learning, pp.26-31, 2012. ,
Anderson Acceleration for Fixed-Point Iterations, SIAM Journal on Numerical Analysis, vol.49, issue.4, pp.1715-1735, 2011. ,
DOI : 10.1137/10078356X
URL : http://users.wpi.edu/%7Ewalker/Papers/Walker-Ni%2CSINUM%2CV49%2C1715-1735.pdf
Ensembling neural networks: Many could be better than all, Artificial Intelligence, vol.137, issue.1-2, pp.239-263, 2002. ,
DOI : 10.1016/S0004-3702(02)00190-X
URL : https://doi.org/10.1016/s0004-3702(02)00190-x