A. Agarwal, S. Negahban, and M. J. Wainwright, Stochastic optimization and sparse statistical recovery: An optimal algorithm for high dimensions, 2014 48th Annual Conference on Information Sciences and Systems (CISS), pp.1538-1546, 2012.
DOI : 10.1109/CISS.2014.6814157
URL : http://arxiv.org/abs/1207.4421

J. Audibert, Progressive mixture rules are deviation suboptimal, Advances in Neural Information Processing Systems, pp.41-48, 2008.

F. Bunea, A. Tsybakov, and M. Wegkamp, Sparsity oracle inequalities for the Lasso, Electronic Journal of Statistics, vol.1, issue.0, pp.169-194, 2007.
DOI : 10.1214/07-EJS008
URL : https://hal.archives-ouvertes.fr/hal-00160646

O. Catoni, Universal aggregation rules with exact bias bounds. preprint, 510, 1999.

N. Cesa-bianchi and G. Lugosi, Prediction, learning, and games, 2006.
DOI : 10.1017/CBO9780511546921

N. Cesa-bianchi, Y. Mansour, and G. Stoltz, Improved second-order bounds for prediction with expert advice, Machine Learning, pp.321-352, 2007.
DOI : 10.1007/s10994-006-5001-7
URL : https://hal.archives-ouvertes.fr/hal-00007539

J. Duchi, S. Shalev-shwartz, Y. Singer, and T. Chandra, -ball for learning in high dimensions, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.272-279, 2008.
DOI : 10.1145/1390156.1390191

J. C. Duchi, S. Shalev-shwartz, Y. Singer, and A. Tewari, Composite objective mirror descent, In COLT, pp.14-26, 2010.

D. Foster, S. Kale, and H. Karloff, Online sparse linear regression, Conference on Learning Theory, pp.960-970, 2016.

D. J. Foster, S. Kale, M. Mohri, and K. Sridharan, Parameter-free online learning via model selection, Advances in Neural Information Processing Systems 30, pp.6020-6030, 2017.

Y. Freund and R. E. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, vol.55, issue.1, pp.119-139, 1997.
DOI : 10.1006/jcss.1997.1504
URL : https://doi.org/10.1006/jcss.1997.1504

P. Gaillard and O. Wintenberger, Sparse Accelerated Exponential Weights, 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 2017.
URL : https://hal.archives-ouvertes.fr/hal-01376808

S. Gerchinovitz, Prediction of individual sequences and prediction in the statistical framework: some links around sparse regression and aggregation techniques, 2011.
URL : https://hal.archives-ouvertes.fr/tel-00653550

E. Hazan, Introduction to Online Convex Optimization, Foundations and Trends?? in Optimization, vol.2, issue.3-4, pp.157-325, 2016.
DOI : 10.1561/2400000013

E. Hazan, A. Agarwal, and S. Kale, Logarithmic regret algorithms for online convex optimization, Machine Learning, pp.169-192, 2007.
DOI : 10.1007/11776420_37
URL : http://www.cs.princeton.edu/~satyen/papers/HKKA2006.pdf

S. Kale, Z. Karnin, T. Liang, and D. Pál, Adaptive feature selection: Computationally efficient online sparse linear regression under rip, 2017.

J. Kivinen and M. K. Warmuth, Exponentiated Gradient versus Gradient Descent for Linear Predictors, Information and Computation, vol.132, issue.1, pp.1-63, 1997.
DOI : 10.1006/inco.1996.2612
URL : https://doi.org/10.1006/inco.1996.2612

W. M. Koolen and T. Van-erven, Second-order quantile methods for experts and combinatorial games, COLT, pp.1155-1175, 2015.

W. M. Koolen, P. Grünwald, and T. Van-erven, Combining adversarial guarantees and stochastic fast rates in online learning, Advances in Neural Information Processing Systems, pp.4457-4465, 2016.

J. Langford, L. Li, and T. Zhang, Sparse online learning via truncated gradient, Journal of Machine Learning Research, vol.10, pp.777-801, 2009.

N. Littlestone and M. K. Warmuth, The weighted majority algorithm. Information and computation, pp.212-261, 1994.
DOI : 10.1006/inco.1994.1009
URL : https://doi.org/10.1006/inco.1994.1009

N. A. Mehta, Fast rates with high probability in exp-concave statistical learning. arXiv preprint, 2016.

A. Rakhlin and K. Sridharan, Online nonparametric regression with general loss functions. arXiv preprint, 2015.

P. Rigollet and A. Tsybakov, Exponential screening and optimal rates of sparse estimation. The Annals of Statistics, pp.731-771, 2011.
DOI : 10.1214/10-aos854
URL : https://hal.archives-ouvertes.fr/hal-00606059

V. Roulet and A. , Sharpness, restart and acceleration, Advances in Neural Information Processing Systems, pp.1119-1129, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01474362

J. Steinhardt, S. Wager, and P. Liang, The statistics of streaming sparse regression. arXiv preprint arXiv, pp.1412-4182, 2014.

T. Van-erven, P. D. Grünwald, N. A. Mehta, M. D. Reid, and R. C. Williamson, Fast rates in statistical and online learning, Journal of Machine Learning Research, vol.16, pp.1793-1861, 2015.

V. Vovk, A Game of Prediction with Expert Advice, Journal of Computer and System Sciences, vol.56, issue.2, pp.153-173, 1998.
DOI : 10.1006/jcss.1997.1556
URL : http://www.cs.rhbnc.ac.uk/research/compint/areas/comp_learn/aa/rob.ps

V. G. Vovk, AGGREGATING STRATEGIES, Proc. of Computational Learning Theory, 1990.
DOI : 10.1016/B978-1-55860-146-8.50032-1

M. J. Wainwright, Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$-Constrained Quadratic Programming (Lasso), IEEE Transactions on Information Theory, vol.55, issue.5, pp.2183-2202, 2009.
DOI : 10.1109/TIT.2009.2016018

O. Wintenberger, Optimal learning with bernstein online aggregation. Extended version available at arXiv:1404, pp.1356-2014
DOI : 10.1007/s10994-016-5592-6
URL : https://hal.archives-ouvertes.fr/hal-00973918

L. Xiao, Dual averaging methods for regularized stochastic learning and online optimization, Journal of Machine Learning Research, vol.11, pp.2543-2596, 2010.

Y. Yang, COMBINING FORECASTING PROCEDURES: SOME THEORETICAL RESULTS, Econometric Theory, vol.137, issue.01, pp.176-222, 2004.
DOI : 10.2307/2344546
URL : http://www.public.iastate.edu/~yyang/papers/combineforecast.ps

Y. Zhang, M. J. Wainwright, and M. I. Jordan, Lower bounds on the performance of polynomialtime algorithms for sparse linear regression, Conference on Learning Theory, pp.921-948, 2014.

M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, Proceedings of the 20th International Conference on Machine Learning, 2003.

S. ?ojasiewicz, Une propriété topologique des sous-ensembles analytiques réels, Les équations aux dérivées partielles, pp.87-89, 1963.

S. ?ojasiewicz, Sur la géométrie semi-et sous-analytique. Annales de l'institut Fourier, pp.1575-1595, 1993.