J. Boyan, Least-squares temporal difference learning, Proceedings of the Sixteenth International Conference on Machine Learning, pp.49-56, 1999.

S. Bradtke and A. Barto, Linear least-squares algorithms for temporal difference learning, Machine Learning, pp.33-57, 1996.

E. Candes and T. Tao, Decoding by Linear Programming, IEEE Transactions on Information Theory, vol.51, issue.12, pp.4203-4215, 2005.
DOI : 10.1109/TIT.2005.858979

A. M. Farahmand, M. Ghavamzadeh, C. Szepesvári, and S. Mannor, Regularized policy iteration, Proceedings of Advances in Neural Information Processing Systems 21, pp.441-448, 2008.

A. M. Farahmand, M. Ghavamzadeh, C. Szepesvári, and S. Mannor, Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems, 2009 American Control Conference, pp.725-730, 2009.
DOI : 10.1109/ACC.2009.5160611

M. Ghavamzadeh, A. Lazaric, O. Maillard, M. , and R. , LSTD with random projections, Proceedings of Advances in Neural Information Processing Systems 23, pp.721-729, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00943120

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference , and Prediction, 2001.

J. Johns, C. Painter-wakefield, and R. Parr, Linear complementarity for regularized policy evaluation and improvement, Proceedings of Advances in Neural Information Processing Systems 23, pp.1009-1017, 2010.

Z. Kolter and A. Ng, Regularization and feature selection in least-squares temporal difference learning, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.521-528, 2009.
DOI : 10.1145/1553374.1553442

A. Lazaric, M. Ghavamzadeh, M. , and R. , Finitesample analysis of LSTD, Proceedings of the Twenty-Seventh International Conference on Machine Learning, pp.615-622, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00482189

M. Loth, M. Davy, and P. Preux, Sparse Temporal Difference Learning Using LASSO, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp.352-359, 2007.
DOI : 10.1109/ADPRL.2007.368210

URL : https://hal.archives-ouvertes.fr/inria-00117075

S. Mallat, A Wavelet Tour of Signal Processing: Wavelet Analysis & Its Applications, 1999.

P. Massart and C. Meynet, An l1-oracle inequality for Lasso, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00506446

M. Petrik, G. Taylor, R. Parr, and S. Zilberstein, Feature selection using regularization in approximate linear programs for Markov decision processes, Proceedings of the Twenty-Seventh International Conference on Machine Learning, pp.871-878, 2010.

R. Sutton and A. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192

S. Van-de-geer and P. Bühlmann, On the conditions used to prove oracle results for the Lasso, Electronic Journal of Statistics, vol.3, issue.0, pp.1360-1392, 2009.
DOI : 10.1214/09-EJS506