L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and regression trees, 1984.

H. Daumé, . Iii, and D. Marcu, Learning as search optimization, Proceedings of the 22nd international conference on Machine learning , ICML '05, pp.169-176, 2005.
DOI : 10.1145/1102351.1102373

G. Dulac-arnold, L. Denoyer, and P. Gallinari, Text Classification: A Sequential Reading Approach, Lecture notes in computer science Proceedings of ECIR, pp.411-423, 2011.
DOI : 10.1016/j.patrec.2010.02.015
URL : https://hal.archives-ouvertes.fr/inria-00607185

G. Dulac-arnold, L. Denoyer, P. Preux, and P. Gallinari, Fast Reinforcement Learning with Large Action Sets Using Error-Correcting Output Codes for MDP Factorization, Proc. of ECML, 2012.
DOI : 10.1007/978-3-642-33486-3_12
URL : https://hal.archives-ouvertes.fr/hal-00747729

B. Efron, T. Hastie, and I. Johnstone, Least angle regression, Annals of Statistics, vol.52, issue.4, 2004.

R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin, LIBLINEAR: a library for large linear classification, Journal of Machine Learning Research, vol.9, pp.1871-1874, 2008.

A. Frank and A. Asuncion, UCI machine learning repository, 2010.

R. Gaudel and M. Sebag, Feature selection as a one-player game, Proceedings of ICML, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00484049

S. Girgin and P. Preux, Feature Discovery in Reinforcement Learning Using Genetic Programming, Proceedings of European conference on genetic programming, 2008.
DOI : 10.1007/978-3-540-78671-9_19
URL : https://hal.archives-ouvertes.fr/hal-00826056

R. Greiner, Learning cost-sensitive active classifiers??????This extends the short conference paper [19]., Artificial Intelligence, vol.139, issue.2, pp.137-174, 2002.
DOI : 10.1016/S0004-3702(02)00209-6

I. Guyon and A. Elisseefi, An introduction to variable and feature selection, Journal of Machine Learning Research, vol.3, issue.78, pp.1157-1182, 2003.

I. Guyon, S. Gunn, and A. Ben-hur, Result analysis of the NIPS 2003 feature selection challenge, Proceedings of NIPS, 2005.

S. Har-peled, D. Roth, and D. Zimak, Constraint Classification: A New Approach to Multiclass Classification, Proceedings of NIPS, 2002.
DOI : 10.1007/3-540-36169-3_29

J. Huang, T. Zhang, and D. Metaxas, Learning with structured sparsity, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553429

R. Jenatton, J. Y. Audibert, and F. Bach, Structured variable selection with sparsity-inducing norms, Journal of Machine Learning Research, vol.12, pp.2777-2824, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00377732

S. Ji and L. Carin, Cost-sensitive feature acquisition and classification, Pattern Recognition, vol.40, issue.5, pp.1474-1485, 2007.
DOI : 10.1016/j.patcog.2006.11.008

P. H. Kanani and A. K. Mccallum, Selecting actions for resource-bounded information extraction using reinforcement learning, Proceedings of the fifth ACM international conference on Web search and data mining, WSDM '12, pp.253-262, 2012.
DOI : 10.1145/2124295.2124328

A. Kapoor and R. Greiner, Learning and Classifying Under Hard Budgets, Proceedings ECML, pp.170-181, 2005.
DOI : 10.1007/11564096_20

M. G. Lagoudakis and R. Parr, Reinforcement learning as classification: leveraging modern classifiers, Proceedings of ICML, 2003.

A. Lazaric, M. Ghavamzadeh, and R. Munos, Analysis of a classification-based policy iteration algorithm, Proceedings of ICML, pp.607-614, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00482065

Y. Lecun, L. Bottou, and Y. Bengio, Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

J. Louradour and C. Kermorvant, Sample-Dependent Feature Selection for Faster Document Image Categorization, 2011 International Conference on Document Analysis and Recognition, pp.309-313, 2011.
DOI : 10.1109/ICDAR.2011.70

F. Maes, L. Denoyer, and P. Gallinari, Structured prediction with reinforcement learning, Machine Learning, vol.50, issue.3, pp.2-3, 2009.
DOI : 10.1007/s10994-009-5140-8
URL : https://hal.archives-ouvertes.fr/hal-01172474

B. Póczos, Y. Abbasi-yadkori, C. Szepesvári, R. Greiner, and N. Sturtevant, Learning when to stop thinking and do something!, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553480

M. L. Puterman, Markov decision processes: discrete stochastic dynamic programming.Ne wY ork, 1994.
DOI : 10.1002/9780470316887

J. Quinlan, C4.5: programs for machine learning, 1993.

T. Rückstieß, C. Osendorfer, and P. Van-der-smagt, Sequential feature selection for classification, Australasian conference on artificial intelligence, 2011.

R. Sutton and A. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society . Series B, vol.58, pp.267-288, 1994.

P. Turney, Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm, The Journal of Artificial Intelligence Research, vol.2, pp.369-409, 1995.

M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.58, issue.1, 1095.
DOI : 10.1198/016214502753479356

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society. Series B. Statistical Methodology, vol.67, issue.2, 2005.