Fitted Qiteration in continuous action-space MDPs, Proceedings of NIPS, pp.9-16, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00185311
Neuro-Dynamic Programming, Athena Scientific, 1996. ,
(Approximate) iterated successive approximations algorithm for sequential decision processes, Annals of Operations Research, vol.3, issue.3, pp.1-12, 2012. ,
DOI : 10.1007/s10479-012-1073-x
Tree-based batch mode reinforcement learning, Journal of Machine Learning Research, vol.6, pp.503-556, 2005. ,
Error propagation for approximate policy and value iteration, Proceedings of NIPS, pp.568-576, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00830154
Approximate policy iteration with a policy language bias: Solving relational markov decision processes, Journal of Artificial Intelligence Research, vol.25, pp.75-118, 2006. ,
Classification-based policy iteration with a critic, Proceedings of ICML, pp.1049-1056, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00590972
Reinforcement learning as classification: Leveraging modern classifiers, Proceedings of ICML, pp.424-431, 2003. ,
Analysis of a classification-based policy iteration algorithm, Proceedings of ICML, pp.607-614, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00482065
Error bounds for approximate policy iteration, Proceedings of ICML, pp.560-567, 2003. ,
Performance Bounds in $L_p$???norm for Approximate Value Iteration, SIAM Journal on Control and Optimization, vol.46, issue.2, pp.541-561, 2007. ,
DOI : 10.1137/040614384
Finite-time bounds for fitted value iteration, Journal of Machine Learning Research, vol.9, pp.815-857, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00120882
Modified Policy Iteration Algorithms for Discounted Markov Decision Problems, Management Science, vol.24, issue.11, 1978. ,
DOI : 10.1287/mnsc.24.11.1127
Approximate modified policy iteration ,
URL : https://hal.archives-ouvertes.fr/hal-00758882
Reinforcement Learning Algorithms for MDPs, Wiley Encyclopedia of Operations Research, 2010. ,
DOI : 10.1002/9780470400531.eorms0714
Performance bound for approximate optimistic policy iteration, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00480952