Fitted-Q Iteration in Continuous Action-Space MDPs, Proc. of NIPS, pp.9-16, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00185311
Rational and Convergent Learning in Stochastic Games, Proc. of IJCAI, pp.1021-1026, 2001. ,
Classification and Regression Trees, 1984. ,
A Comprehensive Survey of Multiagent Reinforcement Learning, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol.38, issue.2, pp.156-172, 2008. ,
DOI : 10.1109/TSMCC.2007.913919
Tree-Based Batch Mode Reinforcement Learning, Journal of Machine Learning Research, pp.503-556, 2005. ,
Error Propagation for Approximate Policy and Value Iteration, Proc. of NIPS, pp.568-576, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00830154
Classification-Based Policy Iteration with a Critic, Proc. of ICML, pp.1049-1056, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00590972
Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor, Journal of the ACM, vol.60, issue.1, p.1, 2013. ,
DOI : 10.1145/2432622.2432623
Nash Q-Learning for General- Sum Stochastic Games, JMLR, vol.4, pp.1039-1069, 2003. ,
A New Polynomial-time Algorithm for Linear Programming, Proc. of ACM Symposium on Theory of Computing, pp.302-311, 1984. ,
Least-squares policy iteration, Journal of Machine Learning Research, pp.1107-1149, 2003. ,
Reinforcement Learning as Classification: Leveraging Modern Classifiers, Proc. of ICML, pp.424-431, 2003. ,
Value function approximation in zero-sum markov games, Proc. of UAI, pp.283-292, 2002. ,
Analysis of a Classification-Based Policy Iteration Algorithm, Proc. of ICML, pp.607-614, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00482065
Markov games as a framework for multi-agent reinforcement learning, Proc. of ICML, pp.157-163, 1994. ,
DOI : 10.1016/B978-1-55860-335-6.50027-1
Learning strategies in games by anticipation, IJCAI 97, pp.698-707, 1997. ,
Finite-time bounds for fitted value iteration, JMLR, vol.9, pp.815-857, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00120882
Stochastic Shortest Path Games, SIAM Journal on Control and Optimization, vol.37, issue.3, 1997. ,
DOI : 10.1137/S0363012996299557
Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. ,
DOI : 10.1002/9780470316887
On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes, Proc. of NIPS, pp.1826-1834, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00758809
Approximate Modified Policy Iteration, Proc. of ICML, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00758882
Stochastic Games, Proceedings of the National Academy of Sciences, vol.39, issue.10, p.1095, 1953. ,
DOI : 10.1073/pnas.39.10.1953
Discounted Markov games: Generalized policy iteration method, Journal of Optimization Theory and Applications, vol.30, issue.1, pp.125-138, 1978. ,
DOI : 10.1007/BF00933260
Morgenstern, 0.(1944) theory of games and economic behavior, 1947. ,