M. E. Maceira, V. S. Duarte, D. D. Penna, L. A. Moraes, and A. C. Melo, Ten years of application of stochastic dual dynamic programming in official and agent studies in Brazil -Description of the new wave program, 2008.

L. Kocsis and C. Szepesvári, Bandit Based Monte-Carlo Planning, Number 4212, pp.282-293, 2006.
DOI : 10.1007/11871842_29
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.1296

S. Gelly and D. Silver, Combining online and offline knowledge in UCT, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273531
URL : https://hal.archives-ouvertes.fr/inria-00164003

S. Billouet, J. Hoock, L. , C. Teytaud, Y. Olivier et al., 9x9 Go as black with Komi 7.5 : At last some games won against top players in the disadvantageous situation, 2009.
DOI : 10.3233/icg-2009-32412
URL : https://hal.archives-ouvertes.fr/inria-00528770

P. Rolet, M. Sebag, and O. Teytaud, Boosting Active Learning to Optimality: A Tractable Monte-Carlo, Billiard-Based Algorithm, 2009.
DOI : 10.1007/978-3-642-04174-7_20
URL : https://hal.archives-ouvertes.fr/inria-00433866