Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002. ,
DOI : 10.1023/A:1013689704352
The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002. ,
DOI : 10.1137/S0097539701398375
Convergence of Probability Measures, 1968. ,
DOI : 10.1002/9780470316962
Online optimization in X ?armed bandits, Proceedings of the 23rd Advances on Neural Information Processing Systems (NIPS), pp.201-208, 2009. ,
URL : https://hal.archives-ouvertes.fr/inria-00329797
Bandit algorithms for tree search, Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI), pp.67-74, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00150207
Combinatorial Methods in Density Estimation, 2001. ,
DOI : 10.1007/978-1-4613-0125-7
PAC Bounds for Multi-armed Bandit and Markov Decision Processes, Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT), pp.255-270, 2002. ,
DOI : 10.1007/3-540-45435-7_18
Modification of UCT with patterns in Monte-Carlo go, 2006. ,
URL : https://hal.archives-ouvertes.fr/inria-00117266
Probability Inequalities for Sums of Bounded Random Variables, Journal of the American Statistical Association, vol.1, issue.301, pp.13-30, 1963. ,
DOI : 10.1214/aoms/1177730491
Nearly tight bounds for the continuum-armed bandit problem, Proceedings of the 18th Advances on Neural Information Processing Systems (NIPS), pp.697-704, 2004. ,
Sharp Dichotomies for Regret Minimization in Metric Spaces, Proceedings of the ACM?SIAM Symposium on Discrete Algorithms (SODA), pp.827-846, 2010. ,
DOI : 10.1137/1.9781611973075.68
Bandit Based Monte-Carlo Planning, Proceedings of the 15th European Conference on Machine Learning (ECML), pp.282-293, 2006. ,
DOI : 10.1007/11871842_29
Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985. ,
DOI : 10.1016/0196-8858(85)90002-8
The budgeted multi-armed bandit problem Open problems session, Proceedings of the 17th Annual Conference on Computational Learning Theory (COLT), pp.643-645, 2004. ,
The sample complexity of exploration in the multi-armed bandit problem, Journal of Machine Learning Research, vol.5, pp.623-648, 2004. ,
On the method of bounded differences, Surveys in Combinatorics, pp.148-188, 1989. ,
DOI : 10.1017/CBO9781107359949.008
Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952. ,
DOI : 10.1090/S0002-9904-1952-09620-8
Eleven tests needed for a recommendation, 2006. ,