Continuous Upper Confidence Trees

Adrien Couёtoux 1 Jean-Baptiste Hoock 1, 2 Nataliya Sokolovska 1, 2 Olivier Teytaud 1, 2 Nicolas Bonnard 3
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Upper Confidence Trees are a very efficient tool for solving Markov Decision Processes; originating in difficult games like the game of Go, it is in particular surprisingly efficient in high dimensional problems. It is known that it can be adapted to continuous domains in some cases (in particular continuous action spaces). We here present an extension of Upper Confidence Trees to continuous stochastic problems. We (i) show a deceptive problem on which the classical Upper Confidence Tree approach does not work, even with arbitrarily large computational power and with progressive widening (ii) propose an improvement, termed double-progressive widening, which takes care of the compromise between variance (we want infinitely many simulations for each action/state) and bias (we want sufficiently many nodes to avoid a bias by the first nodes) and which extends the classical progressive widening (iii) discuss its consistency and show experimentally that it performs well on the deceptive problem and on experimental benchmarks. We guess that the double-progressive widening trick can be used for other algorithms as well, as a general tool for ensuring a good bias/variance compromise in search algorithms.
Type de document :
Communication dans un congrès
LION'11: Proceedings of the 5th International Conference on Learning and Intelligent OptimizatioN, Jan 2011, Italy. pp.TBA, 2011
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00542673
Contributeur : Nataliya Sokolovska <>
Soumis le : dimanche 24 juillet 2011 - 12:54:05
Dernière modification le : lundi 9 avril 2018 - 12:20:03
Document(s) archivé(s) le : dimanche 4 décembre 2016 - 09:04:16

Fichier

c0mcts.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00542673, version 2

Collections

Citation

Adrien Couёtoux, Jean-Baptiste Hoock, Nataliya Sokolovska, Olivier Teytaud, Nicolas Bonnard. Continuous Upper Confidence Trees. LION'11: Proceedings of the 5th International Conference on Learning and Intelligent OptimizatioN, Jan 2011, Italy. pp.TBA, 2011. 〈hal-00542673v2〉

Partager

Métriques

Consultations de la notice

653

Téléchargements de fichiers

1069