Continuous Rapid Action Value Estimates - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

Continuous Rapid Action Value Estimates

Résumé

In the last decade, Monte-Carlo Tree Search (MCTS) has revolutionized the domain of large-scale Markov Decision Process problems. MCTS most often uses the Upper Confidence Tree algorithm to handle the exploration versus exploitation trade-off, while a few heuristics are used to guide the exploration in large search spaces. Among these heuristics is Rapid Action Value Estimate (RAVE). This paper is concerned with extending the RAVE heuristics to continuous action and state spaces. The approach is experimentally validated on two artificial benchmark problems: the treasure hunt game, and a real-world energy management problem.
Fichier principal
Vignette du fichier
couetoux.pdf (187.05 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00642459 , version 1 (23-11-2011)

Identifiants

  • HAL Id : hal-00642459 , version 1

Citer

Adrien Couetoux, Mario Milone, Matyas Brendel, Hassen Doghmen, Michèle Sebag, et al.. Continuous Rapid Action Value Estimates. The 3rd Asian Conference on Machine Learning (ACML2011), Nov 2011, Taoyuan, Taiwan. pp.19-31. ⟨hal-00642459⟩
358 Consultations
350 Téléchargements

Partager

Gmail Facebook X LinkedIn More