Continuous Rapid Action Value Estimates

Adrien Couetoux; Mario Milone; Matyas Brendel; Hassen Doghmen; Michèle Sebag; Olivier Teytaud

Communication Dans Un Congrès Année : 2011

Continuous Rapid Action Value Estimates

(1) , (1) , (2) , (2) , (2, 1) , (1, 2)

1
2

Adrien Couetoux

Fonction : Auteur
PersonId : 910214

Laboratoire de Recherche en Informatique

Mario Milone

Fonction : Auteur

Laboratoire de Recherche en Informatique

Matyas Brendel

Fonction : Auteur

Machine Learning and Optimisation

Hassen Doghmen

Fonction : Auteur

Machine Learning and Optimisation

Michèle Sebag

Fonction : Auteur
PersonId : 836537

Machine Learning and Optimisation

Laboratoire de Recherche en Informatique

Olivier Teytaud

Fonction : Auteur
PersonId : 581
IdHAL : olivier-teytaud
IdRef : 05971008X

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Résumé

In the last decade, Monte-Carlo Tree Search (MCTS) has revolutionized the domain of large-scale Markov Decision Process problems. MCTS most often uses the Upper Conﬁdence Tree algorithm to handle the exploration versus exploitation trade-off, while a few heuristics are used to guide the exploration in large search spaces. Among these heuristics is Rapid Action Value Estimate (RAVE). This paper is concerned with extending the RAVE heuristics to continuous action and state spaces. The approach is experimentally validated on two artiﬁcial benchmark problems: the treasure hunt game, and a real-world energy management problem.

Mots clés

Rapid Action-Value Estimates continuous domains reinforcement learning

Domaines

Optimisation et contrôle [math.OC]

Fichier principal

couetoux.pdf (187.05 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Olivier Teytaud : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00642459

Soumis le : mercredi 23 novembre 2011-03:31:28

Dernière modification le : lundi 15 avril 2024-18:04:11

Archivage à long terme le : vendredi 16 novembre 2012-11:51:15

Dates et versions

hal-00642459 , version 1 (23-11-2011)

Identifiants

HAL Id : hal-00642459 , version 1

Citer

Adrien Couetoux, Mario Milone, Matyas Brendel, Hassen Doghmen, Michèle Sebag, et al.. Continuous Rapid Action Value Estimates. The 3rd Asian Conference on Machine Learning (ACML2011), Nov 2011, Taoyuan, Taiwan. pp.19-31. ⟨hal-00642459⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA UMR8623 INRIA2 LRI-AO TDS-MACS UNIV-PARIS-SACLAY

358 Consultations

350 Téléchargements

Continuous Rapid Action Value Estimates

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager