Skip to Main content Skip to Navigation
Conference papers

Aggregating optimistic planning trees for solving markov decision processes

Gunnar Kedenburg 1 Raphael Fonteneau 1, 2 Remi Munos 1
1 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal
Abstract : This paper addresses the problem of online planning in Markov decision processes using a generative model and under a budget constraint. We propose a new algorithm, ASOP, which is based on the construction of a forest of single successor state planning trees, where each tree corresponds to a random realization of the stochastic environment. The trees are explored using a "safe" optimistic planning strategy which combines the optimistic principle (in order to explore the most promising part of the search space first) and a safety principle (which guarantees a certain amount of uniform exploration). In the decision-making step of the algorithm, the individual trees are aggregated and an immediate action is recommended. We provide a finite-sample analysis and discuss the trade-off between the principles of optimism and safety. We report numerical results on a benchmark problem showing that ASOP performs as well as state-of-the-art optimistic planning algorithms.
Document type :
Conference papers
Complete list of metadata

Cited literature [16 references]  Display  Hide  Download
Contributor : Rémi Munos Connect in order to contact the contributor
Submitted on : Friday, January 3, 2014 - 7:04:52 PM
Last modification on : Saturday, December 18, 2021 - 3:03:32 AM
Long-term archiving on: : Thursday, April 3, 2014 - 10:40:54 PM


Files produced by the author(s)


  • HAL Id : hal-00923681, version 1


Gunnar Kedenburg, Raphael Fonteneau, Remi Munos. Aggregating optimistic planning trees for solving markov decision processes. Advances in Neural Information Processing Systems, 2013, United States. pp.2382-2390. ⟨hal-00923681⟩



Les métriques sont temporairement indisponibles