Skip to Main content Skip to Navigation
Other publications

A Fitted-Q Algorithm for Budgeted MDPs

Nicolas Carrara 1, 2 Romain Laroche 3 Jean-Léon Bouraoui 4, 1 Tanguy Urvoy 1 Olivier Pietquin 5, 2
2 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : We address the problem of bud-geted/constrained reinforcement learning in continuous state-space using a batch of transitions. For this purpose, we introduce a novel algorithm called Budgeted Fitted-Q (BFTQ). We carry out some preliminary benchmarks on a continuous 2-D world. They show that BFTQ performs as well as a penalized Fitted-Q algorithm while also allowing ones to adapt the trained policy on-the-fly for a given amount of budget and without the need of engineering the reward penalties. We believe that the general principles used to design BFTQ could be used to extend others classical reinforcement learning algorithms to budget-oriented applications.
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01867353
Contributor : Nicolas Carrara <>
Submitted on : Tuesday, September 4, 2018 - 11:42:34 AM
Last modification on : Thursday, March 26, 2020 - 5:10:03 PM
Document(s) archivé(s) le : Wednesday, December 5, 2018 - 2:33:03 PM

File

ncarrara-saferl-uai-2018.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01867353, version 1

Citation

Nicolas Carrara, Romain Laroche, Jean-Léon Bouraoui, Tanguy Urvoy, Olivier Pietquin. A Fitted-Q Algorithm for Budgeted MDPs. 2018. ⟨hal-01867353⟩

Share

Metrics

Record views

260

Files downloads

266