Skip to Main content Skip to Navigation
Other publications

A Fitted-Q Algorithm for Budgeted MDPs

Nicolas Carrara 1, 2 Romain Laroche 3 Jean-Léon Bouraoui 1 Tanguy Urvoy 1 Olivier Pietquin 4, 2
2 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
Abstract : We address the problem of bud-geted/constrained reinforcement learning in continuous state-space using a batch of transitions. For this purpose, we introduce a novel algorithm called Budgeted Fitted-Q (BFTQ). We carry out some preliminary benchmarks on a continuous 2-D world. They show that BFTQ performs as well as a penalized Fitted-Q algorithm while also allowing ones to adapt the trained policy on-the-fly for a given amount of budget and without the need of engineering the reward penalties. We believe that the general principles used to design BFTQ could be used to extend others classical reinforcement learning algorithms to budget-oriented applications.
Document type :
Other publications
Complete list of metadata

Cited literature [16 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01867353
Contributor : Nicolas Carrara Connect in order to contact the contributor
Submitted on : Tuesday, September 4, 2018 - 11:42:34 AM
Last modification on : Friday, March 12, 2021 - 4:50:03 PM
Long-term archiving on: : Wednesday, December 5, 2018 - 2:33:03 PM

File

ncarrara-saferl-uai-2018.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01867353, version 1

Citation

Nicolas Carrara, Romain Laroche, Jean-Léon Bouraoui, Tanguy Urvoy, Olivier Pietquin. A Fitted-Q Algorithm for Budgeted MDPs. 2018. ⟨hal-01867353⟩

Share

Metrics

Record views

466

Files downloads

431