A Fitted-Q Algorithm for Budgeted MDPs

Nicolas Carrara; Romain Laroche; Jean-Léon Bouraoui; Tanguy Urvoy; Olivier Pietquin

Communication Dans Un Congrès Année : 2018

A Fitted-Q Algorithm for Budgeted MDPs

(1, 2) , (3) , (1) , (1) , (4)

1
2
3
4

Nicolas Carrara

Fonction : Auteur
PersonId : 1036158

Orange Labs [Lannion]

Sequential Learning

Romain Laroche

Fonction : Auteur
PersonId : 1012067

Maluuba

Jean-Léon Bouraoui

Fonction : Auteur

Orange Labs [Lannion]

Tanguy Urvoy

Fonction : Auteur

Orange Labs [Lannion]

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Résumé

We address the problem of budgeted reinforcement learning, in continuous state-space, using a batch of transitions. To this extend, we introduce a novel algorithm called Budgeted Fitted-Q (BFTQ). Benchmarks show that BFTQ performs as well as a regular Fitted-Q algorithm in a continuous 2-D world but also allows one to choose the right amount of budget that fits to a given task without the need of engineering the rewards. We believe that the general principles used to design BFTQ can be applied to extend others classical reinforcement learning algorithms for budgeted oriented applications.

Mots clés

Budgeted-MDP Fitted-Q Reinforcement Learning

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

ewrl_14_2018_paper_67.pdf (7.3 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Nicolas Carrara : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01928092

Soumis le : mardi 20 novembre 2018-13:44:34

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Dates et versions

hal-01928092 , version 1 (20-11-2018)

Identifiants

HAL Id : hal-01928092 , version 1

Citer

Nicolas Carrara, Romain Laroche, Jean-Léon Bouraoui, Tanguy Urvoy, Olivier Pietquin. A Fitted-Q Algorithm for Budgeted MDPs. EWRL 2018 - 14th European workshop on Reinforcement Learning, Oct 2018, Lille, France. ⟨hal-01928092⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE

101 Consultations

70 Téléchargements

A Fitted-Q Algorithm for Budgeted MDPs

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager