Particle filter-based policy gradient for pomdps

Pierre-Arnaud Coquelin; Romain Deguest; Rémi Munos

Communication Dans Un Congrès Année : 2008

Particle filter-based policy gradient for pomdps

(1) , (2) , (1)

1
2

Pierre-Arnaud Coquelin

Fonction : Auteur
PersonId : 844357

Sequential Learning

Romain Deguest

Fonction : Auteur
PersonId : 837681

Centre de Mathématiques Appliquées - Ecole Polytechnique

Rémi Munos

Fonction : Auteur
PersonId : 836863

Sequential Learning

Résumé

Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the belief state given past observations. We consider a policy gradient approach for parameterized policy optimization. For that purpose, we investigate sensitivity analysis of the performance measure with respect to the parameters of the policy, focusing on Finite Difference (FD) techniques. We show that the naive FD is subject to variance explosion because of the non-smoothness of the resampling procedure. We propose a more sophisticated FD method which overcomes this problem and establish its consistency.

Domaines

Modélisation et simulation

Fichier principal

gradient_POMDP_nips08.pdf (128.61 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Rémi Munos : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00830173

Soumis le : mardi 4 juin 2013-15:16:18

Dernière modification le : vendredi 24 mars 2023-14:52:57

Archivage à long terme le : jeudi 5 septembre 2013-04:22:59

Dates et versions

hal-00830173 , version 1 (04-06-2013)

Identifiants

HAL Id : hal-00830173 , version 1

Citer

Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos. Particle filter-based policy gradient for pomdps. Advances in Neural Information Processing Systems, 2008, Canada. ⟨hal-00830173⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

X UNIV-LILLE3 CNRS INRIA X-CMAP X-DEP-MATHA LAGIS CMAP UVSQ INRIA2 TDS-MACS

529 Consultations

88 Téléchargements

Particle filter-based policy gradient for pomdps

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager