Particle filter-based policy gradient for pomdps - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

Particle filter-based policy gradient for pomdps

Pierre-Arnaud Coquelin
  • Fonction : Auteur
  • PersonId : 844357
Rémi Munos
  • Fonction : Auteur
  • PersonId : 836863

Résumé

Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the belief state given past observations. We consider a policy gradient approach for parameterized policy optimization. For that purpose, we investigate sensitivity analysis of the performance measure with respect to the parameters of the policy, focusing on Finite Difference (FD) techniques. We show that the naive FD is subject to variance explosion because of the non-smoothness of the resampling procedure. We propose a more sophisticated FD method which overcomes this problem and establish its consistency.
Fichier principal
Vignette du fichier
gradient_POMDP_nips08.pdf (128.61 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00830173 , version 1 (04-06-2013)

Identifiants

  • HAL Id : hal-00830173 , version 1

Citer

Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos. Particle filter-based policy gradient for pomdps. Advances in Neural Information Processing Systems, 2008, Canada. ⟨hal-00830173⟩
529 Consultations
85 Téléchargements

Partager

Gmail Facebook X LinkedIn More