Solving Hidden-Semi-Markov-Mode Markov Decision Problems

Emmanuel Hadoux; Aurélie Beynier; Paul Weng

doi:10.1007/978-3-319-11508-5_15

Communication Dans Un Congrès Année : 2014

Solving Hidden-Semi-Markov-Mode Markov Decision Problems

(1) , (1) , (2)

1
2

Emmanuel Hadoux

Fonction : Auteur
PersonId : 6716
IdHAL : emmanuel-hadoux
IdRef : 192282492

Systèmes Multi-Agents

Aurélie Beynier

Fonction : Auteur
PersonId : 9272
IdHAL : aurelie-beynier
IdRef : 113330804

Systèmes Multi-Agents

Paul Weng

Fonction : Auteur

DECISION

Résumé

Hidden-Mode Markov Decision Processes (HM-MDPs) were proposed to represent sequential decision-making problems in non-stationary environments that evolve according to a Markov chain. We introduce in this paper Hidden-Semi-Markov-Mode Markov Decision Processes (HS3MDPs), a generalization of HM-MDPs to the more realistic case of non-stationary environments evolving according to a semi-Markov chain. Like HM-MDPs, HS3MDPs form a subclass of Partially Observable Markov Decision Processes. Therefore, large instances of HS3MDPs (and HM-MDPs) can be solved using an online algorithm, the Partially Observable Monte Carlo Planning (POMCP) algorithm, based on Monte Carlo Tree Search exploiting particle filters for belief state approximation. We propose a first adaptation of POMCP to solve HS3MDPs more efficiently by exploiting their structure. Our empirical results show that the first adapted POMCP reaches higher cumulative rewards than the original algorithm. However, in larger instances, POMCP may run out of particles. To solve this issue, we propose a second adaptation of POMCP, replacing particle filters by exact representations of beliefs. Our empirical results indicate that this new version reaches high cumulative rewards faster than the former adapted POMCP and still remains efficient even for large problems.

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

SUM14.pdf (352.52 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuel Hadoux : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01200812

Soumis le : jeudi 17 septembre 2015-11:42:02

Dernière modification le : mardi 11 avril 2023-15:16:28

Archivage à long terme le : mardi 29 décembre 2015-07:43:15

Dates et versions

hal-01200812 , version 1 (17-09-2015)

Identifiants

HAL Id : hal-01200812 , version 1
DOI : 10.1007/978-3-319-11508-5_15

Citer

Emmanuel Hadoux, Aurélie Beynier, Paul Weng. Solving Hidden-Semi-Markov-Mode Markov Decision Problems. Scalable Uncertainty Management, Sep 2014, Oxford, United Kingdom. pp.176-189, ⟨10.1007/978-3-319-11508-5_15⟩. ⟨hal-01200812⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UPMC CNRS LIP6 SORBONNE-UNIVERSITE SU-SCIENCES ANR

114 Consultations

395 Téléchargements

Solving Hidden-Semi-Markov-Mode Markov Decision Problems

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager