Skip to Main content Skip to Navigation

Markov Decision Processes with Functional Rewards

Olivier Spanjaard 1 Paul Weng 1
1 DECISION
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : Markov decision processes (MDP) have become one of the standard models for decision-theoretic planning problems under uncertainty. In its standard form, rewards are assumed to be numerical additive scalars. In this paper, we propose a generalization of this model allowing rewards to be functional. The value of a history is recursively computed by composing the reward functions. We show that several variants of MDPs presented in the literature can be instantiated in this setting. We then identify sufficient conditions on these reward functions for dynamic programming to be valid. In order to show the potential of our framework, we conclude the paper by presenting several illustrative examples.
Document type :
Conference papers
Complete list of metadatas

Cited literature [15 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01216435
Contributor : Lip6 Publications <>
Submitted on : Friday, June 30, 2017 - 5:19:41 PM
Last modification on : Thursday, March 21, 2019 - 1:09:59 PM
Document(s) archivé(s) le : Monday, January 22, 2018 - 11:01:40 PM

File

miwai2013-1.pdf
Files produced by the author(s)

Identifiers

Citation

Olivier Spanjaard, Paul Weng. Markov Decision Processes with Functional Rewards. 7th Multi-Disciplinary International Workshop on Artificial Intelligence, MIWAI 2013, Dec 2013, Krabi, Thailand. pp.269-280, ⟨10.1007/978-3-642-44949-9_25⟩. ⟨hal-01216435⟩

Share

Metrics

Record views

114

Files downloads

155