Skip to Main content Skip to Navigation
Journal articles

Strong Uniform Value in Gambling Houses and Partially Observable Markov Decision Processes

Abstract : In several standard models of dynamic programming (gambling houses, MDPs, POMDPs), we prove the existence of a robust notion of value for the infinitely repeated problem, namely the strong uniform value. This solves two open problems. First, this shows that for any > 0, the decision-maker has a pure strategy σ which is-optimal in any n-stage problem, provided that n is big enough (this result was only known for behavior strategies, that is, strategies which use randomization). Second, for any > 0, the decision-maker can guarantee the limit of the n-stage value minus in the infinite problem where the payoff is the expectation of the inferior limit of the time average payoff.
Complete list of metadata

Cited literature [22 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01395429
Contributor : Xavier Venel Connect in order to contact the contributor
Submitted on : Thursday, November 10, 2016 - 4:28:56 PM
Last modification on : Friday, April 29, 2022 - 10:13:03 AM
Long-term archiving on: : Tuesday, March 21, 2017 - 10:15:25 AM

File

Revision_venel_ziliotto5.pdf
Files produced by the author(s)

Identifiers

Relations

Citation

Xavier Venel, Bruno Ziliotto. Strong Uniform Value in Gambling Houses and Partially Observable Markov Decision Processes. SIAM Journal on Control and Optimization, Society for Industrial and Applied Mathematics, 2016, 54 (4), pp.1983-2008. ⟨10.1137/15M1043340⟩. ⟨hal-01395429⟩

Share

Metrics

Record views

149

Files downloads

132