Skip to Main content Skip to Navigation
Conference papers

Reducing the Number of Queries in Interactive Value Iteration

Abstract : To tackle the potentially hard task of defining the reward function in a Markov Decision Process (MDPs), a new approach, called Interactive Value Iteration (IVI) has recently been proposed by Weng and Zanuttini (2013). This solving method, which interweaves elicitation and optimization phases, computes a (near) optimal policy without knowing the precise reward values. The procedure as originally presented can be improved in order to reduce the number of queries needed to determine an optimal policy. The key insights are that (1) asking queries should be delayed as much as possible, avoiding asking queries that might not be necessary to determine the best policy, (2) queries should be asked by following a priority order because the answers to some queries can enable to resolve some other queries, (3) queries can be avoided by using heuristic information to guide the process. Following these ideas, a modified IVI algorithm is presented and experimental results show a significant decrease in the number of queries issued.
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01213280
Contributor : Lip6 Publications <>
Submitted on : Friday, June 30, 2017 - 6:35:13 PM
Last modification on : Thursday, November 21, 2019 - 12:00:07 AM
Document(s) archivé(s) le : Monday, January 22, 2018 - 10:16:44 PM

File

IEIVI.pdf
Files produced by the author(s)

Identifiers

Citation

Hugo Gilbert, Olivier Spanjaard, Paolo Viappiani, Paul Weng. Reducing the Number of Queries in Interactive Value Iteration. 4th International Conference on Algorithmic Decision Theory (ADT 2015), Sep 2015, Lexington, KY, United States. pp.139-152, ⟨10.1007/978-3-319-23114-3_9⟩. ⟨hal-01213280⟩

Share

Metrics

Record views

132

Files downloads

167