Multi-Objective MDPs with Conditional Lexicographic Reward Preferences

Kyle Hollins; Shlomo Zilberstein; Abdel-Illah Mouaddib

Communication Dans Un Congrès Année : 2015

Multi-Objective MDPs with Conditional Lexicographic Reward Preferences

(1) , (1) , (2)

1
2

Kyle Hollins

Fonction : Auteur

Department of physics and astronomy

Shlomo Zilberstein

Fonction : Auteur

Department of physics and astronomy

Abdel-Illah Mouaddib

Fonction : Auteur
PersonId : 774873
IdRef : 078450101

Equipe MAD - Laboratoire GREYC - UMR6072

Résumé

Sequential decision problems that involve multiple objectives are prevalent. Consider for example a driver of a semi-autonomous car who may want to optimize competing objectives such as travel time and the effort associated with manual driving. We introduce a rich model called Lexico-graphic MDP (LMDP) and a corresponding planning algorithm called LVI that generalize previous work by allowing for conditional lexicographic preferences with slack. We analyze the convergence characteristics of LVI and establish its game theoretic properties. The performance of LVI in practice is tested within a realistic benchmark problem in the domain of semi-autonomous driving. Finally, we demonstrate how GPU-based optimization can improve the scalability of LVI and other value iteration algorithms for MDPs.

Domaines

Intelligence artificielle [cs.AI] Système multi-agents [cs.MA]

Fichier principal

hal-01191876.pdf (6.33 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Mad Greyc : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01191876

Soumis le : mardi 8 septembre 2015-16:51:54

Dernière modification le : mercredi 20 mars 2024-16:20:04

Archivage à long terme le : mercredi 9 décembre 2015-10:29:24

Dates et versions

hal-01191876 , version 1 (08-09-2015)

Identifiants

HAL Id : hal-01191876 , version 1

Citer

Kyle Hollins, Shlomo Zilberstein, Abdel-Illah Mouaddib. Multi-Objective MDPs with Conditional Lexicographic Reward Preferences. Twenty-Ninth Conference on Artificial Intelligence (AAAI), Jan 2015, Austin, United States. pp.3418-3424. ⟨hal-01191876⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS GREYC GREYC-MAD COMUE-NORMANDIE ENSICAEN UNICAEN

150 Consultations

39 Téléchargements

Multi-Objective MDPs with Conditional Lexicographic Reward Preferences

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager