Adaptive Artificial Companions learning from users' feedback

Abir-Beatrice Karami; Karim Sehaba; Benoit Encelle

doi:10.1177/1059712316634062

Article Dans Une Revue Adaptive Behavior Année : 2016

Adaptive Artificial Companions learning from users' feedback

(1) , (2) , (2)

1
2

Abir-Beatrice Karami

Fonction : Auteur
PersonId : 5643
IdHAL : abir-b-karami
ORCID : 0000-0003-1972-5629
IdRef : 160681960

Equipe MAD - Laboratoire GREYC - UMR6072

Karim Sehaba

Fonction : Auteur
PersonId : 5239
IdHAL : karim-sehaba
IdRef : 111143624

Situated Interaction, Collaboration, Adaptation and Learning

Benoit Encelle

Fonction : Auteur
PersonId : 7106
IdHAL : benoit-encelle
ORCID : 0000-0002-0734-6480
IdRef : 103924787

Situated Interaction, Collaboration, Adaptation and Learning

Résumé

Until recently, propositions on the subject of intelligent service companions, like robots, were mostly user and environment independent. Our work is part of the FUI-RoboPopuli project, which concentrates on endowing entertainment companion robots with adaptive and social behavior. More precisely, we focus on the capacity of a robotic system to learn how to personalize and adapt its behavior/actions according to its interaction situation that describes (a) the current user(s), (b) the current environment settings. Our approach is based on Markov Decision Processes (MDPs) that are largely used for adaptive robot applications. In order to have, as quickly as possible, a relevant adaptive behavior whatever the interaction situation, several approaches were proposed to decrease the sample complexity required to learn the MDP model, including its reward function. To this end, we propose two learning algorithms to learn the MDP reward function through analyzing interaction traces (i.e. the interaction history between the robot and its users including their feedback regarding the robot actions). The first algorithm is direct and certain, but does not particularly exploit its knowledge to adapt to unknown situations. The second is able to detect the importance of certain situation information in the adaptation process. The detection of important information is used to generalize the learned reward function to unknown situations (i.e. unknown users and/or environment settings). In this paper, we present both learning algorithms, simulated experiments and an experiment with the EMOX robot that was upgraded during the FUI-RoboPopuli project. The results of those experiments prove that our proposed algorithms are able to learn, through interactions with simulated and real situations, a reward function that leads to an adapted and personalized behavior. We also present a scaling analysis where we define the main parameters in the proposed representation and we study the level of importance of each of these parameters in the learning and convergence complexity.

Mots clés

Adaptive behavior Personalization Learning from users’ feedback Interaction Traces Markov Decision Processes (MDPs) Companion robots

Domaines

Intelligence artificielle [cs.AI] Robotique [cs.RO] Apprentissage [cs.LG] Recherche d'information [cs.IR]

Karim Sehaba : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01261874

Soumis le : lundi 25 janvier 2016-20:54:58

Dernière modification le : mercredi 27 mars 2024-09:16:03

Dates et versions

hal-01261874 , version 1 (25-01-2016)

Identifiants

HAL Id : hal-01261874 , version 1
DOI : 10.1177/1059712316634062

Citer

Abir-Beatrice Karami, Karim Sehaba, Benoit Encelle. Adaptive Artificial Companions learning from users' feedback. Adaptive Behavior, 2016, 24 (2), pp.69-86. ⟨10.1177/1059712316634062⟩. ⟨hal-01261874⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS GREYC GREYC-MAD COMUE-NORMANDIE LABEXIMU ENSICAEN UNICAEN INSA-GROUPE UDL

341 Consultations

0 Téléchargements

Adaptive Artificial Companions learning from users' feedback

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager