Learning Sequences of Policies by using an Intrinsically Motivated Learner and a Task Hierarchy

Nicolas Duminy 1, 2 Alexandre Manoury 3, 4 Sao Mai Nguyen 3, 4 Cédric Buche 5 Dominique Duhaut 6, 2
1 Lab-STICC_UBS_CID_IHSEV
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
3 Lab-STICC_IMTA_CID_IHSEV
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
5 Lab-STICC_ENIB_CID_IHSEV
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
6 Lab-STICC_UBS_CID_IHSEV
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
Abstract : Our goal is to propose an algorithm for robots to learn sequences of actions, also called policies, in order to achieve complex tasks. We consider in this paper multiple and hierarchical tasks of various difficulties. To tackle this highly dimensional learning we propose a new algorithm, named Socially Guided Intrinsic Motivation for Sequence of Actions through Hierarchical Tasks (SGIM-SAHT), based on intrinsic motivation and using different learning strategies. We then present two implementations of this algorithm designed to address this challenge in different ways: through a "procedures" framework for Socially Guided Intrinsic Motivation with Procedure Babbling (SGIM-PB) and owing to planning and a dynamic environment representation learning for Continual Hierarchical Intrinsically Motivated Exploration (CHIME). We compare the two implementations and show, through two experiments, how efficiently they learn sequences of actions and dynamically adapt to their environment. We also discuss the benefits of implementing a full unified version of SGIM-SAHT using all the mentioned features of both implementations.
Type de document :
Communication dans un congrès
Workshop on Continual Unsupervised Sensorimotor Learning, ICDL-EpiRob 2018, Sep 2018, Tokyo, Japan
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01887073
Contributeur : Alexandre Manoury <>
Soumis le : mercredi 3 octobre 2018 - 15:48:23
Dernière modification le : jeudi 18 octobre 2018 - 08:06:01

Fichier

icdl-epirob-2018_CameraReady.p...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01887073, version 1

Citation

Nicolas Duminy, Alexandre Manoury, Sao Mai Nguyen, Cédric Buche, Dominique Duhaut. Learning Sequences of Policies by using an Intrinsically Motivated Learner and a Task Hierarchy. Workshop on Continual Unsupervised Sensorimotor Learning, ICDL-EpiRob 2018, Sep 2018, Tokyo, Japan. 〈hal-01887073〉

Partager

Métriques

Consultations de la notice

40

Téléchargements de fichiers

19