ELSIM: End-to-end learning of reusable skills through intrinsic motivation

Arthur Aubret; Laëtitia Matignon; Salima Hassas

doi:10.1007/978-3-030-67661-2_32

Communication Dans Un Congrès Année : 2020

ELSIM: End-to-end learning of reusable skills through intrinsic motivation

(1) , (1) , (1)

Arthur Aubret

Fonction : Auteur
PersonId : 176995
IdHAL : arthur-aubret
ORCID : 0000-0003-3495-4323

Systèmes Cognitifs et Systèmes Multi-Agents

Laëtitia Matignon

Fonction : Auteur
PersonId : 3290
IdHAL : laetitia-matignon
ORCID : 0000-0001-7126-8715
IdRef : 134644239

Systèmes Cognitifs et Systèmes Multi-Agents

Salima Hassas

Fonction : Auteur
PersonId : 3291
IdHAL : salima-hassas
ORCID : 0000-0002-1387-2866
IdRef : 083298398

Systèmes Cognitifs et Systèmes Multi-Agents

Résumé

Taking inspiration from developmental learning, we present a novel reinforcement learning architecture which hierarchically learns and represents self-generated skills in an end-to-end way. With this architecture, an agent focuses only on task-rewarded skills while keeping the learning process of skills bottom-up. This bottom-up approach allows to learn skills that 1- are transferable across tasks, 2- improves exploration when rewards are sparse. To do so, we combine a previously defined mutual information objective with a novel curriculum learning algorithm, creating an unlimited and explorable tree of skills. We test our agent on simple gridworld environments to understand and visualize how the agent distinguishes between its skills. Then we show that our approach can scale on more difficult MuJoCo environments in which our agent is able to build a representation of skills which improve over a baseline both transfer learning and exploration when rewards are sparse.

Domaines

Informatique [cs] Intelligence artificielle [cs.AI]

laetitia matignon : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02902573

Soumis le : lundi 20 juillet 2020-09:45:24

Dernière modification le : mercredi 27 mars 2024-09:28:03

Dates et versions

hal-02902573 , version 1 (20-07-2020)

Identifiants

HAL Id : hal-02902573 , version 1
ARXIV : 2006.12903
DOI : 10.1007/978-3-030-67661-2_32

Citer

Arthur Aubret, Laëtitia Matignon, Salima Hassas. ELSIM: End-to-end learning of reusable skills through intrinsic motivation. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), Sep 2020, Ghent, Belgium. ⟨10.1007/978-3-030-67661-2_32⟩. ⟨hal-02902573⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS INSA-GROUPE UDL

142 Consultations

0 Téléchargements

ELSIM: End-to-end learning of reusable skills through intrinsic motivation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager