Unsupervised Learning and Exploration of Reachable Outcome Space

Giuseppe Paolo; Alban Laflaquière; Alexandre Coninx; Stephane Doncieux

doi:10.1109/icra40945.2020.9196819

Communication Dans Un Congrès IEEE International Conference on Robotics and Automation (ICRA) Année : 2020

Unsupervised Learning and Exploration of Reachable Outcome Space

(1) , , ,

Giuseppe Paolo

Fonction : Auteur
PersonId : 735825
IdHAL : gpaolo
ORCID : 0000-0003-4201-5967
IdRef : 261849867

Institut des Systèmes Intelligents et de Robotique

Alban Laflaquière

Fonction : Auteur

Alexandre Coninx

Fonction : Auteur
PersonId : 184690
IdHAL : alex-coninx
ORCID : 0000-0001-7992-8183
IdRef : 166602183

Stephane Doncieux

Fonction : Auteur
PersonId : 3909
IdHAL : stephane-doncieux
ORCID : 0000-0003-1541-054X
IdRef : 089428617

Résumé

Performing Reinforcement Learning in sparse rewards settings, with very little prior knowledge, is a challenging problem since there is no signal to properly guide the learning process. In such situations, a good search strategy is fundamental. At the same time, not having to adapt the algorithm to every single problem is very desirable. Here we introduce TAXONS, a Task Agnostic eXploration of Outcome spaces through Novelty and Surprise algorithm. Based on a population-based divergent-search approach, it learns a set of diverse policies directly from high-dimensional observations, without any task-specific information. TAXONS builds a repertoire of policies while training an autoencoder on the high-dimensional observation of the final state of the system to build a low-dimensional outcome space. The learned outcome space, combined with the reconstruction error, is used to drive the search for new policies. Results show that TAXONS can find a diverse set of controllers, covering a good part of the ground-truth outcome space, while having no information about such space.

Domaines

Informatique [cs] Intelligence artificielle [cs.AI] Robotique [cs.RO]

Giuseppe Paolo : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02951255

Soumis le : lundi 28 septembre 2020-15:52:10

Dernière modification le : jeudi 1 février 2024-14:24:35

Dates et versions

hal-02951255 , version 1 (28-09-2020)

Identifiants

HAL Id : hal-02951255 , version 1
ARXIV : 1909.05508
DOI : 10.1109/icra40945.2020.9196819

Citer

Giuseppe Paolo, Alban Laflaquière, Alexandre Coninx, Stephane Doncieux. Unsupervised Learning and Exploration of Reachable Outcome Space. IEEE International Conference on Robotics and Automation (ICRA), 2020, Paris, France. ⟨10.1109/icra40945.2020.9196819⟩. ⟨hal-02951255⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS ISIR SORBONNE-UNIVERSITE SU-SCIENCES ANR ISIR_AMAC

65 Consultations

0 Téléchargements

Unsupervised Learning and Exploration of Reachable Outcome Space

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager