Learning of Binocular Fixations using Anomaly Detection with Deep Reinforcement Learning

François Jean de La Bourdonnaye; Céline Teulière; Thierry Chateau; Jochen Triesch

doi:10.1109/IJCNN.2017.7965928

Communication Dans Un Congrès Année : 2017

Learning of Binocular Fixations using Anomaly Detection with Deep Reinforcement Learning

(1) , (1) , (1) , (2, 3)

1
2
3

François Jean de La Bourdonnaye

Fonction : Auteur
PersonId : 1022855

Institut Pascal

Céline Teulière

Fonction : Auteur
PersonId : 8681
IdHAL : cteuliere
IdRef : 149645163

Institut Pascal

Thierry Chateau

Fonction : Auteur
PersonId : 8056
IdHAL : thierry-chateau
IdRef : 154402176

Institut Pascal

Jochen Triesch

Fonction : Auteur

Frankfurt Institute for Advanced Studies

Goethe-Universität Frankfurt am Main

Résumé

Due to its ability to learn complex behaviors in high-dimensional state-action spaces, deep reinforcement learning algorithms have attracted much interest in the robotics community. For a practical reinforcement learning implementation on a robot, it has to be provided with an informative reward signal that makes it easy to discriminate the values of nearby states. To address this issue, prior information, e.g. in the form of a geometric model, or human supervision are often assumed. This paper proposes a method to learn binocular fixations without such prior information. Instead, it uses an informative reward requiring little supervised information. The reward computation is based on an anomaly detection mechanism which uses convolutional autoencoders. These detectors estimate in a weakly supervised way an object's pixellic position. This position estimate is affected by noise, which makes the reward signal noisy. We first show that this affects both the learning speed and the resulting policy. Then, we propose a method to partially remove the noise using regression on the detection change given sensor data. The binocular fixation task is learned in a simulated environment on an object training set with various shapes and colors. The learned policy is compared with another one learned with a highly informative and noiseless reward signal. The tests are carried out on the training set and on a test set of new objects. We observe similar performances, showing that the environment-encoding step can replace the prior information.

Domaines

Automatique / Robotique

Fichier principal

fdlb5.pdf (5.1 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

François de La Bourdonnaye : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01635610

Soumis le : mardi 30 janvier 2018-09:20:30

Dernière modification le : samedi 22 avril 2023-04:24:31

Dates et versions

hal-01635610 , version 1 (27-11-2017)

hal-01635610 , version 2 (30-01-2018)

Identifiants

HAL Id : hal-01635610 , version 2
DOI : 10.1109/IJCNN.2017.7965928

Citer

François Jean de La Bourdonnaye, Céline Teulière, Thierry Chateau, Jochen Triesch. Learning of Binocular Fixations using Anomaly Detection with Deep Reinforcement Learning. International joint Conference on Neural Networks, May 2017, Anchorage, AK, United States. ⟨10.1109/IJCNN.2017.7965928⟩. ⟨hal-01635610v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

PRES_CLERMONT CNRS INSTITUT_PASCAL TDS-MACS

217 Consultations

386 Téléchargements

Learning of Binocular Fixations using Anomaly Detection with Deep Reinforcement Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager