There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning

Nathan Grinsztajn; Johan Ferret; Olivier Pietquin; Philippe Preux; Matthieu Geist

Communication Dans Un Congrès Année : 2021

There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning

(1, 2) , , , ,

1
2

Nathan Grinsztajn

Fonction : Auteur
PersonId : 1083320

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Scool

Johan Ferret

Fonction : Auteur

Olivier Pietquin

Fonction : Auteur

Philippe Preux

Fonction : Auteur

Matthieu Geist

Fonction : Auteur
PersonId : 790158
IdRef : 142341819

Résumé

We propose to learn to distinguish reversible from irreversible actions for better informed decision-making in Reinforcement Learning (RL). From theoretical considerations, we show that approximate reversibility can be learned through a simple surrogate task: ranking randomly sampled trajectory events in chronological order. Intuitively, pairs of events that are always observed in the same order are likely to be separated by an irreversible sequence of actions. Conveniently, learning the temporal order of events can be done in a fully self-supervised way, which we use to estimate the reversibility of actions from experience, without any priors. We propose two different strategies that incorporate reversibility in RL agents, one strategy for exploration (RAE) and one strategy for control (RAC). We demonstrate the potential of reversibility-aware agents in several environments, including the challenging Sokoban game. In synthetic tasks, we show that we can learn control policies that never fail and reduce to zero the side-effects of interactions, even without access to the reward function.

Domaines

Apprentissage [cs.LG] Intelligence artificielle [cs.AI]

Fichier principal

Reversibility_Aware_Reinforcement_Learning__NeurIPS_.pdf (1.25 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Nathan Grinsztajn : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03454640

Soumis le : lundi 29 novembre 2021-12:05:55

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Dates et versions

hal-03454640 , version 1 (29-11-2021)

Identifiants

HAL Id : hal-03454640 , version 1

Citer

Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist. There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. Neural Information Processing Systems (2021), Dec 2021, Virtual, France. ⟨hal-03454640⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA GRID5000 CRISTAL INRIA2 UNIV-LILLE SILECS CRISTAL-SCOOL

55 Consultations

65 Téléchargements

There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager