A Cohenence Approach to Learning from Reward. Application to the Reactive Navigation of a simulated Mobile Robot - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 1999

A Cohenence Approach to Learning from Reward. Application to the Reactive Navigation of a simulated Mobile Robot

Frédéric Davesne
Claude Barret
  • Fonction : Auteur
  • PersonId : 856229

Résumé

Within this paper, a new kind of learning agents - so-called Constraint based Memory Units (CbMU) - is described. The framework is the incremental building of a complex behaviour, given a set of basic tasks and a set of perceptive constraints that must be fulfilled to achieve the behavior; the decision problem may be non-Markovian. At each time, one of the basic tasks is executed, so that the complex behavior is a temporal sequence of elementary tasks. A CbMU can be modelled as an adaptive switch which learns to choose among its set of output channels the one to be activated (given its perceptive data and a short term memory), in order to respect a particular constraint. An output channel may be linked either to the firing of a basic task or to the activation of another CbMU; this allows a hierarchical decision process, implying different levels of contexts. The dynamics of the system is learnt by the mean of a perceptive graph and the cycles detected by the short term memory of a CbMU are utilized as sub-goals to build internal contexts. The learning procedure of a CbMU is a reinforcement learning inspired algorithm based on a heuristic which does not need internal parameters. It is achieved by a consistency law between the binary values of the connected nodes of the perceptive graph, inspired from the AI minimax algorithm. In this article, an example of programming with CbMUs is given, using a simulated Khepera robot. The objective is to build a goal-reaching behavior which is formulated by a high level strategy composed of logical rules using perceptive primitives. Four CbMUs are created, each one dedicated to the exploitation of particular perceptive data, and five basic tasks are utilized.
Fichier principal
Vignette du fichier
EWLR1999_FD_CB.pdf (256.44 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03239021 , version 1 (27-05-2021)

Identifiants

  • HAL Id : hal-03239021 , version 1

Citer

Frédéric Davesne, Claude Barret. A Cohenence Approach to Learning from Reward. Application to the Reactive Navigation of a simulated Mobile Robot. 4th European Workshop on Reinforcement Learning (EWRL’99), Oct 1999, Lugano, Switzerland. ⟨hal-03239021⟩

Collections

CEA UNIV-EVRY
26 Consultations
6 Téléchargements

Partager

Gmail Facebook X LinkedIn More