Learning from Reward as an emergent property of Physics-like interactions between neurons in an artificial neural network.
Résumé
We study a class of artificial neural networks in which a physics-like conservation law upon the activity of connected neurons is imposed at each time. We postulate that the modification of the network activities may be interpreted as a learning capability if a judicious conservation law is chosen. We illustrate our claim by modeling a rat behavior in a labyrinth: the exploration of the labyrinth permits to create connections between neurons (latent learning), whereas the discovery of food induces a one step backpropagation process over the activities (reinforcement learning). We give theoretical results about our learning algorithm CbL and show it is intrinsically faster than Q-Learning.
Origine : Fichiers produits par l'(les) auteur(s)