Avoiding catastrophic forgetting by coupling two reverberating neural networks
Résumé
Gradient descent learning procedures are most often used in neural network modeling. When these algorithms (e.g., backpropagation) are applied to sequential learning tasks a major drawback, termed catastrophic forgetting (or catastrophic interference), generally arises: when a network having already learned a first set of items is next trained on a second set of items, the newly learned information may completely destroy the information previously learned. To avoid this implausible failure, we propose a two-network architecture in which new items are learned by a first network concurrently with internal pseudo-items originating from a second network. As it is demonstrated that these pseudo-items reflect the structure of items previously learned by the first network, the model thus implements a refreshing mechanism by the old information. The crucial point is that this refreshing mechanism is based on reverberating neural networks which need only random stimulations to operate. The model thus provides a means to dramatically reduce retroactive interference while conserving the essentially distributed nature of information and proposes an original but plausible means to "copy and past" a distributed memory from one place in the brain to another.