Reinforcement learning model in probalistically rewarded task
Résumé
Adapting resource seeking behavior is of primary importance in survival. Then, balancing exploration and exploitation of discovered resources is at the core of adaptation to the environment. The reinforcement learning theoretical framework has been elaborated to formalize such reward seeking behavior. Biologically plausible models based on this algorithm have flourished recently. Among them, a neural network model was developed to investigate the functions of the anterior cingulate cortex (ACC) and the dorsolateral prefrontal cortex (DLPFC) involved in action valuation and action selection, respectively (Khamassi et al., 2010),. This model propose a method to regulate dynamically the exploration inspired by literature on meta-learning in order to solve dynamically the exploration/exploitation trade-off (Doya, 2002),. This model performed well in a deterministic problem solving task (PST). Our goal was to demonstrate that the model is generalizable to a more ecological PST with probabilistically dispensed rewards. The model was tested with its preset learning rate / exploration rate / initial action values and then optimized search of the parameters space. The initial values of model's parameters proved to be good however not optimal for the new task. Interestingly, the model's performance is very dependent on the initial action values.
Fichier principal
NEUROCOMP2010_0046_3d394d938a755edf24c9a86282b91b2f.pdf (374 Ko)
Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...