M. F. Rushworth and T. E. Behrens, Choice, uncertainty and value in prefrontal and cingulate cortex, Nature Neuroscience, vol.11, issue.4, pp.389-397, 2008.

N. D. Daw, J. P. O'doherty, P. Dayan, B. Seymour, and R. J. Dolan, Cortical substrates for exploratory decisions in humans, Nature, vol.441, issue.7095, pp.876-879, 2006.

N. Schweighofer and K. Doya, Meta-learning in reinforcement learning, Neural Networks, vol.16, issue.1, pp.5-9, 2003.

M. R. Nassar, R. C. Wilson, B. Heasly, and J. I. Gold, An Approximately Bayesian Delta-Rule Model Explains the Dynamics of Belief Updating in a Changing Environment, J. Neurosci, vol.30, issue.37, pp.12366-12378, 2010.

R. C. Wilson, A. Geana, J. M. White, E. A. Ludvig, and J. D. Cohen, Humans use directed and random exploration to solve the explore-exploit dilemma, J exp Psychol Gen, vol.143, issue.6, pp.2074-2081, 2014.

W. Schultz, P. Dayan, and P. R. Montague, A neural substrate of prediction and reward. Science (80-.), vol.275, pp.1593-1599, 1997.

W. Schultz, Updating dopamine reward signals, Current Opinion in Neurobiology, vol.23, issue.2, pp.229-238, 2013.

M. Watabe-uchida, N. Eshel, and N. Uchida, Neural Circuitry of Reward Prediction Error, Annu. Rev. Neurosci, vol.40, issue.1, pp.373-394, 2017.

L. Coddington and J. T. Dudman, The timing of action determines reward prediction signals in identified midbrain dopamine neurons, Nature Neuroscience, vol.21, issue.11, pp.1563-1573, 2018.

H. M. Bayer and P. W. Glimcher, Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal, Neuron, vol.47, issue.1, pp.129-141, 2005.

G. Morris, A. Nevet, D. Arkadir, E. Vaadia, and H. Bergman, Midbrain dopamine neurons encode decisions for future action, Nat. Neurosci, vol.9, issue.8, pp.1057-1063, 2006.

M. R. Roesch, D. J. Calu, and G. Schoenbaum, Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards, Nat. Neurosci, vol.10, issue.12, pp.1615-1639, 2007.

M. Matsumoto and O. Hikosaka, Two types of dopamine neuron distinctly convey positive and negative motivational signals, Nature, vol.459, issue.7248, pp.837-841, 2009.

D. Centonze, B. Picconi, P. Gubellini, G. Bernardi, and P. Calabresi, Dopaminergic control of synaptic plasticity in the dorsal striatum, Eur. J. Neurosci, vol.13, issue.6, pp.1071-1077, 2001.

J. N. Reynolds, B. I. Hyland, and J. R. Wickens, A cellular mechanism of reward-related learning, Nature, vol.413, issue.6851, pp.67-70, 2001.

E. M. Izhikevich, Solving the distal reward problem through linkage of STDP and dopamine signaling, Cereb. Cortex, vol.17, issue.10, pp.2443-2452, 2007.

V. D. Costa, V. L. Tran, J. Turchi, and B. B. Averbeck, Dopamine modulates novelty seeking behavior during decision making, Behav. Neurosci, vol.128, issue.5, pp.556-566, 2014.

D. M. Haluk and S. B. Floresco, Ventral striatal dopamine modulation of different forms of behavioral flexibility, Neuropsychopharmacology, vol.34, issue.8, pp.2041-52, 2009.

S. B. Flagel, A selective role for dopamine in stimulus-reward learning, Nature, vol.469, issue.7328, pp.53-57, 2011.

G. K. Papageorgiou, M. Baudonnat, F. Cucca, and M. E. Walton, Mesolimbic Dopamine Encodes Prediction Errors in a StateDependent Manner, Cell Rep, vol.15, issue.2, pp.221-229, 2016.

N. L. Jenni, J. D. Larkin, and S. B. Floresco, Prefrontal Dopamine D1 and D2 Receptors Regulate Dissociable Aspects of Decision Making via Distinct Ventral Striatal and Amygdalar Circuits, J. Neurosci, vol.37, issue.26, pp.6200-6213, 2017.

J. Salamone, M. Correa, S. Mingote, and S. Weber, Beyond the reward hypothesis: alternative functions of nucleus accumbens dopamine, Curr. Opin. Pharmacol, vol.5, issue.1, pp.34-41, 2005.

C. W. Berridge and A. F. Arnsten, Psychostimulants and motivated behavior: Arousal and cognition, Neurosci. Biobehav. Rev, vol.37, issue.9, pp.1976-1984, 2013.

C. M. Stopper, M. T. Tse, D. R. Montes, C. R. Wiedman, and S. B. Floresco, Overriding Phasic Dopamine Signals Redirects Action Selection during Risk/Reward Decision Making, Neuron, vol.84, issue.1, pp.177-189, 2014.

Y. Niv, N. D. Daw, D. Joel, and P. Dayan, Tonic dopamine: Opportunity costs and the control of response vigor, Psychopharmacology (Berl), vol.191, issue.3, pp.507-520, 2007.

J. Naudé, Nicotinic receptors in the ventral tegmental area promote uncertainty-seeking, Nat. Neurosci, 2015.

M. J. Frank, B. B. Doll, J. Oas-terpstra, and F. Moreno, The neurogenetics of exploration and exploitation: Prefrontal and striatal dopaminergic components, In, Nature Neuroscience, vol.12, issue.8, pp.1062-1068, 2009.

W. K. Zajkowski, M. Kossut, and R. C. Wilson, A causal role for right frontopolar cortex in directed, but not random, exploration. Elife, vol.6, pp.1-18, 2017.

I. Cogliati-dezza, A. J. Yu, A. Cleeremans, and W. Alexander, Learning the value of information and reward over time when solving exploration-exploitation problems, Sci. Rep, vol.7, issue.1, p.16919, 2017.

M. D. Humphries, M. Khamassi, and K. Gurney, Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia, Front. Neurosci, vol.6, pp.1-14, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00688928

R. Sutton and A. Barto, Reinforcement Learning: An Introduction, 1998.

K. Doya, Modulators of decision making, Nat. Neurosci, vol.11, issue.4, pp.410-416, 2008.

M. Khamassi, P. Enel, P. F. Dominey, and E. Procyk, Medial prefrontal cortex and the adaptive regulation of reinforcement learning parameters, Prog Brain Res, vol.202, pp.441-464, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01628829

J. A. Beeler, N. Daw, C. R. Frazier, and X. Zhuang, Tonic dopamine modulates exploitation of reward learning, Front. Behav. Neurosci, vol.4, p.170, 2010.

E. Lee, M. Seo, O. Dal-monte, and B. B. Averbeck, Injection of a Dopamine Type 2 Receptor Antagonist into the Dorsal Striatum Disrupts Choices Driven by Previous Outcomes, But Not Perceptual Inference, J. Neurosci, vol.35, issue.16, pp.6298-6306, 2015.

C. Eisenegger, Role of dopamine D2 receptors in human reinforcement learning, Neuropsychopharmacology, vol.39, issue.10, pp.2366-75, 2014.

L. K. Krugel, G. Biele, P. N. Mohr, S. Li, and H. R. Heekeren, Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions, Proc. Natl. Acad. Sci. USA, vol.106, issue.42, pp.17951-17957, 2009.

B. B. Averbeck, Theory of Choice in Bandit, Information Sampling and Foraging Tasks, PLoS Comput. Biol, vol.11, issue.3, pp.1-28, 2015.

F. Lesaint, O. Sigaud, S. B. Flagel, T. E. Robinson, and M. Khamassi, Modelling Individual Differences in the Form of Pavlovian Conditioned Approach Responses: A Dual Learning Systems Approach with Factored Representations, PLoS Comput. Biol, vol.10, issue.2, 2014.

N. D. Daw, Trial-by-trial data analysis using computational models, Decis. Making, Affect. Learn. Atten. Perform. XXIII, pp.1-26, 2011.

B. B. Averbeck and V. D. Costa, Motivational neural circuits underlying reinforcement learning, Nat. Neurosci, vol.20, issue.4, pp.505-512, 2017.

S. J. Gershman and B. G. Tzovaras, Dopaminergic genes are associated with both directed and random exploration, Neuropsychologia, vol.120, pp.97-104, 2018.

A. Dickinson, J. Smith, and J. Mirenowicz, Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists, Behav. Neurosci, vol.114, issue.3, pp.468-83, 2000.

M. F. Barbano, M. Le-saux, and M. Cador, Involvement of dopamine and opioids in the motivation to eat: influence of palatability, homeostatic state, and behavioral paradigms, Psychopharmacology (Berl), vol.203, issue.3, pp.475-487, 2009.

Y. Niv, Cost, benefit, tonic, phasic: What do response rates tell us about dopamine and motivation?, Ann. N. Y. Acad. Sci, vol.1104, pp.357-376, 2007.

J. A. Beeler, C. R. Frazier, and X. Zhuang, Putting desire on a budget: dopamine and energy expenditure, reconciling reward and resources, Front. Integr. Neuroscir, vol.6, p.49, 2012.

S. Kakade and P. Dayan, Dopamine: generalization and bonuses, Neural Netw, issue.4-6, pp.549-559, 2002.

K. Katahira, The relation between reinforcement learning parameters and the influence of reinforcement history on choice behavior, J. Math. Psychol, vol.66, pp.59-69, 2015.

T. E. Behrens, M. W. Woolrich, M. E. Walton, and M. F. Rushworth, Learning the value of information in an uncertain world, Nat. Neurosci, vol.10, issue.9, pp.1214-1235, 2007.

M. Jepma, Catecholaminergic Regulation of Learning Rate in a Dynamic Environment, PLOS Comput. Biol, vol.12, issue.10, p.1005171, 2016.

K. N. Gurney, M. Humphries, R. Wood, T. J. Prescott, and P. Redgrave, Testing computational hypotheses of brain systems function: a case study with the basal ganglia, Network, vol.15, issue.4, pp.263-90, 2004.

A. A. Grace, S. B. Floresco, Y. Goto, and D. J. Lodge, Regulation of firing of dopaminergic neurons and control of goal-directed behaviors, Trends Neurosci, vol.30, issue.5, pp.220-227, 2007.

S. Q. Park, Adaptive coding of reward prediction errors is gated by striatal coupling, Proc Natl Acad SCI, vol.109, pp.4285-4289, 2012.

A. Lak, W. R. Stauffer, and W. Schultz, Dopamine neurons learn relative chosen value from probabilistic rewards, Elife, vol.5, 2016.

M. Guitart-masip, U. R. Beierholm, R. Dolan, E. Duzel, and P. Dayan, Vigor in the Face of Fluctuating Rates of Reward: An Experimental Examination, J. Cogn. Neurosci, vol.23, issue.12, pp.3933-3938, 2011.

P. N. Tobler, C. D. Fiorillo, and W. Schultz, Adaptive coding of reward value by dopamine neurons, Science, vol.307, pp.1642-1645, 2005.

K. M. Diederen, Dopamine Modulated Adaptive Prediction Error Coding in the Human Midbrain and Striatum, J. Neurosci, vol.37, issue.7, pp.1708-1720, 2017.

W. Schultz, Neuronal Reward and Decision Signals: From Theories to Data, Physiol. Rev, vol.95, issue.3, pp.853-951, 2015.

M. Pessiglione, B. Seymour, G. Flandin, R. J. Dolan, and C. D. Frith, Dopamine-dependent prediction errors underpin rewardseeking behaviour in humans, Nature, vol.442, issue.7106, pp.1042-1045, 2006.

V. D. Costa, V. L. Tran, J. Turchi, and B. B. Averbeck, Reversal learning and dopamine: a bayesian perspective, J. Neurosci, vol.35, issue.6, pp.2407-2416, 2015.

T. Shiner, Dopamine, salience, and response set shifting in prefrontal cortex, Cereb. Cortex, vol.25, issue.10, pp.3629-3639, 2015.

P. Smittenaar, Decomposing effects of dopaminergic medication in Parkinson's disease on probabilistic action selectionlearning or performance?, Eur. J. Neurosci, vol.35, issue.7, pp.1144-1151, 2012.

M. Ito and K. Doya, Validation of decision-making models and analysis of decision variables in the rat basal ganglia, J. Neurosci, vol.29, issue.31, pp.9861-9874, 2009.

S. Palminteri, V. Wyart, and E. Koechlin, The Importance of Falsification in Computational Cognitive Modeling, Trends Cogn. Sci, vol.21, issue.6, pp.425-433, 2017.

, Scientific RepoRts |, vol.9, p.6770, 2019.