R. B. , Idea of the model, specification of the model and tests, implementation of the 870

, model, tests, data analysis, analysis of results, writing-up. G.B.: Idea of the model, vol.871

. B. , specification of the model and tests, analysis of results, writing-up. E.C. and A, p.872

, Specification of the model and tests, analysis of results, writing-up. R.B. carried out 873 this work thanks to the support of the A*MIDEX grant

, French Government "Investissements d'Avenir" program, vol.875, p.876

E. C. , received funding from the European Union, Horizon 2020 Research and Innovation, vol.877

, Program, under Grant Agreement n°713010 Project "GOAL-Robots -Goal-based 878

, A.B. was supported by the 880 French National Agency, Autonomous Learning Robots, vol.879

A. Dickinson and B. Balleine, Motivational control of goal-directed action, Animal Learning & Behavior, vol.22, issue.1, pp.1-18, 1994.

B. W. Balleine and A. Dickinson, Goal-directed instrumental action: contingency and incentive learning and their cortical substrates, Neuropharmacology, vol.37, issue.4, pp.407-419, 1998.

R. Dolan and P. Dayan, Goals and Habits in the Brain, Neuron, vol.80, issue.2, pp.312-325, 2013.

R. S. Sutton and A. G. Barto, Reinforcement learning: an introduction, 1998.

M. M. Botvinick, Y. Niv, and A. Barto, Hierarchically organized behavior and its neural foundations: A reinforcement-learning perspective, Cognition, vol.113, issue.3, pp.262-280, 2008.

B. W. Balleine, A. Dezfouli, M. Ito, and K. Doya, Hierarchical control of goal-directed action in the cortical-basal ganglia network, Current Opinion in Behavioral Sciences, vol.5, pp.1-7, 2015.

F. Mannella, K. Gurney, and G. Baldassarre, The nucleus accumbens as a nexus between values and goals in goal-directed behavior: a review and a new hypothesis, Frontiers in Behavioral Neuroscience, vol.7, 2013.

J. Ribas-fernandes, A. Solway, C. Diuk, J. T. Mcguire, A. G. Barto et al., A neural signature of hierarchical reinforcement learning, Neuron, vol.71, issue.2, pp.370-379, 2011.

H. H. Yin, S. B. Ostlund, B. J. Knowlton, and B. W. Balleine, The role of the dorsomedial striatum in instrumental conditioning, Europearn Journal of Neuroscience, vol.22, issue.2, pp.513-523, 2005.

N. D. Daw, J. P. O'doherty, P. Dayan, B. Seymour, and R. J. Dolan, Cortical substrates for exploratory decisions in humans, Nature, vol.441, issue.7095, pp.876-879, 2006.

K. Mehlhorn, B. R. Newell, P. M. Todd, M. D. Lee, K. Morgan et al., Unpacking the exploration-exploitation tradeoff: A synthesis of human and animal literatures, Decision, vol.2, issue.3, pp.191-215, 2015.

A. Brovelli, N. Laksiri, B. Nazarian, M. Meunier, and D. Boussaoud, Understanding the Neural Computations of Arbitrary Visuomotor Learning through fMRI and Associative Learning Theory, Cerebral Cortex, vol.18, issue.7, pp.1485-1495, 2008.

M. Jahanshahi, I. Obeso, J. C. Rothwell, and J. A. Obeso, A fronto-striato-subthalamic-pallidal network for goal-directed and habitual inhibition, Nature Reviews Neuroscience, vol.16, issue.12, pp.719-732, 2015.

D. Caligiore, M. A. Arbib, C. R. Miall, and G. Baldassarre, The super-learning hypothesis: Integrating learning processes across cortex, cerebellum and basal ganglia, Neuroscience and Biobehavioral Reviews, vol.100, pp.19-34, 2019.

H. Helmholtz, Concerning the perceptions in general, Treatise on physiological optics, vol.III, pp.214-230, 1866.

P. Dayan, G. E. Hinton, R. M. Neal, and R. S. Zemel, The helmholtz machine, Neural computation, vol.7, issue.5, pp.889-904, 1995.

K. Doya, S. Ishii, A. Pouget, and R. Rao, The Bayesian Brain: Probabilistic Approaches to Neural Coding, 2007.

K. Friston, The free-energy principle: a unified brain theory?, Nature Reviews Neuroscience, vol.11, issue.2, pp.127-138, 2010.

T. L. Griffiths, C. Kemp, and J. B. Tenenbaum, Bayesian models of cognition, 2008.

M. Toussaint and A. Storkey, Probabilistic inference for solving discrete and continuous state Markov Decision Processes, Proceedings of the 23rd international conference on Machine learning, pp.945-952, 2006.

M. Botvinick and M. Toussaint, Planning as inference, Trends in Cognitive Sciences, vol.16, issue.10, pp.485-488, 2012.

H. J. Kappen, V. Gómez, and M. Opper, Optimal control as a graphical model inference problem. Machine learning, vol.87, pp.159-182, 2012.

C. M. Bishop, Pattern recognition and machine learning, 2006.

D. Kappel, B. Nessler, and W. Maass, STDP Installs in Winner-Take-All Circuits an Online Approximation to Hidden Markov Model Learning, PLoS Computational Biology, vol.10, issue.3, p.1003511, 2014.

R. P. Rao, B. A. Olshausen, and M. S. Lewicki, Probabilistic models of the brain: Perception and neural function, 2002.

M. Jones and B. C. Love, Bayesian Fundamentalism or Enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition, Behavioral and Brain Sciences, vol.34, issue.4, pp.169-88, 2011.

W. Maass, Networks of spiking neurons: the third generation of neural network models, Neural networks, vol.10, issue.9, pp.1659-1671, 1997.

L. Buesing, J. Bill, B. Nessler, and W. Maass, Neural Dynamics as Sampling: A Model for Stochastic Computation in Recurrent Networks of Spiking Neurons, PLoS Computational Biology, vol.7, issue.11, p.1002211, 2011.

A. E. Orhan and W. J. Ma, Efficient probabilistic inference in generic neural networks trained with non-probabilistic feedback, Nature communications, vol.8, p.138, 2017.

A. Pouget, J. M. Beck, W. J. Ma, and P. E. Latham, Probabilistic brains: knowns and unknowns, Nature Neuroscience, vol.16, issue.9, pp.1170-1178, 2013.

W. Maass, On the computational power of winner-take-all, Neural computation, vol.12, issue.11, pp.2519-2535, 2000.

B. Nessler, M. Pfeiffer, L. Buesing, and W. Maass, Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent Plasticity, PLoS Computational Biology, vol.9, issue.4, p.1003037, 2013.

J. Bill, L. Buesing, S. Habenschuss, B. Nessler, W. Maass et al., Distributed Bayesian Computation and Self-Organized Learning in Sheets of Spiking Neurons with Local Lateral Inhibition, PLOS ONE, vol.10, issue.8, p.134356, 2015.

E. A. Rückert, G. Neumann, M. Toussaint, and W. Maass, Learned graphical models for probabilistic planning provide a new class of movement primitives, Frontiers in Computational Neuroscience, vol.6, 2013.

E. Rueckert, D. Kappel, D. Tanneberg, D. Pecevski, and J. Peters, Recurrent Spiking Networks Solve Planning Tasks, Scientific Reports, vol.6, issue.1, 2016.

D. Tanneberg, A. Paraschos, J. Peters, and E. Rueckert, Deep spiking networks for model-based planning in humanoids, Humanoid Robots (Humanoids), 2016.

, IEEE-RAS 16th International Conference on. IEEE, pp.656-661, 2016.

N. D. Daw, Y. Niv, and P. Dayan, Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control, Nature Neuroscience, vol.8, issue.12, pp.1704-1711, 2005.

G. Viejo, M. Khamassi, A. Brovelli, and B. Girard, Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning, Frontiers in Behavioral Neuroscience, vol.9, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01215419

R. M. Klein, Inhibition of return, Trends in Cognitive Sciences, vol.4, issue.4, pp.138-147, 2000.
URL : https://hal.archives-ouvertes.fr/inserm-00000089

R. M. Neal and G. E. Hinton, A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Learning in graphical models, pp.355-368, 1998.

C. M. Bishop, Pattern recognition and machine learning, 2006.

R. Jolivet, A. Rauch, H. R. Lüscher, and W. Gerstner, Predicting spike timing of neocortical pyramidal neurons by simple threshold models, Journal of computational neuroscience, vol.21, issue.1, pp.35-49, 2006.

Y. Dan and P. Mm, Spike timing-dependent plasticity of neural circuits, Neuron, vol.44, issue.1, pp.23-30, 2004.

D. Feldman, The Spike-Timing Dependence of Plasticity, Neuron, vol.75, issue.4, pp.556-571, 2012.

H. Markram, W. Gerstner, and P. J. Sjöström, Spike-Timing-Dependent Plasticity: A Comprehensive Overview, Frontiers in Synaptic Neuroscience, vol.4, 2012.

S. Zappacosta, F. Mannella, M. Mirolli, and G. Baldassarre, General differential Hebbian learning: Capturing temporal relations between events in neural networks and the brain, Plos Computational Biology, vol.14, issue.8, p.1006227, 2018.

D. P. Kingma and M. Welling, Auto-Encoding Variational Bayes, 2013.

I. Goodfellow, Y. Bengio, A. Courville, . Learning, and M. A. Boston, , 2017.

H. Jaeger, The 'echo state' approach to analysing and training recurrent neural networks-with an erratum note, p.48, 2001.

W. Maass, T. Natschläger, and H. Markram, Real-time computing without stable states: a new framework for neural computation based on perturbations, Neural Comput, vol.14, issue.11, pp.2531-2560, 2002.

J. J. Gibson, The Ecological Approach to Visual Perception, 1979.

G. Baldassarre, W. Lord, G. Granato, and V. G. Santucci, An embodied agent learning affordances with intrinsic motivations and solving extrinsic tasks with attention and one-step planning, Frontiers in Neurorobotics, vol.13, issue.45, 2019.

R. C. O'reilly and M. J. Frank, Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia, Neural Computation, vol.18, issue.2, pp.283-328, 2006.

F. Mannella, M. Mirolli, and G. Baldassarre, Goal-Directed Behavior and Instrumental Devaluation: A Neural System-Level Computational Model, Frontiers in Behavioral Neuroscience, vol.10, issue.181, pp.1-27, 2016.

A. Brovelli, J. M. Badier, F. Bonini, F. Bartolomei, O. Coulon et al., Dynamic reconfiguration of visuomotor-related functional connectivity networks, Journal of Neuroscience, vol.37, issue.4, pp.839-853, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01464162

A. Brovelli, D. Chicharro, J. M. Badier, H. Wang, and V. Jirsa, Characterization of Cortical Networks and Corticocortical Functional Connectivity Mediating Arbitrary Visuomotor Mapping, Journal of Neuroscience, vol.35, issue.37, pp.12643-12658, 2015.
URL : https://hal.archives-ouvertes.fr/hal-02087533

N. Kriegeskorte, M. Mur, and P. A. Bandettini, Representational similarity analysis -Connecting the branches of systems neuroscience. Frontiers in systems neuroscience, vol.2, p.4, 2008.