S. V. Albertin, A. B. Mulder, E. Tabuchi, M. B. Zugaro, and S. I. Wiener, Lesions of the medial shell of the nucleus accumbens impair rats in finding larger rewards, but spare reward-seeking behavior, Behavioural brain research, vol.117, issue.1-2, pp.173-183, 2000.
URL : https://hal.archives-ouvertes.fr/hal-00618317

R. E. Ambrose, B. E. Pfeiffer, and D. J. Foster, Reverse replay of hippocampal place cells is uniquely modulated by changing reward, Neuron, vol.91, issue.5, pp.1124-1136, 2016.

L. Aubin, M. Khamassi, and B. Girard, Prioritized sweeping neural DynaQ with multiple predecessors, and hippocampal replays, Living Machines, p.page TBA, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01709275

A. G. Barto, Adaptive critics and the basal ganglia, Models of Information Processing in the Basal Ganglia, pp.215-232, 1995.

T. Bast, I. A. Wilson, M. P. Witter, and R. G. Morris, From rapid place learning to behavioral performance: a key role for the intermediate hippocampus, PLoS biology, vol.7, issue.4, p.1000089, 2009.

S. Bavard, M. Lebreton, M. Khamassi, G. Coricelli, and S. Palminteri, Reference point and range-adaptation produce both rational and irrational choices in human reinforcement learning. Nature Communications, 2018.

G. Buzsáki, Two-stage model of memory trace formation: A role for "noisy" brain states, Neuroscience, vol.31, issue.3, pp.551-570, 1989.

G. Buzsáki, Hippocampal sharp wave-ripple: A cognitive biomarker for episodic memory and planning, Hippocampus, vol.25, issue.10, pp.1073-1188, 2015.

G. Buzsáki, Z. Horvath, R. Urioste, J. Hetke, and K. Wise, High-frequency network oscillation in the hippocampus, Science, vol.256, issue.5059, pp.1025-1027, 1992.

R. Chavarriaga, T. Strösslin, D. Sheynikhovich, and W. Gerstner, A computational model of parallel navigation systems in rodents, Neuroinformatics, vol.3, issue.3, pp.223-241, 2005.

Z. Chen and M. A. Wilson, Deciphering neural codes of memory during sleep, Trends in Neurosciences, 2017.

A. S. Dave and D. Margoliash, Song replay during sleep and computational rules for sensorimotor vocal learning, Science, vol.290, issue.5492, pp.812-816, 2000.

N. D. Daw, S. J. Gershman, B. Seymour, P. Dayan, and R. J. Dolan, Model-based influences on humans' choices and striatal prediction errors, Neuron, vol.69, issue.6, pp.1204-1215, 2011.

N. D. Daw, Y. Niv, and P. Dayan, Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control, Nature neuroscience, vol.8, issue.12, p.1704, 2005.

G. De-lavilléon, M. M. Lacroix, L. Rondi-reig, and K. Benchenane, Explicit memory creation during sleep demonstrates a causal role of place cells in navigation, Nature neuroscience, vol.18, issue.4, pp.493-495, 2015.

K. Diba and G. Buzsáki, Forward and reverse hippocampal place-cell sequences during ripples, Nature neuroscience, vol.10, issue.10, p.1241, 2007.

S. Diekelmann and J. Born, The memory function of sleep, Nature Reviews Neuroscience, vol.11, issue.2, p.114, 2010.

L. Dollé, R. Chavarriaga, A. Guillot, and M. Khamassi, Interactions of spatial strategies producing generalization gradient and blocking: A computational approach, PLoS computational biology, vol.14, issue.4, p.1006092, 2018.

L. Dollé, D. Sheynikhovich, B. Girard, R. Chavarriaga, and A. Guillot, Path planning versus cue responding: a bio-inspired model of switching between navigation strategies, Biological cybernetics, vol.103, issue.4, pp.299-317, 2010.

D. R. Euston, M. Tatsuno, and B. L. Mcnaughton, Fast-forward playback of recent memory sequences in prefrontal cortex during sleep, science, vol.318, issue.5853, pp.1147-1150, 2007.

D. J. Foster, Replay comes of age. Annual review of neuroscience, vol.40, pp.581-602, 2017.

D. J. Foster and M. A. Wilson, Reverse replay of behavioural sequences in hippocampal place cells during the awake state, Nature, vol.440, issue.7084, pp.680-683, 2006.

G. Girardeau, K. Benchenane, S. I. Wiener, G. Buzsáki, and M. B. Zugaro, Selective suppression of hippocampal ripples impairs spatial memory, Nature neuroscience, vol.12, issue.10, pp.1222-1223, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00599372

S. N. Gomperts, F. Kloosterman, and M. A. Wilson, Vta neurons coordinate with the hippocampal reactivation of spatial experience, Elife, vol.4, p.5360, 2015.

S. C. Goodroe, J. M. Starnes, and T. I. Brown, The complex nature of hippocampal-striatal interactions in spatial navigation, Frontiers in Human Neuroscience, vol.12, p.250, 2018.

A. S. Gupta, M. A. Van-der-meer, D. S. Touretzky, and A. D. Redish, Hippocampal Replay Is Not a Simple Function of Experience, Neuron, vol.65, issue.5, pp.695-705, 2010.

T. Hafting, M. Fyhn, S. Molden, M. Moser, and E. I. Moser, Microstructure of a spatial map in the entorhinal cortex, Nature, vol.436, issue.7052, pp.801-806, 2005.

J. C. Houk, J. L. Adams, and A. G. Barto, A model of how the basal ganglia generate and use neural signals that predict reinforcement, Models of Information Processing in the Basal Ganglia, pp.249-271, 1995.

M. D. Humphries and T. J. Prescott, The ventral basal ganglia, a selection mechanism at the crossroads of space, strategy, and reward, Progress in neurobiology, vol.90, issue.4, pp.385-417, 2010.

S. P. Jadhav, C. Kemere, P. W. German, and L. M. Frank, Awake hippocampal sharp-wave ripples support spatial memory, Science, vol.336, issue.6087, pp.1454-1458, 2012.

D. Ji and M. A. Wilson, Coordinated memory replay in the visual cortex and hippocampus during sleep, Nature neuroscience, vol.10, issue.1, p.100, 2007.

A. Johnson and A. D. Redish, Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point, Journal of Neuroscience, vol.27, issue.45, pp.12176-12189, 2007.

A. Johnson, M. A. Van-der-meer, and A. D. Redish, Integrating hippocampus and striatum in decision-making. Current opinion in neurobiology, vol.17, pp.692-697, 2007.

L. P. Kaelbling, M. L. Littman, and A. W. Moore, Reinforcement learning: A survey, Journal of artificial intelligence research, vol.4, pp.237-285, 1996.

M. P. Karlsson and L. M. Frank, Awake replay of remote experiences in the hippocampus, Nature neuroscience, vol.12, issue.7, p.913, 2009.

M. Keramati, A. Dezfouli, and P. Piray, Speed/accuracy trade-off between the habitual and the goal-directed processes, PLoS computational biology, vol.7, issue.5, p.1002055, 2011.

M. Khamassi and M. D. Humphries, Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies, Frontiers in Behavioral Neuroscience, vol.6, p.79, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01219958

S. Lammel, D. I. Ion, J. Roeper, and R. C. Malenka, Projectionspecific modulation of dopamine neuron synapses by aversive and rewarding stimuli, Neuron, vol.70, issue.5, pp.855-862, 2011.

C. S. Lansink, P. M. Goltstein, J. V. Lankelma, R. N. Joosten, B. L. Mc-naughton et al., Preferential reactivation of motivationally relevant information in the ventral striatum, Journal of Neuroscience, vol.28, issue.25, pp.6372-6382, 2008.

C. S. Lansink, P. M. Goltstein, J. V. Lankelma, B. L. Mcnaughton, and C. M. Pennartz, Hippocampus leads ventral striatum in replay of place-reward information, PLoS Biology, issue.8, p.7, 2009.

M. Lebreton, S. Jorge, V. Michel, B. Thirion, and M. Pessiglione, An automatic valuation system in the human brain: evidence from functional neuroimaging, Neuron, vol.64, issue.3, pp.431-439, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00504100

A. K. Lee and M. A. Wilson, Memory of sequential experience in the hippocampus during slow wave sleep, Neuron, vol.36, issue.6, pp.1183-1194, 2002.

B. Lee, R. Gentry, G. Bissonette, R. Herman, J. Mallon et al., Manipulating the revision of reward value during the intertrial interval increases sign tracking and dopamine releases, PLoS Biology, 2018.

F. Lesaint, O. Sigaud, S. Flagel, T. Robinson, and M. Khamassi, Modelling individual differences observed in pavlovian autoshaping in rats using a dual learning systems approach and factored representations, PLoS Computational Biology, vol.10, issue.2, p.1003466, 2014.

L. Lin, Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine learning, vol.8, issue.3/4, pp.69-97, 1992.

A. Lopez-persem, The Brain Valuation System and its role in decision-making, 2016.

N. Maingret, G. Girardeau, R. Todorova, M. Goutierre, and M. Zugaro, Hippocampo-cortical coupling mediates memory consolidation during sleep, Nature neuroscience, vol.19, issue.7, pp.959-964, 2016.

D. Margoliash and T. P. Brawn, Sleep and learning in birds: Rats! there's more to sleep, Sleep and Brain Activity, pp.109-146, 2012.

D. Marr, Simple memory: a theory for archicortex, Philosophical Transactions of the Royal Society of London B: Biological Sciences, vol.262, pp.23-81, 1971.

L. Martinet, D. Sheynikhovich, K. Benchenane, A. , and A. , Spatial learning and action planning in a prefrontal cortical network model, PLoS computational biology, vol.7, issue.5, p.1002045, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00683093

J. L. Mcclelland, B. L. Mcnaughton, O. , and R. C. , Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory, Psychological review, vol.102, issue.3, p.419, 1995.

W. S. Mcculloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, The bulletin of mathematical biophysics, vol.5, issue.4, pp.115-133, 1943.

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness et al., Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, pp.529-533, 2015.

A. W. Moore and C. G. Atkeson, Prioritized sweeping: Reinforcement learning with less data and less time, Machine learning, vol.13, issue.1, pp.103-130, 1993.

G. Morris, A. Nevet, D. Arkadir, E. Vaadia, and H. Bergman, Midbrain dopamine neurons encode decisions for future action, Nature neuroscience, vol.9, issue.8, p.1057, 2006.

J. O'doherty, P. Dayan, J. Schultz, R. Deichmann, K. Friston et al., Dissociable roles of ventral and dorsal striatum in instrumental conditioning, science, vol.304, issue.5669, pp.452-454, 2004.

J. O'keefe and J. Dostrovsky, The hippocampus as a spatial map: Preliminary evidence from unit activity in the freely-moving rat, Brain research, vol.34, issue.1, pp.171-175, 1971.

H. F. Olafsdóttir, D. Bush, and C. Barry, The role of hippocampal replay in memory and planning, Current Biology, vol.28, issue.1, pp.37-50, 2018.

H. F. Ó-lafsdóttir, F. Carpenter, and C. Barry, Coordinated grid and place cell replay during rest, Nature neuroscience, vol.19, issue.6, p.792, 2016.

H. F. Olafsdóttir, F. Carpenter, and C. Barry, Task demands predict a dynamic switch in the content of awake hippocampal replay, Neuron, vol.96, issue.4, pp.925-935, 2017.

C. Padoa-schioppa and J. A. Assad, Neurons in the orbitofrontal cortex encode economic value, Nature, vol.441, issue.7090, p.223, 2006.

S. Palminteri, M. Khamassi, M. Joffily, and G. Coricelli, Medial prefrontal cortex and the adaptive regulation of reinforcement learning parameters, Nature Communications, vol.6, p.8096, 2015.

A. E. Papale, M. C. Zielinski, L. M. Frank, S. P. Jadhav, and A. D. Redish, Interplay between Hippocampal Sharp-Wave-Ripple Events and Vicarious Trial and Error Behaviors in Decision Making, Neuron, vol.92, issue.5, pp.1-8, 2016.

C. Pavlides and J. Winson, Influences of hippocampal place cell firing in the awake state on the activity of these cells during subsequent sleep episodes, The Journal of neuroscience : the official journal of the Society for Neuroscience, vol.9, issue.8, pp.2907-2918, 1989.

J. Peng and R. J. Williams, Efficient learning and planning within the Dyna framework, Adaptive Behavior, vol.1, issue.4, pp.437-454, 1993.

M. Pessiglione, B. Seymour, G. Flandin, R. J. Dolan, and C. D. Frith, Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans, Nature, vol.442, issue.7106, p.1042, 2006.

A. Peyrache, M. Khamassi, K. Benchenane, S. I. Wiener, and F. P. Battaglia, Replay of rule-learning related neural patterns in the prefrontal cortex during sleep, Nature Neuroscience, vol.12, issue.7, pp.919-926, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00551868

G. Pezzulo, F. Rigoli, and F. Chersi, The mixed instrumental controller: using value of information to combine habitual choice and mental simulation, Frontiers in psychology, p.4, 2013.

B. E. Pfeiffer, The content of hippocampal "replay, 2017.

B. E. Pfeiffer and D. J. Foster, Hippocampal place-cell sequences depict future paths to remembered goals, Nature, vol.497, issue.7447, p.74, 2013.

B. E. Pfeiffer and D. J. Foster, Autoassociative dynamics in the generation of sequences of hippocampal place cells, Science, issue.6244, pp.180-183, 2015.

I. Pohl, Bi-directional search, Machine intelligence, vol.6, p.10, 1971.

A. D. Redish, Vicarious trial and error, Nature Reviews Neuroscience, vol.17, issue.3, pp.147-159, 2016.

M. R. Roesch, D. J. Calu, and G. Schoenbaum, Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards, Nature neuroscience, vol.10, issue.12, p.1615, 2007.

D. K. Roumis and L. M. Frank, Hippocampal sharp-wave ripples in waking and sleeping states, Current opinion in neurobiology, vol.35, pp.6-12, 2015.

W. Schultz, P. Dayan, and P. R. Montague, A neural substrate of prediction and reward, Science, vol.275, pp.1593-1599, 1997.

R. S. Sutton, Integrated architectures for learning, planning, and reacting based on approximating dynamic programming, Proceedings of the seventh international conference on machine learning, pp.216-224, 1990.

R. S. Sutton, Generalization in reinforcement learning: Successful examples using sparse coarse coding, Advances in neural information processing systems, pp.1038-1044, 1996.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 1998.

Y. K. Takahashi, M. R. Roesch, R. C. Wilson, K. Toreson, P. O'donnell et al., Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex, Nature neuroscience, vol.14, issue.12, p.1590, 2011.

W. Tang, J. D. Shin, L. M. Frank, and S. P. Jadhav, Hippocampalprefrontal reactivation during learning is stronger in awake as compared to sleep states, Journal of Neuroscience, pp.2217-2291, 2017.

J. S. Taube, R. U. Muller, and J. B. Ranck, Head-direction cells recorded from the postsubiculum in freely moving rats. i. description and quantitative analysis, Journal of Neuroscience, vol.10, issue.2, pp.420-435, 1990.

J. S. Taube, R. U. Muller, and J. B. Ranck, Head-direction cells recorded from the postsubiculum in freely moving rats. ii. effects of environmental manipulations, Journal of Neuroscience, vol.10, issue.2, pp.436-447, 1990.

J. S. Taube, R. U. Muller, and J. B. Ranck, Head-direction cells recorded from the postsubiculum in freely moving rats. II. Effects of environmental manipulations, The Journal of neuroscience : the official journal of the Society for Neuroscience, vol.10, issue.2, pp.436-483, 1990.

G. Tesauro, Temporal difference learning and TD-Gammon, Communications of the ACM, vol.38, pp.58-68, 1995.

A. Thierry, Y. Gioanni, E. Dégénétais, and J. Glowinski, , 2000.

, Hippocampo-prefrontal cortex pathway: Anatomical and electrophysiological characteristics, Hippocampus, vol.10, issue.4, pp.411-419

E. C. Tolman, Cognitive maps in rats and men, Psychological review, vol.55, issue.4, pp.189-208, 1948.

G. Viejo, M. Khamassi, A. Brovelli, and B. Girard, Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning, Frontiers in behavioral neuroscience, p.9, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01215419

P. Voorn, L. J. Vanderschuren, H. J. Groenewegen, T. W. Robbins, and C. M. Pennartz, Putting a spin on the dorsal-ventral divide of the striatum, Trends in neurosciences, vol.27, issue.8, pp.468-474, 2004.

M. P. Walker and R. Stickgold, Sleep, memory, and plasticity, Annu. Rev. Psychol, vol.57, pp.139-166, 2006.

C. Watkins, Learning from delayed rewards, 1989.

A. M. Wikenheiser and A. D. Redish, The balance of forward and backward hippocampal sequences shifts across behavioral states, Hippocampus, vol.23, issue.1, pp.22-29, 2013.

A. M. Wikenheiser and A. D. Redish, Hippocampal Sequences and the Cognitive Map, Analysis and Modeling of Coordinated Multineuronal Activity, pp.105-129, 2015.

M. A. Wilson and B. L. Mcnaughton, Reactivation of hippocampal ensemble memories during sleep, Science, vol.265, issue.5172, pp.676-679, 1994.

C. Wu, D. Haggerty, C. Kemere, J. , and D. , Hippocampal awake replay in fear memory retrieval, Nature neuroscience, vol.20, issue.4, p.571, 2017.

X. Wu and D. J. Foster, Hippocampal replay captures the unique topological structure of a novel environment, Journal of Neuroscience, vol.34, issue.19, pp.6459-6469, 2014.

H. H. Yin and B. J. Knowlton, The role of the basal ganglia in habit formation, Nature Reviews Neuroscience, vol.7, issue.6, p.464, 2006.