A. Canu and A. Mouaddib, Collective Decision-Theoretic Planning for Planet Exploration, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence, pp.289-296, 2011.
DOI : 10.1109/ICTAI.2011.51
URL : https://hal.archives-ouvertes.fr/hal-00969311

A. Dutech and M. Samuelides, Un algorithme d'apprentissage par renforcement pour les processus décisionnels de Markov partiellement observés : apprendre une extension sélective du passé, 2003.
DOI : 10.3166/ria.17.559-589

J. Ferber, Les systèmes multi-agents, Vers une Intelligence Collective. InterEditions, 1995.

S. Hoet and N. Sabouret, Apprentissage par renforcement d'actes de communication dans un contexte multiagent . Revue d'Intelligence Artificielle, pp.159-188, 2010.
URL : https://hal.archives-ouvertes.fr/hal-01298805

T. Kasai, H. Tenmoto, and A. Kamiya, Learning of communication codes in multi-agent reinforcement learning problem, 2008 IEEE Conference on Soft Computing in Industrial Applications, pp.1-6, 2008.
DOI : 10.1109/SMCIA.2008.5045926

P. Luca-lanzi, Adaptative agents with Reinforcement Learning and Internal Memory, Proceedings of the 6th Inter

, Conf. on the Simulation of Adaptive Behavior (SAB2000), pp.333-342, 2000.

A. Mccallum, Reinforcement learning with selective perception and hidden state, 1996.

S. Francisco, M. Melo, and . Veloso, Learning of coordination : exploiting sparse interactions in multiagent systems International Foundation for Autonomous Agents and Multiagent Systems, Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS '09), pp.773-780, 2009.

A. Nuxoll and J. E. Laird, Extending cognitive architecture with episodic memory, Proceedings of the National Conference on Artificial Intelligence, pp.1560-1564, 2007.

N. Sabouret, A Model of Requests about actions for active components in the semantic web, Proceedings of the STarting AI Researchers Symposium, 2002.

M. Tan, Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, Proceedings of the 10th International Conference on Machine Learning, pp.330-337, 1993.
DOI : 10.1016/B978-1-55860-307-3.50049-6

T. Michael, Y. Todd, J. D. Niv, and . Cohen, Learning to use working memory in partially observable environments through dopaminergic reinforcement, Advances in Neural Information Processing Systems 21, pp.1689-1696, 2009.

J. C. Christopher and . Watkins, Learning from delayed rewards, 1989.

P. Xuan, V. Lesser, and S. Zilberstein, Communication decisions in multi-agent cooperation, Proceedings of the fifth international conference on Autonomous agents , AGENTS '01, pp.616-623, 2001.
DOI : 10.1145/375735.376469

E. A. Zilli and M. E. Hasselmo, Modeling the role of working memory and episodic memory in behavioral tasks, Behavioral Tasks, pp.193-209, 2008.
DOI : 10.1002/hipo.20382