F. Varela, H. Maturana, and R. Uribe, Autopoiesis: The organization of living systems, its characterization and a model, Biosystems, vol.5, issue.4, pp.187-196, 1974.

M. Vaillant-molina, L. Newell, I. Castellanos, L. E. Bahrick, and R. Lickliter, Intersensory redundancy impairs face perception in early development, the International Conference on Infant Studies, 2006.

G. Fanelli, J. Gall, and L. V. Gool, Hough transform-based mouth localization for audio-visual speech recognition, British Machine Vision Conference, 2009.

J. Bonnal, S. Argentieri, P. Dans, and J. Manhs, Speaker localization and speech extraction with the ear sensor, IEEE/RSJ International Conference on Intelligent RObots and Systems (IROS), pp.670-675, 2009.

B. Garcia, M. Bernard, S. Argentieri, and B. Gas, Sensorimotor learning of sound localization for an autonomous robot, Forum Acusticum, 2014.

A. Laflaquiere, S. Argentieri, B. Gas, and E. Castillo-castenada, Space dimension perception from the multimodal sensorimotor flow of a naive robotic agent, Intelligent Robots and Systems (IROS), pp.1520-1525, 2010.

K. Ui-hyun, O. Hiroshi, and G. , Improved binaural sound localization and tracking for unknown time-varying number of speakers, Advanced Robotics, vol.27, issue.15, pp.1161-1173, 2013.

,

B. Burger, I. Ferrané, and F. Lerasle, Multimodal interaction abilities for a robot companion, Computer Vision Systems, pp.549-558, 2008.

A. Droniou, S. Ivaldi, and O. Sigaud, Deep unsupervised network for multimodal perception, representation and classification, Robotics and Autonomous Systems, vol.71, issue.0, pp.83-98, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01083521

A. Giraud, E. Truy, and R. Frackowiak, Imaging plasticity in cochlear implant patients, Audiol Neurootol, vol.6, pp.381-393, 2001.

A. Pitti, A. Blanchard, M. Cardinaux, and P. Gaussier, Gain-field modulation mechanism in multimodal networks for spatial perception, Humanoid Robots (Humanoids), 2012 12th IEEE-RAS International Conference on, pp.297-302, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00762739

S. Boucenna, P. Gaussier, P. Andry, and L. Hafemeister, Imitation as a communication tool for online facial expression learning and recognition, Intelligent Robots and Systems (IROS), pp.5323-5328, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00522773

P. K. Kuhl and A. N. Meltzoff, Infant vocalizations in response to speech: Vocal imitation and developmental change, The Journal of the Acoustical Society of America, vol.100, pp.2425-2438, 1996.

A. Streri, M. Coulon, and B. Guella, The foundations of social cognition: Studies on face/voice integration in newborn infants, International Journal of Behavioral Development, vol.1, issue.5, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01022559

C. Kitamura, B. Guella, and J. Kim, Motherese by eye and ear: Infants perceive visual prosody in point-line displays of talking heads, PLoS ONE, vol.9, issue.10, p.111467, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01478469

J. Sanchez-riera, Capacits audiovisuelles en robot humanode nao, 2013.

G. Tzanetakis, A. Ermolinskyi, and P. Cook, Beyond the queryby-example paradigm: New query interfaces for music information retrieval, Proc. Int. Computer Music Conference, pp.177-183, 2002.

P. D. , Content-based methods for the management of digital music, International Conference on Acoustics, Speech, and Signal Processing, vol.IV, pp.2437-2440, 2000.

S. Boucenna, S. Anzalone, E. Tilmont, D. Cohen, and M. Chetouani, Learning of social signatures through imitation game between a robot and a human partner, IEEE Transactions on, vol.6, issue.3, pp.213-225, 2014.

S. M. Anzalone, S. Boucenna, S. Ivaldi, and M. Chetouani, Evaluating the engagement with social robots, International Journal of Social Robotics, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01158293

P. Gaussier and S. Zrehen, Perac: A neural architecture to control artificial animals, Robotics and Autonomous Systems, vol.16, issue.24, pp.291-320, 1995.

R. P. Lippmann, An introduction to computing with neural nets, ASSP Magazine, IEEE, vol.4, issue.2, pp.4-22, 1987.

D. Mcneil, hand and mind : What gestures reveal about thought, 2005.

D. Lewkowicz, Y. Delevoye-turrell, D. Bailly, P. Andry, and P. Gaussier, Reading motor intention through mental imagery, Adaptive Behavior, p.1059712313501347, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00870420

L. Cohen, W. Abbassi, M. Chetouani, and S. Boucenna, Intention inference learning through the interaction with a caregiver, Development and Learning and Epigenetic Robotics, p.2014

, Joint IEEE International Conferences on, pp.153-154, 2014.

E. Partanen, T. Kujala, R. Ntnen, A. Liitola, A. Sambeth et al., Learning-induced neural plasticity of speech processing before birth, Proceedings of the National Academy of Sciences, vol.110, issue.37, pp.15-145, 2013.

M. Cheour, P. Leppnen, and N. Kraus, Mismatch negativity (mmn) as a tool for inverstigating auditory discrimination and sensory memory in infants and children, Clin Neurophysiol, vol.111, pp.4-16, 2001.

L. Bernstein, S. Eberhardt, and M. Demorest, Single-channel vibrotactile supplements to visual perception of intonation and stress, J Acoust Soc Am, vol.85, 1989.

P. H. Graf, E. Cosatto, V. Strom, and H. F. , Visual prosody: Facial movements accompanying speech, Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, 2002.

K. Munhall, J. Jones, D. Callan, T. Kuratate, and E. Vatikiotis-bateson, Visual prosody and speech intelligibility: head movement improves auditory speech perception, Psychol Sci, vol.15, issue.2, pp.133-137, 2004.

E. J. Krahmer and M. Swerts, More about brows: a cross-linguistic analysis-by-synthesis study, in from brows to trust: Evaluating embodied conversational agents, vol.15, pp.191-216, 2004.