C. Abry, V. Ducey-kaufmann, A. Vilain, and C. Lalevée, When the babble syllable feeds the foot in a point The syllable in speech production: Perspectives on the frame content theory, pp.460-472, 2008.

C. Abry, A. Vilain, and J. Schwartz, Introduction: Vocalize to Localize? A call for better crosstalk between auditory and visual communication systems researchers, Interaction Studies: Social Behaviour and Communication in Biological and Artificial Systems, vol.5, issue.3, pp.313-325, 2004.
DOI : 10.1075/bct.13.02abr

M. A. Arbib, From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics, Behavioral and Brain Sciences, vol.28, issue.02, pp.105-167, 2005.
DOI : 10.1017/S0140525X05000038

M. A. Arbib, Interweaving protosign and protospeech: Further developments beyond the mirror, Interaction Studies, vol.6, pp.145-171, 2005.
DOI : 10.1075/bct.13.08arb

Q. D. Atkinson, Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa, Science, vol.332, issue.6027, pp.346-349, 2011.
DOI : 10.1126/science.1199295

S. Baron-cohen, Mindblindness: An essay on autism and theory of mind, 1997.

A. Berrah, Evolution d'une société artificielle d'agents de parole: un modéle pour l'émergence des structures phonétiques, 1998.

A. Berrah, H. Glotin, R. Laboissiére, P. Bessi?ere, and L. Boë, From form to formation of phonetic structures: An evolutionary computing perspective, ICML '96 workshop on evolutionary computing and machine learning, pp.23-29, 1996.
URL : https://hal.archives-ouvertes.fr/hal-00019460

P. Bessiére, C. Laugier, and R. Siegwart, Probabilistic reasoning and decision making in sensory?motor systems, Springer tracts in advanced robotics, vol.46, 2008.
DOI : 10.1007/978-3-540-79007-5

P. Bessiére, E. Mazer, J. Ahuactzin, and K. Mekhnacha, Bayesian programming. Chapman and Hall/CRC , https://www.crcpress.com/Bayesian-Programming/Bessiere-Mazer- Ahuactzin-Mekhnacha, 2013.

L. Boë, Vowel spaces of newly-born infants and adults consequences for ontogenesis and phylogenesis, The 14th international congress of phonetic sciences, pp.2501-2504, 1999.

L. Boë, P. Badin, L. Ménard, G. Captier, B. Davis et al., Anatomy and control of the developing human vocal tract: A response to Lieberman, Journal of Phonetics, issue.5, pp.41-379, 2013.

L. Boë, P. Bessière, N. Ladjili, and N. Audibert, Simple combinatorial considerations challenge Ruhlen's mother tongue theory The syllable in speech production, pp.63-92, 2008.

L. Boë, J. Heim, K. Honda, S. Maeda, P. Badin et al., The vocal tract of newborn humans and Neanderthals: Acoustic capabilities and consequences for the debate on the origin of language. A reply to Lieberman (2007a), Journal of Phonetics, vol.35, issue.4, pp.564-581, 2007.
DOI : 10.1016/j.wocn.2007.06.006

L. Boë, N. Vallée, P. Badin, J. Schwartz, and C. Abry, Tendencies in phonological structures: The influence of substance on form. Les Cahiers de l'ICP, pp.35-55, 2000.

C. P. Browman and L. Goldstein, Articulatory Phonology: An Overview, Phonetica, vol.49, issue.3-4, pp.3-4, 1992.
DOI : 10.1159/000261913

C. P. Browman and L. M. Goldstein, Towards an articulatory phonology, In Phonology Yearbook, vol.3, pp.219-252, 1986.

R. Carlson, B. Granström, and D. Klatt, Vowel perception: The relative salience of selected acoustic manipulations, STL-QPSR, vol.34, pp.19-35, 1979.

D. L. Cheney and R. M. Seyfarth, How vervet monkeys perceive their grunts: Field playback experiments, Animal Behaviour, vol.30, issue.3, pp.739-751, 1982.
DOI : 10.1016/S0003-3472(82)80146-2

N. Chomsky, Aspects of the theory of syntax, 1965.

N. Clements, Feature economy as a phonological universal Feature economy in sound systems, Proceedings of the 15th International Congress of Phonetic Sciences, pp.371-374, 2003.

M. C. Corballis, From hand to mouth: The origins of language, 2002.

D. Boer and B. , Self-organization in vowel systems, Journal of Phonetics, vol.28, issue.4, pp.441-465, 2000.
DOI : 10.1006/jpho.2000.0125

D. Boer, B. Zuidema, and W. , Multi-Agent Simulations of the Evolution of Combinatorial Phonology, Adaptive Behavior, vol.37, issue.2, pp.141-154, 2010.
DOI : 10.1177/1059712309345789

S. Demange and S. Ouni, An episodic memory-based solution for the acoustic-to-articulatory inversion problem, The Journal of the Acoustical Society of America, vol.133, issue.5, pp.2921-2930, 2013.
DOI : 10.1121/1.4798665

URL : https://hal.archives-ouvertes.fr/hal-00834556

R. L. Diehl, A. J. Lotto, and L. L. Holt, Speech Perception, Annual Review of Psychology, vol.55, issue.1, pp.149-179, 2004.
DOI : 10.1146/annurev.psych.55.090902.142028

P. F. Dominey, Towards a construction-based framework for development of language, event perception and social cognition: Insights from grounded robotics and simulation, Neurocomputing, vol.70, issue.13-15, pp.70-2288, 2007.
DOI : 10.1016/j.neucom.2006.02.030

L. Fadiga, L. Fogassi, G. Pavesi, and G. Rizzolatti, Motor facilitation during action observation: A magnetic stimulation study, Journal of Neurophysiology, vol.73, issue.6, pp.2608-2611, 1995.

C. A. Fowler, An event approach to the study of speech perception from a direct-realist perspective, Journal of Phonetics, vol.14, issue.1, pp.3-28, 1986.

M. Gell-mann and M. Ruhlen, The origin and evolution of word order, Proceedings of the National Academy of Sciences, pp.17290-17295, 2011.
DOI : 10.1073/pnas.1113716108

M. Gentilucci and M. C. Corballis, From manual gesture to speech: A gradual transition, Neuroscience & Biobehavioral Reviews, vol.30, issue.7, pp.949-960, 2006.
DOI : 10.1016/j.neubiorev.2006.02.004

E. Gilet, J. Diard, and P. Bessiére, Bayesian Action???Perception Computational Model: Interaction of Production and Recognition of Cursive Letters, PLoS ONE, vol.34, issue.7, p.20387, 2011.
DOI : 10.1371/journal.pone.0020387.s001

URL : https://hal.archives-ouvertes.fr/hal-00645868

S. Giulivi, D. Whalen, L. M. Goldstein, H. Nam, and A. G. Levitt, An Articulatory Phonology Account of Preferred Consonant-Vowel Combinations, Language Learning and Development, vol.18, issue.3, pp.202-225, 2011.
DOI : 10.1080/15475441.2011.564569

S. Goldin-meadow and C. Butcher, Pointing toward two-word speech in young children, Pointing: Where language, culture, and cognition meet, pp.85-107, 2003.

T. L. Griffiths and M. L. Kalish, Language Evolution by Iterated Learning With Bayesian Agents, Cognitive Science, vol.85, issue.3, pp.31-441, 2007.
DOI : 10.1080/15326900701326576

F. H. Guenther, Cortical interactions underlying the production of speech sounds, Journal of Communication Disorders, vol.39, issue.5, pp.350-365, 2006.
DOI : 10.1016/j.jcomdis.2006.06.013

F. H. Guenther, M. Hampson, and D. Johnson, A theoretical investigation of reference frames for the planning of speech movements., Psychological Review, vol.105, issue.4, pp.611-633, 1998.
DOI : 10.1037/0033-295X.105.4.611-633

S. Harnad, The symbol grounding problem, Physica D: Nonlinear Phenomena, vol.42, pp.1-3, 1990.

M. D. Hauser, N. Chomsky, and W. T. Fitch, The faculty of language: What is it, who has it, and how did it evolve, Science, issue.5598, pp.298-1569, 2002.

J. R. Hurford, Biological evolution of the Saussurean sign as a component of the language acquisition device, Lingua, vol.77, issue.2, pp.187-222, 1989.
DOI : 10.1016/0024-3841(89)90015-6

E. T. Jaynes, Probability Theory: The Logic of Scienceus/academic/subjects/physics/theoretical-physics-and- mathematical-physics/probability-theory-logic-science?, 2003.
DOI : 10.1017/CBO9780511790423

C. Kemp and J. Tenenbaum, The discovery of structural form, Proceedings of the National Academy of Sciences of the United States of America, pp.10687-10692, 2008.
DOI : 10.1073/pnas.0802631105

D. Klatt, Prediction of perceived phonetic distance from critical-band spectra: A first step, ICASSP '82. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.82-1278, 1982.
DOI : 10.1109/ICASSP.1982.1171512

J. L. Konczak, F. Berthouze, H. Kaplan, H. Kozima, J. Yano et al., On the notion of motor primitives in humans and robots, Proceedings of the Fifth International Workshop on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems, pp.47-53, 2005.

K. P. Körding, U. Beierholm, W. J. Ma, S. Quartz, J. B. Tenenbaum et al., Causal Inference in Multisensory Perception, PLoS ONE, vol.93, issue.9, p.943, 2007.
DOI : 10.1371/journal.pone.0000943.s003

D. A. Leavens and K. A. Bard, Environmental Influences on Joint Attention in Great Apes: Implications for Human Cognition, Journal of Cognitive Education and Psychology, vol.10, issue.1, pp.9-31, 2011.
DOI : 10.1891/1945-8959.10.1.9

O. Lebeltel, P. Bessiere, J. Diard, and E. Mazer, Bayesian Robot Programming, Autonomous Robots, vol.16, issue.1, pp.49-79, 2004.
DOI : 10.1023/B:AURO.0000008671.38949.43

URL : https://hal.archives-ouvertes.fr/inria-00189723

A. M. Liberman and I. G. Mattingly, The motor theory of speech perception revised, Cognition, vol.21, issue.1, pp.1-36, 1985.
DOI : 10.1016/0010-0277(85)90021-6

A. M. Liberman and I. G. Mattingly, A specialization for speech perception, Science, vol.243, issue.4890, pp.243-489, 1989.
DOI : 10.1126/science.2643163

A. M. Liberman and D. H. Whalen, On the relation of speech to language, Trends in Cognitive Sciences, vol.4, issue.5, pp.187-196, 2000.
DOI : 10.1016/S1364-6613(00)01471-6

P. Lieberman, The biology and evolution of language, 1984.

P. Lieberman, Vocal tract anatomy and the neural bases of talking, Journal of Phonetics, vol.40, issue.4, 2012.
DOI : 10.1016/j.wocn.2012.04.001

J. Liljencrants and B. Lindblom, Numerical Simulation of Vowel Quality Systems: The Role of Perceptual Contrast, Language, vol.48, issue.4, pp.839-862, 1972.
DOI : 10.2307/411991

B. Lindblom, Can the models of evolutionary biology be applied to phonetic problems, Proceedings of the 10th international congress of phonetic sciences, pp.67-81, 1984.

B. Lindblom, Phonetic universals in vowel systems, Experimental phonology, pp.13-44, 1986.

B. Lindblom, Explaining Phonetic Variation: A Sketch of the H&H Theory, Speech production and speech modelling, pp.403-439, 1990.
DOI : 10.1007/978-94-009-2037-8_16

P. Macneilage and B. Davis, Motor mechanisms in speech ontogeny: phylogenetic, neurobiological and linguistic implications, Current Opinion in Neurobiology, vol.11, issue.6, pp.696-700, 2001.
DOI : 10.1016/S0959-4388(01)00271-9

P. F. Macneilage, The frame/content theory of evolution of speech production, Behavioral and Brain Sciences, vol.21, issue.04, pp.499-511, 1998.
DOI : 10.1017/S0140525X98001265

P. F. Macneilage and B. L. Davis, On the Origin of Internal Structure of Word Forms, Science, vol.288, issue.5465, pp.527-531, 2000.
DOI : 10.1126/science.288.5465.527

I. Maddieson, Patterns of sounds, 1984.
DOI : 10.1017/CBO9780511753459

I. Maddieson, Typological patterns-geographical distribution and phonetic explanation, Conference on the phonetics?phonology interface, 2001.

I. Maddieson and K. Precoda, Updating UPSID. The Journal of the, p.19, 1989.
DOI : 10.1121/1.2027403

S. Maeda, Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal tract shapes using an articulatory model. Speech Production and Speech Modelling, pp.131-149, 1989.

M. B. Manser and L. B. Fletcher, Vocalize to Localize: A test on functionally referential alarm calls, Interaction Studies, vol.5, issue.3, pp.327-344, 2004.
DOI : 10.1075/bct.13.03man

R. K. Moore, Spoken language processing: Piecing together the puzzle, Speech Communication, vol.49, issue.5, pp.418-435, 2007.
DOI : 10.1016/j.specom.2007.01.011

URL : https://hal.archives-ouvertes.fr/hal-00499174

C. Moulin-frier, Rôle des relations perception-action dans la communication parlée et l'émergence des systémes phonologiques: étude, modélisation computationnelle et simulations, 2011.

C. Moulin-frier, R. Laurent, P. Bessière, J. Schwartz, and J. Diard, Adverse conditions improve distinguishability of auditory, motor, and perceptuo-motor theories of speech perception: An exploratory Bayesian modelling study, Language and Cognitive Processes, vol.17, issue.7-8, pp.7-8, 2012.
DOI : 10.1037/0096-1523.18.3.603

C. Moulin-frier and P. Oudeyer, Curiosity-driven phonetic learning, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL), 2012.
DOI : 10.1109/DevLrn.2012.6400583

URL : https://hal.archives-ouvertes.fr/hal-00762795

C. Moulin-frier and P. Oudeyer, Exploration strategies in developmental robotics: A unified probabilistic framework, 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL), 2013.
DOI : 10.1109/DevLrn.2013.6652535

URL : https://hal.archives-ouvertes.fr/hal-00860641

C. Moulin-frier and P. Oudeyer, The role of intrinsic motivations in learning sensorimotor vocal mappings: A developmental robotics study, Proceedings of Interspeech, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00860655

C. Moulin-frier, J. Schwartz, J. Diard, and P. Bessière, Emergence of a language through deictic games within a society of sensori-motor agents in interaction. In The eighth international seminar on speech production, p.8, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00370575

C. Moulin-frier, J. Schwartz, J. Diard, and P. Bessière, A unified theoretical bayesian model of speech communication, The first conference on Applied Digital Human Modeling, 2010.
URL : https://hal.archives-ouvertes.fr/hal-01059208

C. Moulin-frier, J. Schwartz, J. Diard, and P. Bessière, Emergence of articulatory?acoustic systems from deictic interaction games in a " Vocalize to Localize " framework. In Primate communication and human language: Vocalisations, gestures, imitation and deixis in humans and non-humans Advances in interaction studies series, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00961125

J. I. Myung and M. A. Pitt, Optimal experimental design for model discrimination., Psychological Review, vol.116, issue.3, p.499, 2009.
DOI : 10.1037/a0016104

H. Nam, L. M. Goldstein, S. Giulivi, A. G. Levitt, and D. Whalen, Computational simulation of CV combination preferences in babbling, Journal of Phonetics, vol.41, issue.2, pp.41-63, 2013.
DOI : 10.1016/j.wocn.2012.11.002

J. Ohala, Moderator's introduction to symposium on phonetic universals in phonological systems and their explanation, Proceedings of the ninth international congress of phonetic sciences, pp.181-185, 1979.

M. Oliphant, The dilemma of Saussurean communication, Biosystems, vol.37, issue.1-2, pp.31-38, 1996.
DOI : 10.1016/0303-2647(95)01543-4

P. Oudeyer, The self-organization of speech sounds, Journal of Theoretical Biology, vol.233, issue.3, pp.435-449, 2005.
DOI : 10.1016/j.jtbi.2004.10.025

URL : https://hal.archives-ouvertes.fr/inria-00001176

P. Oudeyer, Self-organization in the evolution of speech. Studies in the evolution of language, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00818204

P. Oudeyer, Aux sources de la parole, 2013.

J. Pickles, An introduction to the physiology of hearing, 2012.

C. Pradalier, F. Colas, and P. Bessiere, Expressing Bayesian fusion as a product of distributions: applications in robotics, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453), pp.1851-1856, 2003.
DOI : 10.1109/IROS.2003.1248913

URL : https://hal.archives-ouvertes.fr/hal-00089247

G. Rizzolatti and M. A. Arbib, Language within our grasp, Trends in Neurosciences, vol.21, issue.5, pp.188-194, 1998.
DOI : 10.1016/S0166-2236(98)01260-0

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.169.2919

G. Rizzolatti, L. Fadiga, V. Gallese, and L. Fogassi, Premotor cortex and the recognition of motor actions, Cognitive Brain Research, vol.3, issue.2, pp.131-141, 1996.
DOI : 10.1016/0926-6410(95)00038-0

D. Roy, Semiotic schemas: A framework for grounding language in action and perception, Artificial Intelligence, vol.167, issue.1-2, pp.170-205, 2005.
DOI : 10.1016/j.artint.2005.04.007

A. C. Roy and M. A. Arbib, The syntactic motor system, Gesture, vol.5, issue.1-2, pp.7-37, 2005.
DOI : 10.1075/gest.5.1-2.03roy

M. Ruhlen, The origin of language: Tracing the evolution of the mother tongue, 1996.

R. Schroeder, B. S. Atal, and J. L. Hall, Optimizing digital speech coders by exploiting masking properties of the human ear, The Journal of the Acoustical Society of America, vol.66, issue.6, pp.1647-1652, 1979.
DOI : 10.1121/1.383662

J. Schwartz, A. Basirat, L. Ménard, and M. Sato, The Perception-for-Action-Control Theory (PACT): A perceptuo-motor theory of speech perception, Journal of Neurolinguistics, vol.25, issue.5, pp.336-354, 2012.
DOI : 10.1016/j.jneuroling.2009.12.004

URL : https://hal.archives-ouvertes.fr/hal-00442367

J. Schwartz, L. Boë, and C. Abry, Linking the Dispersion-Focalization Theory (DFT) and the Maximum Utilization of the Available Distinctive Features (MUAF) principle in a Perception-for-Action-Control Theory (PACT), Experimental approaches to phonology, pp.104-124, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00195315

J. Schwartz, L. Boë, P. Badin, and T. R. Sawallis, Grounding stop place systems in the perceptuo-motor substance of speech: On the universality of the labial???coronal???velar stop series, Journal of Phonetics, vol.40, issue.1, pp.20-36, 2012.
DOI : 10.1016/j.wocn.2011.10.004

URL : https://hal.archives-ouvertes.fr/hal-00640400

J. Schwartz, L. Boë, N. Vallée, and C. Abry, The Dispersion-Focalization Theory of vowel systems, Journal of Phonetics, vol.25, issue.3, pp.255-286, 1997.
DOI : 10.1006/jpho.1997.0043

J. Schwartz, L. Boë, N. Vallée, and C. Abry, Major trends in vowel system inventories, Journal of Phonetics, vol.25, issue.3, pp.233-253, 1997.
DOI : 10.1006/jpho.1997.0044

J. E. Serkhane, Un bébé androïde vocalisant: Etude et modélisation des mécanismes d'exploration vocale et d'imitation orofaciale dans le développement de la parole, 2005.

J. Serkhane, J. Schwartz, and P. Bessiere, Building a talking baby robot: A contribution to the study of speech acquisition and evolution, Interaction Studies, vol.6, issue.2, pp.253-286, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00186575

J. Serkhane, J. Schwartz, L. Boë, B. Davis, and C. Matyear, Infants??? vocalizations analyzed with an articulatory model: A preliminary report, Journal of Phonetics, vol.35, issue.3, pp.321-340, 2007.
DOI : 10.1016/j.wocn.2006.10.002

URL : https://hal.archives-ouvertes.fr/hal-00175956

J. I. Skipper, V. Van-wassenhove, H. C. Nusbaum, and S. L. Small, Hearing Lips and Seeing Voices: How Cortical Areas Supporting Speech Production Mediate Audiovisual Speech Perception, Cerebral Cortex, vol.17, issue.10, pp.17-2387, 2007.
DOI : 10.1093/cercor/bhl147

L. Steels, The Artificial Life Roots of Artificial Intelligence, Artificial Life, vol.1, issue.1_2, pp.89-125, 1994.
DOI : 10.1109/TSMC.1973.4309272

L. Steels, The Synthetic Modeling of Language Origins, Evolution of Communication An international multidisciplinary journal, vol.1, issue.1, pp.1-34, 1997.
DOI : 10.1075/eoc.1.1.02ste

L. L. Steels, The spontaneous self-organization of an adaptive language The symbol grounding problem has been solved. so what's next, Machine intelligence Symbols and embodiment: Debates on meaning and cognition, pp.205-224, 1999.

K. Stevens, The quantal nature of speech: Evidence from articulatory-acoustic data, Human communication: A unified view, pp.51-66, 1972.

K. Stevens, On the quantal nature of speech, Journal of Phonetics, vol.17, issue.1, pp.3-45, 1989.

K. Stevens and S. Keyser, Quantal theory, enhancement and overlap, Journal of Phonetics, vol.38, issue.1, pp.10-19, 2010.
DOI : 10.1016/j.wocn.2008.10.004

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.651.6723

M. Studdert-kennedy and L. Goldstein, Launching language: The gestural origin of discrete infinity Language evolution: The states of the art, 2003.

H. M. Sussman, D. Fruchter, J. Hilbert, and J. Sirosh, Linear correlates in the speech signal: The orderly output constraint, Behavioral and Brain Sciences, vol.21, issue.02, pp.241-259, 1998.
DOI : 10.1017/S0140525X98001174

J. B. Tenenbaum, C. Kemp, T. L. Griffiths, and N. D. Goodman, How to Grow a Mind: Statistics, Structure, and Abstraction, Science, vol.331, issue.6022, pp.331-1279, 2011.
DOI : 10.1126/science.1192788

M. Tomasello, M. Carpenter, J. Call, T. Behne, and H. Moll, Understanding and sharing intentions: The origins of cultural cognition, Behavioral and Brain Sciences, vol.28, issue.05, pp.675-690, 2005.
DOI : 10.1017/S0140525X05000129

N. Vallée, Systèmes vocaliques: de la typologie aux prédictions, Thèse de Doctorat en Sciences du Langage, 1994.

N. Vallée, S. Rossato, and I. Rousset, Favoured syllabic patterns in the world???s languages and sensorimotor constraints, Approaches to phonological complexity, pp.111-139, 2009.
DOI : 10.1515/9783110223958.111

A. Vilain, C. Abry, P. Badin, and S. Brosda, From idiosyncratic pure frames to variegated babbling: Evidence from articulatory modelling, Proceedings of the 14th International congress of phonetic sciences, pp.2497-2500, 1999.

V. Volterra, M. C. Caselli, O. Capirci, and E. Pizzuto, Gesture and the emergence and development of language, Beyond nature-nurture: Essays in honor of Elizabeth Bates, pp.3-40, 2005.

W. Zuidema and G. Westermann, Evolution of an Optimal Lexicon under Constraints from Embodiment, Artificial Life, vol.9, issue.4, pp.387-402, 2003.
DOI : 10.1016/S0004-3702(98)00066-6