M. Brainard and A. Doupe, What songbirds teach us about learning, Nature, vol.417, issue.6886, p.351, 2002.

C. Heyes, Causes and consequences of imitation, Trends in cognitive sciences, vol.5, issue.6, pp.253-261, 2001.

, What's social about social learning?, Journal of Comparative Psychology, vol.126, issue.2, p.193, 2012.

P. , Early language acquisition: cracking the speech code, Nature reviews neuroscience, vol.5, issue.11, p.831, 2004.

P. Oudeyer, The self-organization of speech sounds, Journal of Theoretical Biology, vol.233, issue.3, pp.435-449, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00001176

D. Wolpert and M. Kawato, Multiple paired forward and inverse models for motor control, Neural Networks, vol.11, issue.7-8, pp.1317-1329, 1998.

P. Oudeyer, F. Kaplan, and V. Hafner, Intrinsic motivation systems for autonomous mental development, IEEE transactions on evolutionary computation, vol.11, issue.2, pp.265-286, 2007.

A. Doupe and P. Kuhl, Birdsong and human speech: common themes and mechanisms, Annual review of neuroscience, vol.22, issue.1, pp.567-631, 1999.

C. Moulin-frier, S. Nguyen, and P. Oudeyer, Self-organization of early vocal development in infants and machines: the role of intrinsic motivation, Frontiers in psychology, vol.4, p.1006, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00927940

T. Imada, Y. Zhang, M. Cheour, S. Taulu, A. Ahonen et al., Infant speech perception activates broca's area: a developmental magnetoencephalography study, Neuroreport, vol.17, issue.10, pp.957-962, 2006.

S. Robinson, M. Blumberg, M. Lane, and L. Kreber, Spontaneous motor activity in fetal and infant rats is organized into discrete multilimb bouts, Behavioral neuroscience, vol.114, issue.2, p.328, 2000.

P. Wallace and I. Whishaw, Independent digit movements and precision grip patterns in 1-5-month-old human infants: hand-babbling, including vacuous then self-directed hand and digit movements, precedes targeted reaching, Neuropsychologia, vol.41, issue.14, pp.1912-1918, 2003.

M. Chakraborty and E. Jarvis, Brain evolution by brain pathway duplication, Philosophical Transactions of the Royal Society B: Biological Sciences, vol.370, issue.1684, p.20150056, 2015.

E. Jarvis, Evolution of vocal learning and spoken language, Science, vol.366, issue.6461, pp.50-54, 2019.

A. Friederici, The brain basis of language processing: from structure to function, Physiological reviews, vol.91, issue.4, pp.1357-1392, 2011.

P. , A new view of language acquisition, Proceedings of the National Academy of Sciences, vol.97, issue.22, pp.11-850, 2000.

R. H. Hahnloser and A. Kotowicz, Auditory representations and memory in birdsong learning, Current opinion in neurobiology, vol.20, issue.3, pp.332-339, 2010.

F. Theunissen, K. Sen, and A. Doupe, Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds, Journal of Neuroscience, vol.20, issue.6, pp.2315-2331, 2000.

G. Keller and R. H. Hahnloser, Neural processing of auditory feedback during vocal practice in a songbird, Nature, vol.457, issue.7226, p.187, 2009.

V. Gallese, L. Fadiga, L. Fogassi, and G. Rizzolatti, Action recognition in the premotor cortex, Brain, vol.119, issue.2, pp.593-609, 1996.

E. Oztop, M. Kawato, and M. Arbib, Mirror neurons and imitation: A computationally guided review, Neural Networks, vol.19, issue.3, pp.254-271, 2006.

J. Prather, S. Peters, S. Nowicki, and R. Mooney, Precise auditoryvocal mirroring in neurons for learned vocal communication, Nature, vol.451, issue.7176, p.305, 2008.

G. Rizzolatti, L. Fadiga, V. Gallese, and L. Fogassi, Premotor cortex and the recognition of motor actions, Cognitive brain research, vol.3, issue.2, pp.131-141, 1996.

A. Tramacere, K. Wada, K. Okanoya, A. Iriki, and P. Ferrari, Auditorymotor matching in vocal recognition and imitative learning, Neuroscience, 2019.

R. Hahnloser and S. Ganguli, Vocal learning with inverse models, Principles of Neural Coding, pp.547-564, 2013.

M. Sizemore and D. Perkel, Premotor synaptic plasticity limited to the critical period for song learning, Proceedings of the National Academy of Sciences, vol.108, issue.42, pp.17-492, 2011.

R. Darshan, W. Wood, S. Peters, A. Leblois, and D. Hansel, A canonical neural mechanism for behavioral variability, Nature communications, vol.8, p.15415, 2017.

N. Giret, J. Kornfeld, S. Ganguli, and R. H. Hahnloser, Evidence for a causal inverse model in an avian cortico-basal ganglia circuit, vol.111, pp.6063-6068, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01101365

M. Kawato, Internal models for motor control and trajectory planning, Current opinion in neurobiology, vol.9, issue.6, pp.718-727, 1999.

A. Liberman and I. Mattingly, The motor theory of speech perception revised, Cognition, vol.21, issue.1, pp.1-36, 1985.

C. Fowler, Speech perception as a perceptuo-motor skill, Neurobiology of Language, pp.175-184, 2016.

S. Wilson, A. Saygin, M. Sereno, and M. Iacoboni, Listening to speech activates motor areas involved in speech production, Nature Neuroscience, vol.7, issue.7, pp.701-702, 2004.

J. Schwartz, A. Basirat, L. Ménard, and M. Sato, The perceptionfor-action-control theory (PACT): A perceptuo-motor theory of speech perception, Journal of Neurolinguistics, vol.25, issue.5, pp.336-354, 2012.

C. Boettiger and A. Doupe, Developmentally restricted synaptic plasticity in a songbird nucleus required for song learning, Neuron, vol.31, issue.5, pp.809-818, 2001.

L. Ding and D. Perkel, Long-term potentiation in an avian basal ganglia nucleus essential for vocal learning, Journal of Neuroscience, vol.24, issue.2, pp.488-494, 2004.

W. Mehaffey and A. Doupe, Naturalistic stimulation drives opposing heterosynaptic plasticity at two inputs to songbird cortex, Nature neuroscience, vol.18, issue.9, p.1272, 2015.

R. Legenstein, S. Chase, A. Schwartz, and W. Maass, A rewardmodulated hebbian learning rule can explain experimentally observed network reorganization in a brain control task, Journal of Neuroscience, vol.30, issue.25, pp.8400-8410, 2010.

W. Schultz, Predictive reward signal of dopamine neurons, Journal of neurophysiology, vol.80, issue.1, pp.1-27, 1998.

J. Goldberg, M. Farries, and M. Fee, Basal ganglia output to the thalamus: still a paradox, Trends in neurosciences, vol.36, issue.12, pp.695-705, 2013.

A. Andalman and M. Fee, A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors, vol.106, pp.12-518, 2009.

C. Scharff and F. Nottebohm, A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning, Journal of Neuroscience, vol.11, issue.9, pp.2896-2913, 1991.

G. Fant, Acoustic theory of speech production: with calculations based on X-ray studies of Russian articulations, vol.2, 2012.

P. Birkholz, A survey of self-oscillating lumped-element models of the vocal folds, Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung, pp.47-58, 2011.

K. Ishizaka and J. Flanagan, Synthesis of voiced sounds from a twomass model of the vocal cords, Bell system technical journal, vol.51, issue.6, pp.1233-1268, 1972.

B. Erath, M. Zanartu, K. Stewart, M. Plesniak, D. Sommer et al., A review of lumped-element models of voiced speech, Speech Communication, vol.55, issue.5, pp.667-690, 2013.

P. Ladefoged, Elements of acoustic phonetics, 1996.

G. Westerman and E. Miranda, Modelling the development of mirror neurons for auditory-motor integration, Journal of new music research, vol.31, issue.4, pp.367-375, 2002.

B. D. Boer, Self-organization in vowel systems, Journal of phonetics, vol.28, issue.4, pp.441-465, 2000.

, The origins of vowel systems, vol.1, 2001.

S. Maeda, Compensatory articulation in speech: analysis of x-ray data with an articulatory model, First European Conference on Speech Communication and Technology, 1989.

P. Boersma, Functional phonology: Formalizing the interactions between articulatory and perceptual drives. Holland Academic Graphics The Hague, vol.11, 1998.

A. Warlaumont and M. Finnegan, Learning to produce syllabic speech sounds via reward-modulated neural plasticity, PloS one, vol.11, issue.1, p.145096, 2016.

P. Birkholz, Vocaltractlab-towards high-quality articulatory speech synthesis, 2019.

P. Birkholz, D. Jackèl, and B. Kroger, Construction and control of a three-dimensional vocal tract model, ICASSP, vol.1, 2006.

A. Philippsen, R. Reinhart, and B. Wrede, Learning how to speak: Imitation-based refinement of syllable production in an articulatoryacoustic model, ICDL-EpiRob, pp.195-200, 2014.

, Goal babbling of acoustic-articulatory models with adaptive exploration noise, ICDL-EpiRob, pp.72-78, 2016.

M. Murakami, B. Kröger, P. Birkholz, and J. Triesch, Seeing [u] aids vocal learning: Babbling and imitation of vowels using a 3d vocal tract model, reinforcement learning, and reservoir computing, ICDLEpiRob, pp.208-213, 2015.

I. Howard and P. Birkholz, Modelling vowel acquisition using the birkholz synthesizer, Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung, pp.304-311, 2019.

J. Gudhnason, D. Mehta, and T. Quatieri, Evaluation of speech inverse filtering techniques using a physiologically based synthesizer, ICASSP, pp.4245-4249, 2015.

S. Prom-on, P. Birkholz, and Y. Xu, Training an articulatory synthesizer with continuous acoustic data, INTERSPEECH, pp.349-353, 2013.

S. Maeda, Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model, Speech production and speech modelling, pp.131-149, 1990.

I. Howard and M. Huckvale, Training a vocal tract synthesiser to imitate speech using distal supervised learning, SPECOM, vol.2, pp.159-162, 2005.

C. Moulin-frier and P. Oudeyer, Curiosity-driven phonetic learning, ICDL-EpiRob, pp.1-8, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00762795

I. Howard and P. Messum, A computational model of infant speech development, SPECOM, pp.756-765, 2007.

F. Guenther, S. Ghosh, A. Nieto-castanon, and J. Tourville, A neural model of speech production, pp.27-40, 2006.

J. Tourville and F. Guenther, The diva model: A neural theory of speech acquisition and production, Language and cognitive processes, vol.26, issue.7, pp.952-981, 2011.

G. Bailly, Learning to speak. sensori-motor control of speech movements, Speech Communication, vol.22, issue.2-3, pp.251-267, 1997.

S. Forestier, Y. Mollard, and P. Oudeyer, Intrinsically motivated goal exploration processes with automatic curriculum learning, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01651233

J. Acevedo-valle, V. Hafner, and C. Angulo, Social reinforcement in artificial prelinguistic development: A study using intrinsically motivated exploration architectures, IEEE Transactions on Cognitive and Developmental Systems, 2018.

S. Maeda, Vtcalcs

I. Howard and P. Messum, Modeling the development of pronunciation in infant speech acquisition, Motor Control, vol.15, issue.1, pp.85-117, 2011.

B. Kröger, J. Kannampuzha, and C. Neuschaefer-rube, Towards a neurocomputational model of speech production and perception, Speech Communication, vol.51, issue.9, pp.793-809, 2009.

C. Lyon, C. Nehaniv, and J. Saunders, Interactive language learning by robots: The transition from babbling to word forms, PloS one, vol.7, issue.6, p.38236, 2012.

F. S. Foundation, , 2007.

H. Liu and Y. Xu, Learning model-based f0 production through goaldirected babbling, ISCSLP, pp.284-288, 2014.

L. Alonso, J. Alliende, F. Goller, and G. Mindlin, Low-dimensional dynamical model for the diversity of pressure patterns used in canary song, Physical Review E, vol.79, issue.4, p.41929, 2009.

C. Elemans, J. Rasmussen, C. Herbst, D. Düring, S. Zollinger et al., Universal mechanisms of sound production and control in birds and mammals, Nature communications, vol.6, p.8978, 2015.

M. Fee, B. Shraiman, B. Pesaran, and P. Mitra, The role of nonlinear dynamics of the syrinx in the vocalizations of a songbird, Nature, vol.395, issue.6697, p.67, 1998.

R. Laje, T. Gardner, and G. Mindlin, Neuromuscular control of vocalizations in birdsong: a model, Physical Review E, vol.65, issue.5, p.51921, 2002.

G. Mindlin, The physics of birdsong production, Contemporary physics, vol.54, pp.91-96, 2013.

M. Trevisan, J. Mendez, and G. Mindlin, Respiratory patterns in oscine birds during normal respiration and song production, Physical Review E, vol.73, issue.6, p.61911, 2006.

I. Yildiz and S. Kiebel, A hierarchical neuronal model for generation and online recognition of birdsongs, PLoS Computational Biology, vol.7, issue.12, p.1002303, 2011.

R. Alonso, M. Trevisan, A. Amador, F. Goller, and G. Mindlin, A circular model for song motor control in serinus canaria, Frontiers in computational neuroscience, vol.9, p.41, 2015.

K. Srivastava, C. Elemans, and S. Sober, Multifunctional and contextdependent control of vocal acoustics by individual muscles, Journal of Neuroscience, vol.35, issue.42, pp.14-183, 2015.

D. Düring, B. Knörlein, and C. Elemans, In situ vocal fold properties and pitch prediction by dynamic actuation of the songbird syrinx, Scientific reports, vol.7, issue.1, p.11296, 2017.

S. Sober, M. Wohlgemuth, and M. Brainard, Central contributions to acoustic variation in birdsong, Journal of Neuroscience, vol.28, issue.41, pp.10-370, 2008.

A. Amador, Y. Perl, G. Mindlin, and D. Margoliash, Elemental gesture dynamics are encoded by song premotor cortical neurons, Nature, vol.495, issue.7439, p.59, 2013.

K. Doya and T. Sejnowski, A computational model of birdsong learning by auditory experience and auditory feedback, Central auditory processing and neural modeling, pp.77-88, 1998.

I. Fiete, M. Fee, and H. Seung, Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances, Journal of neurophysiology, vol.98, issue.4, pp.2038-2057, 2007.

S. Boari, Y. Perl, . Amador, . Margoliash, and . Mindlin, Automatic reconstruction of physiological gestures used in a model of birdsong production, Journal of neurophysiology, vol.114, issue.5, pp.2912-2922, 2015.

Y. Teramoto, D. Takahashi, P. Holmes, and A. Ghazanfar, Vocal development in a waddington landscape, vol.6, p.20782, 2017.

S. Forestier and P. Oudeyer, A unified model of speech and tool use early development, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01583301

T. Troyer and A. Doupe, An associational model of birdsong sensorimotor learning i. efference copy and the learning of song syllables, Journal of Neurophysiology, vol.84, issue.3, pp.1204-1223, 2000.

M. Barnaud, J. Schwartz, P. Bessière, and J. Diard, Computer simulations of coupled idiosyncrasies in speech perception and speech production with cosmo, a perceptuo-motor bayesian model of speech communication, PloS one, vol.14, issue.1, p.210302, 2019.
URL : https://hal.archives-ouvertes.fr/hal-01994708

L. Cohen and A. Billard, Social babbling: The emergence of symbolic gestures and words, Neural Networks, 2018.

S. Pagliarini, X. Hinaut, and A. Leblois, A bio-inspired model towards vocal gesture learning in songbird, ICDL-Epirob, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01906459

I. Yildiz and S. Kiebel, The cmu pronouncing dictionary

S. Najnin and B. Banerjee, A predictive coding framework for a developmental agent: Speech motor skill acquisition and speech production, Speech Communication, vol.92, pp.24-41, 2017.

K. Doya and T. Sejnowski, A computational model of avian song learning, The new cognitive neurosciences, 2000.

R. Reinhart, Reservoir computing with output feedback, 2011.

M. Jordan and D. Rumelhart, Forward models: Supervised learning with a distal teacher, Cognitive science, vol.16, issue.3, pp.307-354, 1992.

M. Pickering and S. Garrod, An integrated theory of language production and comprehension, Behavioral and brain sciences, vol.36, issue.4, pp.329-347, 2013.

M. Kawato, Feedback-error-learning neural network for supervised motor learning, Advanced neural computers, pp.365-372, 1990.

M. Rolf, J. Steil, and M. Gienger, Goal babbling permits direct learning of inverse kinematics, IEEE Transactions on Autonomous Mental Development, vol.2, issue.3, pp.216-229, 2010.

T. Sejnowski, Storing covariance with nonlinearly interacting neurons, Journal of mathematical biology, vol.4, issue.4, pp.303-321, 1977.

C. Moulin-frier, J. Diard, J. Schwartz, and P. Bessière, Cosmo (communicating about objects using sensory-motor operations): A bayesian modeling framework for studying speech communication and the emergence of phonological systems, Journal of Phonetics, vol.53, pp.5-41, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01230175

S. Forestier and P. Oudeyer, Curiosity-driven development of tool use precursors: a computational model, pp.1859-1864, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01354013

A. Baranes and P. Oudeyer, Active learning of inverse models with intrinsically motivated goal exploration in robots, Robotics and Autonomous Systems, vol.61, issue.1, pp.49-73, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00788440

J. Liljencrants and B. Lindblom, Numerical simulation of vowel quality systems: The role of perceptual contrast, Language, vol.48, issue.4, pp.839-862, 1972.

A. Dhawale, M. Smith, and B. Ölveczky, The role of variability in motor learning, Annual review of neuroscience, vol.40, pp.479-498, 2017.

D. Wolpert, Z. Ghahramani, and J. Flanagan, Perspectives and problems in motor learning, Trends in Cognitive Sciences, vol.5, issue.11, pp.487-494, 2001.

A. Laversanne-finot, A. Péré, and P. Oudeyer, Curiosity driven exploration of learned disentangled goal spaces, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01891598

A. Ghazanfar and D. Liao, Constraints and flexibility during vocal development: insights from marmoset monkeys, Current opinion in behavioral sciences, vol.21, pp.27-32, 2018.

J. M. Acevedo-valle, C. Hafner, and . Angulo, Social reinforcement in artificial prelinguistic development: A study using intrinsically motivated exploration architectures, IEEE Transactions on Cognitive and Developmental Systems, 2018.

. Lm-alonso, . Ja-alliende, G. B. Goller, and . Mindlin, Low-dimensional dynamical model for the diversity of pressure patterns used in canary song, Physical Review E, vol.79, issue.4, p.41929, 2009.

. Rg-alonso, . Ma-trevisan, F. Amador, G. B. Goller, and . Mindlin, A circular model for song motor control in serinus canaria, Frontiers in computational neuroscience, vol.9, p.41, 2015.

A. Amador, . Perl, D. Gb-mindlin, and . Margoliash, Elemental gesture dynamics are encoded by song premotor cortical neurons, Nature, vol.495, issue.7439, p.59, 2013.

G. Bailly, Learning to speak. sensori-motor control of speech movements, Speech Communication, vol.22, issue.2-3, pp.251-267, 1997.

L. Baird, Residual algorithms: Reinforcement learning with function approximation, Machine Learning Proceedings, pp.30-37, 1995.

A. Baranes and . Oudeyer, Active learning of inverse models with intrinsically motivated goal exploration in robots, Robotics and Autonomous Systems, vol.61, issue.1, pp.49-73, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00788440

. Ml-barnaud, P. Schwartz, J. Bessière, and . Diard, Computer simulations of coupled idiosyncrasies in speech perception and speech production with cosmo, a perceptuo-motor bayesian model of speech communication, PloS one, vol.14, issue.1, p.210302, 2019.

O. Barzelay, O. Furst, and . Barak, A new approach to model pitch perception using sparse coding, PLoS computational biology, vol.13, issue.1, p.1005338, 2017.

G. Beckers, Bird speech perception and vocal production: a comparison with humans, Human Biology, vol.83, issue.2, pp.191-213, 2011.

P. Birkholz, A survey of self-oscillating lumped-element models of the vocal folds, Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung, pp.47-58, 2011.

P. Birkholz, Vocaltractlab-towards high-quality articulatory speech synthesis, 2014.

P. Birkholz, B. J. Jackèl, and . Kroger, Construction and control of a three-dimensional vocal tract model, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, vol.1, 2006.

. Boari, . Ys, . Perl, . Amador, . Margoliash et al., Automatic reconstruction of physiological gestures used in a model of birdsong production, Journal of neurophysiology, vol.114, issue.5, pp.2912-2922, 2015.

P. Boersma, Functional phonology: Formalizing the interactions between articulatory and perceptual drives, vol.11, 1998.

C. A. Boettiger and . Doupe, Developmentally restricted synaptic plasticity in a songbird nucleus required for song learning, Neuron, vol.31, issue.5, pp.809-818, 2001.

S. W. Bottjer and . Sengelaub, Cell death during development of a forebrain nucleus involved with vocal learning in zebra finches, Journal of neurobiology, vol.20, issue.7, pp.609-618, 1989.

. Ms-brainard and . Doupe, Auditory feedback in learning and maintenance of vocal behaviour, Nature Reviews Neuroscience, vol.1, issue.1, p.31, 2000.

. Ms-brainard and . Doupe, What songbirds teach us about learning, Nature, vol.417, issue.6886, p.351, 2002.

M. Chakraborty and E. D. Jarvis, Brain evolution by brain pathway duplication, Philosophical Transactions of the Royal Society B: Biological Sciences, vol.370, p.20150056, 1684.

L. Cohen and . Billard, Social babbling: The emergence of symbolic gestures and words, Neural Networks, 2018.

. R-darshan, . Wood, . Peters, and . Leblois, A canonical neural mechanism for behavioral variability, Nature communications, vol.8, p.15415, 2017.

. B-de and . Boer, Self-organization in vowel systems, Journal of phonetics, vol.28, issue.4, pp.441-465, 2000.

. B-de and . Boer, The origins of vowel systems, vol.1, 2001.

M. A. Ak-dhawale and B. Smith, The role of variability in motor learning, Annual review of neuroscience, vol.40, pp.479-498, 2017.

L. Ding, Long-term potentiation in an avian basal ganglia nucleus essential for vocal learning, Journal of Neuroscience, vol.24, issue.2, pp.488-494, 2004.

. Tm-donath, K. T. Natke, and . Kalveram, Effects of frequency-shifted auditory feedback on voice f 0 contours in syllables, The Journal of the Acoustical Society of America, vol.111, issue.1, pp.357-366, 2002.

. Aj-doupe, Birdsong and human speech: common themes and mechanisms. Annual review of neuroscience, vol.22, pp.567-631, 1999.

K. Doya and . Sejnowski, A computational model of birdsong learning by auditory experience and auditory feedback, Central auditory processing and neural modeling, pp.77-88, 1998.

K. Doya and . Sejnowski, A computational model of avian song learning, The new cognitive neurosciences, 2000.

. Dn-düring, C. Bj-knörlein, and . Elemans, In situ vocal fold properties and pitch prediction by dynamic actuation of the songbird syrinx, Scientific reports, vol.7, issue.1, p.11296, 2017.

. Cph-elemans, C. T. Rasmussen, . Herbst, . Dn-düring, H. Sa-zollinger et al., Universal mechanisms of sound production and control in birds and mammals, Nature communications, vol.6, p.8978, 2015.

. Bd-erath, . Zanartu, . Stewart, . Plesniak, S. D. Sommer et al., A review of lumped-element models of voiced speech, Speech Communication, vol.55, issue.5, pp.667-690, 2013.

G. Fant, Acoustic theory of speech production: with calculations based on X-ray studies of Russian articulations, vol.2, 2012.

. Ms-fee, . Shraiman, P. P. Pesaran, and . Mitra, The role of nonlinear dynamics of the syrinx in the vocalizations of a songbird, Nature, vol.395, issue.6697, p.67, 1998.

. Ir-fiete, H. S. Ms-fee, and . Seung, Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances, Journal of neurophysiology, vol.98, issue.4, pp.2038-2057, 2007.

S. Forestier, P. Y. Mollard, and . Oudeyer, Intrinsically motivated goal exploration processes with automatic curriculum learning, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01651233

S. Forestier and . Oudeyer, Curiosity-driven development of tool use precursors: a computational model, 38th annual conference of the cognitive science society, pp.1859-1864, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01354013

S. Forestier and . Oudeyer, A unified model of speech and tool use early development, 39th Annual Conference of the Cognitive Science Society, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01583301

, Free Software Foundation, 2007.

C. A. Fowler, Speech perception as a perceptuo-motor skill, Neurobiology of Language, pp.175-184, 2016.

. Ad-friederici, The brain basis of language processing: from structure to function, Physiological reviews, vol.91, issue.4, pp.1357-1392, 2011.

. V-gallese, . Fadiga, G. Fogassi, and . Rizzolatti, Action recognition in the premotor cortex, Brain, vol.119, issue.2, pp.593-609, 1996.

A. A. Ghazanfar and . Liao, Constraints and flexibility during vocal development: insights from marmoset monkeys. Current opinion in behavioral sciences, vol.21, pp.27-32, 2018.

N. Giret, . Kornfeld, R. Ganguli, and . Hr-hahnloser, Evidence for a causal inverse model in an avian cortico-basal ganglia circuit, vol.111, pp.6063-6068, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01101365

J. H. Goldberg, M. S. Ma-farries, and . Fee, Basal ganglia output to the thalamus: still a paradox, Trends in neurosciences, vol.36, issue.12, pp.695-705, 2013.

J. Gudhnason, T. F. Mehta, and . Quatieri, Evaluation of speech inverse filtering techniques using a physiologically based synthesizer, Acoustics, Speech and Signal Processing, pp.4245-4249, 2015.

. Fh-guenther, . Ss-ghosh, J. A. Nieto-castanon, and . Tourville, A neural model of speech production. Speech production: Models, phonetic processes and techniques, pp.27-40, 2006.

R. Hahnloser and . Ganguli, Vocal learning with inverse models, Principles of Neural Coding, pp.547-564, 2013.

R. Hr-hahnloser and A. Kotowicz, Auditory representations and memory in birdsong learning, Current opinion in neurobiology, vol.20, issue.3, pp.332-339, 2010.

C. Heyes, Causes and consequences of imitation, Trends in cognitive sciences, vol.5, issue.6, pp.253-261, 2001.

C. Heyes, What's social about social learning, Journal of Comparative Psychology, vol.126, issue.2, p.193, 2012.

I. Howard and M. Huckvale, Training a vocal tract synthesiser to imitate speech using distal supervised learning, Proc. SpeCom: 10th International Conference on Speech and Computer, vol.2, pp.159-162, 2005.

I. S. Howard and P. Birkholz, Modelling vowel acquisition using the birkholz synthesizer, Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung, pp.304-311, 2019.

I. S. Howard and P. Messum, Modeling the development of pronunciation in infant speech acquisition, Motor Control, vol.15, issue.1, pp.85-117, 2011.

I. S. Howard and . Messum, A computational model of infant speech development, XII International Conference" Speech and Computer"(SPECOM'2007), pp.756-765, 2007.

T. Imada, M. Zhang, . Cheour, . Taulu, and . Ahonen, Infant speech perception activates broca's area: a developmental magnetoencephalography study, Neuroreport, vol.17, issue.10, pp.957-962, 2006.

K. Ishizaka and . Flanagan, Synthesis of voiced sounds from a two-mass model of the vocal cords. Bell system technical journal, vol.51, pp.1233-1268, 1972.

J. A. Jones, Perceptual calibration of f 0 production: Evidence from feedback perturbation, The Journal of the Acoustical Society of America, vol.108, issue.3, pp.1246-1251, 2000.

M. I. Jordan and . De-rumelhart, Forward models: Supervised learning with a distal teacher, Cognitive science, vol.16, issue.3, pp.307-354, 1992.

M. Kawato, Feedback-error-learning neural network for supervised motor learning, Advanced neural computers, pp.365-372, 1990.

M. Kawato, Internal models for motor control and trajectory planning, Current opinion in neurobiology, vol.9, issue.6, pp.718-727, 1999.

R. Gb-keller and . Hr-hahnloser, Neural processing of auditory feedback during vocal practice in a songbird, Nature, vol.457, issue.7226, p.187, 2009.

. Bj-kröger, C. Kannampuzha, and . Neuschaefer-rube, Towards a neurocomputational model of speech production and perception, Speech Communication, vol.51, issue.9, pp.793-809, 2009.

P. Kuhl, A new view of language acquisition, Proceedings of the National Academy of Sciences, vol.97, issue.22, pp.11850-11857, 2000.

. Pk-kuhl, Early language acquisition: cracking the speech code, Nature reviews neuroscience, vol.5, issue.11, p.831, 2004.

P. Ladefoged, Elements of acoustic phonetics, 1996.

R. Laje, G. B. Gardner, and . Mindlin, Neuromuscular control of vocalizations in birdsong: a model, Physical Review E, vol.65, issue.5, p.51921, 2002.

A. Laversanne-finot, P. Y. Péré, and . Oudeyer, Curiosity driven exploration of learned disentangled goal spaces, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01891598

. R-legenstein, . Chase, W. Schwartz, and . Maass, A rewardmodulated hebbian learning rule can explain experimentally observed network reorganization in a brain control task, Journal of Neuroscience, vol.30, issue.25, pp.8400-8410, 2010.

J. Liljencrants and . Lindblom, Numerical simulation of vowel quality systems: The role of perceptual contrast, Language, vol.48, issue.4, pp.839-862, 1972.

Y. Lim, . Lagoy, . Shinn-cunningham, and . Gardner, Transformation of temporal sequences in the zebra finch auditory system, Elife, 5:e18205, 2016.

H. Liu and Y. Xu, Learning model-based f0 production through goaldirected babbling, Chinese Spoken Language Processing, pp.284-288, 2014.

. Aj-lotto, L. L. Kluender, and . Holt, Perceptual compensation for coarticulation by japanese quail (coturnix coturnix japonica), The Journal of the Acoustical Society of America, vol.102, issue.2, pp.1134-1140, 1997.

C. Lyon, J. Nehaniv, and . Saunders, Interactive language learning by robots: The transition from babbling to word forms, PloS one, vol.7, issue.6, p.38236, 2012.

S. Maeda and . Vtcalcs,

S. Maeda, Compensatory articulation in speech: analysis of x-ray data with an articulatory model, First European Conference on Speech Communication and Technology, 1989.

S. Maeda, Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model, Speech production and speech modelling, pp.131-149, 1990.

W. H. Mehaffey and . Doupe, Naturalistic stimulation drives opposing heterosynaptic plasticity at two inputs to songbird cortex, Nature neuroscience, vol.18, issue.9, p.1272, 2015.

. Gb-mindlin, The physics of birdsong production, Contemporary physics, vol.54, pp.91-96, 2013.

R. Mooney, Neural mechanisms for learned birdsong, Learning & Memory, vol.16, issue.11, pp.655-669, 2009.

J. C-moulin-frier, . Diard, P. Schwartz, and . Bessière, Cosmo (communicating about objects using sensory-motor operations): A bayesian modeling framework for studying speech communication and the emergence of phonological systems, Journal of Phonetics, vol.53, pp.5-41, 2015.

. C-moulin-frier, P. Y. Nguyen, and . Oudeyer, Self-organization of early vocal development in infants and machines: the role of intrinsic motivation, Frontiers in psychology, vol.4, p.1006, 2014.

-. C-moulin, P. Y. Frier, and . Oudeyer, Curiosity-driven phonetic learning, Development and Learning and Epigenetic Robotics (ICDL), 2012 IEEE International Conference on, pp.1-8, 2012.

M. Murakami, P. Kröger, J. Birkholz, and . Triesch, Seeing [u] aids vocal learning: Babbling and imitation of vowels using a 3d vocal tract model, reinforcement learning, and reservoir computing, Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pp.208-213, 2015.

. Md-network, Microsoft speech api (sapi) 5.4, microsoft

. Py-oudeyer, The self-organization of speech sounds, Journal of Theoretical Biology, vol.233, issue.3, pp.435-449, 2005.

. Py-oudeyer, V. V. Kaplan, and . Hafner, Intrinsic motivation systems for autonomous mental development, IEEE transactions on evolutionary computation, vol.11, issue.2, pp.265-286, 2007.

E. Oztop, M. Kawato, and . Arbib, Mirror neurons and imitation: A computationally guided review, Neural Networks, vol.19, issue.3, pp.254-271, 2006.

E. Oztop, M. A. Kawato, and . Arbib, Mirror neurons: functions, mechanisms and models, Neuroscience letters, vol.540, pp.43-55, 2013.

S. Pagliarini, A. Hinaut, and . Leblois, A bio-inspired model towards vocal gesture learning in songbird, ICDL-Epirob, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01906459

. Ak-philippsen, B. Reinhart, and . Wrede, Learning how to speak: Imitation-based refinement of syllable production in an articulatoryacoustic model, ICDL-Epirob, pp.195-200, 2014.

. Ak-philippsen, B. Reinhart, and . Wrede, Goal babbling of acousticarticulatory models with adaptive exploration noise, Development and Learning and Epigenetic Robotics, pp.72-78, 2016.

M. J. Pickering and S. Garrod, An integrated theory of language production and comprehension, Behavioral and brain sciences, vol.36, issue.4, pp.329-347, 2013.

. Jf-prather, S. Peters, R. Nowicki, and . Mooney, Precise auditoryvocal mirroring in neurons for learned vocal communication, Nature, vol.451, issue.7176, p.305, 2008.

S. Prom-on, Y. Birkholz, and . Xu, Training an articulatory synthesizer with continuous acoustic data, INTERSPEECH, pp.349-353, 2013.

. Rf-reinhart, Reservoir computing with output feedback, 2011.

G. Rizzolatti, . Fadiga, L. Gallese, and . Fogassi, Premotor cortex and the recognition of motor actions, Cognitive brain research, vol.3, issue.2, pp.131-141, 1996.

. Sr-robinson, . Blumberg, L. S. Lane, and . Kreber, Spontaneous motor activity in fetal and infant rats is organized into discrete multilimb bouts, Behavioral neuroscience, vol.114, issue.2, p.328, 2000.

M. Rolf, M. Steil, and . Gienger, Goal babbling permits direct learning of inverse kinematics, IEEE Transactions on Autonomous Mental Development, vol.2, issue.3, pp.216-229, 2010.

C. Scharff and . Nottebohm, A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning, Journal of Neuroscience, vol.11, issue.9, pp.2896-2913, 1991.

W. Schultz, Predictive reward signal of dopamine neurons, Journal of neurophysiology, vol.80, issue.1, pp.1-27, 1998.

J. Schwartz, A. Basirat, L. Ménard, and M. Sato, The perception-for-action-control theory (PACT): A perceptuo-motor theory of speech perception, Journal of Neurolinguistics, vol.25, issue.5, pp.336-354, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00442367

. Tj-sejnowski, Storing covariance with nonlinearly interacting neurons, Journal of mathematical biology, vol.4, issue.4, pp.303-321, 1977.

M. Sizemore, Premotor synaptic plasticity limited to the critical period for song learning, Proceedings of the National Academy of Sciences, vol.108, issue.42, pp.17492-17497, 2011.

. Sj-sober, M. S. Wohlgemuth, and . Brainard, Central contributions to acoustic variation in birdsong, Journal of Neuroscience, vol.28, issue.41, pp.10370-10379, 2008.

. Kh-srivastava, S. J. Elemans, and . Sober, Multifunctional and context-dependent control of vocal acoustics by individual muscles, Journal of Neuroscience, vol.35, issue.42, pp.14183-14194, 2015.

Y. Teramoto, P. Takahashi, A. A. Holmes, and . Ghazanfar, Vocal development in a waddington landscape. eLife, vol.6, p.20782, 2017.

. Fe-theunissen, A. J. Sen, and . Doupe, Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds, Journal of Neuroscience, vol.20, issue.6, pp.2315-2331, 2000.

A. Tramacere, K. Wada, . Okanoya, P. F. Iriki, and . Ferrari, Auditorymotor matching in vocal recognition and imitative learning. Neuroscience, 2019.

. Ma-trevisan, G. B. Mendez, and . Mindlin, Respiratory patterns in oscine birds during normal respiration and song production, Physical Review E, vol.73, issue.6, p.61911, 2006.

T. W. Troyer and . Doupe, An associational model of birdsong sensorimotor learning i. efference copy and the learning of song syllables, Journal of Neurophysiology, vol.84, issue.3, pp.1204-1223, 2000.

P. S. Wallace and . Whishaw, Independent digit movements and precision grip patterns in 1-5-month-old human infants: hand-babbling, including vacuous then self-directed hand and digit movements, precedes targeted reaching, Neuropsychologia, vol.41, issue.14, pp.1912-1918, 2003.

A. S. Warlaumont and . Finnegan, Learning to produce syllabic speech sounds via reward-modulated neural plasticity, PloS one, vol.11, issue.1, p.145096, 2016.

G. Westerman and E. R. Miranda, Modelling the development of mirror neurons for auditory-motor integration, Journal of new music research, vol.31, issue.4, pp.367-375, 2002.

. Sm-wilson, M. I. Pinar-saygin, M. Sereno, and . Iacoboni, Listening to speech activates motor areas involved in speech production, Nature Neuroscience, vol.7, issue.7, pp.701-702, 2004.

. Dm-wolpert, J. R. Ghahramani, and . Flanagan, Perspectives and problems in motor learning, Trends in Cognitive Sciences, vol.5, issue.11, pp.487-494, 2001.

D. M. Wolpert and M. Kawato, Multiple paired forward and inverse models for motor control, Neural networks, vol.11, issue.7-8, pp.1317-1329, 1998.

. Ib-yildiz and . Kiebel, A hierarchical neuronal model for generation and online recognition of birdsongs, PLoS Computational Biology, vol.7, issue.12, p.1002303, 2011.