.. , , p.33

T. Epr and M. , , p.34

L. Parametric and P. , , p.34

. Summary-on-voice-modeling-techniques....., , p.37

T. Expressive-voice, , p.40

.. Singer-'s-formant, , p.41

.. Vocal, , p.41

B. , , p.44

.. Expression, 46 Fundamental frequency (f 0 ), p.53

.. Main-approaches-to-expression-control, 55 Statistical approaches, 56 Unit selection-based approaches . . . . . . . . 58 Hybrid approaches . . . . . . . . . . . . . . . . 59 Parametrized expression templates selection, p.60

C. , , p.60

, ISiS: a concatenative singing synthesizer 63

I. , , p.63

D. , , p.67

. Description-of-recorded-databases........, , p.68

.. Database-annotation, , p.69

.. Units-selection, , p.70

T. and .. , , p.73

.. Synthesis,

. Spectral-envelope-interpolation....., , p.79

.. Pan-engine,

, mse = 0.0395 samples = 2 value =

, mse = 0.3593 samples = 3 value =

, mse = 0.0796 samples = 2 value =

, mse = 0.983 samples = 12 value =

, mse = 0.5314 samples = 2 value =

, Appendix D List of publications International conferences

L. @bullet-ardaillon, G. Degottex, and A. Roebel, A multi-layer F0 model for singing voice synthesis using a B-spline representation with intuitive controls, 2015.

@. Degottex, G. Ardaillon, L. Roebel, and A. , Simple multi frame analysis methods for estimation of amplitude spectral envelope estimation in singing voice, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2016-4975, 2016.
DOI : 10.1109/ICASSP.2016.7472624

URL : https://hal.archives-ouvertes.fr/hal-01498324

L. @bullet-ardaillon, C. Chabot-canet, and A. Roebel, Expressive control of singing voice synthesis using musical contexts and a parametric F0 model, Interspeech 2016, pp.1250-1254, 2016.

L. @bullet-feugère, C. Alessandro, S. Delalez, L. Ardaillon, and A. Roebel, Evaluation of Singing Synthesis: Methodology and Case Study with Concatenative and Performative Systems, Interspeech 2016, pp.1245-1249, 2016.

L. @bullet-ardaillon and A. Roebel, A mouth opening effect based on pole modification for expressive singing voice transformation, Interspeech 2017. National conferences, 2017.

L. @bullet-ardaillon, A. Roebel, and C. Chabot-canet, Modélisation des paramètres de contrôle pour la synthèse de voix chantée, CFA/VISHNO 2016. Contribution to a journal paper, 2016.

L. @bullet-gilles-degottex and . Ardaillon, Multi-Frame Amplitude Envelope Estimation for Modification of Singing Voice, IEEE/ACM Transactions on Audio Speech and Language Processing, vol.247, pp.1242-1254, 2016.

, Appendix D. List of publications Seminars

@. Chabot-canet, C. Ardaillon, L. Roebel, and A. , proceedings waiting for publication) Analyse du style vocal et modélisation pour la synthèse de chant expressif: l'exemple d'Edith Piaf, colloque international "La voix dans les chansons: approches musicologiques, p.3, 2016.

L. @bullet-ardaillon and A. Roebel, Synthèse concaténative de la voix chantée, Journées des Jeunes Chercheurs en Audition, Acoustique musicale et Signal (JJCAAS), 2014.

L. @bullet-ardaillon and A. Roebel, A multi-layer F0 model for singing voice synthesis using a B-spline representation with intuitive controls, 2016.

@. Summer, Sciences et Voix : expressions, usages et prises en charge de l'instrument vocal humain, pp.26-30, 2016.

, Master's thesis (supervision

@. Dickerson and M. , Modification expressive de la voix chantée. IRCAM. Undergraduate internship's report (supervision), 2016.

@. Sébal and L. , Stage sur le projet ChaNTeR, 2014.

, ? 6.16 Original recording of a rough (shouted) voice

, ? 6.17 Synthesis of shouted voice from sound 6.16 using PaN without roughness

, ? 6.18 Synthesis of shouted voice from sound 6.16 using PaN with original jitter and shimmer

, ? 6.19 Original recording of a rough (shouted) voice by MS singer

, ? 6.20 Synthesis of shouted voice from sound 6.19 using PaN without roughness

, 21 Synthesis of shouted voice from sound 6.19 using PaN with original jitter scaled with a factor 0, ? 6

, ? 6.22 Synthesis of shouted voice from sound 6.19 using PaN with original jitter scaled with a factor 2

, ? 6.23 Original recording of "clean" loud voice without roughness by MS singer

, ? 6.24 Jitter and shimmer extracted from sound 6.19 applied on sound 6.23 using PaN

, ? 7.1 Extract of synthesis for the opera I.D. by Arnaud Petit with the EL database (with midi musical accompaniment)

, ? 7.2 A capella version of the song "Les feuilles d'Interspeech" submitted to the singing synthesis challenge at the Interspeech 2017 conference, using the PaN engine, RT database, and Le Roux style model

, ? 7.4 Synthesis of song "les feuilles d'Interspeech" with the PaN engine and the MS database, used for the evaluation in

, ? 7.5 Synthesis of song "les feuilles d'Interspeech" with the PaN engine and the MS database, used for the evaluation in

, ? 7.6 Synthesis of the songAu temps d'Interspeech" with the SVP engine and the RT database, used for the evaluation in

, ? 7.7 Synthesis of the songAu temps d'Interspeech" with the PaN engine and the MS database, used for the evaluation in

. Bibliography, L. Arorat, P. Behera, and . Sircar, Singing Voice Synthesis For Indian Classical Raga System, Signals and Systems Conference, 2009.

A. Ardaillon, Modélisation des paramètres de contrôle pour la synthèse de voix chantée, pp.2241-2247, 2016.

L. Ardaillon and C. Chabot-canet, Expressive Control of Singing Voice Synthesis Using Musical Contexts and a Parametric F0 Model, Interspeech 2016, pp.1250-1254, 2016.
DOI : 10.21437/Interspeech.2016-1317

URL : https://hal.archives-ouvertes.fr/hal-01449835

C. Alessandro and B. Doval, Voice quality modification for emotional speech synthesis, Eighth European Conference on Speech Communication and Technology, 2003.

[. Alessandro, B. Doval, and O. Cedex, Experiments in voice quality modification of natural speech signals: The spectral approach, The Third ESCA/COCOSDA Workshop (ETRW) on Speech Synthesis, 1998.

[. Ardaillon and G. Degottex, A multi-layer F0 model for singing voice synthesis using a B-spline representation with intuitive controls, Proceedings of the Annual Conference of the International Speech Communication Association, IN- TERSPEECH. 2015, pp.3375-3379
URL : https://hal.archives-ouvertes.fr/hal-01251898

S. [. Atal and . Hanauer, Speech Analysis and Synthesis by Linear Prediction of the Speech Wave, The Journal of the Acoustical Society of America, vol.50, issue.2B, pp.637-655, 1971.
DOI : 10.1121/1.1912679

M. Akagi and H. Kitakaze, Perception of synthesized singing voices with fine fluctuation in their fundamental frequency contours, Sixth International Conference on Spoken Language Processing, 2000.

P. Alku, Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering, Speech communication 11, pp.109-118, 1992.
DOI : 10.1016/0167-6393(92)90005-R

A. Alonso, Model d'Expressivitat Emocional per a un Sintetitzador de Veu Cantada, 2004.

[. Ardaillon and A. Roebel, A Mouth Opening Effect Based on Pole Modification for Expressive Singing Voice Transformation, Interspeech 2017, 2017.
DOI : 10.21437/Interspeech.2017-1453

URL : https://hal.archives-ouvertes.fr/hal-01534671

L. Ardaillon, Synthèse du chant, UPMC), 2013.

L. Bailly, Interaction entre cordes vocales et bandes ventriculaires en phonation : exploration in-vivo , modélisation physique, 2009.

B. Battey, B??zier Spline Modeling of Pitch-Continuous Melodic Expression and Ornamentation, Computer Music Journal, vol.52, issue.6, pp.25-39, 2004.
DOI : 10.1109/TASSP.1977.1162905

J. Bonada and M. Blaauw, Generation of growl-type voice qualities by spectral morphing, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.6910-6914
DOI : 10.1109/ICASSP.2013.6639001

URL : http://mtg.upf.edu/system/files/publications/icassp2013_growl.pdf

[. Blaauw and J. Bonada, A Singing Synthesizer Based on PixelCNN

J. Bonada and M. Blaauw, Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge, pp.1230-1234, 2016.
DOI : 10.21437/interspeech.2016-872

URL : http://repositori.upf.edu/bitstream/10230/32188/1/Bonada_Interspeech2016_expr.PDF

[. Blaauw and J. Bonada, A Neural Parametric Singing Synthesizer, Interspeech 2017, 2017.
DOI : 10.21437/Interspeech.2017-1420

URL : http://arxiv.org/pdf/1704.03809

[. Barbot, O. Boëffard, and D. Lolive, F0 stylisation with a free-knot B-spline model and simulated-annealing optimization, Ninth European Conference on Speech Communication and Technology, pp.325-328, 2005.
URL : https://hal.archives-ouvertes.fr/hal-01199085

[. Bechet, Lia_phon: Un système complet de phonétisation de textes, Traitement automatique des langues 42, pp.47-67, 2001.

G. John and . Beerends, The Third Generation ITU-T Standard for Endto-End Speech Quality Measurement Part II-Perceptual Model, Perceptual Objective Listening Quality Assessment (POLQA) Journal of the Audio Engineering Society, vol.616, 2013.

G. Beller, Analyse et modèle génératif de l'expressivité : application à la Parole et à l'Interprétation musicale, 2009.

N. Bernardoni, Vocal tract resonances in singing: variation with laryngeal mechanism for male operatic singers in chest and falsetto registers, The Journal of the Acoustical Society of America, vol.1351, pp.491-501, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01056878

]. G. Ber96 and . Berndtsson, The KTH rule system for singing synthesis, Computer Music Journal, vol.201, pp.76-91, 1996.

R. Bresin and A. Friberg, Synthesis and decoding of emotionally expressive music performance, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028), pp.317-322, 1999.
DOI : 10.1109/ICSMC.1999.812420

P. Birkholz, Articulatory Synthesis of Singing, Proceedings of Interspeech, pp.4001-4004, 2007.

E. Bjo, Musical Theater and Opera Singing ? Why So Different ? A Study of Subglottal Pressure , Voice Source , and Formant Frequency Characteristics, Journal of Voice, vol.225, pp.533-540, 2008.

[. Boersma and G. Kovacic, Spectral characteristics of three styles of Croatian folk singing, The Journal of the Acoustical Society of America, vol.119, issue.3, pp.1805-1816, 2006.
DOI : 10.1121/1.2168549

URL : http://acousticalsociety.org/wp-content/uploads/2018/01/ASAStrategicPlan_2_November_2015_0.pdf

J. Bonada and A. Loscos, Sample-based singing voice synthesizer by spectral concatenation, 2003.

W. Alan and . Black, Perfect synthesis for all the people all of the time, Proceedings of 2002 IEEE Workshop on Speech Synthesis, pp.167-170, 2002.

[. Bogaards, Sound Analysis and Processing with AudioSculpt 2, International Computer Music Conference (ICMC)
URL : https://hal.archives-ouvertes.fr/hal-01161198

]. L. Boh+91 and . Bohl, Decision Trees for Phonological Rules in Continuous Speech, International Conference on Acoustics, Speech, and Signal Processing, pp.185-188, 1991.

[. Bonada, Singing Voice Synthesis Combining Excitation plus Resonance and Sinusoidal plus Residual Models, 2001.

[. Bonada, Spectral Approach to the Modeling of the Singing Voice, 2001.

J. Bonada, Spectral Processing, pp.393-445, 2011.
DOI : 10.1162/014892600559317

J. Bonada, High quality voice transformations based on modeling radiated voice pulses in frequency domain, Proc. Digital Audio Effects (DAFx). 3, pp.291-295, 2004.

J. Bonada, Voice Processing and synthesis by performance sampling and spectral models, p.251, 2008.
DOI : 10.1109/msp.2007.323266

URL : http://mtg.upf.edu/system/files/publications/IEEESP-SingingVoiceSynthesis_FINAL.pdf

J. Bonada, Wide-band harmonic sinusoidal modeling, Proc of the 11th Int Conference on Digital Audio Effects, 2008.

[. Borchani, A survey on multi-output regression, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol.33, issue.5, pp.216-233, 2015.
DOI : 10.18637/jss.v033.i01

URL : http://oa.upm.es/40804/1/INVE_MEM_2015_204213.pdf

[. Breiman, Classification and regression trees, 1984.

E. Martine, K. Bestebreurtje, and . Schutte, Resonance Strategies for the Belting Style : Results of a Single Female Subject Study, Journal of voice, vol.14, issue.2, pp.194-204, 2000.

[. Bretos and J. Sundberg, Measurements of vibrato parameters in long sustained crescendo notes as sung by ten sopranos, Journal of Voice, vol.17, issue.3, pp.343-352, 2002.
DOI : 10.1067/S0892-1997(03)00006-7

J. Bonada and X. Serra, Synthesis of the Singing Voice by Performance Sampling and Spectral Models, IEEE signal processing magazine 24, pp.67-79, 2007.
DOI : 10.1109/MSP.2007.323266

URL : http://mtg.upf.edu/system/files/publications/IEEESP-SingingVoiceSynthesis_FINAL.pdf

A. W. Black, P. Taylor, and R. Caley, The Festival Speech Synthesis System -system documentation, 2001.

P. Cano, Voice Morphing System for Impersonating in Karaoke Applications, 2000.

[. Canazza, Modeling and Control of Expressiveness in Music Performance, Proceedings of the IEEE 92, pp.686-701, 2004.
DOI : 10.1109/JPROC.2004.825889

N. Campbell and A. W. Black, Prosody and the Selection of Source Units for Concatenative Synthesis, pp.279-292, 1997.
DOI : 10.1007/978-1-4612-1894-4_22

[. Campedel-oudot, O. Cappé, and E. Moulines, Estimation of the spectral envelope of voiced sounds using a penalized likelihood approach, IEEE Transactions on Speech and Audio Processing, vol.9, issue.5, pp.469-481, 2001.
DOI : 10.1109/89.928912

A. Camacho and J. G. Harris, A sawtooth waveform inspired pitch estimator for speech and music, The Journal of the Acoustical Society of America, vol.124, issue.3, pp.1638-1652, 2008.
DOI : 10.1121/1.2951592

URL : http://www.cise.ufl.edu/~acamacho/publications/dissertation.pdf

C. Chabot-canet, Les feuilles mortes ou les avatars d'une chanson culte : aborder les phénomènes vocaux interprétatifs dans la chanson française à travers la pratique de la reprise, pp.28-33, 2008.

, Interprétation, phrasé et rhétorique vocale dans la chanson française depuis 1950 : expliciter l'indicible de la voix, 2013.

[. Campbell, E. Jones, and M. Glavin, Audio quality assessment techniques???A review, and recent developments, Signal Processing, vol.89, issue.8, pp.1489-1500, 2009.
DOI : 10.1016/j.sigpro.2009.02.015

A. De, C. , and H. Kawahara, YIN, a fundamental frequency estimator for speech and music, In: The Journal of the Acoustical Society of America, vol.1114, pp.1917-1930, 2002.
URL : https://hal.archives-ouvertes.fr/hal-01106271

O. Cappe and E. Moulines, Regularization techniques for discrete cepstrum estimation, IEEE Signal Processing Letters, vol.3, issue.4, pp.100-102, 1996.
DOI : 10.1109/97.489060

[. Cook, Real-Time Performance Controllers for Synthesized Singing, Proceedings of the International Conference on New Interfaces for Musical Expression (NIME). 2005, pp.236-237

R. Perry and . Cook, Synthesis of the singing voice using a physically parameterized model of the human vocal tract, 1989.

[. Cook, SPASM, a Real-Time Vocal Tract Physical Model Controller; and Singer, the Companion Software Synthesis System, Computer Music Journal, vol.17, issue.1, pp.30-44, 1993.
DOI : 10.2307/3680568

R. Perry and . Cook, Toward the Perfect Audio Morph ? Singing Voice Synthesis and Processing, Proceedings of the 1st. International Conference on Digital Audio Effects (DAFX). Barcelona, 1998.

]. R. Cro80 and . Crochiere, A Weighted Overlap-Add Method of Short-time Fourier Analysis/Synthesis, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.281, pp.99-102, 1980.

G. Carlsson and J. Sundberg, Formant frequency tuning in singing, Journal of Voice, vol.6, issue.3, pp.256-260, 1992.
DOI : 10.1016/S0892-1997(05)80150-X

[. Doval, C. Alessandro, and N. Henrich, The voice source as a causal / anticausal linear filter Analysis and Synthesis, ISCA Tutorial and Research Workshop on Voice Quality: Functions, 2003.

[. Doval, C. Alessandro, and N. Henrich, The Spectrum of Glottal Flow Models, Acta acustica united with acustica 92, pp.1026-1046, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00368131

D. Chistophe, The Pitch of Short-duration Vibrato Tones, pp.95-1617, 1994.

[. Degottex and L. Ardaillon, Multi-Frame Amplitude Envelope Estimation for Modification of Singing Voice, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.7, pp.1242-1254, 2016.
DOI : 10.1109/TASLP.2016.2551863

URL : https://hal.archives-ouvertes.fr/hal-01448760

[. Degottex and L. Ardaillon, Simple multi frame analysis methods for estimation of amplitude spectral envelope estimation in singing voice, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4975-4979
DOI : 10.1109/ICASSP.2016.7472624

URL : https://hal.archives-ouvertes.fr/hal-01498324

[. Dannenberg and I. Derenyi, Combining instrument and performance models for high???quality music synthesis, Journal of New Music Research, vol.59, issue.3, pp.211-238, 1998.
DOI : 10.1007/978-1-349-12670-5_16

[. Degottex, Glottal source and vocal-tract separation, p.181, 2010.
URL : https://hal.archives-ouvertes.fr/tel-00554763

[. Degottex, A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative, IEEE Signal Processing Letters, vol.22, issue.7, pp.978-982, 2015.
DOI : 10.1109/LSP.2014.2380412

C. Johanna and . Devaney, Characterizing singing voice fundamental frequency trajectories, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2011, pp.73-76

[. Depalle, X. Garcia, and . Rodet, The recreation of a castrato voice, Farinelli's voice, Proceedings of 1995 Workshop on Applications of Signal Processing to Audio and Accoustics, pp.242-245, 1995.
DOI : 10.1109/ASPAA.1995.483000

J. Dang and K. Honda, Acoustic characteristics of the piriform fossa in models and humans, The Journal of the Acoustical Society of America, vol.101, issue.1, pp.456-465, 1997.
DOI : 10.1121/1.417990

M. Dickerson, Modification expressive de voix chantée, UPMC), 2016.

[. Dutoit and H. Leich, MBR-PSOLA: Text-To-Speech synthesis based on an MBE re-synthesis of the segments database, Speech Communication, vol.13, issue.3-4, pp.3-4, 1993.
DOI : 10.1016/0167-6393(93)90042-J

S. Davis and P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.284, pp.357-366, 1980.

M. Dong, Spectral Transformation of Singing Vowels by Dynamic Frequency Warping, 2011.

[. Degottex, A. Roebel, and X. Rodet, Phase Minimization for Glottal Model Estimation, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.5, pp.1080-1090, 2011.
DOI : 10.1109/TASL.2010.2076806

URL : https://hal.archives-ouvertes.fr/hal-01106851

[. Degottex, A. Roebel, and X. Rodet, Pitch transposition and breathiness modification using a glottal source model and its adapted vocal-tract filter, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5128-5131, 2011.
DOI : 10.1109/ICASSP.2011.5947511

URL : https://hal.archives-ouvertes.fr/hal-01106641

G. Degottex and Y. Stylianou, A Full-Band Adaptive Harmonic Representation of Speech, Thirteenth Annual Conference of the International Speech Communication Association, 2012.

[. Defez, J. C. Socor, and R. Clark, Parametric model for vocal effort interpolation with Harmonics Plus Noise Models, 8th ISCA Speech Synthesis Workshop, pp.25-30, 2013.

[. Dutoit, The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96, pp.1393-1396, 1996.
DOI : 10.1109/ICSLP.1996.607874

A. El-jaroudi and J. Makhoul, Discrete all-pole modeling, IEEE transactions on signal processing 39, pp.411-423, 1991.
DOI : 10.1109/78.80824

URL : http://hil.t.u-tokyo.ac.jp/~kameoka/SAP/papers/El-Jaroudi1991__Discrete-All_Pole_Modeling.pdf

G. Fant, The LF-model revisited . Transformations and frequency domain analysis, STL-QPSR 36, pp.119-156, 1995.

G. Fant, The voice source in connected speech, Speech communication 22, pp.2-3, 1997.
DOI : 10.1016/S0167-6393(97)00017-4

[. Friberg, R. Bresin, and J. Sundberg, Overview of the KTH rule system for musical performance, Advances in Cognitive Psychology, vol.2, issue.2, pp.145-161, 2009.
DOI : 10.2478/v10053-008-0052-x

URL : https://doi.org/10.2478/v10053-008-0052-x

. Feugère, Evaluation of Singing Synthesis: Methodology and Case Study with Concatenative and Performative Systems, Interspeech 2016, pp.1245-1249
DOI : 10.21437/Interspeech.2016-1248

. Feugère, Cantor Digitalis: chironomic parametric synthesis of singing, EURASIP Journal on Audio, Speech, and Music Processing, vol.22, issue.1, 2017.
DOI : 10.2307/3681043

L. Feugère, Synthèse par règles de la voix chantée contrôlée par le geste et applications musicales, 2013.

R. [. Flanagan and . Golden, Phase Vocoder, Bell Labs Technical Journal, vol.459, pp.1493-1509, 1966.
DOI : 10.1121/1.1939800

URL : http://asa.scitation.org/doi/pdf/10.1121/1.1939800

[. Fujisaki and K. Hirose, Analysis of voice fundamental frequency contours for declarative sentences of Japanese., Journal of the Acoustical Society of Japan (E), vol.5, issue.4, pp.233-242, 1984.
DOI : 10.1250/ast.5.233

URL : https://www.jstage.jst.go.jp/article/ast1980/5/4/5_4_233/_pdf

[. Fant, J. Liljencrants, and Q. Lin, A four-parameter model of glottal flow, pp.1-13, 1985.

Q. Fu and P. Murphy, Adaptive inverse filtering for high accuracy estimation of the glottal source Adaptive Inverse Filtering for High Accuracy Estimation, ISCA Tutorial and Research Workshop on Non-Linear Speech Processing, 2003.

J. Ortolà and . Font, Musical and phonetic controls in a singing voice synthesizer, 2001.

I. Fónagy, La vive voix: essais de psycho-phonétique, 1983.

]. For73, . Jr, and G. D. Forney, The viterbi algorithm, Proceedings of the IEEE, pp.302-309, 1973.

. A. Carol and . Fowler, Perceptual centers" in speech production and perception, In: Attention, Perception, & Psychophysics, vol.255, pp.375-388, 1979.

[. Florentine, A. N. Popper, R. R. Fay, and . Loudness, , 2011.

A. Friberg, Generating Musical Performances with Director Musices, Computer Music Journal, vol.4, issue.3, pp.23-29, 2000.
DOI : 10.1016/0167-6393(93)90075-V

J. Charlotte and . Frisbie, Anthropological and Ethnomusicological Implications of a Comparative Analysis of Bushmen and African Pygmy Music, In: Ethnology, vol.103, pp.265-290, 1971.

A. Friberg, Generative Rules for Music Performance: A Formal Description of a Rule System, Computer Music Journal, vol.15, issue.2, pp.56-71, 1991.
DOI : 10.2307/3680917

[. Farner, A. Röbel, and X. Rodet, Natural transformation of type and nature of the voice for extending vocal repertoire in high-fidelity applications, Audio Engineering Society Conference: 35th International Conference: Audio for Games, 2009.
URL : https://hal.archives-ouvertes.fr/hal-01106356

T. Fux, Vers un système indiquant la distance d'un locuteur par transformation de sa voix, 2012.

H. Fastl and E. Zwicker, Psychoacoustics: Facts and Models, 1990.
DOI : 10.1007/978-3-540-68888-4

[. Garnier, Vocal tract adjustments in the high soprano range, The Journal of the Acoustical Society of America, vol.127, issue.6, pp.3771-3780, 2010.
DOI : 10.1121/1.3419907

URL : https://hal.archives-ouvertes.fr/hal-00480078

A. Giovanni, Nonlinear behavior of vocal fold vibration: The role of coupling between the vocal folds, Journal of Voice, vol.13, issue.4, pp.465-476, 1999.
DOI : 10.1016/S0892-1997(99)80002-2

[. Gomez, Melodic characterization of monophonic recordings for expressive tempo transformations, Proceedings of Stockholm Music Acoustics Conference, 2003.

[. Goebl, E. Pampalk, and G. Widmer, Exploring expressive performance trajectories: six famous pianists play six Chopin pieces, Proceedings of the 8th international conference on music perception and cognition, pp.505-509, 2004.

[. Galas and X. Rodet, An improved cepstral method for deconvolution of source-filter systems with discrete spectra: Application to musical sound signals, 1990.

[. Galas and X. Rodet, Generalized Discrete Cepstral Analysis for Deconvolution of Source-Filter System with Discrete Spectra, IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, 1991.

[. Gramming, Relationship between changes in voice pitch and loudness, Journal of Voice, vol.2, issue.2, pp.118-126, 1988.
DOI : 10.1016/S0892-1997(88)80067-5

[. Hsiao and D. Childers, A new approach to formant estimation and modification based on pole interaction, Thirteenth Asilomar Conference on Signals, Systems and Computers, pp.783-787, 1996.

[. Henrich, Period-doubling occurences in singing : the " bassu " case in traditional Sardinian " A Tenore " singing, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00371458

[. Hallqvist, M. Filipa, J. Lã, and . Sundberg, Soul and Musical Theater: A Comparison of Two Vocal Styles, Journal of Voice, vol.31, issue.2, pp.229-235, 2017.
DOI : 10.1016/j.jvoice.2016.05.020

E. [. Hamon, F. Mouline, and . Charpentier, A diphone synthesis system based on time-domain prosodic modifications of speech, International Conference on Acoustics, Speech, and Signal Processing, pp.238-241, 1989.
DOI : 10.1109/ICASSP.1989.266409

S. Huber and A. Roebel, On the use of voice descriptors for glottal source shape parameter estimation, Computer Speech & Language, vol.28, issue.5, pp.1170-1194, 2014.
DOI : 10.1016/j.csl.2013.09.006

URL : https://hal.archives-ouvertes.fr/hal-00865343

S. Huber and A. Roebel, Voice quality transformation using an extended source-filter speech model, 12th Sound and Music Computing Conference (SMC). 2015, pp.69-76
URL : https://hal.archives-ouvertes.fr/hal-01185324

[. Henrich, J. Smith, and J. Wolfe, Vocal tract resonances in singing: Strategies used by sopranos, altos, tenors, and baritones, The Journal of the Acoustical Society of America, vol.129, issue.2, pp.1024-1035, 2011.
DOI : 10.1121/1.3518766

URL : https://hal.archives-ouvertes.fr/hal-00569451

E. Jessica and . Huber, Formants of children, women, and men: The effects of vocal intensity variation, The Journal of the Acoustical Society of America, vol.1063, pp.1532-1542, 1999.

S. Huber, Voice Conversion by modelling and transformation of extended voice characteristics, 2015.
URL : https://hal.archives-ouvertes.fr/tel-01263614

[. Ikemiya, K. Itoyama, and H. G. Okuno, Transcribing vocal expression from polyphonic music, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.3151-3155, 2014.
DOI : 10.1109/ICASSP.2014.6854176

URL : http://winnie.kuis.kyoto-u.ac.jp/members/ikemiya/paper/icassp-2014-ikemiya.pdf

[. Ikemiya, K. Itoyama, G. Hiroshi, and . Okuno, Transferring Vocal Expression of F0 Contour Using Singing Voice Synthesizer, International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp.250-259, 2014.
DOI : 10.1007/978-3-319-07467-2_27

URL : http://winnie.kuis.kyoto-u.ac.jp/members/okuno/Public/IEAAIE2014%81[Ikemiya.pdf

[. Iso, International Standard ISO 226: Normal Equal-Loudness Level Contours, 2003.

[. , Recommendation ITU-T-P.800.2 : Mean opinion score interpretation and reporting

J. Janer, J. Bonada, and M. Blaauw, Performance-driven control for sample-based singing voice synthesis, Proc. of DAFx, pp.41-44, 2006.

[. Jensen, Envelope model of isolated musical sounds, Proceedings of the 2nd COST G-6 Workshop on Digital Audio Effects (DAFx99). Trondheim, 1999.

T. , Objective assessment of hoarseness by measuring jitter, Clinical Otolaryngology, vol.261, pp.29-32, 2001.

E. Joliveau, J. Smith, and J. Wolfe, Vocal tract resonances in singing: The soprano voice, The Journal of the Acoustical Society of America, vol.116, issue.4, pp.2434-2439, 2004.
DOI : 10.1121/1.1791717

[. Kako, Automatic identification for singing style based on sung melodic contour characterized in phase plane, pp.393-398, 2009.

H. Kawahara, SparkNG: Interactive MATLAB tools for introduction to speech production, perception and processing fundamentals and application of the aliasing-free L-F model component, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2016, pp.1180-1181

H. Kawahara, Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.1303-1306, 1997.
DOI : 10.1109/ICASSP.1997.596185

[. Kawahara, Y. Agiomyrgiannakis, and H. Zen, Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis, 9th ISCA Speech Synthesis Workshop, pp.221-228
DOI : 10.21437/SSW.2016-36

URL : http://arxiv.org/pdf/1605.07809

H. Kenmochi, Singing synthesis as a new musical instrument, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5385-5388
DOI : 10.1109/ICASSP.2012.6289138

[. Kin, Visualising Singing Style Under Common Musical Events Using Pitch-Dynamics Trajectories and Modified TRACLUS Clustering Visualising Singing Style Under Common Musical Events Using Pitch-Dynamics, 13th International Conference on Machine Learning and Applications (ICMLA). 2014, pp.237-242

H. Dennis and . Klatt, Software for a cascade/parallel formant synthesizer, In: the Journal of the Acoustical Society of America, vol.673, pp.971-995, 1980.

A. Kirke and E. R. Miranda, Guide to computing for expressive music performance, 2012.
DOI : 10.1007/978-1-4471-4123-5

[. Kawahara, I. Masuda-katsuse, and A. D. Cheveigné, Restructuring speech representations using a pitch adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, Speech Communication, vol.273, pp.187-207, 1999.
URL : https://hal.archives-ouvertes.fr/hal-01105608

H. Kenmochi and H. Ohshita, VOCALOID ? Commercial singing synthesizer based on sample concatenation, pp.4009-4010, 2007.

M. Kob, Physical modeling of the singing voice, 2002.

M. Kob, Analysis and modelling of overtone singing in the sygyt style, Applied Acoustics 65.12 SPEC. ISS, pp.1249-1259, 2004.
DOI : 10.1016/j.apacoust.2004.04.010

P. Kabal and R. Ramachandran, The computation of line spectral frequencies using Chebyshev polynomials, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.34, issue.6, pp.1419-1426, 1986.
DOI : 10.1109/TASSP.1986.1164983

URL : http://users.rowan.edu/~ravi/journal/jour_1986_01.pdf

E. Klabbers and R. Veldhuis, On the reduction of concatenation artefacts in diphone synthesis, pp.1983-1986, 1998.

J. Latorre and M. Akamine, Multilevel parametric-base F0 model for speech synthesis, Ninth Annual Conference of the International Speech Communication Association -Interspeech, pp.2274-2277, 2008.

A. Lagier, The shouted voice: A pilot study of laryngeal physiology under extreme aerodynamic pressure, Logopedics Phoniatrics Vocology, vol.42, issue.4, 2016.
DOI : 10.1016/S0892-1997(99)80002-2

[. Lai, F0 Control Model for Mandarin Singing Voice Synthesis, 2007 Second International Conference on Digital Telecommunications (ICDT'07), 2007.
DOI : 10.1109/ICDT.2007.14

P. Lanchantin, Automatic Phoneme Segmentation with Relaxed Textual Constraints, Proceedings of the Sixth International Language Resources and Evaluation (LREC'08, 2008.
URL : https://hal.archives-ouvertes.fr/hal-01161385

A. Loscos and J. Bonada, Emulating rough and growl voice in spectral domain, Proc. of the 7th Int. Conference on Digital Audio Effects, 2004.

J. Sylvain, L. , and C. Barras, Fine-grain voice strength estimation from vowel spectral cues, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp.128-132, 2013.

[. Lolive, N. Barbot, and O. Boeffard, B-Spline Model Order Selection With Optimal MDL Criterion Applied to Speech Fundamental Frequency Stylization, IEEE Journal of Selected Topics in Signal Processing, vol.4, issue.3, pp.571-581, 2010.
DOI : 10.1109/JSTSP.2010.2048236

J. Laroche and M. Dolson, Phase-vocoder: about this phasiness business, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics, 1997.
DOI : 10.1109/ASPAA.1997.625603

URL : http://www.ee.columbia.edu/~dpwe/papers/LaroD97-phasiness.pdf

J. Laroche and M. Dolson, New phase-vocoder techniques for pitch-shifting, harmonizing and other exotic effects, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452), pp.91-94, 1999.
DOI : 10.1109/ASPAA.1999.810857

URL : http://www.ee.columbia.edu/~dpwe/papers/LaroD99-pvoc.pdf

J. Liénard and M. Benedetto, Effect of vocal effort on spectral properties of vowels, The Journal of the Acoustical Society of America, vol.106, issue.1, pp.411-433, 1999.
DOI : 10.1121/1.428140

[. Lee, M. Dong, and H. Li, A study of F0 modelling and generation with lyrics and shape characterization for singing voice synthesis, 2012 8th International Symposium on Chinese Spoken Language Processing, pp.150-154
DOI : 10.1109/ISCSLP.2012.6423491

P. Lanchantin, G. Degottex, and X. Rodet, A HMMbased speech synthesis system using a new glottal source and vocal-tract separation method, IEEE International Conference on Acoustics Speech and Signal Processing, pp.4630-4633, 2010.
DOI : 10.1109/icassp.2010.5495550

URL : https://hal.archives-ouvertes.fr/hal-01161230

[. Lee, Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.429-432
DOI : 10.1109/ICASSP.2012.6287908

]. S. Lee+14 and . Lee, A comparative study of spectral transformation techniques for singing voice synthesis, Proceedings of the Annual Conference of the International Speech Communication Association, 2014.

E. Matthew and . Lee, Acoustic Models for the Analysis and Synthesis of the Singing Voice, 2005.

[. Lindestad, Voice Source Characteristics in Mongolian ???Throat Singing??? Studied with High-Speed Imaging Technique, Acoustic Spectra, and Inverse Filtering, Journal of Voice, vol.15, issue.1, pp.78-85, 2001.
DOI : 10.1016/S0892-1997(01)00008-X

[. Lolive, Comparing B-Spline and Spline Models for F0 Modelling, In: Lecture notes in computer science, vol.4188, pp.423-430, 2006.
DOI : 10.1007/11846406_53

URL : https://hal.archives-ouvertes.fr/hal-01199086

M. Liuni and A. Röbel, Phase vocoder and beyond, pp.73-89, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01250848

W. Michael and . Macon, A singing voice synthesis system based on sinusoidal modeling, IEEE International Conference on Acoustics , Speech, and Signal Processing, pp.435-438, 1997.

W. Michael and . Macon, Concatenation-based MIDI-to-Singing Voice Synthesis, 1997.

[. Mizuno, M. Abe, and T. Hirokawa, Waveform-based speech synthesis approach with a formant frequency modification, IEEE International Conference on Acoustics Speech and Signal Processing, pp.195-198, 1993.
DOI : 10.1109/ICASSP.1993.319267

J. Makhoul, Linear prediction: A tutorial review, Proceedings of the IEEE, pp.561-580, 1975.
DOI : 10.1109/PROC.1975.9792

[. Marcus, Acoustic determinants of perceptual center (P-center) location, Perception & Psychophysics, vol.30, issue.3, pp.247-256, 1981.
DOI : 10.3758/BF03214280

URL : https://link.springer.com/content/pdf/10.3758%2FBF03214280.pdf

R. Maher and J. Beauchamp, An investigation of vocal vibrato for synthesis, Applied Acoustics, vol.30, issue.2-3, pp.219-245, 1990.
DOI : 10.1016/0003-682X(90)90045-V

[. Mayor, J. Bonada, and A. Loscos, The Singing Tutor: Expression Categorization and Segmentation of the Singing Voice

, Proceedings of the AES 121st Convention, 2006.

E. Maestre, J. Bonada, and O. Mayor, Modeling musical articulation gestures in singing voice performances, Proceedings of the AES 121st Convention, 2006.

E. Moulines and C. Francis, Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones, Speech communication 9, pp.5-6, 1990.
DOI : 10.1016/0167-6393(90)90021-Z

J. Makhoul and A. El-jaroudi, Time-scale modification in medium to low rate speech coding, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.1705-1708, 1986.
DOI : 10.1109/ICASSP.1986.1169252

D. John, A. Markel, and . Gray, Linear prediction of speech

E. , by Sringer-Verlag, vol.12, 1976.

E. Moulines and J. Laroche, Non-parametric techniques for pitch-scale and time-scale modification of speech, Speech communication 16, pp.175-205, 1995.
DOI : 10.1016/0167-6393(94)00054-E

C. Robert, A. Maher, and . Member, Control of Synthesized Vibrato during Portamento Musical Pitch Transitions, Journal of the Audio Engineering Society, vol.561, pp.18-27, 2008.

[. Molina, Parametric model of spectral envelope to synthesize realistic intensity variations in singing voice, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.634-638
DOI : 10.1109/ICASSP.2014.6853673

J. Jorge and . Moré, The Levenberg-Marquardt algorithm: implementation and theory " . In: Numerical analysis, pp.105-116, 1978.

J. Robert, T. F. Mcauley, and . Quatieri, Speech Analysis/Synthesis Based on a Sinusoidal Representation, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.344, pp.744-754, 1986.

[. Miller and H. Schutte, Formant tuning in a professional baritone, Journal of Voice, vol.4, issue.3, pp.231-237, 1990.
DOI : 10.1016/S0892-1997(05)80018-9

J. Muñoz, Acoustic and Perceptual Indicators of Normal and Pathological Voice, Folia Phoniatrica et logopaedica, vol.552, pp.102-114, 2003.

[. Nakamura, HMM-Based singing voice synthesis and its application to Japanese and English, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.265-269
DOI : 10.1109/ICASSP.2014.6853599

, Comparative Studies on Vocal Expressions in Japanese Traditional and Western Classical-Style Singing , Using a Common Verse, Proc. ICA, pp.295-296, 2004.


[. Neubauer and H. Herzel, Calls out of chaos: the adaptive significance of nonlinear phenomena in mammalian vocal production, Animal Behaviour, vol.633, pp.407-418, 2002.

O. Nieto, Voice Transformations for Extreme Vocal Effects, 2008.

M. Nishimura, Singing Voice Synthesis Based on Deep Neural Networks, Interspeech 2016, pp.2478-2482
DOI : 10.21437/Interspeech.2016-1027

H. Nlm07-]-tin-lay-nwe, S. Li, and . Member, Exploring Vibrato- Motivated Acoustic Features for Singer Identification, IEEE Transactions on Audio, Speech, and Language Processing, vol.15, issue.2, pp.519-530, 2007.

I. Karl and . Nordstrom, Transforming Perceived Vocal Effort and Breathiness Using Adaptive Pre-Emphasis Linear Prediction, IEEE transactions on audio, speech, and language processing 16, pp.1087-1096, 2008.

[. Nose, HMM-based expressive singing voice synthesis with singing style control and robust pitch modeling, Computer Speech & Language, vol.34, issue.1, pp.308-322, 2015.
DOI : 10.1016/j.csl.2015.04.001

[. Nakata and S. E. Trehub, Expressive timing and dynamics in infant-directed and non-infant-directed singing., Psychomusicology: Music, Mind and Brain, vol.21, issue.1-2, pp.130-138, 2010.
DOI : 10.1037/h0094003

[. Obin, MeLos : Analysis and Modelling of Speech Prosody and Speaking Style, p.266, 2011.
URL : https://hal.archives-ouvertes.fr/tel-00694687

J. J. and O. , The Use of Context in Large Vocabulary Speech Recognition, 1995.

[. Ohishi, Statistical Modeling of F0 Dynamics in Singing Voices Based on Gaussian Processes with Multiple Oscillation Bases, pp.2598-2601, 2010.

[. Ohishi, A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic Components, 13th Annual Conference of the International Speech Communication Association (INTERSPEECH) 2.1 (2012), pp.474-477

A. V. Oppenheim, Speech Analysis???Synthesis System Based on Homomorphic Filtering, The Journal of the Acoustical Society of America, vol.45, issue.2, pp.458-465, 1969.
DOI : 10.1121/1.1911395

[. Obin, A. Roebel, and G. Bachman, On automatic voice casting for expressive speech: Speaker recognition vs. speech classification, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.950-954
DOI : 10.1109/ICASSP.2014.6853737

URL : https://hal.archives-ouvertes.fr/hal-00943796

[. Oura, Recent development of the HMM-based singing voice synthesis system-Sinsy, 7th ISCA Workshop on Speech Synthesis (SSW-7), pp.211-216, 2010.

[. Oura, Pitch adaptive training for hmm-based singing voice synthesis, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5377-5380
DOI : 10.1109/ICASSP.2012.6289136

[. Obin, C. Veaux, and P. Lanchantin, Making sense of variations: Introducing alternatives in speech synthesis, Proceedings of the 6th International Conference on Speech Prosody (SP2012). 2012, pp.179-182
URL : https://hal.archives-ouvertes.fr/hal-00663837

M. Panteli, Towards the characterization of singing styles in world music, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.636-640
DOI : 10.1109/ICASSP.2017.7952233

[. Park, Time Course of the First Formant Bandwidth, Annual Meeting of the Berkeley Linguistics Society, pp.213-224, 2002.
DOI : 10.3765/bls.v28i1.3836

O. Perrotin and C. , Vocal Effort Modification for Singing Synthesis, Interspeech 2016, pp.1235-1239, 2016.
DOI : 10.21437/Interspeech.2016-1096

URL : https://hal.archives-ouvertes.fr/hal-01712564

]. C. Ped11 and . Pedersen, Leja ordering LSFs for accurate estimation of predictor coefficients, Proceedings of the Annual Conference of the International Speech Communication Association, pp.2545-2548, 2011.

[. Peeters, A large set of audio features for sound description (similarity and classification) in the CUIDADO project, 2004.

M. Pfleiderer, Vocal pop pleasures. Theoretical, analytical and empirical approaches to voice and singing in popular music, IASPM@Journal, vol.1, issue.1, pp.1-16, 2010.
DOI : 10.5429/2079-3871(2010)v1i1.7en

[. Portnoff, Implementation of the digital phase vocoder using the fast Fourier transform, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.24, issue.3, pp.243-248, 1976.
DOI : 10.1109/TASSP.1976.1162810

M. D. Plumpe, T. F. Quatieri, and D. A. Reynolds, Modeling of the glottal flow derivative waveform with application to speaker identification, IEEE Transactions on Speech and Audio Processing, vol.7, issue.5, pp.569-585, 1999.
DOI : 10.1109/89.784109

]. E. Pra94 and . Prame, Measurements of the vibrato rate of ten singers, In: The journal of the Acoustical Society of America, vol.964, pp.1979-1984, 1994.

[. Pantazis, O. Rosec, and Y. Stylianou, On the properties of a time-varying quasi-harmonic model of speech, Proceedings of the Annual Conference of the International Speech Communication Association, pp.1044-1047, 2008.

[. Pantazis, O. Rosec, and Y. Stylianou, Adaptive AM???FM Signal Decomposition With Application to Speech Analysis, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.2, pp.290-300, 2011.
DOI : 10.1109/TASL.2010.2047682

[. Puckette, Phase-locked vocoder, Proceedings of 1995 Workshop on Applications of Signal Processing to Audio and Accoustics, 1995.
DOI : 10.1109/ASPAA.1995.482995

, Workshop on Applications of Signal Processing to Audio and Accoustics, pp.222-225, 1995.

[. Quinlan, C4. 5: Programs for machine learning, 1993.

[. Raitio, HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.1, pp.153-165, 2011.
DOI : 10.1109/TASL.2010.2045239

URL : https://www.research.ed.ac.uk/portal/files/15268997/HMM_Based_Speech_Synthesis_Utilizing_Glottal.pdf

[. Recommendation, BS.1284-1 General methods for the subjective assessment of sound quality, pp.1-13, 2003.

[. Roubeau and N. Henrich, Laryngeal Vibratory Mechanisms: The Notion of Vocal Register Revisited, Journal of Voice, vol.23, issue.4, pp.425-438, 2009.
DOI : 10.1016/j.jvoice.2007.10.014

URL : https://hal.archives-ouvertes.fr/hal-00319915

[. Ruinskiy and Y. Lavner, Stochastic models of pitch jitter and amplitude shimmer for voice modification, 2008 IEEE 25th Convention of Electrical and Electronics Engineers in Israel, pp.489-493, 2008.
DOI : 10.1109/EEEI.2008.4736577

A. Roebel and S. Maller, Transforming vibrato extent in monophonic sounds, Proc. of the 14th Int. Conference on Digital Audio Effects (DAFx-11), 2011.
URL : https://hal.archives-ouvertes.fr/hal-01161310

, Frequency-Slope Estimation and Its Application to Parameter Estimation for Non-Stationary Sinusoids, Computer Music Journal, vol.32, issue.2, pp.68-79, 2008.
DOI : 10.1162/comj.2008.32.2.68

A. Röbel, A Shape-Invariant Phase Vocoder For Speech Transformation, 13th International Conference on Digital Audio Effects (DAFx), 2010.

X. Rodet, Synthesis and processing of the singing voice, Proc. 1st IEEE Benelux Workshop on Model based Processing and Coding of Audio, pp.99-108, 2002.
URL : https://hal.archives-ouvertes.fr/hal-01105758

X. Rodet, Transformation et synthèse de la voix parlée et de la voix chantée, In: PAROLE ETMUSIQUE, 2009.

A. Roebel, Analysis and modification of excitation source characteristics for singing voice synthesis, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5381-5384
DOI : 10.1109/ICASSP.2012.6289137

URL : https://hal.archives-ouvertes.fr/hal-01106709

A. Roebel, A new approach to transient processing in the phase vocoder, Proc. of the 6th Int. Conference on Digital Audio Effects (DAFx-03), 2003.
URL : https://hal.archives-ouvertes.fr/hal-01161124

[. Rodet, Y. Potard, and J. Barriere, The CHANT Project: From the Synthesis of the Singing Voice to Synthesis in General, Computer Music Journal, vol.8, issue.3, pp.15-31, 1984.
DOI : 10.2307/3679810

A. Röbel and X. Rodet, Efficient spectral envelope estimation and its application to pitch shifting and envelope preservation, Proc. of the 8th Int. Conference on Digital Audio Effects (DAFx'05)

S. Madrid, , 2005.

X. Rodet and A. Roebel, Real time signal transposition with envelope preservation in the phase vocoder, Proc. International Computer Music Conference (ICMC'05, pp.672-675, 2005.
URL : https://hal.archives-ouvertes.fr/hal-01161347

A. Röbel, F. Villavicencio, and X. Rodet, On cepstral and all-pole based spectral envelope modeling with unknown model order, Pattern Recognition Letters, vol.28, issue.11, pp.1343-1350, 2007.
DOI : 10.1016/j.patrec.2006.11.021

A. [. Roucos and . Wilgus, High quality time-scale modification for speech, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.493-496, 1985.
DOI : 10.1109/ICASSP.1985.1168381

[. Saitou, Analysis of Acoustic Features Affecting Singing-ness " and Its Application to Singing-Voice Synthesis from Speaking-Voice, 8th International Conference on Spoken Language Processing -INTERSPEECH, 2004.

[. Saino, An HMM-based singing voice synthesis system, 9th International Conference on Spoken Language Processing -Interspeech, pp.2274-2277, 2006.

T. Saitou, Speech-to-Singing Synthesis: Converting Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp.215-218, 2007.
DOI : 10.1109/ASPAA.2007.4393001

URL : http://staff.aist.go.jp/m.goto/PAPER/WASPAA2007saitou.pdf

K. Sakakibara, Growl Voice in Ethnic and Pop Styles, Proceedings of the International Symposium on Musical Acoustics, 2004.

S. Sakai, Additive Modeling of English F0 Contour for Speech Synthesis, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., pp.277-280, 2005.
DOI : 10.1109/ICASSP.2005.1415104

URL : http://www.sls.csail.mit.edu/sls/publications/2005/sakai_f0_icassp05.pdf

L. José and . Santacruz, Spectral Envelope Transformation in Singing Voice for Advanced Pitch Shifting, In: Applied Sciences, vol.6, issue.11, p.368, 2016.

[. Smialek, P. Depalle, and D. Brackett, A spectrographic analysis of vocal techniques in extreme metal for musicological analysis, pp.88-93, 2012.

X. Serra, A system for sound analysis/transformation/synthesis based on a deterministic plus stochastic decomposition, 1989.
DOI : 10.2307/3680788

E. Schoonderwaldt and A. Friberg, Toward a rule-based model for violin vibrato, Current Research Directions in Computer Music, pp.61-64, 2001.

M. Schroder and M. Grice, Expressing vocal effort in concatenative synthesis, Proc. 15th international conference of phonetic sciences (ICPhS), pp.2589-2592, 2003.

[. Saitou and M. Goto, Acoustic and Perceptual Effects of Vocal training in Amateur Male Singing, 10th Annual Conference of the International Speech Communication Association -Interspeech, pp.832-835, 2009.

A. Stan and M. Giurgiu, A superpositional model applied to F0 parameterization using DCT for text-to-speech synthesis, 2011 6th Conference on Speech Technology and Human-Computer Dialogue (SpeD), 2011.
DOI : 10.1109/SPED.2011.5940734

[. Shih, Prosody Control for Speaking and Singing Styles, 7th European Conference on Speech Communication and Technology -Eurospeech, pp.669-672, 2001.

[. Shirota, Integration of speaker and pitch adaptive training for HMM-based singing voice synthesis, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2578-2582
DOI : 10.1109/ICASSP.2014.6854062

X. Serra and J. Smith, Spectral Modeling Synthesis: A Sound Analysis/Synthesis System Based on a Deterministic Plus Stochastic Decomposition, Computer Music Journal, vol.14, issue.4, p.12, 1990.
DOI : 10.2307/3680788

[. Shiga and S. King, Estimating the spectral envelope of voiced speech using multi-frame analysis, 2003.

[. Shiga and S. King, Estimation of voice source and vocal tract characteristics based on multi-frame analysis, pp.1749-1752, 2003.

J. Sundberg, M. B. Filipa, B. P. Lã, and . Gill, Formant Tuning Strategies in Professional Male Opera Singers, Journal of Voice, vol.27, issue.3, pp.278-288, 2013.
DOI : 10.1016/j.jvoice.2012.12.002

K. Harm, . Schutte, G. Donald, and . Miller, Belting and Pop, Nonclassical Approaches to the Female Middle Voice: Some Preliminary Considerations, Journal of Voice, vol.7, issue.2, pp.142-150, 1993.

[. Smith, On an Unusual Mode of Chanting by Certain Tibetan Lamas, The Journal of the Acoustical Society of America, vol.41, issue.5, p.1262, 1967.
DOI : 10.1121/1.1910466

P. H. Jan, T. Van-santen, E. Mishra, and . Klabbers, Estimating Phrase Curves in the General Superpositional Intonation Model " . In: 5th ISCA Speech Synthesis Workshop, pp.61-66, 2004.

Y. Sasaki and H. Okamura, Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness, Journal of Speech and Hearing Research, vol.27, pp.2-6, 1984.

[. Stables, Towards a model for the humanisation of pitch drift in singing voice synthesis, 2011.

[. Saino, M. Tachibana, and H. Kenmochi, A Singing Style Modeling System for Singing Voice Synthesizers, pp.2894-2897, 2010.

[. Stylianou, Applying the harmonic plus noise model in concatenative speech synthesis, IEEE Transactions on Speech and Audio Processing, vol.9, issue.1, pp.21-29, 2001.
DOI : 10.1109/89.890068

[. Saitou, M. Unoki, and M. Akagi, Extraction of F0 dynamic characteristics and development of F0 control model in singing voice, Proceedings of the 2002 International Conference on Auditory Display, 2002.

[. Saitou, M. Unoki, and M. Akagi, Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis, Speech Communication, vol.46, issue.3-4, pp.405-417, 2005.
DOI : 10.1016/j.specom.2005.01.010

J. Sundberg, Level and Center Frequency of the Singer's Formant, Journal of Voice, vol.15, issue.2, pp.176-186, 2001.
DOI : 10.1016/S0892-1997(01)00019-4

J. Sundberg, The KTH synthesis of singing, Advances in Cognitive Psychology 2, pp.2-3, 2006.
DOI : 10.2478/v10053-008-0051-y

URL : https://doi.org/10.2478/v10053-008-0051-y

J. Sundberg, Synthesising Singing, Proceedings SMC'07, 4th Sound andMusic Computing Conference. July. Lefkada, Greece Synthesising, pp.9-13, 2007.

J. Sundberg, The Journal of the Acoustical Society of America, vol.87, issue.1, 1990.
DOI : 10.1121/1.399243

[. Tamura, Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), pp.805-808, 2001.
DOI : 10.1109/ICASSP.2001.941037

[. Tamura, Text-to-speech synthesis with arbitrary speaker's voice from average voice, 7th European Conference on Speech Communication and Technology -Eurospeech, pp.345-348, 2001.

[. Taylor, A. W. Black, and R. Caley, The Architecture of the Festival Speech Synthesis System, Proc. 3rd ESCA Workshop on Speech Synthesis, pp.147-151, 1998.

P. [. Thibault and . Depalle, Adaptive processing of singing voice timbre, Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513), pp.871-874, 2004.
DOI : 10.1109/CCECE.2004.1345253

[. Tilmanne and T. Dutoit, Continuous Control of Style and Style Transitions through Linear Interpolation in Hidden Markov Model Based Walk Synthesis, In: Transactions on Computational Science XVI, pp.34-54, 2012.
DOI : 10.1007/978-3-642-32663-9_3

[. Traunmüller and A. Eriksson, Acoustic effects of variation in vocal effort by men, women, and children, The Journal of the Acoustical Society of America, vol.107, issue.6, pp.3438-3451, 2000.
DOI : 10.1121/1.429414

[. Terhardt, On the perception of periodic sound fluctuations (roughness ) . In: Acta Acustica united with Acustica 30, pp.201-213, 1974.

[. Tigges, Observation and modelling of glottal biphonation, Acta Acustica united with Acustica, vol.834, pp.707-714, 1997.

[. Thalén and J. Sundberg, Describing different styles of singing: A comparison of a female singer's voice source in ''Classical'', ''Pop'', ''Jazz'' and ''Blues'', Logopedics Phoniatrics Vocology, vol.12, issue.2, pp.82-93, 2001.
DOI : 10.1016/S0892-1997(98)80021-0

[. Tsai, Aggressiveness of the Growl-Like Timbre: Acoustic Characteristics, Musical Implications, and Biomechanical Mechanisms, Music Perception, pp.209-222, 2010.
DOI : 10.1525/mp.2010.27.3.209

[. Turk, Voice Quality Interpolation for Emotional Text- To-Speech Synthesis, 9th European Conference on Speech Communication and Technology, 2005.

[. Titze, S. Albert, and . Worley, Modeling source-filter interaction in belting and high-pitched operatic male singing, The Journal of the Acoustical Society of America, vol.126, issue.3, p.1530, 2009.
DOI : 10.1121/1.3160296

URL : http://europepmc.org/articles/pmc2757425?pdf=render

[. Teutenberg, C. Watson, and P. Riddle, Modelling and synthesizing F0 contours with the discrete cosine transform, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.3973-3976, 2008.
DOI : 10.1109/icassp.2008.4518524

M. Umbert, J. Bonada, and M. Blaauw, Generating singing voice expression contours based on unit selection, Stockholm Music Acoustics Conference (SMAC). 2013, pp.315-320

M. Umbert, J. Bonada, and M. Blaauw, Systematic database creation for expressive singing voice synthesis control, 8th ISCA Workshop on Speech Synthesis. 2013, pp.213-216

M. Umbert, Expression Control in Singing Voice Synthesis: Features, approaches, evaluation, and challenges, IEEE Signal Processing Magazine, vol.32, issue.6, pp.55-73, 2015.
DOI : 10.1109/MSP.2015.2424572

M. Umbert, Expression Control of Singing Voice Synthesis: Modeling Pitch and Dynamics with Unit Selection and Statistical Approaches, 2015.

[. Uneson, Burcas -A Simple Concatenation-based MIDI-to- Singing Voice Synthesis System for Swedish, 2002.

. Van+16-]-aäron-van-den and . Oord, Wavenet: a generative model for raw audio, p.arXiv, 2016.

]. Van58, . Van-den, and . Berg, Myoelastic-aerodynamic theory of voice production, Journal of Speech, Language, and Hearing Research, vol.1, issue.3, pp.227-244, 1958.

, Joint cost for unit selection speech synthesis, 2004.

A. Verma and A. Kumar, Introducing Roughness in Individuality Transformation through Jitter Modeling and Modification, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., pp.5-8, 2005.
DOI : 10.1109/ICASSP.2005.1415036

[. Valbret, E. Moulines, and J. P. Tubach, Voice transformation using PSOLA technique, Speech Communication, vol.11, pp.2-3, 1992.

[. Villavicencio, A. Röbel, and X. Rodet, Improving Lpc Spectral Envelope Extraction Of Voiced Speech By True-Envelope Estimation, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, pp.869-872, 2006.
DOI : 10.1109/ICASSP.2006.1660159

URL : https://hal.archives-ouvertes.fr/hal-01161354

[. Villavicencio, A. Röbel, and X. Rodet, All-Pole Spectral Envelope Modelling with Order Selection for Harmonic Signals, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07, 2007.
DOI : 10.1109/ICASSP.2007.366613

H. Wakita, Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveforms, IEEE Transactions on Audio and Electroacoustics, vol.21, issue.5, pp.417-427, 1973.
DOI : 10.1109/TAU.1973.1162506

G. Widmer and W. Goebl, Computational Models of Expressive Music Performance: The State of the Art, Journal of New Music Research, vol.33, issue.3, pp.203-216, 2004.
DOI : 10.1080/0929821042000317804

[. Wise, Yodel species: a typology of falsetto effects in popular music vocal styles, Radical Musicology, vol.2, p.57, 2007.

, Improving the modeling of the noise part in the harmonic plus noise model of speech, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4609-4612, 2008.

[. Yoshimura, Speaker interpolation for HMM-based speech synthesis system., THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN (E), vol.21, issue.4, pp.199-206, 2000.
DOI : 10.1250/ast.21.199

[. Yoshimura, Mixed excitation for HMM-based speech synthesis, 7th European Conference on Speech Communication and Technology Eurospeech'01, pp.2263-2266, 2001.

[. Yoshimura, Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis, 6th European Conference on Speech Communication and Technology, 1999.

[. Yeh, A. Roebel, and X. Rodet, Multiple fundamental frequency estimation and polyphony inference of polyphonic music signals, IEEE Transactions on Audio, Speech and Language Processing, vol.186, pp.1116-1126, 2010.
URL : https://hal.archives-ouvertes.fr/hal-01106539

U. Zölzer, DAFX : Digital Audio Effects, 2011.
DOI : 10.1002/9781119991298

[. Zen, K. Tokuda, and A. W. Black, Statistical parametric speech synthesis, Speech Communication, vol.51, issue.11, pp.1039-1064, 2009.
DOI : 10.1016/j.specom.2009.04.004

URL : https://hal.archives-ouvertes.fr/hal-00746106

[. Zwicker, Ein Verfahren zur Beredinung der Lautst{ä}rke, Acta Acustica united with Acustica, pp.304-308, 1960.

]. E. Zwi61 and . Zwicker, Subdivision of the Audible Frequency Range into Critical Bands, The Journal of the Acoustical Society of America, vol.33, issue.2, pp.248-248, 1961.