M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen et al., Tensorflow: large-scale machine learning on heterogeneous distributed systems, 2016.

M. Bisani and H. Ney, Joint-sequence models for grapheme-to-phoneme conversion, Speech Commun, vol.50, issue.5, pp.434-451, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00499203

N. Campbell, Loudness, spectral tilt, and perceived prominence in dialogues, Proceedings ICPhS, vol.95, pp.676-679, 1995.

N. Campbell, A. Esposito, M. Faundez-zanuy, and E. Keller, On the use of nonverbal speech sounds in human communication, Verbal and Nonverbal Communication Behaviours, vol.4775, pp.117-128, 2007.

W. N. Campbell, Prosodic encoding of English speech, Second International Conference on Spoken Language Processing, 1992.

A. C. Cohn, C. Fougeron, and M. K. Huffman, The Oxford Handbook of Laboratory Phonology, pp.103-114, 2012.

J. Cole, Y. Mo, and M. Hasegawa-johnson, Signal-based and expectation-based factors in the perception of prosodic prominence, Lab. Phonol, vol.1, issue.2, pp.425-452, 2010.

S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, J. F. Bonastre et al., The ESTER phase II evaluation campaign for the rich transcription of French broadcast news, pp.1149-1152, 2005.

M. Heldner, On the reliability of overall intensity and spectral emphasis as acoustic correlates of focal accents in swedish, J. Phon, vol.31, issue.1, pp.39-62, 2003.

P. E. Honnet, A. Lazaridis, P. N. Garner, and J. Yamagishi, The SIWIS French speech synthesis database? Design and recording of a high quality French database for speech synthesis, 2017.

D. Kingma and J. Ba, Adam: a method for stochastic optimization, 2014.

K. Li and H. Meng, Automatic lexical stress and pitch accent detection for L2 English speech using multi-distribution deep neural networks, Speech Commun, 2016.

K. Li, S. Zhang, M. Li, W. K. Lo, and H. M. Meng, Prominence model for prosodic features in automatic lexical stress and pitch accent detection, pp.2009-2012, 2011.

L. Narupiyakul, V. Keselj, N. Cercone, and B. Sirinaovakul, Focus to emphasize tone analysis for prosodic generation, Comput. Math. Appl, vol.55, issue.8, pp.1735-1753, 2008.

E. Noth, A. Batliner, A. Kießling, R. Kompe, and H. Niemann, Verbmobil: the use of prosody in the linguistic components of a speech understanding system, IEEE Trans. Speech Audio Process, vol.8, issue.5, pp.519-532, 2000.

D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek et al., The kaldi speech recognition toolkit, IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, 2011.

E. Shriberg, A. Stolcke, D. Hakkani-tür, and G. Tür, Prosody-based automatic segmentation of speech into sentences and topics, Speech Commun, vol.32, issue.1, pp.127-154, 2000.

A. M. Sluijter, S. Shattuck-hufnagel, K. N. Stevens, and V. Van-heuven, Supralaryngeal resonance and glottal pulse shape as correlates of prosodic stress and accent in American English, 1995.

A. M. Sluijter and V. J. Van-heuven, Spectral balance as an acoustic correlate of linguistic stress, J. Acoust. Soc. Am, vol.100, issue.4, pp.2471-2485, 1996.

B. M. Streefkerk, L. C. Pols, and L. Bosch, Automatic detection of prominence (as defined by listeners' judgements) in read aloud Dutch sentences, 1998.

J. Tepperman and S. Narayanan, Automatic syllable stress detection using prosodic features for pronunciation evaluation of language learners, IEEE International Conference on Proceedings of the Acoustics, Speech, and Signal Processing (ICASSP 2005), vol.1, p.937, 2005.

D. Van-kuijk and L. Boves, Acoustic characteristics of lexical stress in continuous telephone speech, Speech Commun, vol.27, issue.2, pp.95-111, 1999.

B. Wheatley, G. Doddington, C. Hemphill, J. Godfrey, E. Holliman et al., Robust automatic time alignment of orthographic transcriptions with unconstrained speech, 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp.533-536, 1992.

C. W. Wightman and M. Ostendorf, Automatic labeling of prosodic patterns, IEEE Trans. Speech Audio Process, vol.2, issue.4, pp.469-481, 1994.

K. Yu, F. Mairesse, and S. Young, Word-level emphasis modelling in HMM-based speech synthesis, 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp.4238-4241, 2010.

M. D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang et al., On rectified linear units for speech processing, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.3517-3521, 2013.

J. Zhao, H. Yuan, J. Liu, and S. Xia, Automatic lexical stress detection using acoustic features for computer assisted language learning, Proceedings of the APSIPA ASC, pp.247-251, 2011.

Y. Zhu, J. Liu, and R. Liu, Automatic lexical stress detection for English learning, Proceedings of the 2003 International Conference on Natural Language Processing and Knowledge Engineering, pp.728-733, 2003.