, World Health Organization, 2019.

, Hearing by eye: The psychology of lip-reading, 1987.

H. Gaye, D. L. Nicholls, and . Mcgill, Cued speech and the reception of spoken language, Journal of Speech, Language, and Hearing Research, vol.25, issue.2, pp.262-269, 1982.

C. Richard-orin, Cued speech, American annals of the deaf, vol.112, issue.1, pp.3-13, 1967.

J. Carol, K. L. Lasasso, J. Crain, and . Leybaert, Cued Speech and Cued Language Development for Deaf and Hard of Hearing Children, 2010.

. William-c-stokoe, Sign language structure: An outline of the visual communication systems of the american deaf, Journal of deaf studies and deaf education, vol.10, issue.1, pp.3-37, 2005.

K. Scott and R. Liddell, American sign language: The phonological base, Sign language studies, vol.64, pp.195-277, 1989.

C. Valli and C. Lucas, Linguistics of American sign language: an introduction, 2000.

S. E. and R. , An examination of cued speech as a tool for language, literacy, and bilingualism for children who are deaf or hard of hearing, 2007.

C. R. Vicente-peruffo-minotto, B. Jung, and . Lee, Multimodal multi-channel on-line speaker diarization using sensor fusion through svm, IEEE Transactions on Multimedia, vol.17, issue.10, pp.1694-1705, 2015.

O. Wu, H. Zuo, W. Hu, and B. Li, Multimodal web aesthetics assessment based on structural svm and multitask fusion learning, IEEE Transactions on Multimedia, vol.18, issue.6, pp.1062-1076, 2016.

M. Tang, X. Wu, P. Agrawal, S. Pongpaichet, and R. Jain, Integration of diverse data sources for spatial pm2. 5 data interpolation, IEEE Transactions on Multimedia, vol.19, issue.2, pp.408-417, 2017.

V. Attina, D. Beautemps, M. Cathiard, and M. Odisio, A pilot study of temporal organization in cued speech production of french syllables: rules for a cued speech synthesizer, Speech Communication, vol.44, issue.1, pp.197-214, 2004.

V. Attina, M. Cathiard, and D. Beautemps, Temporal measures of hand and speech coordination during french cued speech production, International Gesture Workshop, pp.13-24, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00371993

P. Heracleous, D. Beautemps, and N. Aboutabit, Cued speech automatic recognition in normal-hearing and deaf subjects, Speech Communication, vol.52, issue.6, pp.504-512, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00535529

P. Heracleous, D. Beautemps, and N. Hagita, Continuous phoneme recognition in cued speech for french, Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European, pp.2090-2093, 2012.

L. Liu, T. Hueber, G. Feng, and D. Beautemps, Visual recognition of continuous cued speech using a tandem cnn-hmm approach, in Interspeech, pp.2643-2647, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01978344

Y. Lecun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol.521, issue.7553, pp.436-444, 2015.

I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, vol.1, 2016.

G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. W. Senior, Recent advances in the automatic recognition of audiovisual speech, Proceedings of the IEEE, vol.91, issue.9, pp.1306-1326, 2003.

T. Burger, A. Caplier, and S. Mancini, Cued speech hand gestures recognition tool, Signal Processing Conference in European, pp.1-4, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00328131

S. Stillittano, V. Girondel, and A. Caplier, Lip contour segmentation and tracking compliant with lip-reading application constraints, Machine vision and applications, pp.1-18, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00741453

L. Liu, G. Feng, and D. Beautemps, Extraction automatique de contour de lèvreà partir du modèle clnf, Actes des 31èmes Journées d'Etude de la Parole, 2016.

L. Liu, G. Feng, and D. Beautemps, Automatic tracking of inner lips based on clnf, Acoustics, Speech and Signal Processing, pp.5130-5134, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01504342

L. Liu, G. Feng, and D. Beautemps, Inner lips parameter estimation based on adaptive ellipse model, 14th International Conference on Auditory-Visual Speech Processing, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01848508

D. Svozil, V. Kvasnicka, and J. Pospichal, Introduction to multi-layer feed-forward neural networks, Chemometrics and intelligent laboratory systems, vol.39, pp.43-62, 1997.

N. Aboutabit, D. Beautemps, and L. Besacier, Hand and lip desynchronization analysis in french cued speech: Automatic temporal segmentation of hand flow, Proc. IEEE-ICASSP, vol.1, pp.I-I, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00261527

L. Liu, G. Feng, and D. Beautemps, Automatic temporal segmentation of hand movement for hand position recognition in french cued speech, Acoustics, Speech and Signal Processing, pp.3061-3065, 2018.

N. Aboutabit, Reconnaissance de la Langue Française Parlée Complété (LPC): décodage phonétique des gestes main-lèvres, 2007.

D. Beautemps, L. Girin, N. Aboutabit, G. Bailly, L. Besacier et al., Telma: Telephony for the hearingimpaired people. from models to user tests, pp.201-208, 2007.

V. Daniel, T. K. Abreu, D. Tamura, R. D. Jr, and . Eavey, Podcasting: contemporary patient education, Ear, Nose & Throat Journal, vol.87, issue.4, p.208, 2008.

J. Naylor, Magix Movie Edit Pro, 2014.

A. Tinwell, M. Grimshaw, and D. A. Nabi, The effect of onset asynchrony in audio-visual speech and the uncanny valley in virtual characters, International Journal of Mechanisms and Robotic Systems, vol.2, issue.2, pp.97-110, 2015.

G. Gibert, G. Bailly, D. Beautemps, F. Elisei, and R. Brun, Analysis and synthesis of the three-dimensional movements of the head, face, and hand of a speaker using cued speech, The Journal of the Acoustical Society of America, vol.118, issue.2, pp.1144-1153, 2005.

F. Béchet, Lia phon: un systeme complet de phonétisation de textes, Traitement automatique des langues, vol.42, pp.47-67, 2001.

J. Steve, S. Young, and . Young, The HTK hidden Markov model toolkit: Design and philosophy, 1993.

J. Shi and T. Carlo, Good features to track, Computer Vision and Pattern Recognition, pp.593-600, 1994.

C. Stauffer, W. Eric, and L. Grimson, Adaptive background mixture models for real-time tracking, Proc. IEEE-CVPR, vol.2, pp.246-252, 1999.

J. Steve, . Young, J. Julian, P. C. Odell, and . Woodland, Tree-based state tying for high accuracy acoustic modelling, Proceedings of the workshop on Human Language Technology, pp.307-312, 1994.

J. Schwartz, P. Escudier, and P. Teissier, Multimodal speech: Two or three senses are better than one, Language and Speech Processing, pp.377-415, 2009.