Construction rapide, performante et mutualisée de systèmes de reconnaissance et de synthèse de la parole pour de nouvelles langues

Abstract : We study in this thesis the joint construction of speech recognition and synthesis systems for new languages, with the goals of accuracy and quick development. The rapid development of voice technologies for new languages is driving scientific ambitions and is now considered strategic by industial players. However, language development research is led by a few research centers, each working on a limited number of languages. However, these technologies share many common points. Our study focuses on building and sharing tools between systems for creating lexicons, learning phonetic rules and taking advantage of imperfect data. Our contributions focus on the selection of relevant data for learning acoustic models, the joint development of phonetizers and pronunciation lexicons for speech recognition and synthesis, and the use of neural models for phonetic transcription from text and speech signal. In addition, we present an approach for automatic detection of phonetic transcript errors in annotated speech signal databases. This study has shown that it is possible to significantly reduce the quantity of data annotation useful for the development of new text-to-speech systems. It naturally helps to reduce data collection time in the process of new systems creation. Finally, we study an application case by jointly building a system for recognizing and synthesizing speech for a new language.
Document type :
Theses
Complete list of metadatas

Cited literature [163 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/tel-02446915
Contributor : Kévin Vythelingum <>
Submitted on : Tuesday, January 21, 2020 - 11:38:31 AM
Last modification on : Thursday, January 23, 2020 - 10:51:26 AM

File

kevin_vythelingum_thesis_final...
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02446915, version 1

Collections

Citation

Kévin Vythelingum. Construction rapide, performante et mutualisée de systèmes de reconnaissance et de synthèse de la parole pour de nouvelles langues. Informatique et langage [cs.CL]. Le Mans Université, 2019. Français. ⟨tel-02446915⟩

Share

Metrics

Record views

97

Files downloads

53