Weakly-supervised text-to-speech alignment confidence measure - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Weakly-supervised text-to-speech alignment confidence measure

Résumé

This work proposes a new confidence measure for evaluating text-to-speech alignment systems outputs, which is a key component for many applications, such as semi-automatic corpus anonymization, lips syncing, film dubbing, corpus preparation for speech synthesis and speech recognition acoustic models training. This confidence measure exploits deep neural networks that are trained on large corpora without direct supervision. It is evaluated on an open-source spontaneous speech corpus and outperforms a confidence score derived from a state-of-the-art text-to-speech aligner. We further show that this confidence measure can be used to fine-tune the output of this aligner and improve the quality of the resulting alignment.
Fichier principal
Vignette du fichier
170_Paper.pdf (358.79 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01378355 , version 1 (10-10-2016)

Identifiants

  • HAL Id : hal-01378355 , version 1

Citer

Guillaume Serrière, Christophe Cerisara, Dominique Fohr, Odile Mella. Weakly-supervised text-to-speech alignment confidence measure. International Conference on Computational Linguistics (COLING), Dec 2016, Osaka, Japan. ⟨hal-01378355⟩
444 Consultations
199 Téléchargements

Partager

Gmail Facebook X LinkedIn More