Weakly-supervised text-to-speech alignment confidence measure

Guillaume Serrière; Christophe Cerisara; Dominique Fohr; Odile Mella

Communication Dans Un Congrès Année : 2016

Weakly-supervised text-to-speech alignment confidence measure

(1) , (1) , (2) , (2)

1
2

Guillaume Serrière

Fonction : Auteur
PersonId : 990245

Natural Language Processing : representations, inference and semantics

Christophe Cerisara

Fonction : Auteur
PersonId : 2353
IdHAL : christophe-cerisara
IdRef : 102700168

Natural Language Processing : representations, inference and semantics

Dominique Fohr

Fonction : Auteur
PersonId : 15652
IdHAL : dominique-fohr
IdRef : 031092942

Speech Modeling for Facilitating Oral-Based Communication

Odile Mella

Fonction : Auteur
PersonId : 15902
IdHAL : odile-mella
IdRef : 12011903X

Speech Modeling for Facilitating Oral-Based Communication

Résumé

This work proposes a new confidence measure for evaluating text-to-speech alignment systems outputs, which is a key component for many applications, such as semi-automatic corpus anonymization, lips syncing, film dubbing, corpus preparation for speech synthesis and speech recognition acoustic models training. This confidence measure exploits deep neural networks that are trained on large corpora without direct supervision. It is evaluated on an open-source spontaneous speech corpus and outperforms a confidence score derived from a state-of-the-art text-to-speech aligner. We further show that this confidence measure can be used to fine-tune the output of this aligner and improve the quality of the resulting alignment.

Domaines

Informatique et langage [cs.CL]

Fichier principal

170_Paper.pdf (358.79 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Christophe Cerisara : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01378355

Soumis le : lundi 10 octobre 2016-09:04:15

Dernière modification le : lundi 11 septembre 2023-17:41:18

Archivage à long terme le : samedi 4 février 2017-00:20:24

Dates et versions

hal-01378355 , version 1 (10-10-2016)

Identifiants

HAL Id : hal-01378355 , version 1

Citer

Guillaume Serrière, Christophe Cerisara, Dominique Fohr, Odile Mella. Weakly-supervised text-to-speech alignment confidence measure. International Conference on Computational Linguistics (COLING), Dec 2016, Osaka, Japan. ⟨hal-01378355⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA GRID5000 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD SILECS

444 Consultations

199 Téléchargements

Weakly-supervised text-to-speech alignment confidence measure

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager