A joint prosody evaluation of French text-to-speech synthesis systems - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2006

A joint prosody evaluation of French text-to-speech synthesis systems

Résumé

This paper reports on prosodic evaluation in the framework of the EVALDA/EvaSy project for text-to-speech (TTS) evaluation for the French language. Prosody is evaluated using a prosodic transplantation paradigm. Intonation contours generated by the synthesis systems are transplanted on a common segmental content. Both diphone based synthesis and natural speech are used. Five TTS systems are tested along with natural voice. The test is a paired preference test (with 19 subjects), using 7 sentences. The results indicate that natural speech obtains consistently the first rank (with an average preference rate of 80%) , followed by a selection based system (72%) and a diphone based system (58%). However, rather large variations in judgements are observed among subjects and sentences, and in some cases synthetic speech is preferred to natural speech. These results show the remarkable improvement achieved by the best selection based synthesis systems in terms of prosody. In this way; a new paradigm for evaluation of the prosodic component of TTS systems has been successfully demonstrated.

Domaines

Linguistique
Fichier principal
Vignette du fichier
LREC06_EVASY_PROSODIE_v4.pdf (56.02 Ko) Télécharger le fichier
Loading...

Dates et versions

hal-00103557 , version 1 (04-10-2006)

Identifiants

  • HAL Id : hal-00103557 , version 1

Citer

Marie-Neige Garcia, Christophe d'Alessandro, Gérard Bailly, Philippe Boula de Mareüil, Michel Morel. A joint prosody evaluation of French text-to-speech synthesis systems. 5th edition of the International Conference on Language Ressources and Evaluation (LREC 2006), May 2006, Genoa, Italy. pp.307-310. ⟨hal-00103557⟩
199 Consultations
205 Téléchargements

Partager

Gmail Facebook X LinkedIn More