Is Text-to-Speech Synthesis Ready for Use in Computer-Assisted Language Learning?

Zöe Handley

doi:10.1016/j.specom.2008.12.004

Article Dans Une Revue Speech Communication Année : 2009

Is Text-to-Speech Synthesis Ready for Use in Computer-Assisted Language Learning?

Zöe Handley

Fonction : Auteur correspondant
PersonId : 890343

Connectez-vous pour contacter l'auteur

Résumé

Text-to-Speech (TTS) synthesis, the generation of speech from text input, offers another means of providing spoken language input to learners in Computer-Assisted Language Learning (CALL) environments. Indeed, many potential benefits (ease of creation and editing of speech models, generation of speech models and feedback on demand, etc.) and uses (talking dictionaries, talking texts, dictation, pronunciation training, dialogue partner, etc.) of TTS synthesis in CALL have been put forward. Yet, the use of TTS synthesis in CALL is not widely accepted and only a few applications have found their way onto the market. One potential reason for this is that TTS synthesis has not been adequately evaluated for this purpose. Previous evaluations of TTS synthesis for use in CALL, have only addressed the comprehensibility of TTS synthesis. Yet, CALL places demands on the comprehensibility, naturalness, accuracy, register and expressiveness of the output of TTS synthesis. In this paper, the aforementioned aspects of the quality of the output of four state-of-the-art French TTS synthesis systems are evaluated with respect to their use in the three different roles that TTS synthesis systems may assume within CALL applications, namely: 1) reading machine, 2) pronunciation model and 3) conversational partner (Handley and Hamel, 2005). The results of this evaluation suggest that the best TTS synthesis systems are ready for use in applications in which they ‘add value' to CALL, i.e. exploit the unique capacity of TTS synthesis to generate speech models on demand. An example of such an application is a dialogue partner. In order to fully meet the requirements of CALL, further attention needs to be paid to accuracy and naturalness, in particular at the prosodic level, and expressiveness.

Mots clés

CALL Speech synthesis TTS synthesis Evaluation

Fichier principal

PEER_stage2_10.1016%2Fj.specom.2008.12.004.pdf (457.47 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Hal Peer : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00558516

Soumis le : samedi 22 janvier 2011-02:51:50

Dernière modification le : samedi 22 janvier 2011-02:51:50

Archivage à long terme le : vendredi 2 décembre 2016-14:30:54

Dates et versions

hal-00558516 , version 1 (22-01-2011)

Identifiants

HAL Id : hal-00558516 , version 1
DOI : 10.1016/j.specom.2008.12.004

Citer

Zöe Handley. Is Text-to-Speech Synthesis Ready for Use in Computer-Assisted Language Learning?. Speech Communication, 2009, 51 (10), pp.906. ⟨10.1016/j.specom.2008.12.004⟩. ⟨hal-00558516⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

PEER

91 Consultations

608 Téléchargements

Is Text-to-Speech Synthesis Ready for Use in Computer-Assisted Language Learning?

Résumé

Mots clés

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager