Discourse phrases classification: direct vs. narrative audio speech

Marie Tahon; Damien Lolive

Communication Dans Un Congrès Année : 2018

Discourse phrases classification: direct vs. narrative audio speech

(1) , (2)

1
2

Marie Tahon

Fonction : Auteur
PersonId : 9821
IdHAL : marie-tahon
ORCID : 0000-0002-6782-0332
IdRef : 165065532

Laboratoire d'Informatique de l'Université du Mans

Damien Lolive

Fonction : Auteur
PersonId : 5088
IdHAL : damien-lolive
ORCID : 0000-0002-1110-5444
IdRef : 13017498X

Expressiveness in Human Centered Data/Media

Résumé

In the field of storytelling, speech synthesis is trying to move from a neutral machine-like to an expressive voice. For para-metric and unit-selection systems, building new features or cost functions is necessary to allow a better expressivity control. The present article investigates the classification task between direct and narrative discourse phrases to build a new expressivity score. Different models are trained on different speech units (syllable, word and discourse phrases) from an audiobook with 3 sets of features. Classification experiments are conducted on the Blizzard corpus which features children English audio-books and contains various characters and emotional states. The experiments show that the fusion of SVM classifiers trained with different prosodic and phonologic feature sets increases the classification rate from 67.4% with 14 prosodic features to 71.8% with the 3 merged sets. Also the addition of a decision threshold achieves promising results for expressive speech synthesis according to the strength of the constraint required on expressivity: 71.8% with 100% of the words, 79.9% with 50% and 82.6% with 25%.

Mots clés

discourse phrases narration classification audiobook expressive speech synthesis

Domaines

Informatique et langage [cs.CL] Intelligence artificielle [cs.AI]

Fichier principal

SP18_paper_95.pdf (211.29 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Marie Tahon : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01790910

Soumis le : lundi 14 mai 2018-10:37:19

Dernière modification le : mardi 3 octobre 2023-09:49:12

Archivage à long terme le : mardi 25 septembre 2018-09:29:03

Dates et versions

hal-01790910 , version 1 (14-05-2018)

Identifiants

HAL Id : hal-01790910 , version 1

Citer

Marie Tahon, Damien Lolive. Discourse phrases classification: direct vs. narrative audio speech. Speech Prosody, Jun 2018, Poznan, Poland. ⟨hal-01790910⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM UNIV-RENNES1 CNRS INRIA UNIV-LEMANS INSA-RENNES ENSSAT IRISA CENTRALESUPELEC IRISA-D6 UR1-MATH-STIC LIUM LIUM-LST UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

741 Consultations

225 Téléchargements

Discourse phrases classification: direct vs. narrative audio speech

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager