SynPaFlex-Corpus: An Expressive French Audiobooks Corpus Dedicated to Expressive Speech Synthesis

Aghilas Sini; Damien Lolive; Gaëlle Vidal; Marie Tahon; Elisabeth Delais-Roussarie

Communication Dans Un Congrès Année : 2018

SynPaFlex-Corpus: An Expressive French Audiobooks Corpus Dedicated to Expressive Speech Synthesis

(1, 2, 3) , (1, 2, 3) , (2, 3) , (4) , (5)

1
2
3
4
5

Aghilas Sini

Fonction : Auteur

Université de Rennes

MEDIA ET INTERACTIONS

Expressiveness in Human Centered Data/Media

Damien Lolive

Fonction : Auteur
PersonId : 5088
IdHAL : damien-lolive
ORCID : 0000-0002-1110-5444
IdRef : 13017498X

Université de Rennes

MEDIA ET INTERACTIONS

Expressiveness in Human Centered Data/Media

Gaëlle Vidal

Fonction : Auteur
PersonId : 871680

MEDIA ET INTERACTIONS

Expressiveness in Human Centered Data/Media

Marie Tahon

Fonction : Auteur
PersonId : 9821
IdHAL : marie-tahon
ORCID : 0000-0002-6782-0332
IdRef : 165065532

Laboratoire d'Informatique de l'Université du Mans

Elisabeth Delais-Roussarie

Fonction : Auteur
PersonId : 179303
IdHAL : elisabeth-delais-roussarie
ORCID : 0000-0002-4517-1503
IdRef : 076329828

Laboratoire de Linguistique de Nantes

Résumé

This paper presents an expressive French audiobooks corpus containing eighty seven hours of good audio quality speech, recorded by a single amateur speaker reading audiobooks of different literary genres. This corpus departs from existing corpora collected from audiobooks since they usually provide a few hours of mono-genre and multi-speaker speech. The motivation for setting up such a corpus is to explore expressiveness from different perspectives, such as discourse styles, prosody, and pronunciation, and using different levels of analysis (syllable, prosodic and lexical words, prosodic and syntactic phrases, utterance or paragraph). This will allow developing models to better control expressiveness in speech synthesis, and to adapt pronunciation and prosody to specific discourse settings (changes in discourse perspectives, indirect vs. direct styles, etc.). To this end, the corpus has been annotated automatically and provides information as phone labels, phone boundaries, syllables, words or morpho-syntactic tagging. Moreover, a significant part of the corpus has also been annotated manually to encode direct/indirect speech information and emotional content. The corpus is already usable for studies on prosody and TTS purposes and is available to the community.

Mots clés

Speech Synthesis Speech Resource/Database Prosody

Domaines

Informatique [cs] Intelligence artificielle [cs.AI]

Fichier principal

723.pdf (293.18 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Aghilas SINI : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01826690

Soumis le : mercredi 25 juillet 2018-09:49:33

Dernière modification le : mardi 3 octobre 2023-09:49:25

Archivage à long terme le : lundi 1 octobre 2018-06:04:10

Dates et versions

hal-01826690 , version 1 (25-07-2018)

Identifiants

HAL Id : hal-01826690 , version 1

Citer

Aghilas Sini, Damien Lolive, Gaëlle Vidal, Marie Tahon, Elisabeth Delais-Roussarie. SynPaFlex-Corpus: An Expressive French Audiobooks Corpus Dedicated to Expressive Speech Synthesis. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), May 2018, Miyazaki, Japan. ⟨hal-01826690⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-NANTES INSTITUT-TELECOM UNIV-RENNES1 CNRS INRIA UNIV-LEMANS INSA-RENNES ENSSAT IRISA LLING CENTRALESUPELEC IRISA-D6 UR1-MATH-STIC LIUM LIUM-LST UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE ANR UR1-MATH-NUM NANTES-UNIVERSITE

843 Consultations

502 Téléchargements

SynPaFlex-Corpus: An Expressive French Audiobooks Corpus Dedicated to Expressive Speech Synthesis

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager