Predicting unseen articulations from multi-speaker articulatory models

Gopal Ananthakrishnan; Pierre Badin; Julián Andrés Valdés Vargas; Olov Engwall

Communication Dans Un Congrès Année : 2010

Predicting unseen articulations from multi-speaker articulatory models

(1) , (2) , (2) , (1)

1
2

Gopal Ananthakrishnan

Fonction : Auteur
PersonId : 874603

Department of Speech, Music and Hearing [KTH Stockholm]

Pierre Badin

Fonction : Auteur
PersonId : 4918
IdHAL : pierrebadin
ORCID : 0000-0001-7440-820X
IdRef : 117976687

GIPSA - Machines parlantes, Gestes oro-faciaux, Interaction Face-à-face, Communication augmentée

Julián Andrés Valdés Vargas

Fonction : Auteur
PersonId : 874604

GIPSA - Machines parlantes, Gestes oro-faciaux, Interaction Face-à-face, Communication augmentée

Olov Engwall

Fonction : Auteur
PersonId : 874605

Department of Speech, Music and Hearing [KTH Stockholm]

Résumé

In order to study inter-speaker variability, this work aims to assess the generalization capabilities of data-based multi-speaker articulatory models. We use various three-mode factor analysis techniques to model the variations of midsagittal vocal tract contours obtained from MRI images for three French speakers articulating 73 vowels and consonants. Articulations of a given speaker for phonemes not present in the training set are then predicted by inversion of the models from measurements of these phonemes articulated by the other subjects. On the average, the prediction RMSE was 5.25 mm for tongue contours, and 3.3 mm for 2D midsagittal vocal tract distances. Besides, this study has established a methodology to determine the optimal number of factors for such models.

Mots clés

Multi-speaker Articulatory Model Factor analysis

Domaines

Sciences de l'information et de la communication

Pierre Badin : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00508267

Soumis le : lundi 2 août 2010-17:11:02

Dernière modification le : jeudi 4 avril 2024-21:18:15

Dates et versions

hal-00508267 , version 1 (02-08-2010)

Identifiants

HAL Id : hal-00508267 , version 1

Citer

Gopal Ananthakrishnan, Pierre Badin, Julián Andrés Valdés Vargas, Olov Engwall. Predicting unseen articulations from multi-speaker articulatory models. Interspeech 2010 - 11th Annual Conference of the International Speech Communication Association, Sep 2010, Makuhari, Japan. pp.n.c. ⟨hal-00508267⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS GIPSA GIPSA-DPC GIPSA-MAGIC

161 Consultations

0 Téléchargements

Predicting unseen articulations from multi-speaker articulatory models

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager