Intelligibility of natural and 3D-cloned German speech - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2007

Intelligibility of natural and 3D-cloned German speech

Résumé

We investigate the intelligibility of natural visual and audiovisual speech compared to re-synthesized speech movements rendered by a talking head. This talking head is created using the speaker cloning methodology of the Institut de la Communication Parlée in Grenoble (now department for speech and cognition in GIPSA- Lab). A German speaker with colored markers on the face was recorded audiovisually using multiple cameras. The three-dimensional coordinates of the markers were extracted and parameterized. Spoken VCV sequences were then visually re-synthesized. A perception experiment was carried out to measure the visual and audiovisual intelligibility of natural and synthesized video, using the original audio with and without added noise. Identification scores show that the clone is capable of recovering almost 70% of the intelligibility gain provided by the original face. Part of this loss is due to missing visual cues in the present synthesis, due notably to the lack of a tongue.
Fichier principal
Vignette du fichier
sf_AVSP07.pdf (1008.63 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00169563 , version 1 (04-09-2007)

Identifiants

  • HAL Id : hal-00169563 , version 1

Citer

Sascha Fagel, Gérard Bailly, Frédéric Elisei. Intelligibility of natural and 3D-cloned German speech. AVSP 2007 - 6th International Conference on Auditory-Visual Speech Processing, Aug 2007, Hilvarenbeek, Netherlands. pp.56-61. ⟨hal-00169563⟩
127 Consultations
43 Téléchargements

Partager

Gmail Facebook X LinkedIn More