From 3-D speaker cloning to text-to-audiovisual speech

Sascha Fagel; Frédéric Elisei; Gérard Bailly

Conference Papers Year : 2008

From 3-D speaker cloning to text-to-audiovisual speech

(1) , (2) , (3)

1
2
3

Sascha Fagel

Function : Author

Institut für Sprache und Kommunikation

Frédéric Elisei

Function : Author
PersonId : 17769
IdHAL : frederic-elisei
ORCID : 0000-0002-1295-3445

GIPSA-Services

Gérard Bailly

Function : Author
PersonId : 444
IdHAL : gerard-bailly
ORCID : 0000-0002-6053-0818
IdRef : 033792135

GIPSA - Machines Parlantes, Agents Communicants & Interaction Face-à-face

Abstract

Visible speech movements were motion captured and parameterized. Coarticulated targets were extracted from VCVs and modeled to generate arbitrary German utterances by target interpolation. The system was extended to synthesize English utterances by a mapping to German phonemes. An evaluation by means of a modified rhyme test reveals that the synthetic videos of isolated words increase the recognition scores from 27 % to 47.5 % when added to audio only presentation

Domains

Signal and Image processing Signal and Image Processing

Fichier principal

sf_IS08.pdf (101.34 Ko)

Origin : Files produced by the author(s)

Gérard Bailly : Connect in order to contact the contributor

https://hal.science/hal-00361886

Submitted on : Monday, February 16, 2009-7:31:58 PM

Last modification on : Thursday, April 4, 2024-8:57:57 PM

Long-term archiving on: Tuesday, June 8, 2010-10:33:58 PM

Dates and versions

hal-00361886 , version 1 (16-02-2009)

Identifiers

HAL Id : hal-00361886 , version 1

Cite

Sascha Fagel, Frédéric Elisei, Gérard Bailly. From 3-D speaker cloning to text-to-audiovisual speech. Interspeech 2008 - 9th Annual Conference of the International Speech Communication Association, Sep 2008, Brisbane, Australia. pp.2325. ⟨hal-00361886⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS GIPSA GIPSA-DPC GIPSA-MPACIF

135 View

88 Download

From 3-D speaker cloning to text-to-audiovisual speech

Abstract

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share