Amélioration de la conversion de voix chuchotée enregistrée par capteur NAM vers la voix audible

Viet-Anh Tran 1 Gérard Bailly 2 Hélène Loevenbruck 3 Christian Jutten 4
2 GIPSA-MPACIF - MPACIF
GIPSA-DPC - Département Parole et Cognition
3 GIPSA-PMD - PMD
GIPSA-DPC - Département Parole et Cognition
4 GIPSA-SIGMAPHY - SIGMAPHY
GIPSA-DIS - Département Images et Signal
Abstract : The NAM-to-speech conversion proposed by Toda and colleagues which converts Non-Audible Murmur (NAM) to audible speech by statistical mapping trained using aligned corpora is a very promising technique, but its performance is still insufficient. In this paper, we present our current work to improve the intelligibility and the naturalness of the synthesized speech converted from whispered speech with this technique. The first system is proposed to improve F0 estimation and voicing decision. A simple neural network is used to detect voiced segments in the whisper while a GMM estimates a continuous melodic contour based on training voiced segments. In the second system, we attempt to integrate visual information for improving both spectral estimation, F0 estimation and voicing decision.
Complete list of metadatas

Cited literature [6 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00339058
Contributor : Gérard Bailly <>
Submitted on : Saturday, November 15, 2008 - 5:53:40 PM
Last modification on : Monday, July 8, 2019 - 3:10:46 PM
Long-term archiving on : Monday, June 7, 2010 - 8:40:11 PM

File

vat_JEP08.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00339058, version 1

Citation

Viet-Anh Tran, Gérard Bailly, Hélène Loevenbruck, Christian Jutten. Amélioration de la conversion de voix chuchotée enregistrée par capteur NAM vers la voix audible. 27e Journées d'Etudes sur la Parole, JEP'2008, Jun 2008, Avignon, France. pp.110-113. ⟨hal-00339058⟩

Share

Metrics

Record views

550

Files downloads

362