Improvement to a NAM captured whisper-to-speech system

Viet-Anh Tran 1 Gérard Bailly 2 Hélène Loevenbruck 3 Christian Jutten 4
2 GIPSA-MPACIF - MPACIF
GIPSA-DPC - Département Parole et Cognition
3 GIPSA-PMD - PMD
GIPSA-DPC - Département Parole et Cognition
4 GIPSA-SIGMAPHY - SIGMAPHY
GIPSA-DIS - Département Images et Signal
Abstract : In this paper, new techniques to improve whisper-to-speech conversion are investigated, in the framework of silent speech telephone communication. A preliminary conversion method from Non-Audible Murmur (NAM) to modal speech, based on statistical mapping trained using aligned corpora has been proposed. Although it is a very promising technique, its performance is still insufficient due to the difficulties in estimating F0 from unvoiced speech. In this paper, two distinct modifications are proposed, in order to improve the naturalness of the synthesized speech. In the first modification, LDA (Linear Discriminant Analysis) is used instead of PCA (Principal Component Analysis) to reduce the dimensionality of the input spectral vectors. In addition, the influence of long-term variation of spectral information on pitch estimation is examined. The second modification is an attempt to integrate visual information as a complementary input to improve spectral estimation, F0 estimation and voicing decision.
Complete list of metadatas

Cited literature [14 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00333288
Contributor : Gérard Bailly <>
Submitted on : Saturday, November 15, 2008 - 12:33:41 PM
Last modification on : Monday, July 8, 2019 - 3:08:48 PM
Long-term archiving on : Monday, June 7, 2010 - 7:26:33 PM

File

vat_IS08.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00333288, version 1

Citation

Viet-Anh Tran, Gérard Bailly, Hélène Loevenbruck, Christian Jutten. Improvement to a NAM captured whisper-to-speech system. 9th Annual Conference of the International Speech Communication Association (Interspeech 2008), Sep 2008, Brisbane, Australia. pp.1465-1468. ⟨hal-00333288⟩

Share

Metrics

Record views

475

Files downloads

361