Exploiting visual information for NAM recognition - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue IEICE Electronics Express Année : 2009

Exploiting visual information for NAM recognition

Résumé

Non-audible murmur (NAM) is an unvoiced speech received through body tissue using special acoustic sensors (i.e., NAM microphones) attached behind the talkers ear. Although NAM has different frequency characteristics compared to normal speech, it is possible to perform automatic speech recognition (ASR) using conventional methods. In using a NAM microphone, body transmission and the loss of lip radiation act as a low-pass filter; as a result, higher frequency components are attenuated in NAM signal. A decrease in NAM recognition performance is attributed to spectral reduction. To address the problem of loss of lip radiation, visual information extracted from the talker's facial movements is fused with NAM speech. Experimental results revealed a relative improvement of 39% when fused NAM speech and facial information were used as compared to using only NAM speech. Results also showed that improvements in the recognition rate depend on the place of articulation.
Fichier principal
Vignette du fichier
IEICE-ELEX-HAL.pdf (162.19 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00357985 , version 1 (09-02-2009)

Identifiants

Citer

Panikos Heracleous, Denis Beautemps, Viet-Anh Tran, Hélène Loevenbruck, Gérard Bailly. Exploiting visual information for NAM recognition. IEICE Electronics Express, 2009, 6 (2), pp.77-82. ⟨10.1587/elex.6.77⟩. ⟨hal-00357985⟩
213 Consultations
467 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More