Towards Speaker and Environmental Robustness in ASR: The HIWIRE Project

Alexandros Potamianos; Ghazi Bouselmi; Dimitrios Dimitriadis; Dominique Fohr; Roberto Gemello; Irina Illina; Franco Mana; Petros Maragos; M. Matassoni; Vassilis Pitsikalis; J. Ramirez; E. Sanchez-Soto; J. Segura; P. Svaizer

Communication Dans Un Congrès Année : 2006

Towards Speaker and Environmental Robustness in ASR: The HIWIRE Project

(1) , (2) , (3) , (2) , (4) , (2) , (4) , (3) , (5) , (3) , (6) , (1) , (6) , (5)

1
2
3
4
5
6

Alexandros Potamianos

Fonction : Auteur

Department of Electronic and Computer Engineering [Crete]

Ghazi Bouselmi

Fonction : Auteur
PersonId : 836336

Analysis, perception and recognition of speech

Dimitrios Dimitriadis

Fonction : Auteur

School of of Electrical and Computer Engineering [Athens]

Dominique Fohr

Fonction : Auteur
PersonId : 15652
IdHAL : dominique-fohr
IdRef : 031092942

Analysis, perception and recognition of speech

Roberto Gemello

Fonction : Auteur

LOQUENDO

Irina Illina

Fonction : Auteur
PersonId : 15663
IdHAL : irina-illina
IdRef : 120731746

Analysis, perception and recognition of speech

Franco Mana

Fonction : Auteur

LOQUENDO

Petros Maragos

Fonction : Auteur

School of of Electrical and Computer Engineering [Athens]

M. Matassoni

Fonction : Auteur

Istituto Trentino di Cultura

Vassilis Pitsikalis

Fonction : Auteur

School of of Electrical and Computer Engineering [Athens]

J. Ramirez

Fonction : Auteur

Universidad de Granada = University of Granada

E. Sanchez-Soto

Fonction : Auteur

Department of Electronic and Computer Engineering [Crete]

J. Segura

Fonction : Auteur

Universidad de Granada = University of Granada

P. Svaizer

Fonction : Auteur

Istituto Trentino di Cultura

Résumé

In this paper, we present algorithms for dealing with variability and mismatch in speech recognition due to environmental conditions and non-native speaker populations. The proposed algorithms cover a broad spectrum of ideas including robust feature extraction, feature compensation and speech enhancement. Specifically the following algorithms are presented and evaluated: beamforming for multi-microphone speech recognition, robust modulation and fractal features, Teager energy cepstrum coefficients, parametric feature equalization, speech enhancement, and acoustic modeling for non-native speech recognition. Also the problem of feature fusion and voice activity detection are discussed. Evaluation results on the AURORA databases under the auspices of the HIWIRE project show that significant gains can be achieved under adverse or mismatched conditions using these algorithms. Relative error rate reduction of up to 50% was shown for multi-microphone speech recognition, robust feature combination and speech enhancement. 30-40% reduction was shown for parametric feature equalization and non-native acoustic models.

Mots clés

speech recognition noisy sppech robustness

Domaines

Interface homme-machine [cs.HC]

Fichier principal

hiwire_sriv06.pdf (214.99 Ko)

Dominique Fohr : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00110502

Soumis le : lundi 30 octobre 2006-12:34:20

Dernière modification le : vendredi 24 mars 2023-14:52:48

Archivage à long terme le : mardi 6 avril 2010-21:16:16

Dates et versions

hal-00110502 , version 1 (30-10-2006)

Identifiants

HAL Id : hal-00110502 , version 1

Citer

Alexandros Potamianos, Ghazi Bouselmi, Dimitrios Dimitriadis, Dominique Fohr, Roberto Gemello, et al.. Towards Speaker and Environmental Robustness in ASR: The HIWIRE Project. SRIV'06 ITRW on Speech Recognition and Intrinsic Variation, May 2006, Toulouse, France. ⟨hal-00110502⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA

355 Consultations

150 Téléchargements

Towards Speaker and Environmental Robustness in ASR: The HIWIRE Project

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager