Feature Subset Selection for Improved Native Accent Identification

Tingyao Wu; Jacques Duchateau; Jean-Pierre Martens; Dirk Van Compernolle

doi:10.1016/j.specom.2009.08.010

Article Dans Une Revue Speech Communication Année : 2009

Feature Subset Selection for Improved Native Accent Identification

, , ,

Tingyao Wu

Fonction : Auteur

Jacques Duchateau

Fonction : Auteur

Jean-Pierre Martens

Fonction : Auteur correspondant
PersonId : 900905

Connectez-vous pour contacter l'auteur

Dirk Van Compernolle

Fonction : Auteur

Résumé

In this paper, we develop methods to identify accents of native speakers. Accent identification differs from other speaker classification tasks because accents may differ in a limited number of phonemes only and moreover the differences can be quite subtle. In this paper, it is shown that in such cases it is essential to select a small subset of discriminative features that can be reliably estimated and at the same time discard non-discriminative and noisy features. For identification purposes a speaker is modeled by a supervector containing the mean values for the features for all phonemes. Initial accent models are obtained as class means from the speaker supervectors. Then feature subset selection is performed by applying either ANOVA (Analysis of Variance), LDA (Linear Discriminant Analysis), SVM-RFE (Support Vector Machine - Recursive Feature Elimination), or their hybrids, resulting in a reduced dimensionality of the speaker vector and more importantly a significantly enhanced recognition performance. We also compare the performance of GMM, LDA and SVM as classifiers on a full or a reduced feature subset. The methods are tested on a Flemish read speech database with speakers classified in 5 regions. The difficulty of the task is confirmed by a human listening experiment. We show that a relative improvement of more than 20% in accent recognition rate can be achieved with feature subset selection irrespective of the choice of classifier. We finally show that the construction of speaker based supervectors significantly enhances results over a reference GMM system that uses the raw feature vectors directly as input, both in text dependent and independent conditions.

Mots clés

Accent Identification Language Identification Feature Selection Gaussian Mixture Model Linear Discriminant Analysis Support Vector Machine

Fichier principal

PEER_stage2_10.1016%2Fj.specom.2009.08.010.pdf (555.06 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Hal Peer : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00592582

Soumis le : vendredi 13 mai 2011-02:54:27

Dernière modification le : vendredi 13 mai 2011-02:54:27

Archivage à long terme le : samedi 3 décembre 2016-13:17:00

Dates et versions

hal-00592582 , version 1 (13-05-2011)

Identifiants

HAL Id : hal-00592582 , version 1
DOI : 10.1016/j.specom.2009.08.010

Citer

Tingyao Wu, Jacques Duchateau, Jean-Pierre Martens, Dirk Van Compernolle. Feature Subset Selection for Improved Native Accent Identification. Speech Communication, 2009, 52 (2), pp.83. ⟨10.1016/j.specom.2009.08.010⟩. ⟨hal-00592582⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

PEER

67 Consultations

258 Téléchargements

Feature Subset Selection for Improved Native Accent Identification

Résumé

Mots clés

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager