Combination of Cepstral and Phonetically Discriminative Features for Speaker Verification

Achintya Sarkar; Cong-Thanh Do; Viet-Bac Le; Claude Barras

doi:10.1109/LSP.2014.2323432

Article Dans Une Revue IEEE Signal Processing Letters Année : 2014

Combination of Cepstral and Phonetically Discriminative Features for Speaker Verification

(1) , (1) , (2) , (1)

1
2

Achintya Sarkar

Fonction : Auteur

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Cong-Thanh Do

Fonction : Auteur

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Viet-Bac Le

Fonction : Auteur

Vocapia Research [Orsay]

Claude Barras

Fonction : Auteur
PersonId : 17217
IdHAL : claude-barras
IdRef : 165065583

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Résumé

Most speaker recognition systems rely on short-term acoustic cepstral features for extracting the speaker-relevant information from the signal. But phonetic discriminant features, extracted by a bottleneck multi-layer perceptron (MLP) on longer stretches of time, can provide a complementary information and have been adopted in speech transcription systems. We compare the speaker verification performance using cepstral features, discriminant features, and a concatenation of both followed by a dimension reduction. We consider two speaker recognition systems, one based on maximum likelihood linear regression (MLLR) super-vectors and the other on a state-of-the-art i-vector system with two session variability compensation schemes. Experiments are reported on a standard configuration of NIST SRE 2008 and 2010 databases. The results show that the phonetically discriminative MLP features retain speaker-specific information which is complementary to the short-term cepstral features. The performance improvement is obtained with both score domain and feature domain fusion and the speaker verification equal error rate (EER) is reduced up to 50% relative, compared to the best i-vector system using only cepstral features.

Mots clés

Speaker verification i-vector multi-layer perceptron bottleneck features PCA LDA PLDA

Domaines

Informatique [cs] Traitement du signal et de l'image [eess.SP]

Fichier principal

double-final.pdf (225.27 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Claude Barras : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01690336

Soumis le : lundi 22 janvier 2018-22:41:48

Dernière modification le : samedi 7 octobre 2023-21:36:20

Archivage à long terme le : jeudi 24 mai 2018-10:45:28

Dates et versions

hal-01690336 , version 1 (22-01-2018)

Identifiants

HAL Id : hal-01690336 , version 1
DOI : 10.1109/LSP.2014.2323432

Citer

Achintya Sarkar, Cong-Thanh Do, Viet-Bac Le, Claude Barras. Combination of Cepstral and Phonetically Discriminative Features for Speaker Verification. IEEE Signal Processing Letters, 2014, 21 (9), pp.1040 - 1044. ⟨10.1109/LSP.2014.2323432⟩. ⟨hal-01690336⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LIMSI SORBONNE-UNIVERSITE LISN

260 Consultations

164 Téléchargements

Combination of Cepstral and Phonetically Discriminative Features for Speaker Verification

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager