Digital Signal Processing Constrained temporal structure for text-dependent speaker verification

Anthony Larcher; Jean-François Bonastre; John S.D. Mason

doi:10.1016/j.dsp.2013.07.007

Article Dans Une Revue Digital Signal Processing Année : 2013

Digital Signal Processing Constrained temporal structure for text-dependent speaker verification

(1) , (1) , (2)

1
2

Anthony Larcher

Fonction : Auteur
PersonId : 20105
IdHAL : anthony-larcher
ORCID : 0000-0003-4398-0224
IdRef : 139544569

Laboratoire Informatique d'Avignon

Jean-François Bonastre

Fonction : Auteur
PersonId : 172421
IdHAL : jean-francois-bonastre
ORCID : 0000-0001-7741-3346
IdRef : 079112978

Laboratoire Informatique d'Avignon

John S.D. Mason

Fonction : Auteur

School of engineering, Swansea University

Résumé

In the context of mobile devices, speaker recognition engines may suffer from ergonomic constraints and limited amount of computing resources. Even if they prove their efficiency in classical contexts, GMM/UBM systems show their limitations when restricting the quantity of speech data. In contrast, the proposed GMM/UBM extension addresses situations characterised by limited enrolment data and only the computing power typically found on modern mobile devices. A key contribution comes from the harnessing of the temporal structure of speech using client-customised pass-phrases and new Markov model structures. Additional temporal information is then used to enhance discrimination with Viterbi decoding, increasing the gap between client and imposter scores. Experiments on the MyIdea database are presented with a standard GMM/UBM configuration acting as a benchmark. When imposters do not know the client pass-phrase, a relative gain of up to 65% in terms of EER is achieved over the GMM/UBM baseline configuration. The results clearly highlight the potential of this new approach, with a good balance between complexity and recognition accuracy.

Mots clés

Speaker recognition Text-dependent Password Embedded application * Corresponding author

Domaines

Informatique [cs]

Fichier principal

Constrained_Temporal_Structure_for_Text-Dependent_Speaker_Verification.pdf (456.38 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

bibliothèque Universitaire Déposants HAL-Avignon : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01317964

Soumis le : lundi 19 novembre 2018-10:35:54

Dernière modification le : mercredi 9 juin 2021-15:26:02

Archivage à long terme le : mercredi 20 février 2019-13:27:29

Dates et versions

hal-01317964 , version 1 (19-11-2018)

Identifiants

HAL Id : hal-01317964 , version 1
DOI : 10.1016/j.dsp.2013.07.007

Citer

Anthony Larcher, Jean-François Bonastre, John S.D. Mason. Digital Signal Processing Constrained temporal structure for text-dependent speaker verification. Digital Signal Processing, 2013, ⟨10.1016/j.dsp.2013.07.007⟩. ⟨hal-01317964⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON LIA

54 Consultations

83 Téléchargements

Digital Signal Processing Constrained temporal structure for text-dependent speaker verification

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager