Unsupervised regularization of the embedding extractor for robust language identification - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Unsupervised regularization of the embedding extractor for robust language identification

Résumé

State-of-the-art spoken language identification systems are constituted of three modules: a frame-level feature extractor, a segment-level embedding extractor and a final classifier. The performance of these systems degrades when facing mismatch between training and testing data. Most domain adaptation methods focus on adaptation of the final classifier. In this article , we propose a model-based unsupervised domain adaptation of the segment-level embedding extractor. The approach consists in a modification of the loss function used for training the embedding extractor. We introduce a regularization term based on the maximum mean discrepancy loss. Experiments were performed on the RATS corpus with transmission channel mismatch between telephone and radio channels. We obtained the same language identification performance as supervised training on the target domains but without using labeled data from these domains.
Fichier principal
Vignette du fichier
odyssey_corrections_publication.pdf (929.36 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02544156 , version 1 (16-04-2020)

Identifiants

  • HAL Id : hal-02544156 , version 1

Citer

Raphaël Duroselle, Denis Jouvet, Irina Illina. Unsupervised regularization of the embedding extractor for robust language identification. Odyssey 2020 - The Speaker and Language Recognition Workshop, Nov 2020, Tokyo, Japan. ⟨hal-02544156⟩
141 Consultations
228 Téléchargements

Partager

Gmail Facebook X LinkedIn More