Skip to Main content Skip to Navigation
Conference papers

Unsupervised regularization of the embedding extractor for robust language identification

Raphaël Duroselle 1 Denis Jouvet 1 Irina Illina 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : State-of-the-art spoken language identification systems are constituted of three modules: a frame-level feature extractor, a segment-level embedding extractor and a final classifier. The performance of these systems degrades when facing mismatch between training and testing data. Most domain adaptation methods focus on adaptation of the final classifier. In this article , we propose a model-based unsupervised domain adaptation of the segment-level embedding extractor. The approach consists in a modification of the loss function used for training the embedding extractor. We introduce a regularization term based on the maximum mean discrepancy loss. Experiments were performed on the RATS corpus with transmission channel mismatch between telephone and radio channels. We obtained the same language identification performance as supervised training on the target domains but without using labeled data from these domains.
Document type :
Conference papers
Complete list of metadata

Cited literature [36 references]  Display  Hide  Download
Contributor : Raphaël Duroselle <>
Submitted on : Thursday, April 16, 2020 - 9:27:36 AM
Last modification on : Monday, February 15, 2021 - 1:48:34 PM


Files produced by the author(s)


  • HAL Id : hal-02544156, version 1


Raphaël Duroselle, Denis Jouvet, Irina Illina. Unsupervised regularization of the embedding extractor for robust language identification. Odyssey 2020 - The Speaker and Language Recognition Workshop, Nov 2020, Tokyo, Japan. ⟨hal-02544156⟩



Record views


Files downloads