Skip to Main content Skip to Navigation
New interface
Conference papers

Joint optimization of diffusion probabilistic-based multichannel speech enhancement with far-field speaker verification

Sandipana Dowerah 1 Romain Serizel 1 Denis Jouvet 1 M Mohammadamini 2 Driss Matrouf 2 
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Today's smart devices using speaker verification are getting equipped with multiple microphones resulting in improving spatial ambiguity and directivity. However, unlike any other speech-based applications, the performance of speaker verification degrades in far-field scenarios due to the adverse effects of a noisy environment and room reverberation. This paper presents a novel multichannel speech enhancement module based on the diffusion probabilistic model. It is used as the front-end of the ECAPA-TDNN speaker verification system in far-field scenarios under a noisy-reverberant environment. The proposed system incorporates a two-stage training approach. In the first stage, both speech enhancement and speaker verification modules are trained individually. In the second stage, both the modules are combined to jointly trained them. We use similaritypreserving knowledge distillation loss that guides the network to produce similar activation for enhanced signals to that of clean speech signals. Using joint optimization with knowledge distillation loss achieved the best performance on both the evaluation composed of synthetic clips similar to those used at training and on unseen recorded clips from the VOiCES dataset.
Document type :
Conference papers
Complete list of metadata
Contributor : Sandipana Dowerah Connect in order to contact the contributor
Submitted on : Thursday, October 27, 2022 - 11:02:25 AM
Last modification on : Thursday, December 8, 2022 - 9:01:33 AM


Files produced by the author(s)


  • HAL Id : hal-03671583, version 2


Sandipana Dowerah, Romain Serizel, Denis Jouvet, M Mohammadamini, Driss Matrouf. Joint optimization of diffusion probabilistic-based multichannel speech enhancement with far-field speaker verification. IEEE SLT 2022, Jan 2023, Doha, Qatar. ⟨hal-03671583v2⟩



Record views


Files downloads