Skip to Main content Skip to Navigation
Conference papers

Speaker detection in the wild: Lessons learned from JSALT 2019

Abstract : This paper presents the problems and solutions addressed at the JSALT workshop when using a single microphone for speaker detection in adverse scenarios. The main focus was to tackle a wide range of conditions that go from meetings to wild speech. We describe the research threads we explored and a set of modules that was successful for these scenarios. The ultimate goal was to explore speaker detection; but our first finding was that an effective diarization improves detection, and not having a diarization stage impoverishes the performance. All the different configurations of our research agree on this fact and follow a main backbone that includes diarization as a previous stage. With this backbone, we analyzed the following problems: voice activity detection, how to deal with noisy signals, domain mismatch, how to improve the clustering; and the overall impact of previous stages in the final speaker detection. In this paper, we show partial results for speaker diarizarion to have a better understanding of the problem and we present the final results for speaker detection.
Document type :
Conference papers
Complete list of metadatas

Cited literature [29 references]  Display  Hide  Download
Contributor : Emmanuel Dupoux <>
Submitted on : Friday, December 20, 2019 - 3:24:19 PM
Last modification on : Tuesday, March 2, 2021 - 10:25:03 AM
Long-term archiving on: : Saturday, March 21, 2020 - 7:57:49 PM


Files produced by the author(s)


  • HAL Id : hal-02417632, version 1
  • ARXIV : 1912.00938


Paola García, Jesus Villalba, Hervé Bredin, Jun Du, Diego Castan, et al.. Speaker detection in the wild: Lessons learned from JSALT 2019. Odyssey 2020 The Speaker and Language Recognition Workshop, Nov 2020, Tokyo, Japan. ⟨hal-02417632⟩



Record views


Files downloads