Skip to Main content Skip to Navigation
Conference papers

Segmentation et Regroupement en Locuteurs d'une collection de documents audio

Abstract : We propose to study speaker diarization from a collection of audio documents. The goal is to detect speakers appearing in several shows. In our approach, shows are processed independently of each other before being processed collectively, to group speakers involved in several shows. Two clustering methods are studied for the overall treatment of the collection: one uses the NCLR metric and the other is inspired by techniques based on i-vectors, used in the speaker verification field, and is expressed as an ILP problem. Both methods were evaluated on two sets of 15 shows from ESTER 2. The method based on i-vectors achieves performance slightly lower than those obtained by the NCLR method, however, the computation time is on average 17 times faster. Therefore, this method is suitable for processing large volumes of data.
Document type :
Conference papers
Complete list of metadata

Cited literature [8 references]  Display  Hide  Download
Contributor : HAKIM AMOKRANE Connect in order to contact the contributor
Submitted on : Monday, April 3, 2017 - 9:54:37 PM
Last modification on : Tuesday, December 8, 2020 - 9:51:09 AM
Long-term archiving on: : Tuesday, July 4, 2017 - 2:56:14 PM


Publisher files allowed on an open archive


  • HAL Id : hal-01450722, version 1



Grégor Dupuy, Mickael Rouvier, Sylvain Meignier, yannick Estève. Segmentation et Regroupement en Locuteurs d'une collection de documents audio. 29e Journées d’Études sur la Parole (JEP'12), 2012, Grenoble, France. pp.433 - 440. ⟨hal-01450722⟩



Record views


Files downloads