Skip to Main content Skip to Navigation
Conference papers

Segmentation et Regroupement en Locuteurs d'une collection de documents audio

Abstract : We propose to study speaker diarization from a collection of audio documents. The goal is to detect speakers appearing in several shows. In our approach, shows are processed independently of each other before being processed collectively, to group speakers involved in several shows. Two clustering methods are studied for the overall treatment of the collection: one uses the NCLR metric and the other is inspired by techniques based on i-vectors, used in the speaker verification field, and is expressed as an ILP problem. Both methods were evaluated on two sets of 15 shows from ESTER 2. The method based on i-vectors achieves performance slightly lower than those obtained by the NCLR method, however, the computation time is on average 17 times faster. Therefore, this method is suitable for processing large volumes of data.
Complete list of metadatas

Cited literature [8 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01450722
Contributor : Hakim Amokrane <>
Submitted on : Monday, April 3, 2017 - 9:54:37 PM
Last modification on : Thursday, April 6, 2017 - 10:00:42 AM
Document(s) archivé(s) le : Tuesday, July 4, 2017 - 2:56:14 PM

File

F12-1055.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01450722, version 1

Collections

Citation

Grégor Dupuy, Mickael Rouvier, Sylvain Meignier, Yannick Estève. Segmentation et Regroupement en Locuteurs d'une collection de documents audio. 29e Journées d’Études sur la Parole (JEP'12), 2012, Grenoble, France. pp.433 - 440. ⟨hal-01450722⟩

Share

Metrics

Record views

193

Files downloads

88