Speaker Utterances tying among speaker segmented audio documents using hierarchical classification: towards speaker indexing of audio databases

Abstract : Speaker indexing of an audio database consists in organizing the audio data according to the speakers present in the database. It is composed of three steps: (1) segmentation by speakers of each audio document; (2) speaker tying among the various segmented portions of the audio documents; and (3) generation of a speaker- based index. This paper focuses on the second step, the speaker tying task, which has not been addressed in the literature. The re- sult of this task is a classification of the segmented acoustic data by clusters; each cluster should represent one speaker. This paper investigates on hierarchical classification approaches for speaker tying. Two new discriminant dissimilarity measures and a new bottom-up algorithm are also proposed. The experiments are con- ducted on a subset of the Switchboard database, a conversational telephone database, and show that the proposed method allows a very satisfying speaker tying among various audio documents, with a good level of purity for the clusters, but with a number of clusters significantly higher than the number of speakers.
Document type :
Conference papers
Complete list of metadatas

Cited literature [10 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01434586
Contributor : Sylvain Meignier <>
Submitted on : Wednesday, March 29, 2017 - 10:15:30 AM
Last modification on : Saturday, June 15, 2019 - 12:24:17 PM
Long-term archiving on : Friday, June 30, 2017 - 12:11:39 PM

File

mei-icslp2002.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01434586, version 1

Collections

Citation

Sylvain Meignier, Jean-François Bonastre, Ivan Magrin-Chagnolleau. Speaker Utterances tying among speaker segmented audio documents using hierarchical classification: towards speaker indexing of audio databases. ISCA International Conference on Spoken Language Processing (ICSLP 2002), 2002, Denver, CO, United States. pp.577--580. ⟨hal-01434586⟩

Share

Metrics

Record views

100

Files downloads

40