Skip to Main content Skip to Navigation
Conference papers

Spoken WordCloud: Clustering Recurrent Patterns in Speech

Abstract : The automatic summarization of speech recordings is typically carried out as a two step process: the speech is first decoded using an automatic speech recognition system and the resulting text transcripts are processed to create the summary. However, this approach might not be suitable with adverse acoustic conditions or languages with limited training resources. In order to address these limitations, we propose in this paper an automatic speech summarization method that is based on the automatic discovery of patterns in the speech: recurrent acoustic patterns are first extracted from the audio and then are clustered and ranked according to the number of repetitions in the recording. This approach allows us to build what we call a "Spoken WordCloud" because of its similarity with text-based word-clouds. We present an algorithm that achieves a cluster purity of up to 90% and an inverse purity of 71% in preliminary experiments using a small dataset of connected spoken words.
Document type :
Conference papers
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00582799
Contributor : Rémi Flamary <>
Submitted on : Monday, April 4, 2011 - 10:41:42 AM
Last modification on : Thursday, February 7, 2019 - 5:35:21 PM
Long-term archiving on: : Thursday, November 8, 2012 - 1:15:28 PM

File

flamary_cmbi2011.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00582799, version 1

Citation

Rémi Flamary, Xavier Anguera, Nuria Oliver. Spoken WordCloud: Clustering Recurrent Patterns in Speech. International Workshop on Content-Based Multimedia Indexing, Jun 2011, Madrid, Spain. ⟨hal-00582799⟩

Share

Metrics

Record views

247

Files downloads

614