Skip to Main content Skip to Navigation
New interface
Conference papers

Spoken WordCloud: Clustering Recurrent Patterns in Speech

Abstract : The automatic summarization of speech recordings is typically carried out as a two step process: the speech is first decoded using an automatic speech recognition system and the resulting text transcripts are processed to create the summary. However, this approach might not be suitable with adverse acoustic conditions or languages with limited training resources. In order to address these limitations, we propose in this paper an automatic speech summarization method that is based on the automatic discovery of patterns in the speech: recurrent acoustic patterns are first extracted from the audio and then are clustered and ranked according to the number of repetitions in the recording. This approach allows us to build what we call a "Spoken WordCloud" because of its similarity with text-based word-clouds. We present an algorithm that achieves a cluster purity of up to 90% and an inverse purity of 71% in preliminary experiments using a small dataset of connected spoken words.
Document type :
Conference papers
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Rémi Flamary Connect in order to contact the contributor
Submitted on : Monday, April 4, 2011 - 10:41:42 AM
Last modification on : Wednesday, March 2, 2022 - 10:10:09 AM
Long-term archiving on: : Thursday, November 8, 2012 - 1:15:28 PM


Files produced by the author(s)


  • HAL Id : hal-00582799, version 1


Rémi Flamary, Xavier Anguera, Nuria Oliver. Spoken WordCloud: Clustering Recurrent Patterns in Speech. International Workshop on Content-Based Multimedia Indexing, Jun 2011, Madrid, Spain. ⟨hal-00582799⟩



Record views


Files downloads