Spoken WordCloud: Clustering Recurrent Patterns in Speech - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

Spoken WordCloud: Clustering Recurrent Patterns in Speech

Résumé

The automatic summarization of speech recordings is typically carried out as a two step process: the speech is first decoded using an automatic speech recognition system and the resulting text transcripts are processed to create the summary. However, this approach might not be suitable with adverse acoustic conditions or languages with limited training resources. In order to address these limitations, we propose in this paper an automatic speech summarization method that is based on the automatic discovery of patterns in the speech: recurrent acoustic patterns are first extracted from the audio and then are clustered and ranked according to the number of repetitions in the recording. This approach allows us to build what we call a "Spoken WordCloud" because of its similarity with text-based word-clouds. We present an algorithm that achieves a cluster purity of up to 90% and an inverse purity of 71% in preliminary experiments using a small dataset of connected spoken words.
Fichier principal
Vignette du fichier
flamary_cmbi2011.pdf (185.67 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00582799 , version 1 (04-04-2011)

Identifiants

  • HAL Id : hal-00582799 , version 1

Citer

Rémi Flamary, Xavier Anguera, Nuria Oliver. Spoken WordCloud: Clustering Recurrent Patterns in Speech. International Workshop on Content-Based Multimedia Indexing, Jun 2011, Madrid, Spain. ⟨hal-00582799⟩
125 Consultations
400 Téléchargements

Partager

Gmail Facebook X LinkedIn More