Spoken WordCloud: Clustering Recurrent Patterns in Speech

Rémi Flamary; Xavier Anguera; Nuria Oliver

Communication Dans Un Congrès Année : 2011

Spoken WordCloud: Clustering Recurrent Patterns in Speech

(1) , (2) , (2)

1
2

Rémi Flamary

Fonction : Auteur
PersonId : 22
IdHAL : remi-flamary
ORCID : 0000-0002-4212-6627
IdRef : 188395008

Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes

Xavier Anguera

Fonction : Auteur

Telefonica Investigación y Desarrollo

Nuria Oliver

Fonction : Auteur

Telefonica Investigación y Desarrollo

Résumé

The automatic summarization of speech recordings is typically carried out as a two step process: the speech is first decoded using an automatic speech recognition system and the resulting text transcripts are processed to create the summary. However, this approach might not be suitable with adverse acoustic conditions or languages with limited training resources. In order to address these limitations, we propose in this paper an automatic speech summarization method that is based on the automatic discovery of patterns in the speech: recurrent acoustic patterns are first extracted from the audio and then are clustered and ranked according to the number of repetitions in the recording. This approach allows us to build what we call a "Spoken WordCloud" because of its similarity with text-based word-clouds. We present an algorithm that achieves a cluster purity of up to 90% and an inverse purity of 71% in preliminary experiments using a small dataset of connected spoken words.

Domaines

Apprentissage [cs.LG]

Fichier principal

flamary_cmbi2011.pdf (185.67 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Rémi Flamary : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00582799

Soumis le : lundi 4 avril 2011-10:41:42

Dernière modification le : vendredi 22 décembre 2023-15:16:05

Archivage à long terme le : jeudi 8 novembre 2012-13:15:28

Dates et versions

hal-00582799 , version 1 (04-04-2011)

Identifiants

HAL Id : hal-00582799 , version 1

Citer

Rémi Flamary, Xavier Anguera, Nuria Oliver. Spoken WordCloud: Clustering Recurrent Patterns in Speech. International Workshop on Content-Based Multimedia Indexing, Jun 2011, Madrid, Spain. ⟨hal-00582799⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSA-ROUEN LITIS COMUE-NORMANDIE UNIROUEN UNILEHAVRE INSA-GROUPE

125 Consultations

400 Téléchargements

Spoken WordCloud: Clustering Recurrent Patterns in Speech

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager