Audio word similarity for clustering with zero resources based on iterative HMM classification

Amélie Royer; Guillaume Gravier; Vincent Claveau

doi:10.1109/ICASSP.2016.7472697

Communication Dans Un Congrès Année : 2016

Audio word similarity for clustering with zero resources based on iterative HMM classification

(1) , (1) , (1)

Amélie Royer

Fonction : Auteur

Creating and exploiting explicit links between multimedia fragments

Guillaume Gravier

Fonction : Auteur
PersonId : 1046
IdHAL : guig
ORCID : 0000-0002-2266-5682
IdRef : 110355415

Creating and exploiting explicit links between multimedia fragments

Vincent Claveau

Fonction : Auteur
PersonId : 5270
IdHAL : vincent-claveau
ORCID : 0000-0002-3459-0550
IdRef : 075988216

Creating and exploiting explicit links between multimedia fragments

Résumé

Recent work on zero resource word discovery makes intensive use of audio fragment clustering to find repeating speech patterns. In the absence of acoustic models, the clustering step traditionally relies on dynamic time warping (DTW) to compare two samples and thus suffers from the known limitations of this technique. We propose a new sample comparison method, called 'similarity by iterative classification', that exploits the modeling capacities of hidden Markov models (HMM) with no supervision. The core idea relies on the use of HMMs trained on randomly labeled data and exploits the fact that similar samples are more likely to be classified together by a large number of random classifiers than dissimilar ones. The resulting similarity measure is compared to DTW on two tasks, namely nearest neighbor retrieval and clustering , showing that the generalization capabilities of probabilis-tic machine learning significantly benefit to audio word comparison and overcome many of the limitations of DTW-based comparison.

Mots clés

acoustic similarity dynamic time warping Index Terms— zero-resource speech processing word discovery audio words clustering unsupervised learning

Domaines

Intelligence artificielle [cs.AI] Informatique et langage [cs.CL] Recherche d'information [cs.IR]

Fichier principal

ICASSP2016.pdf (199.45 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Vincent Claveau : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01394757

Soumis le : mercredi 9 novembre 2016-17:01:34

Dernière modification le : vendredi 24 mars 2023-14:53:03

Archivage à long terme le : mercredi 15 mars 2017-04:25:31

Dates et versions

hal-01394757 , version 1 (09-11-2016)

Identifiants

HAL Id : hal-01394757 , version 1
DOI : 10.1109/ICASSP.2016.7472697

Citer

Amélie Royer, Guillaume Gravier, Vincent Claveau. Audio word similarity for clustering with zero resources based on iterative HMM classification. International Conference on Acoustics, Speech and Signal Processing, ICASSP, Mar 2016, Shanghai, China. pp.5340 - 5344, ⟨10.1109/ICASSP.2016.7472697⟩. ⟨hal-01394757⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA CENTRALESUPELEC IRISA-D6 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

573 Consultations

298 Téléchargements

Audio word similarity for clustering with zero resources based on iterative HMM classification

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager