A K-nearest neighbours approach to unsupervised spoken term discovery

Unsupervised spoken term discovery is the task of finding recurrent acoustic patterns in speech without any annotations. Current approaches consists of two steps: (1) discovering similar patterns in speech, and (2) partitioning those pairs of acoustic tokens using graph clustering methods. We propose a new approach for the first step. Previous systems used various approximation algorithms to make the search tractable on large amounts of data. Our approach is based on an optimized k-nearest neighbours (KNN) search coupled with a fixed word embedding algorithm. The results show that the KNN algorithm is robust across languages, consistently out-performs the DTW-based baseline, and is competitive with current state-of-the-art spoken term discovery systems.

Mots clés

Unsupervised Spoken term discovery Word discovery Word segmentation

Domaines

Sciences cognitives Linguistique Informatique et langage [cs.CL]

Fichier principal

Thual_2018_A_K-Nearest_Neighbours_Approch_to_Unsupervised_Spoken_Term_Discovery.SLT.pdf (285 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuel Dupoux : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01947953

Soumis le : vendredi 7 décembre 2018-14:35:33

Dernière modification le : vendredi 19 avril 2024-16:18:55

Archivage à long terme le : vendredi 8 mars 2019-15:06:43

Dates et versions

hal-01947953 , version 1 (07-12-2018)

Identifiants

HAL Id : hal-01947953 , version 1

Citer

Alexis Thual, Corentin Dancette, Julien Karadayi, Juan Benjumea, Emmanuel Dupoux. A K-nearest neighbours approach to unsupervised spoken term discovery. IEEE Spoken Language Technology SLT-2018, Dec 2018, Athènes, Greece. ⟨hal-01947953⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA EHESS LSCP DEC INRIA2 PSL ANR

157 Consultations

200 Téléchargements