A K-nearest neighbours approach to unsupervised spoken term discovery

Abstract : Unsupervised spoken term discovery is the task of finding recurrent acoustic patterns in speech without any annotations. Current approaches consists of two steps: (1) discovering similar patterns in speech, and (2) partitioning those pairs of acoustic tokens using graph clustering methods. We propose a new approach for the first step. Previous systems used various approximation algorithms to make the search tractable on large amounts of data. Our approach is based on an optimized k-nearest neighbours (KNN) search coupled with a fixed word embedding algorithm. The results show that the KNN algorithm is robust across languages, consistently out-performs the DTW-based baseline, and is competitive with current state-of-the-art spoken term discovery systems.
Complete list of metadatas

Cited literature [28 references]  Display  Hide  Download

Contributor : Emmanuel Dupoux <>
Submitted on : Friday, December 7, 2018 - 2:35:33 PM
Last modification on : Thursday, January 3, 2019 - 3:10:16 PM
Long-term archiving on : Friday, March 8, 2019 - 3:06:43 PM


Files produced by the author(s)


  • HAL Id : hal-01947953, version 1



Alexis Thual, Corentin Dancette, Julien Karadayi, Juan Benjumea, Emmanuel Dupoux. A K-nearest neighbours approach to unsupervised spoken term discovery. IEEE Spoken Language Technology SLT-2018, Dec 2018, Athènes, Greece. ⟨hal-01947953⟩



Record views


Files downloads