Unsupervised Speech Unit Discovery Using K-means and Neural Networks - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Unsupervised Speech Unit Discovery Using K-means and Neural Networks

Résumé

Unsupervised discovery of sub-lexical units in speech is a problem that currently interests speech researchers. In this paper, we report experiments in which we use phone segmentation followed by clustering the segments together using k-means and a Convolutional Neural Network. We thus obtain an annotation of the corpus in pseudo-phones, which then allows us to find pseudo-words. We compare the results for two different segmentations: manual and automatic. To check the portability of our approach, we compare the results for three different languages (English, French and Xitsonga). The originality of our work lies in the use of neural networks in an unsupervised way that differ from the common method for unsupervised speech unit discovery based on auto-encoders. With the Xitsonga corpus, for instance, with manual and automatic segmentations, we were able to obtain 46% and 42% purity scores, respectively, at phone-level with 30 pseudo-phones. Based on the inferred pseudo-phones, we discovered about 200 pseudo-words.
Fichier principal
Vignette du fichier
Manenti_22281.pdf (196.11 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02559766 , version 1 (30-04-2020)

Identifiants

  • HAL Id : hal-02559766 , version 1
  • OATAO : 22281

Citer

Céline Manenti, Thomas Pellegrini, Julien Pinquier. Unsupervised Speech Unit Discovery Using K-means and Neural Networks. 5th International Conference on Statistical Language and Speech Processing (SLSP 2017), Oct 2017, Le Mans, France. pp.169-180. ⟨hal-02559766⟩
40 Consultations
108 Téléchargements

Partager

Gmail Facebook X LinkedIn More