Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments

Nils Holzenberger; Mingxing Du; Julien Karadayi; Rachid Riad; Emmanuel Dupoux

doi:10.21437/Interspeech.2018-2364

Communication Dans Un Congrès Année : 2018

Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments

(1) , (2) , (3, 4) , (2, 4) , (2, 4)

1
2
3
4

Nils Holzenberger

Fonction : Auteur
PersonId : 1329136
ORCID : 0000-0002-0844-1391

Johns Hopkins University

Mingxing Du

Fonction : Auteur

Laboratoire de sciences cognitives et psycholinguistique

Julien Karadayi

Fonction : Auteur

École des hautes études en sciences sociales

Apprentissage machine et développement cognitif

Rachid Riad

Fonction : Auteur

Laboratoire de sciences cognitives et psycholinguistique

Apprentissage machine et développement cognitif

Emmanuel Dupoux

Fonction : Auteur
PersonId : 757939
ORCID : 0000-0002-7814-2952

Laboratoire de sciences cognitives et psycholinguistique

Apprentissage machine et développement cognitif

Résumé

Fixed-length embeddings of words are very useful for a variety of tasks in speech and language processing. Here we systematically explore two methods of computing fixed-length embeddings for variable-length sequences. We evaluate their susceptibility to phonetic and speaker-specific variability on English, a high resource language and Xitsonga, a low resource language, using two evaluation metrics: ABX word discrimination and ROC-AUC on same-different phoneme n-grams. We show that a simple downsampling method supplemented with length information can outperform the variable-length input feature representation on both evaluations. Recurrent autoencoders, trained without supervision, can yield even better results at the expense of increased computational complexity.

Mots clés

unsupervised speech processing audio word embeddings ABX discrimination same-different classification representation learning

Domaines

Linguistique Sciences cognitives

Fichier principal

Holzenberger_DKRD_2018_fixed_length_embeddings_for_words.Interspeech.pdf (505.44 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuel Dupoux : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01888708

Soumis le : vendredi 7 décembre 2018-14:36:29

Dernière modification le : vendredi 19 avril 2024-16:18:55

Archivage à long terme le : vendredi 8 mars 2019-14:50:24

Dates et versions

hal-01888708 , version 1 (07-12-2018)

Identifiants

HAL Id : hal-01888708 , version 1
DOI : 10.21437/Interspeech.2018-2364

Citer

Nils Holzenberger, Mingxing Du, Julien Karadayi, Rachid Riad, Emmanuel Dupoux. Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments. Interspeech 2018, Sep 2018, Hyderabad, India. ⟨10.21437/Interspeech.2018-2364⟩. ⟨hal-01888708⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA EHESS LSCP DEC INRIA2 PSL ANR

355 Consultations

722 Téléchargements

Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager