Robust Training of Vector Quantized Bottleneck Models

Adrian Łańcucki; Jan Chorowski; Guillaume Sanchez; Ricard Marxer; Nanxin Chen; Hans J G A Dolfing; Sameer Khurana; Tanel Alumäe; Antoine Laurent

Communication Dans Un Congrès Année : 2020

Robust Training of Vector Quantized Bottleneck Models

(1) , (1) , (2) , (3) , (4) , , , (5) , (6)

1
2
3
4
5
6

Adrian Łańcucki

Fonction : Auteur

University of Wrocław [Poland]

Jan Chorowski

Fonction : Auteur

University of Wrocław [Poland]

Guillaume Sanchez

Fonction : Auteur
PersonId : 1068521

DYNamiques de l’Information

Ricard Marxer

Fonction : Auteur
PersonId : 19391
IdHAL : ricard-marxer
ORCID : 0000-0001-5099-5059
IdRef : 240437713

Laboratoire d'Informatique et des Systèmes (LIS) (Marseille, Toulon)

Nanxin Chen

Fonction : Auteur

China Agricultural University

Hans J G A Dolfing

Fonction : Auteur

Sameer Khurana

Fonction : Auteur

Tanel Alumäe

Fonction : Auteur

Tallinn University of Technology

Antoine Laurent

Fonction : Auteur
PersonId : 13586
IdHAL : antoine-laurent
ORCID : 0000-0002-2653-1008
IdRef : 147099072

Laboratoire d'Informatique de l'Université du Mans

Résumé

In this paper we demonstrate methods for reliable and efficient training of discrete representation using Vector-Quantized Variational Auto-Encoder models (VQ-VAEs). Discrete latent variable models have been shown to learn nontrivial representations of speech, applicable to unsupervised voice conversion and reaching state-of-the-art performance on unit discovery tasks. For unsupervised representation learning, they became viable alternatives to continuous latent variable models such as the Variational Auto-Encoder (VAE). However, training deep discrete variable models is challenging, due to the inherent non-differentiability of the discretization operation. In this paper we focus on VQ-VAE, a state-of-the-art discrete bottleneck model shown to perform on par with its continuous counterparts. It quantizes encoder outputs with on-line k-means clustering. We show that the codebook learning can suffer from poor initialization and non-stationarity of clustered encoder outputs. We demonstrate that these can be successfully overcome by increasing the learning rate for the codebook and periodic date-dependent codeword re-initialization. As a result, we achieve more robust training across different tasks, and significantly increase the usage of latent codewords even for large code-books. This has practical benefit, for instance, in unsupervised representation learning, where large codebooks may lead to disentanglement of latent representations.

Mots clés

VQ-VAE k-means discrete information bottleneck

Domaines

Informatique et langage [cs.CL] Intelligence artificielle [cs.AI] Réseau de neurones [cs.NE] Apprentissage [cs.LG]

Fichier principal

robust_vq_arxiv.pdf (449.62 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Antoine LAURENT : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02912027

Soumis le : mercredi 5 août 2020-09:41:37

Dernière modification le : vendredi 22 mars 2024-18:24:03

Archivage à long terme le : lundi 30 novembre 2020-14:36:37

Dates et versions

hal-02912027 , version 1 (05-08-2020)

Identifiants

HAL Id : hal-02912027 , version 1

Citer

Adrian Łańcucki, Jan Chorowski, Guillaume Sanchez, Ricard Marxer, Nanxin Chen, et al.. Robust Training of Vector Quantized Bottleneck Models. IJCNN 2020, Jul 2020, Glasgow, United Kingdom. ⟨hal-02912027⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLN CNRS UNIV-AMU UNIV-LEMANS LIUM LIS-LAB INCIAM

118 Consultations

148 Téléchargements

Robust Training of Vector Quantized Bottleneck Models

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager