A Recurrent Variational Autoencoder for Speech Enhancement - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

A Recurrent Variational Autoencoder for Speech Enhancement

Résumé

This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE). The deep generative speech model is trained using clean speech signals only, and it is combined with a nonnegative matrix factorization noise model for speech enhancement. We propose a variational expectation-maximization algorithm where the encoder of the RVAE is fine-tuned at test time, to approximate the distribution of the latent variables given the noisy speech observations. Compared with previous approaches based on feed-forward fully-connected architectures, the proposed recurrent deep generative speech model induces a posterior temporal dynamic over the latent variables, which is shown to improve the speech enhancement results.
Fichier principal
Vignette du fichier
LAGH_2019.pdf (391.16 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02329000 , version 1 (23-10-2019)
hal-02329000 , version 2 (07-02-2020)

Identifiants

Citer

Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud. A Recurrent Variational Autoencoder for Speech Enhancement. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2020, Barcelona, Spain. ⟨hal-02329000v1⟩
454 Consultations
1238 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More