Skip to Main content Skip to Navigation
Conference papers

A variance modeling framework based on variational autoencoders for speech enhancement

Simon Leglaive 1 Laurent Girin 2 Radu Horaud 1 
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology, LJK - Laboratoire Jean Kuntzmann
Abstract : In this paper we address the problem of enhancing speech signals in noisy mixtures using a source separation approach. We explore the use of neural networks as an alternative to a popular speech variance model based on supervised non-negative matrix factorization (NMF). More precisely, we use a variational autoencoder as a speaker-independent supervised generative speech model, highlighting the conceptual similarities that this approach shares with its NMF-based counterpart. In order to be free of generalization issues regarding the noisy recording environments, we follow the approach of having a supervised model only for the target speech signal, the noise model being based on unsupervised NMF. We develop a Monte Carlo expectation-maximization algorithm for inferring the latent variables in the variational autoencoder and estimating the unsupervised model parameters. Experiments show that the proposed method outperforms a semi-supervised NMF baseline and a state-of-the-art fully supervised deep learning approach.
Complete list of metadata

Cited literature [28 references]  Display  Hide  Download
Contributor : Simon Leglaive Connect in order to contact the contributor
Submitted on : Thursday, July 12, 2018 - 11:24:57 AM
Last modification on : Friday, February 4, 2022 - 3:29:53 AM
Long-term archiving on: : Monday, October 15, 2018 - 10:45:33 PM


Files produced by the author(s)



Simon Leglaive, Laurent Girin, Radu Horaud. A variance modeling framework based on variational autoencoders for speech enhancement. MLSP 2018 - IEEE 28th International Workshop on Machine Learning for Signal Processing, Sep 2018, Aalborg, Denmark. pp.1-6, ⟨10.1109/MLSP.2018.8516711⟩. ⟨hal-01832826⟩



Record views


Files downloads