Speech enhancement with variational autoencoders and alpha-stable distributions

Simon Leglaive 1 Umut Simsekli 2 Antoine Liutkus 3 Laurent Girin 4 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
3 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
4 GIPSA-CRISSP - CRISSP
GIPSA-DPC - Département Parole et Cognition
Abstract : This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of variational autoencoders. The noise model remains unsupervised because we do not assume prior knowledge of the noisy recording environment. In this context, our contribution is to propose a noise model based on alpha-stable distributions , instead of the more conventional Gaussian non-negative matrix factorization approach found in previous studies. We develop a Monte Carlo expectation-maximization algorithm for estimating the model parameters at test time. Experimental results show the superiority of the proposed approach both in terms of perceptual quality and intelligibility of the enhanced speech signal.
Type de document :
Communication dans un congrès
IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), May 2019, Brighton, United Kingdom. IEEE, pp.1-5, 2019, 〈https://2019.ieeeicassp.org〉
Liste complète des métadonnées

https://hal.inria.fr/hal-02005106
Contributeur : Simon Leglaive <>
Soumis le : vendredi 8 février 2019 - 11:25:30
Dernière modification le : mercredi 20 février 2019 - 01:28:49

Fichier

LSLGH-icassp2019.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-02005106, version 1

Citation

Simon Leglaive, Umut Simsekli, Antoine Liutkus, Laurent Girin, Radu Horaud. Speech enhancement with variational autoencoders and alpha-stable distributions. IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), May 2019, Brighton, United Kingdom. IEEE, pp.1-5, 2019, 〈https://2019.ieeeicassp.org〉. 〈hal-02005106〉

Partager

Métriques

Consultations de la notice

120

Téléchargements de fichiers

87