Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization

Résumé

In this paper we address speaker-independent multichannel speech enhancement in unknown noisy environments. Our work is based on a well-established multichannel local Gaussian modeling framework. We propose to use a neural network for modeling the speech spectro-temporal content. The parameters of this supervised model are learned using the framework of variational autoencoders. The noisy recording environment is supposed to be unknown, so the noise spectro-temporal modeling remains unsupervised and is based on non-negative matrix factorization (NMF). We develop a Monte Carlo expectation-maximization algorithm and we experimentally show that the proposed approach outperforms its NMF-based counterpart , where speech is modeled using supervised NMF.
Fichier principal
Vignette du fichier
LGH-icassp2019.pdf (501.89 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02005102 , version 1 (08-02-2019)
hal-02005102 , version 2 (30-04-2019)

Identifiants

Citer

Simon Leglaive, Laurent Girin, Radu Horaud. Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization. IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP 2019), May 2019, Brighton, United Kingdom. pp.101-105, ⟨10.1109/ICASSP.2019.8683704⟩. ⟨hal-02005102v1⟩
272 Consultations
671 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More