Fusion methods for speech enhancement and audio source separation

Abstract : A wide variety of audio source separation techniques exist and can already tackle many challenging industrial issues. However, in contrast with other application domains, fusion principles were rarely investigated in audio source separation despite their demonstrated potential in classification tasks. In this paper, we propose a general fusion framework which takes advantage of the diversity of existing separation techniques in order to improve separation quality. We obtain new source estimates by summing the individual estimates given by different separation techniques weighted by a set of fusion coefficients. We investigate three alternative fusion methods which are based on standard non-linear optimization, Bayesian model averaging or deep neural networks. Experiments conducted for both speech enhancement and singing voice extraction demonstrate that all the proposed methods outperform traditional model selection. The use of deep neural networks for the estimation of time-varying coefficients notably leads to large quality improvements, up to 3 dB in terms of signal-to-distortion ratio (SDR) compared to model selection.
Type de document :
Article dans une revue
IEEE Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2016, 〈10.1109/TASLP.2016.2553441〉
Liste complète des métadonnées

Littérature citée [55 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01120685
Contributeur : Xabier Jaureguiberry <>
Soumis le : samedi 9 avril 2016 - 11:39:20
Dernière modification le : mercredi 13 mars 2019 - 11:45:46
Document(s) archivé(s) le : mardi 15 novembre 2016 - 00:14:14

Fichier

taslp16.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Xabier Jaureguiberry, Emmanuel Vincent, Gaël Richard. Fusion methods for speech enhancement and audio source separation. IEEE Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2016, 〈10.1109/TASLP.2016.2553441〉. 〈hal-01120685v4〉

Partager

Métriques

Consultations de la notice

694

Téléchargements de fichiers

1014