Fusion Methods for Audio Source Separation - Archive ouverte HAL Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2014

Fusion Methods for Audio Source Separation

Méthodes de fusion pour la séparation de source audio

Résumé

A wide variety of audio source separation techniques exists and can already tackle many challenging industrial issues. However, by contrast to other application domains, fusion principles were rarely investigated in audio source separation despite their demonstrated potential in classification tasks. In this paper, we propose a general fusion framework which takes advantage of the diversity of existing separation techniques in order to improve separation quality. Our approaches aim at obtaining a new source estimate by summing the individual estimates given by different separation techniques weighted by a set of fusion coefficients. We investigate three alternative fusion methods which are based on standard non-linear optimization, Bayesian model averaging or deep neural networks. Experiments conducted on both speech enhancement and singing-voice extraction demonstrate that the proposed methods lead to diverse separation performance, yet all outperform traditional model selection. The use of deep neural networks for the estimation of time-varying coefficients notably leads to great quality improvements, up to +3.3 dB in terms of signal-to-distortion ratio (SDR) compared to model selection. As such, our fusion framework is a practical and efficient way to get rid of the need to choose and carefully tune a separation system and it further allows the adaptation of existing techniques to given separation problems and objectives.
Fichier principal
Vignette du fichier
journal14_vs.pdf (335.23 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01120685 , version 1 (04-03-2015)
hal-01120685 , version 2 (20-10-2015)
hal-01120685 , version 3 (20-02-2016)
hal-01120685 , version 4 (09-04-2016)

Identifiants

  • HAL Id : hal-01120685 , version 1

Citer

Xabier Jaureguiberry, Emmanuel M. Vincent, Gael Richard. Fusion Methods for Audio Source Separation. [Research Report] Télécom ParisTech; Inria Nancy, équipe Multispeech. 2014. ⟨hal-01120685v1⟩
697 Consultations
1364 Téléchargements

Partager

Gmail Facebook X LinkedIn More