Reducing Interference with Phase Recovery in DNN-based Monaural Singing Voice Separation

Paul Magron; Konstantinos Drossos; Stylianos Ioannis Mimilakis; Tuomas Virtanen

Pré-Publication, Document De Travail Année : 2018

Reducing Interference with Phase Recovery in DNN-based Monaural Singing Voice Separation

(1) , (1) , (2) , (1)

1
2

Paul Magron

Fonction : Auteur
PersonId : 1085197
ORCID : 0000-0002-8561-0961

Tampere University of Technology [Tampere]

Konstantinos Drossos

Fonction : Auteur

Tampere University of Technology [Tampere]

Stylianos Ioannis Mimilakis

Fonction : Auteur

Fraunhofer Institute for Digital Media Technology [Ilmenau]

Tuomas Virtanen

Fonction : Auteur

Tampere University of Technology [Tampere]

Résumé

State-of-the-art methods for monaural singing voice separation consist in estimating the magnitude spectrum of the voice in the short-term Fourier transform (STFT) domain by means of deep neural networks (DNNs). The resulting magnitude estimate is then combined with the mixture's phase to retrieve the complex-valued STFT of the voice, which is further synthesized into a time-domain signal. However, when the sources overlap in time and frequency, the STFT phase of the voice differs from the mixture's phase, which results in interference and artifacts in the estimated signals. In this paper, we investigate on recent phase recovery algorithms that tackle this issue and can further enhance the separation quality. These algorithms exploit phase constraints that originate from a sinusoidal model or from consistency , a property that is a direct consequence of the STFT redundancy. Experiments conducted on real music songs show that those algorithms are efficient for reducing interference in the estimated voice compared to the baseline approach.

Mots clés

phase recovery deep neural networks MaD TwinNet Wiener filtering Monaural singing voice separation

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

phase-recovery-dnn (1).pdf (1.07 Mo)

Paul Magron : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01741278

Soumis le : vendredi 23 mars 2018-09:31:08

Dernière modification le : vendredi 2 février 2024-16:55:28

Archivage à long terme le : jeudi 13 septembre 2018-07:59:38

Dates et versions

hal-01741278 , version 1 (23-03-2018)

hal-01741278 , version 2 (15-06-2018)

Identifiants

HAL Id : hal-01741278 , version 1

Citer

Paul Magron, Konstantinos Drossos, Stylianos Ioannis Mimilakis, Tuomas Virtanen. Reducing Interference with Phase Recovery in DNN-based Monaural Singing Voice Separation. 2018. ⟨hal-01741278v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

441 Consultations

427 Téléchargements

Reducing Interference with Phase Recovery in DNN-based Monaural Singing Voice Separation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager