Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Harmonic-Percussive Source Separation with Deep Neural Networks and Phase Recovery

Abstract : Harmonic/percussive source separation (HPSS) consists in separating the pitched instruments from the percussive parts in a music mixture. In this paper, we propose to apply the recently introduced Masker-Denoiser with twin networks (MaD TwinNet) system to this task. MaD TwinNet is a deep learning architecture that has reached state-of-the-art results in monaural singing voice separation. Herein, we propose to apply it to HPSS by using it to estimate the magnitude spectrogram of the percussive source. Then, we retrieve the complex-valued short-term Fourier transform of the sources by means of a phase recovery algorithm, which minimizes the reconstruction error and enforces the phase of the harmonic part to follow a sinusoidal phase model. Experiments conducted on realistic music mixtures show that this novel separation system outperforms the previous state-of-the art kernel additive model approach.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [36 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01812225
Contributor : Paul Magron <>
Submitted on : Tuesday, July 24, 2018 - 10:03:17 AM
Last modification on : Wednesday, June 17, 2020 - 11:30:03 AM
Document(s) archivé(s) le : Thursday, October 25, 2018 - 1:17:40 PM

File

hpss-dnns-phase.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01812225, version 2

Citation

Konstantinos Drossos, Paul Magron, Stylianos Ioannis Mimilakis, Tuomas Virtanen. Harmonic-Percussive Source Separation with Deep Neural Networks and Phase Recovery. 2018. ⟨hal-01812225v2⟩

Share

Metrics

Record views

147

Files downloads

318