Phase recovery with Bregman divergences for audio source separation - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Phase recovery with Bregman divergences for audio source separation

Paul Magron
Pierre-Hugo Vial
  • Fonction : Auteur
  • PersonId : 1113887
  • IdRef : 27051290X

Résumé

Time-frequency audio source separation is usually achieved by estimating the short-time Fourier transform (STFT) magnitude of each source, and then applying a phase recovery algorithm to retrieve time-domain signals. In particular, the multiple input spectrogram inversion (MISI) algorithm has shown good performance in several recent works. This algorithm minimizes a quadratic reconstruction error between magnitude spectrograms. However, this loss does not properly account for some perceptual properties of audio, and alternative discrepancy measures such as beta-divergences have been preferred in many settings. In this paper, we propose to reformulate phase recovery in audio source separation as a minimization problem involving Bregman divergences. To optimize the resulting objective, we derive a projected gradient descent algorithm. Experiments conducted on a speech enhancement task show that this approach outperforms MISI for several alternative losses, which highlights their relevance for audio source separation applications.
Fichier principal
Vignette du fichier
main_arxiv.pdf (424.54 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03049800 , version 1 (10-12-2020)
hal-03049800 , version 2 (09-02-2021)

Identifiants

Citer

Paul Magron, Pierre-Hugo Vial, Thomas Oberlin, Cédric Févotte. Phase recovery with Bregman divergences for audio source separation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jun 2021, Toronto, Canada. ⟨hal-03049800v2⟩
193 Consultations
78 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More