Multichannel audio source separation: variational inference of time-frequency sources from time-domain observations

Abstract : A great number of methods for multichannel audio source separation are based on probabilistic approaches in which the sources are modeled as latent random variables in a time-frequency (TF) domain. For reverberant mixtures, most of the methods approximate the time-domain convolutive mixing process in the TF-domain, assuming short mixing filters. The TF latent sources are then inferred from the TF mixture observations. In this paper we propose to infer latent TF sources from the time-domain observations. This approach allows us to exactly model the convolutive mixing process. The inference procedure rely on a variational expectation-maximization algorithm. In significant reverberation conditions, we show that our approach leads a Signal-to-Distortion Ratio improvement of 5.5 dB.
Document type :
Conference papers
Complete list of metadatas

Cited literature [24 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01416347
Contributor : Roland Badeau <>
Submitted on : Wednesday, January 11, 2017 - 12:19:43 PM
Last modification on : Thursday, October 17, 2019 - 12:36:10 PM
Long-term archiving on : Friday, April 14, 2017 - 1:38:31 PM

File

LeglaiveBadeauRichard.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01416347, version 1

Collections

Citation

Simon Leglaive, Roland Badeau, Gaël Richard. Multichannel audio source separation: variational inference of time-frequency sources from time-domain observations. 42nd International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Mar 2017, La Nouvelle Orléans, LA, United States. ⟨hal-01416347⟩

Share

Metrics

Record views

320

Files downloads

463