Two Multimodal Approaches for Single Microphone Source Separation

Abstract : —In this paper, the problem of single microphone source separation via Nonnegative Matrix Factorization (NMF) by exploiting video information is addressed. Respective audio and video modalities coming from a single human speech usually have similar time changes. It means that changes in one of them usually corresponds to changes in the other one. So it is expected that activation coefficient matrices of their NMF decomposition are similar. Based on this similarity, in this paper the activation coefficient matrix of the video modality is used as an initialization for audio source separation via NMF. In addition, the mentioned similarity is used for post-processing and for clustering the rows of the activation coefficient matrix which were resulted from randomly initialized NMF. Simulation results confirm the effectiveness of the proposed multimodal approaches in single microphone source separation.
Document type :
Conference papers
Complete list of metadatas

Cited literature [15 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01400542
Contributor : Bertrand Rivet <>
Submitted on : Tuesday, November 22, 2016 - 10:25:45 AM
Last modification on : Monday, July 8, 2019 - 3:08:11 PM
Long-term archiving on : Tuesday, March 21, 2017 - 12:34:22 AM

File

1570251892.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01400542, version 1

Citation

Farnaz Sedighin, Massoud Babaie-Zadeh, Bertrand Rivet, Christian Jutten. Two Multimodal Approaches for Single Microphone Source Separation. 24th European Signal Processing Conference (EUSIPCO 2016), Aug 2016, Budapest, Hungary. pp.110-114. ⟨hal-01400542⟩

Share

Metrics

Record views

501

Files downloads

312