Skip to Main content Skip to Navigation
Conference papers

"Sparsification" of audio signals using the MDCT/IntMDCT and a psychoacoustic model - Application to informed audio source separation

Abstract : Sparse representations have proved a very useful tool in a variety of domain, e.g. speech/music source separation. As strictly sparse representations (in the sense of l0) are often impossible to achieve, other ways of studying signals sparsity have been proposed. In this paper, we revisit the irrelevance filtering analysis-synthesis approach proposed in (Balazs et al., IEEE Trans. ASLP, 18(1), 2010), where the TF coefficients that are below some masking threshold are set to zero. Instead of using the Gabor transform and a specific psychoacoustic model, we use tools directly inspired from perceptual audio coding, for instance MPEG-AAC. We show that significantly better "sparsification performances" are obtained on music signals, at lower computational cost. We then apply the sparsification process to the informed source separation (ISS) problem and show that it enables to significantly decrease the computational cost at the ISS decoder.
Complete list of metadatas

Cited literature [15 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00695730
Contributor : Laurent Girin <>
Submitted on : Wednesday, May 9, 2012 - 4:50:22 PM
Last modification on : Tuesday, December 8, 2020 - 10:41:07 AM
Long-term archiving on: : Friday, November 30, 2012 - 11:30:21 AM

File

AES42_JP_LG.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00695730, version 1

Citation

Jonathan Pinel, Laurent Girin. "Sparsification" of audio signals using the MDCT/IntMDCT and a psychoacoustic model - Application to informed audio source separation. AES 42nd International Conference: Semantic Audio, Jul 2011, Ilmenau, Germany. pp.179-188. ⟨hal-00695730⟩

Share

Metrics

Record views

256

Files downloads

265