Auditory time-frequency masking: Psychoacoustical measures and application to the analysis-synthesis of sound signals

Abstract : Many audio applications, such as sound analysis-synthesis tools or audio codecs, call for specific signal representations enabling the analysis, processing, and synthesis of non stationary signals. Most of them are concerned with time-frequency (TF) representations such as the Gabor and wavelet transforms that allow decomposing any real-world sound into a set of elementary functions (or “atoms”) well localized in the TF domain. On the purpose of adapting these representations to the human auditory perception, the present study investigated auditory masking in the TF domain. Masking has been extensively investigated with simultaneous (frequency masking) and non-simultaneous (temporal masking) presentation of masker and target. A few studies examined TF relations of masking between masker and target. Because those studies involved stimuli that are not maximally compact in the TF plane (i.e., they were temporally and/or spectrally broad), their results are not suitable for predicting masking effects between TF atoms. In this study, we investigated auditory TF masking with masker and target signals having minimum spread in the TF plane, namely Gaussian-shaped sinusoids (referred to as Gaussians). The masker had a carrier frequency of 4 kHz and a level of 60 dB SL. Masker and target were separated either in frequency, in time, or both. The results of the TF conditions provide the TF spread of masking for stimuli that are maximally concentrated in the TF domain. The results of the simultaneous and non-simultaneous conditions allowed to show that a simple superposition of frequency and temporal masking functions does not provide an accurate representation of the measured TF masking function for Gaussian maskers. Two additional experiments were carried out that examined the effects of masker level and masker frequency in simultaneous conditions. Decreasing the masker level from 60 to 30 dB SL resulted in a reversal of the masking patterns' asymmetry and a narrowing of the frequency spread of masking. The frequency spread of masking at 0.75 kHz was similar to that obtained at 4 kHz when compared on an ERB scale. This is compatible with the constant-Q frequency analysis by the human auditory system. Finally, a first attempt was made to implement the gathered masking data in a sound signal processing algorithm allowing to remove the perceptually irrelevant atoms in the TF representations of audio signals. Potential applications of such an approach are, for instance, audio codecs and sound analysis-synthesis tools.
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-00553006
Contributor : Thibaud Necciari <>
Submitted on : Thursday, January 6, 2011 - 12:33:06 PM
Last modification on : Monday, March 4, 2019 - 2:04:04 PM
Long-term archiving on : Thursday, April 7, 2011 - 2:55:12 AM

File

Identifiers

  • HAL Id : tel-00553006, version 1

Citation

Thibaud Necciari. Auditory time-frequency masking: Psychoacoustical measures and application to the analysis-synthesis of sound signals. Acoustics [physics.class-ph]. Université de Provence - Aix-Marseille I, 2010. English. ⟨tel-00553006⟩

Share

Metrics

Record views

594

Files downloads

2364