Modelling and transformation of sound textures and environmental sounds

Abstract : The processing of environmental sounds has become an important topic in various areas. Environmental sounds are mostly constituted of a kind of sounds called sound textures. Sound textures are usually non-sinusoidal, noisy and stochastic. Several researches have stated that human recognizes sound textures with statistics that characterizing the envelopes of auditory critical bands. Existing synthesis algorithms can impose some statistical properties to a certain extent, but most of them are computational intensive. We propose a new analysis-synthesis framework that contains a statistical description that consists of perceptually important statistics and an efficient mechanism to adapt statistics in the time-frequency domain. The quality of resynthesised sound is at least as good as state-of-the-art but more efficient in terms of computation time. The statistic description is based on the STFT. If certain conditions are met, it can also adapt to other filter bank based time-frequency representations (TFR). The adaptation of statistics is achieved by using the connection between the statistics on TFR and the spectra of time-frequency domain coefficients. It is possible to adapt only a part of cross-correlation functions. This allows the synthesis process to focus on important statistics and ignore the irrelevant parts, which provides extra flexibility. The proposed algorithm has several perspectives. It could possibly be used to generate unseen sound textures from artificially created statistical descriptions. It could also serve as a basis for transformations like stretching or morphing. One could also expect to use the model to explore semantic control of sound textures.
Document type :
Complete list of metadatas

Cited literature [71 references]  Display  Hide  Download
Contributor : Abes Star <>
Submitted on : Friday, July 28, 2017 - 11:01:33 PM
Last modification on : Saturday, December 21, 2019 - 3:30:47 AM
Long-term archiving on: Friday, January 26, 2018 - 11:23:02 PM


  • HAL Id : tel-01263988, version 2


Wei-Hsiang Liao. Modelling and transformation of sound textures and environmental sounds. Sound [cs.SD]. Université Pierre et Marie Curie - Paris VI; National Cheng Kung University (Taiwan), 2015. English. ⟨NNT : 2015PA066725⟩. ⟨tel-01263988v2⟩



Record views


Files downloads