Skip to Main content Skip to Navigation
Conference papers

WAV2SHAPE: Hearing the Shape of A Drum Machine

Han Han 1 Vincent Lostanlen 2, 3 
1 EX-SITU - Extreme Situated Interaction
Inria Saclay - Ile de France, LISN - Laboratoire Interdisciplinaire des Sciences du Numérique, IaH - Interaction avec l'Humain
3 SIMS - Signal, IMage et Son
LS2N - Laboratoire des Sciences du Numérique de Nantes
Abstract : The timbre of a percussive instrument, such as a drum or a bell, results from a complex interaction between a source and a resonator. The former, denoting the playing technique of the musician, has relatively few degrees of freedom, but may change arbitrarily during musical performances. The latter, representing the instrument as an inert object, encodes the response of a dynamical system with many degrees of freedom yet without any non-stationarity nor memory effects. Disentangling and recovering these physical factors from a few instances is a challenging inverse problem in audio signal processing, with numerous applications in musical acoustics as well as structural engineering. We propose to address this problem via a combination of domain-specific knowledge and supervised machine learning. We start by synthesizing a dataset of sounds using the functional transformation method (FTM), a well-studied physical modeling sound synthesis approach that enables physical and excitation flexibility for basic shapes. The FTM formulates the motion of the resonator as a system of partial differential equations with boundary conditions. Then, it computes the resulting sound using additive modal synthesis. To regress this procedure, we represent each percussive sound in a time-invariant feature space by extracting wavelet scattering transform coefficients and estimate the physical parameters of both the source and the resonator by multidimensional regression with a neural network. Because our model is differentiable from the waveform domain to the space of physical control parameters, it can be employed as a virtual analog synthesizer without explicit specification of hardware components.
Complete list of metadata
Contributor : Claude Inserra Connect in order to contact the contributor
Submitted on : Wednesday, May 26, 2021 - 1:20:08 PM
Last modification on : Friday, August 5, 2022 - 2:54:51 PM
Long-term archiving on: : Friday, August 27, 2021 - 7:32:08 PM


Files produced by the author(s)



Han Han, Vincent Lostanlen. WAV2SHAPE: Hearing the Shape of A Drum Machine. Forum Acusticum, Dec 2020, Lyon, France. pp.647-654, ⟨10.48465/fa.2020.0087⟩. ⟨hal-03234049⟩



Record views


Files downloads