Skip to Main content Skip to Navigation
New interface
Conference papers

StyleWaveGAN: Style-based synthesis of drum sounds using generative adversarial networks for higher audio quality

Antoine Lavault 1, 2 Axel Roebel 2 Matthieu Voiry 1 
2 Analyse et synthèse sonores [Paris]
STMS - Sciences et Technologies de la Musique et du Son
Abstract : In this paper we introduce StyleWaveGAN, a stylebased drum sound generator that is a variation of StyleGAN, a state-of-the-art image generator. By conditioning StyleWaveGAN on the type of drum, we are able to synthesize waveforms faster than real-time on a GPU directly in CD quality up to a duration of 1.5s while retaining some control over the generation. We also introduce an alternative to the progressive growing of GANs and experimented on the effect of dataset balancing for generative tasks. The experiments are carried out on an augmented subset of a publicly available dataset comprised of different drums and cymbals. We evaluate against two recent drum generators, WaveGAN and NeuroDrum, demonstrating significantly improved generation quality using two quality measures: first the Frechet Audio Distance and second a perceptual test.
Document type :
Conference papers
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03693960
Contributor : Antoine Lavault Connect in order to contact the contributor
Submitted on : Monday, June 13, 2022 - 11:40:00 AM
Last modification on : Saturday, June 25, 2022 - 3:34:24 AM

File

eusipco_22.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03693960, version 1

Citation

Antoine Lavault, Axel Roebel, Matthieu Voiry. StyleWaveGAN: Style-based synthesis of drum sounds using generative adversarial networks for higher audio quality. 30th European Signal Processing Conference (EUSIPCO 2022), Aug 2022, Belgrade, Serbia. ⟨hal-03693960⟩

Share

Metrics

Record views

47

Files downloads

56