Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Towards Universal Neural Vocoding with a Multi-band Excited WaveNet

Axel Roebel 1 Frederik Bous 1 
1 Analyse et synthèse sonores [Paris]
STMS - Sciences et Technologies de la Musique et du Son
Abstract : This paper introduces the Multi-Band Excited WaveNet a neural vocoder for speaking and singing voices. It aims to advance the state of the art towards an universal neural vocoder, which is a model that can generate voice signals from arbitrary mel spectrograms extracted from voice signals. Following the success of the DDSP model and following the development of the recently proposed excitation vocoders we propose a vocoder structure consisting of multiple specialized DNN that are combined with dedicated signal processing components. All components are implemented as differentiable operators and therefore allow joined optimization of the model parameters. To prove the capacity of the model to reproduce high quality voice signals we evaluate the model on single and multi speaker/singer datasets. We conduct a subjective evaluation demonstrating that the models support a wide range of domain variations (unseen voices, languages, expressivity) achieving perceptive quality that compares with a state of the art universal neural vocoder, however using significantly smaller training datasets and significantly less parameters. We also demonstrate remaining limits of the universality of neural vocoders e.g. the creation of saturated singing voices.
Document type :
Preprints, Working Papers, ...
Complete list of metadata
Contributor : Axel Roebel Connect in order to contact the contributor
Submitted on : Thursday, February 3, 2022 - 7:42:55 PM
Last modification on : Tuesday, March 15, 2022 - 3:22:02 AM

Links full text


  • HAL Id : hal-03555987, version 1
  • ARXIV : 2110.03329


Axel Roebel, Frederik Bous. Towards Universal Neural Vocoding with a Multi-band Excited WaveNet. 2021. ⟨hal-03555987⟩



Record views