The Zero Resource Speech Challenge 2019: TTS without T

Abstract : We present the Zero Resource Speech Challenge 2019, which proposes to build a speech synthesizer without any text or pho-netic labels: hence, TTS without T (text-to-speech without text). We provide raw audio for a target voice in an unknown language (the Voice dataset), but no alignment, text or labels. Participants must discover subword units in an unsupervised way (using the Unit Discovery dataset) and align them to the voice recordings in a way that works best for the purpose of synthesizing novel utterances from novel speakers, similar to the target speaker's voice. We describe the metrics used for evaluation , a baseline system consisting of unsupervised subword unit discovery plus a standard TTS system, and a topline TTS using gold phoneme transcriptions. We present an overview of the 19 submitted systems from 10 teams and discuss the main results.
Complete list of metadatas

Cited literature [33 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02274112
Contributor : Emmanuel Dupoux <>
Submitted on : Thursday, August 29, 2019 - 3:18:26 PM
Last modification on : Saturday, August 31, 2019 - 1:11:43 AM

File

1904.11469.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02274112, version 1

Citation

Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, et al.. The Zero Resource Speech Challenge 2019: TTS without T. Interspeech 2019, Sep 2019, Graz, Austria. ⟨hal-02274112⟩

Share

Metrics

Record views

39

Files downloads

14