Hider-Finder-Combiner: An Adversarial Architecture for General Speech Signal Modification

Jacob J Webber; Olivier Perrotin; Simon King

doi:10.21437/interspeech.2020-2558

Communication Dans Un Congrès Année : 2020

Hider-Finder-Combiner: An Adversarial Architecture for General Speech Signal Modification

(1) , (2) , (1)

1
2

Jacob J Webber

Fonction : Auteur

The Centre for Speech Technology Research [Edinburgh]

Olivier Perrotin

Fonction : Auteur
PersonId : 21602
IdHAL : operrotin
ORCID : 0000-0002-9909-6078
IdRef : 189243848

GIPSA - Cognitive Robotics, Interactive Systems, & Speech Processing

Simon King

Fonction : Auteur

The Centre for Speech Technology Research [Edinburgh]

Résumé

We introduce a prototype system for modifying an arbitrary parameter of a speech signal. Unlike signal processing approaches that require dedicated methods for different parameters, our system can-in principle-modify any control parameter that the signal can be annotated with. Our system comprises three neural networks. The 'hider' removes all information related to the control parameter, outputting a hidden embedding. The 'finder' is an adversary used to train the 'hider', attempting to detect the value of the control parameter from the hidden embedding. The 'combiner' network recombines the hidden embedding with a desired new value of the control parameter. The input and output to the system are mel-spectrograms and we employ a neural vocoder to generate the output speech waveform. As a proof of concept, we use F0 as the control parameter. The system was evaluated in terms of control parameter accuracy and naturalness against a high quality signal processing method of F0 modification that also works in the spectrogram domain. We also show that, with modifications only to training data, the system is capable of modifying the 1 st and 2 nd vocal tract for-mants, showing progress towards universal signal modification.

Mots clés

speech synthesis adversarial networks speech modification

Domaines

Traitement du signal et de l'image [eess.SP] Interface homme-machine [cs.HC]

Fichier principal

Webber2020.pdf (2.41 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Olivier Perrotin : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02987956

Soumis le : mercredi 4 novembre 2020-12:27:48

Dernière modification le : jeudi 4 avril 2024-21:23:52

Archivage à long terme le : vendredi 5 février 2021-18:26:06

Dates et versions

hal-02987956 , version 1 (04-11-2020)

Identifiants

HAL Id : hal-02987956 , version 1
DOI : 10.21437/interspeech.2020-2558

Citer

Jacob J Webber, Olivier Perrotin, Simon King. Hider-Finder-Combiner: An Adversarial Architecture for General Speech Signal Modification. Interspeech 2020 - 21st Annual Conference of the International Speech Communication Association, Oct 2020, Shanghai (Virtual Conf), China. ⟨10.21437/interspeech.2020-2558⟩. ⟨hal-02987956⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS GIPSA PERSYVAL-LAB GIPSA-CRISSP GIPSA-PPC ANR

186 Consultations

171 Téléchargements

Hider-Finder-Combiner: An Adversarial Architecture for General Speech Signal Modification

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager