Speaker information modification in the VoicePrivacy 2020 toolchain - Archive ouverte HAL Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2020

Speaker information modification in the VoicePrivacy 2020 toolchain

Résumé

This paper presents a study of the baseline system of the VoicePrivacy 2020 challenge. This baseline relies on a voice conversion system that aims at separating speaker identity and linguistic contents for a given speech utterance. To generate an anonymized speech waveform, the neural acoustic model and neural waveform model use the related linguistic content together with a selected pseudo-speaker identity. The linguistic content is estimated using bottleneck features extracted from a triphone classifier while the speaker information is extracted then modified to target a pseudo-speaker identity in the x-vector's space. In this work, we first proposed to replace the triphone-based bottleneck features extractor that requires supervised training by an end-to-end Automatic Speech Recognition (ASR) system. In this framework, we explored the use of adver-sarial and semi-adversarial training to learn linguistic features while masking speaker information. Last, we explored several anonymization schemes to introspect which module benefits the most from the generated pseudo-speaker identities.
Fichier principal
Vignette du fichier
MultiSpeech.pdf (256.39 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02995855 , version 1 (09-11-2020)

Identifiants

  • HAL Id : hal-02995855 , version 1

Citer

Pierre Champion, Denis Jouvet, Anthony Larcher. Speaker information modification in the VoicePrivacy 2020 toolchain. [Research Report] INRIA Nancy, équipe Multispeech; LIUM - Laboratoire d'Informatique de l'Université du Mans. 2020. ⟨hal-02995855⟩
273 Consultations
268 Téléchargements

Partager

Gmail Facebook X LinkedIn More