Skip to Main content Skip to Navigation
New interface
Conference papers

Study on Acoustic Model Personalization in a Context of Collaborative Learning Constrained by Privacy Preservation

Salima Mdhaffar 1 Marc Tommasi 2 Yannick Estève 1 
2 MAGNET - Machine Learning in Information Networks
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
Abstract : This paper investigates different approaches in order to improve the performance of a speech recognition system for a given speaker by using no more than 5 min of speech from this speaker, and without exchanging data from other users/speakers. Inspired by the federated learning paradigm, we consider speakers that have access to a personalized database of their own speech, learn an acoustic model and collaborate with other speakers in a network to improve their model. Several local personalizations are explored depending on how aggregation mechanisms are performed. We study the impact of selecting, in an adaptive way, a subset of speakers's models based on a notion of similarity. We also investigate the effect of weighted averaging of fine-tuned and global models. In our approach, only neural acoustic model parameters are exchanged and no audio data is exchanged. By avoiding communicating their personal data, the proposed approach tends to preserve the privacy of speakers. Experiments conducted on the TEDLIUM 3 dataset show that the best improvement is given by averaging a subset of different acoustic models fine-tuned on several user datasets. Our approach applied to HMM/TDNN acoustic models improves quickly and significantly the ASR performance in terms of WER (for instance in one of our two evaluation datasets, from 14.84% to 13.45% with less than 5 min of speech per speaker).
Document type :
Conference papers
Complete list of metadata
Contributor : Yannick Estève Connect in order to contact the contributor
Submitted on : Tuesday, October 12, 2021 - 1:46:19 PM
Last modification on : Tuesday, November 22, 2022 - 2:26:16 PM
Long-term archiving on: : Thursday, January 13, 2022 - 6:03:17 PM



Salima Mdhaffar, Marc Tommasi, Yannick Estève. Study on Acoustic Model Personalization in a Context of Collaborative Learning Constrained by Privacy Preservation. SPECOM 2021 - 23rd International Conference on Speech and Computer, Sep 2021, St Petersburg, Russia. pp.426 - 436, ⟨10.1007/978-3-030-87802-3_39⟩. ⟨hal-03369206⟩



Record views


Files downloads