An Adaptive Method for Cross-Recording Speaker Diarization

Gaël Le Lan; Delphine Charlet; Anthony Larcher; Sylvain Meignier

doi:10.1109/TASLP.2018.2844025

Article Dans Une Revue IEEE/ACM Transactions on Audio, Speech and Language Processing Année : 2018

An Adaptive Method for Cross-Recording Speaker Diarization

(1) , (2) , (3) , (3)

1
2
3

Gaël Le Lan

Fonction : Auteur
PersonId : 751878
IdHAL : gael-le-lan
ORCID : 0000-0002-1493-5777

Orange Labs [Belfort]

Delphine Charlet

Fonction : Auteur
PersonId : 1005321

Orange Labs [Lannion]

Anthony Larcher

Fonction : Auteur
PersonId : 20105
IdHAL : anthony-larcher
ORCID : 0000-0003-4398-0224
IdRef : 139544569

Laboratoire d'Informatique de l'Université du Mans

Sylvain Meignier

Fonction : Auteur
PersonId : 11674
IdHAL : sylvain-meignier
ORCID : 0000-0001-7687-073X
IdRef : 182269086

Laboratoire d'Informatique de l'Université du Mans

Résumé

Nowadays, state-of-the-art speaker diarization and linking systems heavily rely on between-recording variability compensation methods to accurately process large collections of recordings. Variability estimation is performed on consequent training datasets, which must be labeled by speaker. One major problem of such systems is the acoustic mismatch between training and target data that degrades performances. Most of the collections contain lots of speakers speaking in various acoustic conditions. In this paper, we investigate how unlabeled speakers can help improve between-recording variability estimation, to overcome the mismatch issue. We propose a scalable unsupervised adaptation framework for two types of variability compensation. The proposed framework consists in adapting a state-of-the-art diarization and linking system, trained on out-domain data, using the data of the collection itself. Experiments in mismatch condition are run on two French Television shows, while the initial training dataset is composed of Radio recordings. Results indicate that the proposed adaptation framework reduces the cross-recording DER of 13% in average for variable collection sizes.

Mots clés

speaker diarization speaker linking domain adaptation

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

journal_final.pdf (3.55 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

anthony larcher : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01818495

Soumis le : mercredi 28 novembre 2018-20:27:20

Dernière modification le : mardi 17 mai 2022-11:18:02

Dates et versions

hal-01818495 , version 1 (28-11-2018)

Identifiants

HAL Id : hal-01818495 , version 1
DOI : 10.1109/TASLP.2018.2844025

Citer

Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier. An Adaptive Method for Cross-Recording Speaker Diarization. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2018, 14, pp.1-12. ⟨10.1109/TASLP.2018.2844025⟩. ⟨hal-01818495⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LEMANS LIUM LIUM-LST

129 Consultations

145 Téléchargements

An Adaptive Method for Cross-Recording Speaker Diarization

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager