Deep complementary features for speaker identification in TV broadcast data

Mateusz Budnik; Laurent Besacier; Ali Khodabakhsh; Cenk Demiroglu

doi:10.21437/Odyssey.2016-21

Communication Dans Un Congrès Année : 2016

Deep complementary features for speaker identification in TV broadcast data

(1) , (2, 3) , (4, 1) , (4)

1
2
3
4

Mateusz Budnik

Fonction : Auteur

Laboratoire d'Informatique de Grenoble

Laurent Besacier

Fonction : Auteur
PersonId : 1521
IdHAL : laurent-besacier
ORCID : 0000-0001-7411-9125
IdRef : 079377017

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Institut universitaire de France

Ali Khodabakhsh

Fonction : Auteur correspondant
PersonId : 987033

Connectez-vous pour contacter l'auteur

Istanbul University

Laboratoire d'Informatique de Grenoble

Cenk Demiroglu

Fonction : Auteur

Istanbul University

Résumé

This work tries to investigate the use of a Convolutional Neu-ral Network approach and its fusion with more traditional systems such as Total Variability Space for speaker identification in TV broadcast data. The former uses spectrograms for training, while the latter is based on MFCC features. The dataset poses several challenges such as significant class imbalance or background noise and music. Even though the performance of the Convolutional Neural Network is lower than the state-of-the-art, it is able to complement it and give better results through fusion. Different fusion techniques are evaluated using both early and late fusion.

Domaines

Informatique et langage [cs.CL]

Fichier principal

odyssey-deep-complementary.pdf (215.79 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Laurent Besacier : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01350068

Soumis le : vendredi 29 juillet 2016-15:30:20

Dernière modification le : lundi 15 avril 2024-11:25:23

Dates et versions

hal-01350068 , version 1 (29-07-2016)

Identifiants

HAL Id : hal-01350068 , version 1
DOI : 10.21437/Odyssey.2016-21

Citer

Mateusz Budnik, Laurent Besacier, Ali Khodabakhsh, Cenk Demiroglu. Deep complementary features for speaker identification in TV broadcast data. Odyssey Workshop 2016, Jun 2016, Bilbao, Spain. ⟨10.21437/Odyssey.2016-21⟩. ⟨hal-01350068⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS LIG LIG_TDCGE_GETALP LIG_TDCGE_MRIM POLYTECH-GRENOBLE ANR LIG_SIDCH

238 Consultations

617 Téléchargements

Deep complementary features for speaker identification in TV broadcast data

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager