An EM Algorithm for Joint Source Separation and Diarisation of Multichannel Convolutive Speech Mixtures

Dionyssos Kounades-Bastian 1 Laurent Girin 2, 1 Xavier Alameda-Pineda 3, 1 Sharon Gannot 4 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
2 GIPSA-CRISSP - CRISSP
GIPSA-DPC - Département Parole et Cognition
Abstract : We present a probabilistic model for joint source separation and diarisation of multichannel convolutive speech mixtures. We build upon the framework of local Gaussian model (LGM) with non-negative matrix factorization (NMF). The diarisa-tion is introduced as a temporal labeling of each source in the mix as active or inactive at the short-term frame level. We devise an EM algorithm in which the source separation process is aided by the diarisation state, since the latter indicates the sources actually present in the mixture. The diarisation state is tracked with a Hidden Markov Model (HMM) with emission probabilities calculated from the estimated source signals. The proposed EM has separation performance comparable with a state-of-the-art LGM NMF method, while out-performing a state-of-the-art speaker diarisation pipeline.
Type de document :
Communication dans un congrès
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar 2017, New Orleans, United States. 2017
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01430761
Contributeur : Team Perception <>
Soumis le : mardi 10 janvier 2017 - 11:24:31
Dernière modification le : jeudi 15 juin 2017 - 09:08:44
Document(s) archivé(s) le : mardi 11 avril 2017 - 14:14:03

Fichier

diarisation_camready.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01430761, version 1

Citation

Dionyssos Kounades-Bastian, Laurent Girin, Xavier Alameda-Pineda, Sharon Gannot, Radu Horaud. An EM Algorithm for Joint Source Separation and Diarisation of Multichannel Convolutive Speech Mixtures. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar 2017, New Orleans, United States. 2017. 〈hal-01430761〉

Partager

Métriques

Consultations de
la notice

1019

Téléchargements du document

216