Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environment

Xiaofei Li 1 Yutong Ban 1 Laurent Girin 2, 1 Xavier Alameda-Pineda 1 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
2 GIPSA-MAGIC - MAGIC
GIPSA-DPC - Département Parole et Cognition
Abstract : This paper addresses the problem of online multiple-speaker localization and tracking in reverberant environment. We propose to use the direct-path relative transfer function (DP-RTF) – a feature that encodes the inter-channel direct-path information robust against reverberation, hence well suited for reliable localization. A complex Gaussian mixture model (CGMM) is then used, such that each component weight represents the probability that an active speaker is present at a corresponding candidate source direction. Exponentiated gradient descent is used to update these weights online by minimizing a combination of negative log-likelihood and entropy. The latter imposes sparsity over the number of audio sources, since in practice only a few speakers are simultaneously active. The outputs of this online localization process are then used as observations within a Bayesian filtering process whose computation is made tractable via an instance of variational expectation-maximization. Birth and sleeping processes are used to account for the intermittent nature of speech. The method is thoroughly evaluated using several datasets.
Type de document :
Pré-publication, Document de travail
Submitted to Journal on Selected Topics in Signal Processing. 2018
Liste complète des métadonnées

Littérature citée [38 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01851985
Contributeur : Team Perception <>
Soumis le : mardi 31 juillet 2018 - 13:49:18
Dernière modification le : lundi 20 août 2018 - 15:45:56

Fichier

SSLT_JSTSP.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01851985, version 1

Citation

Xiaofei Li, Yutong Ban, Laurent Girin, Xavier Alameda-Pineda, Radu Horaud. Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environment. Submitted to Journal on Selected Topics in Signal Processing. 2018. 〈hal-01851985〉

Partager

Métriques

Consultations de la notice

478

Téléchargements de fichiers

83