Online Localization of Multiple Moving Speakers in Reverberant Environments

Xiaofei Li 1 Bastien Mourgue 1 Laurent Girin 2 Sharon Gannot 3 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
2 GIPSA-CRISSP - CRISSP
GIPSA-DPC - Département Parole et Cognition
Abstract : This paper addresses the problem of online multiple moving speakers localization in reverberant environments. The direct-path relative transfer function (DP-RTF), as defined by the ratio between the first taps of the convolutive transfer function (CTF) of two microphones, encodes the inter-channel direct-path information and is thus used as a localization feature being robust against reverberation. The CTF estimation is based on the cross-relation method. In this work, the recursive least-square method is proposed to solve the cross-relation problem, due to its relatively low computational cost and its good convergence rate. The DP-RTF feature estimated at each time-frequency bin is assumed to correspond to a single speaker. A complex Gaussian mixture model is used to assign each observed feature to one among several speakers. The recursive expectation-maximization algorithm is adopted to update online the model parameters. The method is evaluated with a new dataset containing multiple moving speakers, where the ground-truth speaker trajectories are recorded with a motion capture system.
Type de document :
Communication dans un congrès
SAM 2018 - The Tenth IEEE Workshop on Sensor Array and Multichannel Signal Processing, Jul 2018, Sheffield, United Kingdom. pp.1-5
Liste complète des métadonnées

Littérature citée [23 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01795462
Contributeur : Team Perception <>
Soumis le : vendredi 18 mai 2018 - 14:45:56
Dernière modification le : mardi 10 juillet 2018 - 01:18:23

Fichier

Xiaofei_SAM2018.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01795462, version 1

Citation

Xiaofei Li, Bastien Mourgue, Laurent Girin, Sharon Gannot, Radu Horaud. Online Localization of Multiple Moving Speakers in Reverberant Environments. SAM 2018 - The Tenth IEEE Workshop on Sensor Array and Multichannel Signal Processing, Jul 2018, Sheffield, United Kingdom. pp.1-5. 〈hal-01795462〉

Partager

Métriques

Consultations de la notice

213

Téléchargements de fichiers

82