X. Alameda-pineda and R. Horaud, Vision-guided robot hearing, The International Journal of Robotics Research, vol.26, issue.10, pp.437-456, 2015.
DOI : 10.1214/aos/1176344136

URL : https://hal.archives-ouvertes.fr/hal-00990766

X. A. Miro, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland et al., Speaker Diarization: A Review of Recent Research, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.2, pp.356-370, 2012.
DOI : 10.1109/TASL.2011.2125954

S. Bae and K. Yoon, Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.1218-1225, 2014.
DOI : 10.1109/CVPR.2014.159

A. Deleforge, R. Horaud, Y. Y. Schechner, and L. Girin, Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.4, pp.718-731, 2015.
DOI : 10.1109/TASLP.2015.2405475

URL : https://hal.archives-ouvertes.fr/hal-01112834

D. Gatica-perez, G. Lathoud, J. Odobez, and I. Mc-cowan, Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.2, pp.601-616, 2007.
DOI : 10.1109/TASL.2006.881678

I. D. Gebru, X. Alameda-pineda, F. Forbes, and R. Horaud, EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.12, 2015.
DOI : 10.1109/TPAMI.2016.2522425

URL : https://hal.archives-ouvertes.fr/hal-01261374

I. D. Gebru, S. Ba, G. Evangelidis, and R. Horaud, Audiovisual speech-turn detection and tracking, The Twelfth International Conference on Latent Variable Analysis and Signal Separation, 2015.
DOI : 10.1007/978-3-319-22482-4_17

URL : https://hal.archives-ouvertes.fr/hal-01163659

V. Khalidov, F. Forbes, and R. Horaud, Conjugate Mixture Models for Clustering Multimodal Data, Neural Computation, vol.49, issue.3, pp.517-557, 2011.
DOI : 10.1007/978-94-011-3436-1

URL : https://hal.archives-ouvertes.fr/inria-00590267

E. Kidron, Y. Y. Schechner, and M. Elad, Cross-Modal Localization via Sparsity, IEEE Transactions on Signal Processing, vol.55, issue.4, pp.1390-1404, 2007.
DOI : 10.1109/TSP.2006.888095

V. Kilic, M. Barnard, W. Wang, and J. Kittler, Audio Assisted Robust Visual Tracking With Adaptive Particle Filtering, IEEE Transactions on Multimedia, vol.17, issue.2, pp.186-200, 2015.
DOI : 10.1109/TMM.2014.2377515

S. Kotz and S. Nadarajah, Multivariate t Distributions and their Applications, 2004.

V. P. Minotto, C. R. Jung, and B. Lee, Multimodal on-line speaker diarization using sensor fusion through SVM, IEEE Transactions on Multimedia, 2015.

S. Naqvi, M. Yu, and J. Chambers, A Multimodal Approach to Blind Source Separation of Moving Sources, IEEE Journal of Selected Topics in Signal Processing, vol.4, issue.5, pp.895-910, 2010.
DOI : 10.1109/JSTSP.2010.2057198

A. Noulas, G. Englebienne, and B. J. Krose, Multimodal Speaker Diarization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.1, pp.79-93, 2012.
DOI : 10.1109/TPAMI.2011.47

G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. W. Senior, Recent advances in the automatic recognition of audiovisual speech, Proceedings of the IEEE, pp.1306-1326, 2003.

M. E. Sargin, Y. Yemez, E. Erzin, and M. A. Tekalp, Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis, IEEE Transactions on Multimedia, vol.9, issue.7, pp.1396-1403, 2007.
DOI : 10.1109/TMM.2007.906583

J. Sohn, N. S. Kim, and W. Sung, A statistical model-based voice activity detection, IEEE Signal Processing Letters, vol.6, issue.1, pp.1-3, 1999.
DOI : 10.1109/97.736233

J. Sun, A. Kabán, and J. M. Garibaldi, Robust mixture clustering using Pearson type VII distribution, Pattern Recognition Letters, vol.31, issue.16, pp.312447-2454, 2010.
DOI : 10.1016/j.patrec.2010.07.015

B. D. Van-veen and K. M. Buckley, Beamforming: a versatile approach to spatial filtering, IEEE ASSP Magazine, vol.5, issue.2, pp.4-24, 1988.
DOI : 10.1109/53.665