S. E. Tranter and D. A. Reynolds, An Overview of Automatic Speaker Diarization Systems, IEEE Transactions on audio, speech, and language processing, vol.14, issue.5, pp.1557-1565, 2006.

X. Anguera, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland et al., Speaker diarization: A Review of Recent Research, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.2, pp.356-370, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00733397

C. Barras, X. Zhu, S. Meignier, and J. L. Gauvain, Multi-Stage Speaker Diarization of Broadcast New, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.5, pp.1505-1512, 2006.

M. Rouvier, G. Dupuy, P. Gay, E. Khoury, T. Merlin et al., An Open-source State-of-the-art Toolbox for Broadcast News Diarization, Interspeech 2013, 14th Annual Conference of the International Speech Communication Association, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01433449

M. A. Siegler, U. Jain, B. Raj, and R. M. Stern, Automatic Segmentation, Classification and Clustering of Broadcast News Audio, Proc. DARPA speech recognition workshop, vol.1997, 1997.

S. Chen and P. Gopalakrishnan, Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion, Proc. DARPA Broadcast News Transcription and Understanding Workshop, vol.8, pp.127-132, 1998.

R. Yin, H. Bredin, and C. Barras, Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks, Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01690244

G. Gelly and J. Gauvain, Minimum Word Error Training of RNN-based Voice Activity Detection, p.16, 2015.

, Annual Conference of the International Speech Communication Association, pp.2650-2654, 2015.

C. Barras, X. Zhu, S. Meignier, and J. Gauvain, Improving Speaker Diarization, RT-04F workshop, 2004.
URL : https://hal.archives-ouvertes.fr/hal-01451540

S. Meignier and T. Merlin, LIUM SpkDiarization: An Open Source Toolkit for Diarization, CMU SPUD Workshop, 2010.
URL : https://hal.archives-ouvertes.fr/hal-01433518

H. Bredin, TristouNet: Triplet Loss for Speaker Turn Embedding, ICASSP 2017, IEEE International Conference on Acoustics, Speech, and Signal Processing, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01830421

G. Wisniewski, H. Bredin, G. Gelly, and C. Barras, Combining Speaker Turn Embedding and Incremental Structure Prediction for Low-Latency Speaker Diarization, 18th Annual Conference of the International Speech Communication Association, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01987809

Q. Wang, C. Downey, L. Wan, P. A. Mansfield, and I. L. Moreno, Speaker Diarization with LSTM, ICASSP 2018, IEEE International Conference on Acoustics, Speech, and Signal Processing, 2018.

X. Zhang, J. Gao, P. Lu, and Y. Yan, A Novel Speaker Clustering Algorithm via Supervised Affinity Propagation, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4369-4372, 2008.

G. Gelly and J. Gauvain, Spoken Language Identification using LSTM-based Angular Proximity, p.18, 2017.

, Annual Conference of the International Speech Communication Association, 2017.

B. J. Frey and D. Dueck, Clustering by Passing Messages Between Data Points, science, vol.315, issue.5814, pp.972-976, 2007.

A. Giraudel, M. Carré, V. Mapelli, J. Kahn, O. Galibert et al., The REPERE Corpus: A Multimodal Corpus for Person Recognition, LREC, pp.1102-1107, 2012.

G. Gravier, G. Adda, N. Paulson, M. Carré, A. Giraudel et al., The ETAPE Corpus for the Evaluation of Speech-based TV Content Processing in the French Language, LREC-Eighth international conference on Language Resources and Evaluation, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00712591

H. Bredin, pyannote.metrics: A Toolkit for Reproducible Evaluation, Diagnostic, and Error Analysis of Speaker Diarization Systems, 18th Annual Conference of the International Speech Communication Association, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01836450

B. Mathieu, S. Essid, T. Fillon, J. Prado, and G. Richard, YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software, ISMIR 2010, 11th International Society for Music Information Retrieval Conference, pp.441-446, 2010.

A. Larcher, K. A. Lee, and S. Meignier, An Extensible Speaker Identification Sidekit in Python, ICASSP 2016, IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.5095-5099, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01433157

P. Broux, F. Desnous, A. Larcher, S. Petitrenaud, J. Carrive et al., S4D: Speaker Diarization Toolkit in Python, Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01818313

R. O. Duda, P. E. Hart, and D. G. Stork, , 2012.

S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, J. Bonastre et al., The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News, Ninth European Conference on Speech Communication and Technology, 2005.

J. Bergstra, D. Yamins, and D. Cox, Making A Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures, International Conference on Machine Learning, pp.115-123, 2013.

J. Bergstra, D. Yamins, and D. D. Cox, Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms, Proceedings of the 12th Python in Science Conference, pp.13-20, 2013.