D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent dirichlet allocation, The Journal of Machine Learning Research, vol.3, pp.993-1022, 2003.

M. Morchid, R. Dufour, P. Bousquet, M. Bouallegue, G. Linarès et al., Improving dialogue classification using a topic space representation and a gaussian classifier based on the decision rule, International Conference on Acoustic, Speech and Signal Processing, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01318674

P. Kenny, P. Ouellet, N. Dehak, V. Gupta, and P. Dumouchel, A study of interspeaker variability in speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.5, pp.980-988, 2008.

D. A. Reynolds and R. C. Rose, Robust text-independent speaker identification using gaussian mixture speaker models, IEEE Transactions on Speech and Audio Processing, vol.3, issue.1, pp.72-83, 1995.

N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, Frontend factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.4, pp.788-798, 2011.

F. Bechet, B. Maza, N. Bigouroux, T. Bazillon, M. El-beze et al., Decoda: a call-centre human-human spoken conversation corpus, p.12, 2012.

A. Asuncion and D. Newman, Uci machine learning repository, 2007.

P. Bousquet, D. Matrouf, and J. Bonastre, Intersession compensation and scoring methods in the i-vectors space for speaker recognition, INTERSPEECH, pp.485-488, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01313266

D. Garcia-romero and C. Y. Espy-wilson, Analysis of i-vector length normalization in speaker recognition systems, INTERSPEECH, pp.249-252, 2011.

J. R. Bellegarda, Exploiting both local and global constraints for multi-span statistical language modeling, Proceedings of the 1998 IEEE International Conference on, vol.2, pp.677-680, 1998.

P. Clarkson and R. Rosenfeld, Statistical language modeling using the cmu-cambridge toolkit, Eurospeech, vol.97, pp.2707-2710, 1997.

R. Iyer and M. Ostendorf, Relevance weighting for combining multidomain data for¡ i¿ n¡/i¿-gram language modeling, Computer Speech & Language, vol.13, issue.3, pp.267-282, 1999.

R. Kneser, J. Peters, and D. Klakow, Language model adaptation using dynamic marginals, Eurospeech, 1997.

J. R. Bellegarda, Exploiting latent semantic information in statistical language modeling, Proceedings of the IEEE, vol.88, issue.8, pp.1279-1296, 2000.

Y. Suzuki, F. Fukumoto, and Y. Sekiguchi, Keyword extraction using term-domain interdependence for dictation of radio news, 17th international conference on Computational linguistics, vol.2, pp.1272-1276, 1998.

G. Salton, Analysis and Retrieval of Information by Computer, 1989.

T. Hofmann, Unsupervised learning by probabilistic latent semantic analysis, Machine Learning, vol.42, issue.1, pp.177-196, 2001.

T. N. Rubin, A. Chambers, P. Smyth, and M. Steyvers, Statistical topic models for multi-label document classification, Machine Learning, vol.88, pp.157-208, 2012.

R. Arun, V. Suresh, C. Veni-madhavan, and M. N. Murthy, On finding the natural number of topics with latent dirichlet allocation: Some observations, Advances in Knowledge Discovery and Data Mining, pp.391-402, 2010.

Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei, Sharing clusters among related groups: Hierarchical dirichlet processes, NIPS, 2004.

E. Zavitsanos, S. Petridis, G. Paliouras, and G. A. Vouros, Determining automatically the size of learned ontologies, ECAI, vol.178, pp.775-776, 2008.

J. Cao, T. Xia, J. Li, Y. Zhang, and S. Tang, A density-based method for adaptive lda model selection, Neurocomputing, vol.72, issue.7, pp.1775-1781, 2009.

D. M. Blei and J. Lafferty, Correlated topic models, Advances in neural information processing systems, vol.18, p.147, 2006.

W. Li and A. Mccallum, Pachinko allocation: Dag-structured mixture models of topic correlations, 2006.

M. Morchid, M. Bouallegue, R. Dufour, G. Linarès, D. Matrouf et al., I-vector based approach to compact multi-granularity topic spaces representation of textual documents, the Conference of Empirical Methods on Natural Lnguage Processing (EMNLP) 2014. SIGDAT, 2014.
DOI : 10.3115/v1/d14-1051

URL : https://hal.archives-ouvertes.fr/hal-01318651

, I-vector based representation of highly imperfect automatic transcriptions, Conference of the International Speech Communication Association (INTERSPEECH) 2014. ISCA, 2014.

T. L. Griffiths and M. Steyvers, Finding scientific topics, Proceedings of the National academy of Sciences of the United States of America, vol.101, pp.5228-5235, 2004.
DOI : 10.1073/pnas.0307752101

URL : http://www.pnas.org/content/101/suppl_1/5228.full.pdf

T. Minka and J. Lafferty, Expectation-propagation for the generative aspect model, Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence, pp.352-359, 2002.

S. Geman and D. Geman, Stochastic relaxation, gibbs distributions, and the bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence, issue.6, pp.721-741, 1984.

G. Heinrich, Parameter estimation for text analysis, 2005.

D. Mart?nez, O. Plchot, L. Burget, O. Glembek, and P. Matejka, Language recognition in ivectors space, pp.861-864, 2011.

J. Franco-pedroso, I. Lopez-moreno, D. T. Toledano, and J. Gonzalezrodriguez, Atvs-uam system description for the audio segmentation and speaker diarization albayzin 2010 evaluation, FALA VI Jornadas en Tecnologa del Habla and II Iberian SLTech Workshop, pp.415-418, 2010.

P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, Joint factor analysis versus eigenchannels in speaker recognition, IEEE Transactions on Audio, Speech, and Language Processing, vol.15, pp.1435-1447, 2007.
DOI : 10.1109/tasl.2006.881693

URL : http://www.crim.ca/perso/patrick.kenny/FASysJ.pdf

D. Matrouf, N. Scheffer, B. G. Fauve, and J. Bonastre, A straightforward and efficient implementation of the factor analysis model for speaker verification, INTERSPEECH, pp.1242-1245, 2007.
URL : https://hal.archives-ouvertes.fr/hal-01318480

E. P. Xing, M. I. Jordan, S. Russell, and A. Ng, Distance metric learning with application to clustering with side-information, Advances in neural information processing systems, pp.505-512, 2002.

G. Linarès, P. Nocéra, D. Massonie, and D. Matrouf, The lia speech recognition system: from 10xrt to 1xrt, Text, Speech and Dialogue, pp.302-308, 2007.

S. , Hybrid feature selection for text classification, Turkish Journal of Electrical Engineering & Computer Sciences, vol.20, issue.2, pp.1296-1311, 2012.

W. Zhu and Y. Lin, Using gini-index for feature weighting in text categorization, Journal of Computational Information Systems, vol.9, issue.14, pp.5819-5826, 2013.
DOI : 10.2991/icibet-14.2014.22

URL : https://download.atlantis-press.com/article/11384.pdf

R. R. Larson, Introduction to information retrieval, 2010.

M. Morchid, G. Linarès, M. El-beze, and R. D. Mori, Theme identification in telephone service conversations using quaternions of speech features, Conference of the International Speech Communication Association (INTERSPEECH) 2013. ISCA, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01339930

M. Morchid, R. Dufour, and G. Linarès, A lda-based topic classification approach from highly imperfect automatic transcriptions, Proceedings of the 9th edition of the Language Resources and Evaluation Conference (LREC), 2014.
DOI : 10.1007/978-3-319-18117-2_44

URL : https://hal.archives-ouvertes.fr/hal-01319771

M. Bouallegue, M. Morchid, R. Dufour, M. Driss, G. Linarès et al., Factor analysis based semantic variability compensation for automatic conversation representation, Conference of the International Speech Communication Association (INTERSPEECH) 2014. ISCA, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01313121

, Subspace gaussian mixture models for dialogues classification, Conference of the International Speech Communication Association (INTERSPEECH) 2014. ISCA, 2014.

M. Morchid, R. Dufour, M. Bouallegue, G. Linarès, and R. D. Mori, Theme identification in human-human conversations with features from specific speaker type hidden spaces, Conference of the International Speech Communication Association (INTERSPEECH) 2014. ISCA, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01318666

V. Van-asch, Macro-and micro-averaged evaluation measures, 2013.

C. D. Manning, P. Raghavan, and H. Schütze, Introduction to information retrieval, vol.1, 2008.

Y. Yang and J. O. Pedersen, A comparative study on feature selection in text categorization, ICML, vol.97, pp.412-420, 1997.

G. Forman, An extensive empirical study of feature selection metrics for text classification, The Journal of machine learning research, vol.3, pp.1289-1305, 2003.

S. Gunal, O. N. Gerek, D. G. Ece, and R. Edizkan, The search for optimal feature set in power quality event classification, Expert Systems with Applications, vol.36, issue.7, pp.10-266, 2009.

L. Yongmin, Z. Weidong, and S. Wenqian, Improvement of the decision rule in knn text categorization, Journal of Computer Research and Development, vol.42, pp.378-382, 2005.