M. Berry and D. Martin, Principal Component Analysis for Information Retrieval, Handbook of Parallel Computing and Statistics. Statistics: A Series of Textbooks and Monographs, 2005.
DOI : 10.1201/9781420028683.ch13

J. Bourgain, On lipschitz embedding of finite metric spaces in Hilbert space, Israel Journal of Mathematics, vol.26, issue.1-2, 1985.
DOI : 10.1007/BF02776078

V. Claveau and S. Lefvre, Topic segmentation of tv-streams by mathematical morphology and vectorization, Procedings of the InterSpeech conference, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00643905

V. Claveau, R. Tavenard, and L. Amsaleg, Vectorisation des processus d'appariement document-requête. In: 7e conférence en recherche d'informations et applications, CORIA'10, pp.313-324, 2010.

M. Datar, N. Immorlica, P. Indyk, and V. Mirrokni, Locality-sensitive hashing scheme based on p-stable distributions, Proceedings of the twentieth annual symposium on Computational geometry , SCG '04, 2004.
DOI : 10.1145/997817.997857

M. David, A. Y. Blei, and M. I. Ng, Latent dirichlet allocation, Journal of Machine Learning Research, vol.3, issue.4-5, pp.993-1022, 2003.

S. Dumais, Latent semantic analysis, Annual Review of Information Science and Technology, vol.21, issue.2, 2004.
DOI : 10.1002/aris.1440380105

E. Fox and J. Shaw, Combination of multiple searches, Proceedings of the 2nd Text Retrieval Conference (TREC-2), NIST Special Publication, pp.243-252, 1994.

S. Harter, A probabilistic approach to automatic keyword indexing. Part I. On the Distribution of Specialty Words in a Technical Literature, Journal of the American Society for Information Science, vol.24, issue.4, pp.197-206, 1975.
DOI : 10.1002/asi.4630260402

T. Hofmann, Probabilistic latent semantic indexing, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '99, 1999.
DOI : 10.1145/312624.312649

J. Lee, Combining multiple evidence from different properties of weighting schemes, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '95, pp.180-188, 1995.
DOI : 10.1145/215206.215358

H. Lejsek, F. Asmundsson, B. Jónsson, and L. Amsaleg, NV-Tree, Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR '11, 2008.
DOI : 10.1145/1991996.1992050

URL : https://hal.archives-ouvertes.fr/hal-00644939

H. P. Luhn, The Automatic Creation of Literature Abstracts, IBM Journal of Research and Development, vol.2, issue.2, 1958.
DOI : 10.1147/rd.22.0159

S. Jones and K. , A statistical interpretation of term specificity and its application in retrieval, Journal of Documentation, vol.28, issue.1, 1972.

S. Jones, K. Walker, S. G. Robertson, and S. E. , Probabilistic model of information retrieval : Development and comparative experiments, Information Processing and Management, vol.36, issue.6, 2000.

B. Stein, Principles of hash-based text retrieval, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, 2007.
DOI : 10.1145/1277741.1277832

S. Vempala, The Random Projection Method, Discrete Mathematics and Theoretical Computer Science AMS, vol.65, 2004.
DOI : 10.1007/978-1-4615-0013-1_16