Modeling and predicting user behavior in sponsored search, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09, pp.1067-1076, 2009. ,
DOI : 10.1145/1557019.1557135
Observer et évaluer les usages de Gallica Réflexion épistémologique et stratégique URL https, 2014. ,
Latent dirichlet allocation, Journal of machine Learning research, vol.3, pp.993-1022, 2003. ,
Enriching word vectors with subword information. arXiv preprint, 2016. ,
Multimodel inference understanding aic and bic in model selection. Sociological methods & research, pp.261-304, 2004. ,
Improving Europeana Search Experience Using Query Logs, International Conference on Theory and Practice of Digital Libraries, pp.384-395, 2011. ,
DOI : 10.1002/(SICI)1097-4571(199708)48:8<741::AID-ASI7>3.0.CO;2-S
Introduction to artificial intelligence. Pearson Education India, 1985. ,
Data Preparation for Mining World Wide Web Browsing Patterns, Knowledge and Information Systems, vol.27, issue.6, pp.5-32, 1999. ,
DOI : 10.1002/9780470316801
Creating meaningful data from web logs for improving the impressiveness of a website by using path analysis method, Expert Systems with Applications, vol.36, issue.3, pp.6635-6644, 2009. ,
DOI : 10.1016/j.eswa.2008.08.067
Web user session reconstruction using integer programming, Proceedings of theACM International Conference on Web Intelligence and Intelligent Agent Technology, pp.385-388, 2008. ,
Effects of session representation models on the performance of web recommender systems, Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop, ICDEW '07, pp.931-936, 2007. ,
Maximum likelihood from incomplete data via the em algorithm, Journal of the royal statistical society. Series B (methodological), pp.1-38, 1977. ,
Web data extraction, applications and techniques : a survey. Knowledge-based systems, pp.301-323, 2014. ,
DOI : 10.1016/j.knosys.2014.07.007
URL : http://arxiv.org/pdf/1207.0246
Evaluation de l'usage et de la satisfaction de la bibliothèque numérique Gallica et perspectives d'évolution, 2012. ,
Guide d'interopérabilité OAI-PMH pour un référencement des documents numériques dans Gallica, 2015. ,
word2vec explained : deriving mikolov et al.'s negative-sampling word-embedding method. arXiv preprint arXiv :1402, 2014. ,
How a session is defined in analytics -analytics help, 2016. URL https ,
A web page prediction model based on click-stream tree representation of user behavior, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp.535-540, 2003. ,
Parameter estimation for text analysis, 2004. ,
URL https, 2003. ,
Data clustering : a review, ACM computing surveys (CSUR), vol.31, issue.3, pp.264-323, 1999. ,
Implicit feedback for inferring user preference, ACM SIGIR Forum, vol.37, issue.2, pp.18-28, 2003. ,
DOI : 10.1145/959258.959260
Towards electronic persistence using ark identifiers, ark motivation and overview, 2003. ,
Distributed representations of sentences and documents, ICML, pp.1188-1196, 2014. ,
Distributed representations of words and phrases and their compositionality, Advances in neural information processing systems, pp.3111-3119, 2013. ,
Expectation-propagation for the generative aspect model, Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence, pp.352-359, 2002. ,
Effective personalization based on association rule discovery from web usage data, Proceeding of the third international workshop on Web information and data management , WIDM '01, pp.9-15, 2001. ,
DOI : 10.1145/502932.502935
Discovery and evaluation of aggregate usage profiles for web personalization, Data Mining and Knowledge Discovery, vol.6, issue.1, pp.61-82, 2002. ,
DOI : 10.1023/A:1013232803866
An Introduction to the Search/Retrieve URL Service (SRU), 2004. ,
An analysis of user behavior in online video streaming Mining and Retrieval, VLS- MCMR '10, Proceedings of the International Workshop on Very-large-scale Multimedia Corpus, pp.49-54, 2010. ,
Readings in speech recognition. chapter A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, pp.267-296, 1990. ,
Characterizing typical and atypical user sessions in clickstreams, Proceeding of the 17th international conference on World Wide Web , WWW '08, pp.885-894, 2008. ,
DOI : 10.1145/1367497.1367617
Botnets : A survey, Comput. Netw, vol.57, issue.2, pp.378-403, 2013. ,
Probabilistic topic models. Handbook of latent semantic analysis, pp.424-440 ,
Discovery of Web Robot Sessions Based on Their Navigational Patterns, Data Mining and Knowledge Discovery, vol.6, issue.1, pp.9-35, 2002. ,
DOI : 10.1007/978-3-662-07952-2_9
1 regroupe le plus d'usagers. Il se caractérise par des sessions démarrant généralement par la consultation de documents et faisant intervenir de courtes phases de recherche. Les clusters suivants (figures C.2 ,