, This work has been supported by the European Union Horizon 2020 research and innovation programme under grants 825153 (Embeddia) and, p.770299

, Bibliographical References

E. Aramaki, S. Maskawa, and M. Morita, Twitter catches the flu: detecting influenza epidemics using twitter, Proceedings of the conference on empirical methods in natural language processing, pp.1568-1576, 2011.

T. M. Bernardo, A. Rajic, I. Young, K. Robiadek, M. T. Pham et al., Scoping review on search queries and social media for disease surveillance: a chronology of innovation, Journal of medical Internet research, vol.15, issue.7, p.147, 2013.

T. Bodnar and M. Salathé, Validating models for disease detection using twitter, Proceedings of the 22nd International Conference on World Wide Web, pp.699-702, 2013.

J. S. Brownstein, C. C. Freifeld, B. Y. Reis, and K. D. Mandl, Surveillance sans frontieres: Internetbased emerging infectious disease intelligence and the healthmap project, PLoS medicine, vol.5, issue.7, p.151, 2008.

R. P. Bunker and F. Thabtah, A machine learning framework for sport result prediction. Applied computing and informatics, 2017.

L. E. Charles-smith, T. L. Reynolds, M. A. Cameron, M. Conway, E. H. Lau et al., Using social media for actionable disease surveillance and outbreak management: a systematic literature review, PloS one, vol.10, issue.10, p.139701, 2015.

R. Chunara, C. C. Freifeld, and J. S. Brownstein, New technologies for reporting real-time emergent infections, Parasitology, vol.139, issue.14, pp.1843-1851, 2012.

N. Collier, S. Doan, A. Kawazoe, R. M. Goodwin, M. Conway et al., Biocaster: detecting public health rumors with a web-based text mining system, Bioinformatics, vol.24, issue.24, pp.2940-2941, 2008.

N. Collier, N. T. Son, and N. M. Nguyen, Omg u got flu? analysis of shared health messages for biosurveillance, Journal of biomedical semantics, vol.2, issue.5, p.9, 2011.

N. Collier, Towards cross-lingual alerting for bursty epidemic events, Journal of Biomedical Semantics, vol.2, issue.5, p.10, 2011.

C. P. Cooper, K. P. Mallon, S. Leadbetter, L. A. Pollack, and L. A. Peipins, Cancer internet search activity on a major search engine, united states, Journal of medical Internet research, vol.7, issue.3, p.36, 2001.

A. Culotta, Detecting influenza outbreaks by analyzing twitter messages, 2010.

E. Diaz-aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl, Epidemic intelligence for the crowd, by the crowd, Sixth International AAAI Conference on Weblogs and Social Media, 2012.

M. Du, P. Von-etter, M. Kopotev, M. Novikov, N. Tarbeeva et al., Building support tools for russian-language information extraction, International Conference on Text, Speech and Dialogue, pp.380-387, 2011.

J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski et al., Detecting influenza epidemics using search engine query data, Nature, vol.457, issue.7232, p.1012, 2009.

F. Hogenboom, F. Frasincar, U. Kaymak, D. Jong, and F. , An overview of event extraction from text, DeRiVE@ ISWC, pp.48-57, 2011.

A. G. Huff, N. Breit, T. Allen, K. Whiting, K. et al., Evaluation and verification of the global rapid identification of threats system for infectious diseases in textual data sources. Interdisciplinary perspectives on infectious diseases, 2016.

A. Lamb, M. J. Paul, and M. Dredze, Separating fact from fear: Tracking flu infections on twitter, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.789-795, 2013.

C. Le, P. Prasad, A. Alsadoon, L. Pham, and A. Elchouemi, Text classification: Naïve bayes classifier with sentiment lexicon, IAENG International Journal of Computer Science, vol.46, issue.2, pp.141-148, 2019.

G. Lejeune, R. Brixtel, A. Doucet, L. , and N. , Multilingual event extraction for epidemic detection. Artificial intelligence in medicine, vol.65, pp.131-143, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01294127

J. Li and C. Cardie, Early stage influenza detection from twitter, 2013.

M. L. Mchugh, Interrater reliability: the kappa statistic, Biochemia medica: Biochemia medica, vol.22, issue.3, pp.276-282, 2012.

R. Misra, News category dataset, p.6, 2018.

J. O'shea, Digital disease detection: A systematic review of event-based internet biosurveillance systems, International journal of medical informatics, vol.101, pp.15-22, 2017.

M. J. Paul, A. Sarker, J. S. Brownstein, A. Nikfarjam, M. Scotch et al., Social media mining for public health monitoring and surveillance, Biocomputing 2016: Proceedings of the Pacific symposium, pp.468-479, 2016.

P. M. Polgreen, Y. Chen, D. M. Pennock, F. D. Nelson, and R. A. Weinstein, Using internet searches for influenza surveillance, Clinical infectious diseases, vol.47, issue.11, pp.1443-1448, 2008.

J. Pomikálek, Removing boilerplate and duplicate content from web corpora, Masarykova univerzita, Fakulta informatiky, 2011.

P. J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, Journal of computational and applied mathematics, vol.20, pp.53-65, 1987.

A. Sadilek, H. Kautz, and V. Silenzio, Predicting disease transmission from geo-tagged micro-blog data, Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012.

M. Salathé, C. C. Freifeld, S. R. Mekaru, A. F. Tomasulo, and J. S. Brownstein, Influenza a (h7n9) and the importance of digital epidemiology, The New England journal of medicine, vol.369, issue.5, p.401, 2013.

T. Vogels, O. Ganea, and C. Eickhoff, Web2text: Deep structured boilerplate removal, European Conference on Information Retrieval, pp.167-179, 2018.

, Early detection, assessment and response to acute public health events: implementation of early warning and response with a focus on event-based surveillance: interim version, World Health Organization, 2014.

L. Zhan and X. Jiang, Survey on event extraction technology in information extraction research area, 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (IT-NEC), pp.2121-2126, 2019.

X. Zhou, J. Ye, and Y. Feng, Tuberculosis surveillance by analyzing google trends, IEEE transactions on biomedical engineering, vol.58, issue.8, pp.2247-2254, 2011.