Added-Value of Automatic Multilingual Text Analysis for Epidemic Surveillance
Résumé
The early detection of disease outbursts is an important ob-jective of epidemic surveillance. The web news are one of the information bases for detecting epidemic events as soon as possible, but to analyze tens of thousands articles published daily is costly. Recently, automatic systems have been devoted to epidemiological surveillance. The main is-sue for these systems is to process more languages at a limited cost. How-ever, existing systems mainly process major languages (English, French, Russian, Spanish. . .). Thus, when the first news reporting a disease is in a minor language, the timeliness of event detection is worsened. In this paper, we test an automatic style-based method, designed to fill the gaps of existing automatic systems. It is parsimonious in resources and specially designed for multilingual issues. The events detected by the human-moderated ProMED mail between November 2011 and January 2012 are used as a reference dataset and compared to events detected in 17 languages by the system DAnIEL2 from web articles of this time-window. We show how being able to process press articles in languages less-spoken allows quicker detection of epidemic events in some regions of the world.
Origine : Fichiers produits par l'(les) auteur(s)
Loading...