Filtering news for epidemic surveillance: towards processing more languages with fewer resources - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

Filtering news for epidemic surveillance: towards processing more languages with fewer resources

Résumé

Processing content for security becomes more and more important since every local danger can have global consequences. Being able to collect and analyse information in different languages is a great issue. This paper addresses multilingual solutions for analysis of press articles for epidemiological surveillance. The system described here relies on pragmatics and stylistics, giving up "bag of sentences" approach in favour of discourse repetition patterns. It only needs light resources (compared to existing systems) in order to process new languages easily. In this paper we present here results in English, French and Chinese, three languages with quite different characteristics. These results show that simple rules allow selection of relevant documents in a specialized database improving the reliability of information extraction.
Fichier principal
Vignette du fichier
ACTI-LEJEUNE-2010-2.pdf (5.26 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01067156 , version 1 (23-09-2014)

Identifiants

  • HAL Id : hal-01067156 , version 1

Citer

Gaël Lejeune, Antoine Doucet, Roman Yangarber, Nadine Lucas. Filtering news for epidemic surveillance: towards processing more languages with fewer resources. 4th International worshop on cross-lingual information access CLIA 2010, Aug 2010, Pekin, China. 8 p. ⟨hal-01067156⟩
92 Consultations
20 Téléchargements

Partager

Gmail Facebook X LinkedIn More