Detecting salient events in large corpora by a combination of NLP and data mining techniques (poster) - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2013

Detecting salient events in large corpora by a combination of NLP and data mining techniques (poster)

Résumé

In this paper, we present a framework and a system that extracts "salient" events relevant to a query from a large collection of documents, and which also enables events to be placed along a timeline. Each event is represented by a sentence extracted from the collection. We have conducted some experiments showing the interest of the method for this issue. Our method is based on a combination of linguistic modeling (concerning temporal adverbial meanings), symbolic natural language processing techniques (using cascades of morpho-lexical transducers) and data mining techniques (namely, sequential pattern mining under constraints). The system was applied to a corpus of newswires in French provided by the Agence France Presse (AFP). Evaluation was performed in partnership with French newswire agency journalists.
Fichier principal
Vignette du fichier
ACTI-BATTISTELLI-2013-1.pdf (5.73 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01023926 , version 1 (15-07-2014)

Identifiants

  • HAL Id : hal-01023926 , version 1

Citer

Delphine Battistelli, Thierry Charnois, Jean-Luc Minel, Charles Teissedre. Detecting salient events in large corpora by a combination of NLP and data mining techniques (poster). Supplementary Proceedings of the 14th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2013), Mar 2013, Samos, Greece. ⟨hal-01023926⟩
178 Consultations
43 Téléchargements

Partager

Gmail Facebook X LinkedIn More