Coupling Knowledge-Based and Data-Driven Systems for Named Entity Recognition - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Coupling Knowledge-Based and Data-Driven Systems for Named Entity Recognition

Résumé

Within Information Extraction tasks, Named Entity Recognition has received much attention over latest decades. From symbolic / knowledge-based to data-driven / machine-learning systems, many approaches have been experimented. Our work may be viewed as an attempt to bridge the gap from the data-driven perspective back to the knowledge-based one. We use a knowledge-based system, based on manually implemented transducers, that reaches satisfactory performances. It has the undisputable advantage of being modular. However, such a hand-crafted system requires substantial efforts to cope with dedicated tasks. In this context, we implemented a pattern extractor that extracts symbolic knowledge, using hierarchical sequential pattern mining over annotated corpora. To assess the accuracy of mined patterns, we designed a module that recognizes Named Entities in texts by determining their most probable boundaries. Instead of considering Named Entity Recognition as a labeling task, it relies on complex context-aware features provided by lower-level systems and considers the tagging task as a markovian process. Using thos systems, coupling knowledge-based system with extracted patterns is straightforward and leads to a competitive hybrid NE-tagger. We report experiments using this system and compare it to other hybridization strategies along with a baseline CRF model.

Mots clés

Fichier principal
Vignette du fichier
CouplingPatternTransducers.pdf (198.09 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00788166 , version 1 (14-02-2013)
hal-00788166 , version 2 (07-09-2020)

Identifiants

  • HAL Id : hal-00788166 , version 2

Citer

Damien Nouvel, Jean-Yves Antoine, Nathalie Friburger, Arnaud Soulet. Coupling Knowledge-Based and Data-Driven Systems for Named Entity Recognition. Workshop W4 Hybrid'12 : Innovative Hybrid Approaches to Process Textual Data, Apr 2012, Avignon, France. pp.69-77. ⟨hal-00788166v2⟩
134 Consultations
214 Téléchargements

Partager

Gmail Facebook X LinkedIn More