| HAL: inria-00617068, version 1 |
| See detailed view | BibTeX,EndNote,... |
|
|
| TALN'2011 - Traitement Automatique des Langues Naturelles, Montpellier : France (2011) |
|
|
|
|
| Coopération de méthodes statistiques et symboliques pour l'adaptation non-supervisée d'un système d'étiquetage en entités nommées |
|
|
| Frédéric Béchet 1Benoît Sagot 2 |
|
|
| (2011) |
|
|
| Named entity recognition and typing is achieved both by symbolic and probabilistic systems. We report on an experiment for making the rule-based system NP, a high-precision system developed on AFP news corpora and relies on the Aleda named entity database, interact with LIANE, a high-recall probabilistic system trained on oral transcriptions from the ESTER corpus. We show that a probabilistic system such as LIANE can be adapted to a new type of corpus in a non-supervized way thanks to large-scale corpora automatically annotated by NP. This adaptation does not require any additional manual anotation and illustrates the complementarity between numeric and symbolic techniques for tackling linguistic tasks. |
|
|
|
|
|
|
|
|
|
|
| 1: | Laboratoire d'informatique Fondamentale de Marseille (LIF) |
| CNRS : UMR6166 – Université de la Méditerranée - Aix-Marseille II – Université de Provence - Aix-Marseille I | |
| 2: | ALPAGE (INRIA Rocquencourt) |
| INRIA – Université Paris VII - Paris Diderot | |
| 3: | Medialab AFP (Medialab AFP) |
| Agence France-Presse | |
|
|
|
|
|
|
|
|
| Domain | : | Computer Science/Computation and Language |
|
|
| Named entity recognition – domain adaptation – cooperation between probabilistic and symbolic approaches |
|
|
| Attached file list to this document: | |||||
|
|
|
| inria-00617068, version 1 | |
| http://hal.inria.fr/inria-00617068 | |
| oai:hal.inria.fr:inria-00617068 | |
| From: Benoît Sagot | |
| Submitted on: Thursday, 25 August 2011 22:40:32 | |
| Updated on: Monday, 5 September 2011 14:49:55 | |