| HAL : hal-00402321, version 1 |
| Fiche détaillée | Récupérer au format |
|
|
| Traitement Automatique des Langues Naturelles 2009, Senlis : France (2009) |
|
|
|
|
| Vers une méthodologie d'annotation des entités nommées en corpus ? |
|
|
| Karen Fort 1Maud Ehrmann 2 |
|
|
| Quaero Collaboration(s) |
|
|
| (2009) |
|
|
| Today, the named entity recognition task is considered as fundamental, but it involves some specific difficulties in terms of annotation. We list them here, with illustrations taken from manual annotation experiments in microbiology. Those issues lead us to ask the fun- damental question of what the annotators should annotate and, even more important, for which purpose. We thus identify the applications using named entity recognition and, according to the real needs of those applications, we propose to semantically define the elements to annotate. Finally, we put forward a number of methodological recommendations to ensure a coherent and reliable annotation scheme. |
|
|
|
|
|
|
|
|
|
|
| 1 : | Institut de l'information scientifique et technique (INIST) |
| CNRS : UPS76 | |
| 2 : | Xerox Research Centre Europe (XRCE) |
| Xerox | |
| 3 : | Laboratoire d'informatique de Paris-nord (LIPN) |
| CNRS : UMR7030 – Université Paris XIII - Paris Nord | |
|
|
|
|
|
|
|
|
| INIST / LIPN; XRCE; LIPN / Université Paris 13 / CNRS |
|
|
|
|
| Domaine | : | Informatique/Traitement du texte et du document |
|
|
| annotation – named entities extraction |
|
|
| Liste des fichiers attachés à ce document : | |||||
|
|
|
| hal-00402321, version 1 | |
| http://hal.archives-ouvertes.fr/hal-00402321 | |
| oai:hal.archives-ouvertes.fr:hal-00402321 | |
| Contributeur : Karën Fort | |
| Soumis le : Mardi 7 Juillet 2009, 11:19:48 | |
| Dernière modification le : Mardi 7 Juillet 2009, 13:42:49 | |