Influence of Pre-annotation on POS-tagged Corpus Development - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

Influence of Pre-annotation on POS-tagged Corpus Development

Résumé

This article details a series of carefully designed experiments aiming at evaluating the influence of automatic pre-annotation on the manual part-of-speech annotation of a corpus, both from the quality and the time points of view, with a specific attention drawn to biases. For this purpose, we manually annotated parts of the Penn Treebank corpus under various experimental setups, either from scratch or using various pre-annotations. These experiments confirm and detail the gain in quality observed before, while showing that biases do appear and should be taken into account. They finally demonstrate that even a not so accurate tagger can help improving annotation speed.
Fichier principal
Vignette du fichier
lawiv_KFBS_preannot.pdf (258.67 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00484294 , version 1 (18-05-2010)

Identifiants

  • HAL Id : hal-00484294 , version 1

Citer

Karen Fort, Benoît Sagot. Influence of Pre-annotation on POS-tagged Corpus Development. The Fourth ACL Linguistic Annotation Workshop, Jul 2010, Uppsala, Sweden. pp.56-63. ⟨hal-00484294⟩
359 Consultations
319 Téléchargements

Partager

Gmail Facebook X LinkedIn More