Annotating a large corpus with anaphoric links

Abstract : This paper presents a one million word French corpus annotated with anaphoric links. The anaphoric expressions selected are mainly grammatical discourse phenomena for which a reliable annotation could be provided. The annotation scheme, defined in XML, encodes the orientation of the anaphoric relation by using a specific element for relating the anaphoric expression to its antecedent(s). A set of five semantic relations is used to type the anaphoric relation. As a rule, linguistic expressions selected are phrases, but the annotation scheme uses specific elements to deal with descriptive anaphors which occur in nominal ellipses and demonstrative anaphors. Special cases such as multiple antecedents, discontinuous elements or ambiguity are discussed.
Liste complète des métadonnées

Cited literature [6 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00373327
Contributor : François Trouilleux <>
Submitted on : Friday, April 3, 2009 - 9:00:38 PM
Last modification on : Tuesday, February 12, 2019 - 10:30:05 AM
Document(s) archivé(s) le : Thursday, June 10, 2010 - 6:09:23 PM

File

tutin_daarc2000.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00373327, version 1

Collections

Citation

Agnès Tutin, François Trouilleux, Catherine Clouzot, Éric Gaussier, Annie Zaenen, et al.. Annotating a large corpus with anaphoric links. Third International Conference on Discourse Anaphora and Anaphor Resolution (DAARC2000), 2000, United Kingdom. pp.2. ⟨hal-00373327⟩

Share

Metrics

Record views

260

Files downloads

362