Annotating a large corpus with anaphoric links

Abstract : This paper presents a one million word French corpus annotated with anaphoric links. The anaphoric expressions selected are mainly grammatical discourse phenomena for which a reliable annotation could be provided. The annotation scheme, defined in XML, encodes the orientation of the anaphoric relation by using a specific element for relating the anaphoric expression to its antecedent(s). A set of five semantic relations is used to type the anaphoric relation. As a rule, linguistic expressions selected are phrases, but the annotation scheme uses specific elements to deal with descriptive anaphors which occur in nominal ellipses and demonstrative anaphors. Special cases such as multiple antecedents, discontinuous elements or ambiguity are discussed.
Liste complète des métadonnées

Cited literature [6 references]  Display  Hide  Download
Contributor : François Trouilleux <>
Submitted on : Friday, April 3, 2009 - 9:00:38 PM
Last modification on : Tuesday, February 12, 2019 - 10:30:05 AM
Document(s) archivé(s) le : Thursday, June 10, 2010 - 6:09:23 PM


Files produced by the author(s)


  • HAL Id : hal-00373327, version 1



Agnès Tutin, François Trouilleux, Catherine Clouzot, Éric Gaussier, Annie Zaenen, et al.. Annotating a large corpus with anaphoric links. Third International Conference on Discourse Anaphora and Anaphor Resolution (DAARC2000), 2000, United Kingdom. pp.2. ⟨hal-00373327⟩



Record views


Files downloads