Annotating a large corpus with anaphoric links
Résumé
This paper presents a one million word French corpus annotated with anaphoric links. The anaphoric expressions selected are mainly grammatical discourse phenomena for which a reliable annotation could be provided. The annotation scheme, defined in XML, encodes the orientation of the anaphoric relation by using a specific element for relating the anaphoric expression to its antecedent(s). A set of five semantic relations is used to type the anaphoric relation. As a rule, linguistic expressions selected are phrases, but the annotation scheme uses specific elements to deal with descriptive anaphors which occur in nominal ellipses and demonstrative anaphors. Special cases such as multiple antecedents, discontinuous elements or ambiguity are discussed.
Origine : Fichiers produits par l'(les) auteur(s)
Loading...