Lexicon-Based Algorithms for the Automatic Analysis of Natural Language

Abstract : Let us examine the following discourse (D) from the point of view of elementary grammatical analysis:

(D) Two men cleaned the offices, then, they waited for the janitor

This discourse is composed of two members: two simple sentences connected by the conjunction 'then'. One of the elements needed for the interpretation of (D) lies in the nature of the antecedent of the pronoun 'they'. In principle, the pronoun 'they' refers to the noun phrase 'two men' of the first member, but it might indicate a group of persons different from these two men, if (D) is attached to an appropriate context or background. Whether the scene which constitutes the interpretation of (D) includes 3 persons (2 men and 1 janitor) or more depends entirely on the analysis of 'they'.

Such questions of resolution of pronouns are trivial for a native speaker of English, but they become of paramount importance when one attempts a computer analysis of texts, and also when a reader who does not know well the language in which the discourse (D) is written tries to understand it. In both situations, in order to interpret (D), detailed dictionaries and grammars must be available which account for the relations occurring between the terms of (D). In this article, we are going to simulate the computation of the search for the antecedent(s) of 'they'. We will simplify this procedure by omitting its clerical aspects. In this way, we throw into relief the nature and the amount of information that must be stored in a lexicon-grammar, since, as we will see, we do not draw the usual line of demarcation between these two components of a language.
Complete list of metadatas

Contributor : Eric Laporte <>
Submitted on : Sunday, May 11, 2008 - 3:05:53 PM
Last modification on : Friday, January 4, 2019 - 5:33:24 PM
Long-term archiving on : Friday, May 28, 2010 - 6:56:56 PM


Files produced by the author(s)


  • HAL Id : halshs-00278305, version 1



Maurice Gross. Lexicon-Based Algorithms for the Automatic Analysis of Natural Language. Theorie und Praxis des Lexikons. Proceeding of the Conference on Theories and Applications of Lexicology and Lexicography, Walter de Gruyter, pp.218-236, 1993. ⟨halshs-00278305⟩



Record views


Files downloads