Abstract : Understanding a text, whether by a human being or by a computer, implies that units of meanings be identified in the text and that rules composing these units and the corresponding meaning units provide the complete meaning of the text. We will limit ourselves to lexical and grammatical procedures that lead to the recognition of patterns of words on which the process of understanding is based.
Three groups of sentences (ordinary, frozen and those with support verbs) can be clearly distinguished in European languages.
The systematic description of French verbs (simple sentences) has shown that no two verbs have the same set of syntactic properties; as a consequence, verbs have to be described individually and not in terms of intensional classes.
The proportion in the lexicon of idiomatic sentences, of metaphoric and technical sentences that have non compositional meanings, is very high. All these sentences or sentence types have anecdotal origins. The consequence is that they must be described individually, that is without reference to other classes of lexical combinations or of interpretation rules.
The large number of verb-complement combinations that cannot be qualified in terms of semantic (i.e. selectional) restrictions leads to the notion of support verb. Their variety also implies detailed individual descriptions of nouns.
This method of construction of equivalence classes for elementary sentences could be applied today to a whole language, leading to a coverage of structures so complete that computer analysis of syntactic forms would become possible for texts.