Syntactico-semantic classification of verbs: a case study - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2009

Syntactico-semantic classification of verbs: a case study

Aude Grezka

Résumé

This article will present the methodology and the linguistic model used to identify and describe verbal structures in French. The last part will evoke two main NLP application of the resulting dictionary. The present work is in line with a lexicaliste theoretical, the model of the "classes of objects" (Gross 1994; Gross 1995; Le Pesant & Mathieu-Colas 1998), in the tradition of the grammar transformational of Z. S Harris (1976, 1988) and works on the lexicon-grammar. The lexical items are always interpreted according to their syntactical and semantic characteristics. The model of the "classes of objects" allows the realization of electronic dictionaries intended for the systems on linguistic data. These dictionaries aim at an exhaustive cover of French, among others languages. It is a question of describing the lexicon by means of the explicit and reproducible syntactico-semantic properties, to be the object of computerized procedures. The structuring of the lexicon in semantic subsets is an old practice, meant to give a natural and legible access to the nomenclature. We are trying at present to establish semantic classes supported by linguistic criteria, by means of both an inductive and internal method which is proper to every language. Entries are represented by uses. The dividing into groups is a preliminary; we notice however, in the practice of the research, that grouping can, in return, act on the dividing into groups, particularly by refining them: both approaches are interdependent and support one another. As far as verbs are concerned, several recent works are keen on drawing up semantic and syntactic classes: amongst others, Levin (1993) for English; Dubois & Dubois-Charlier, 1997; François, Le Pesant & Leeman éds, 2007 for French. From our point of view, the verbal uses represent elementary sentences, distinguished by the patterns of arguments. These uses are grouped together in semantic classes, giving a structuring of the lexicon which is close to intuition, therefore facilitating the access to the lexical information: for example, creation verbs; speech verbs; perception verbs; motion verbs; verbs of hitting etc. We ought to look for the optimal level of semantic and syntactic homogeneity. The fact that we reach uses by means of classes and not directly by means of uses'lists, nor by means of graphic units (word lists mask the polysemy), improves the legibility of the lexicon. Several large classes were previously defined such as: elementary states and modalities; cognition; language and communication; etc. Each of these groups is itself subdivided into finer categories (for example, for language and communication: word, writing / reading, non-verbal signs, etc.). This typology was conceived as a tool, a temporary starting point. It must be revised, corrected and completed according to the description's progress. The final classes cannot be, a priori, classes, they will emerge from the lexicon. The classification is thus not ontological but linguistic. The grouping of the verbs in the same class is made at first according to their meaning: sense verbs must be sufficiently close as to be coupled. Such an approach, although based on all kinds of lexicographic works, is largely intuitive and demands a resort to rigorous criteria as to validate the constitution of a class, in this particular case, criteria being represented here by the linguistic properties. A methodology which applies to all classes has to be elaborated, thus appealing to all the linguistic properties (Grezka 2006; Grezka & Martin-Berthet éds 2007): semantic (class meaning, particular meanings), structural (basic patterns, restructurings), morphological (nouns and associated verbs), distributional (appropriate adverbials). The various descriptors allow at the same time to divide uses into groups and to gather them in semantically homogeneous sets, without neglecting the particular features (singular constructions, subtle shades of meaning). The classes thus conceived include separately the frozen structures (passer à tabac, casser la figure). The verbal phrases are generally submitted to the same description type as the simple verbs, with few exceptions. The grouping by classes facilitates the interaction with the other morphological categories (nouns and adjectives) as to obtain transverse classes of predicates. In this way, we separate morphologically associated nouns and adjectives and semantically associated nouns and adjectives. The descriptions basing on this method allow their further development into tools intended for the Natural Language Processing systems. At present, about five hundred verb uses, which belong to the hyperclass of opinion verbs, are being integrated into a linguistic analysis tool, called TextBox, developed in our laboratory. This integration process, based on syntactic and semantic patterns allows to automatically recognize the appropriate meanings within real texts, thus validating (or not) the linguistic descriptions that the lexicographers put forward. Another NLP application which could arise from this work concerns the automatic translation, since the adequate recognition of syntactic and semantic patterns and of their associated meanings will eventually allow to automatically translate the needed structures in a given target language. The present talk, having introduced the methodological aspects developed above, will propose a case study and, in the third time, will set out this work's applications: in particular, the computerization of a linguistic description by means of an semantic analyzer for the automatic recognition of verbal structures.

Domaines

Linguistique
Fichier non déposé

Dates et versions

hal-00685855 , version 1 (06-04-2012)

Identifiants

  • HAL Id : hal-00685855 , version 1

Citer

Aude Grezka. Syntactico-semantic classification of verbs: a case study. Verb Typologies revisited: A Cross-linguistic Reflection on Verbs and Verb Classes, Feb 2009, Ghent, Belgium. ⟨hal-00685855⟩
144 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More