Trees and after: The concept of text topology: Some applications to verb-form distributions in language corpora

Abstract : The model described here relies on the key concepts of topology, i.e. neighbourhood and equivalence of shape. A linguistic object L is studied in text T by means of one or several local questions Q. The set of successive local answers is processed so as to provide a global function characterizing the textual space under scrutiny. We begin with short sequences of tenses to illustrate the way in which to explore originally Emile Benveniste's concepts of history and discourse . We then supply life-size examples of other objects selected for their heuristic value. We go on to demonstrate the model at work on the distribution of strings of finite (F) and non-finite (n) verbal forms in the LOB Corpus of English. A topological chart is produced as the synthetic image mirroring the locations of the relevant linguistic entities throughout the text. All the individual strings concatenating any number of F and n are classified in a table. Alternatively, individual full-text strings can be extracted. We then proceed to refine the notion of lexical distribution in "rafales" in a lemmatized corpus of Latin texts, the purpose being to test the stability of the distributions in individual texts of selected verbs and assess whether a verb's behaviour is related to its semantic status. The final section is devoted to other Latin texts. The use of segments of equal length makes it possible to draw up the narrative profile of each author as revealed by his handling of tenses in main clauses.
Type de document :
Article dans une revue
Literary and Linguistic Computing, Oxford University Press (OUP), 2007, 22 (2), pp.167-186
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00555349
Contributeur : Sylvie Mellet <>
Soumis le : mercredi 19 janvier 2011 - 18:22:25
Dernière modification le : samedi 19 novembre 2016 - 01:11:14
Document(s) archivé(s) le : mercredi 20 avril 2011 - 02:28:11

Fichier

Trees_and_After_auteurs.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00555349, version 1

Collections

Citation

Xuan Luong, Michel Juillard, Sylvie Mellet, Dominique Longrée. Trees and after: The concept of text topology: Some applications to verb-form distributions in language corpora. Literary and Linguistic Computing, Oxford University Press (OUP), 2007, 22 (2), pp.167-186. 〈hal-00555349〉

Partager

Métriques

Consultations de
la notice

191

Téléchargements du document

255