L.A.S.L.A. and Collatinus: a convergence in lexica - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Studi e saggi linguistici Année : 2019

L.A.S.L.A. and Collatinus: a convergence in lexica

Résumé

L.A.S.L.A. has begun in 1961 a project of lemmatisation and morphosyntactic tagging of Latin texts. This project is still running with new texts lemmatised each year. The resulting files have been recently opened to the interested scholars and they now count approximatively 2.500.000 words, the lemmatisation of which has been checked by a philologist. In the early 2.000's, Collatinus has been developed by Yves Ouvrard for teaching. Its goal was to generate a complete lexical aid, with a short translation and the morphological analyses of the forms, for any text that can be given to the students. Although these two projects look very different, they met a few years ago in the conception of a new tool to speed up the lemmatisation process of Latin texts at L.A.S.L.A.. This tool is based on a concurrent lemmatisation of each word by looking for the form in those already analysed in the L.A.S.L.A. files and by Collatinus. This lemmatisation is followed by a disambiguation process with a second-order hidden Markov model and the result is presented in a text-editor to be corrected by the philologist.
Fichier principal
Vignette du fichier
Article_LASLA_Collatinus_final.pdf (195.34 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02399878 , version 1 (09-12-2019)

Identifiants

  • HAL Id : hal-02399878 , version 1

Citer

Philippe Verkerk, Yves Ouvrard, Margherita Fantoli, Dominique Longrée. L.A.S.L.A. and Collatinus: a convergence in lexica. Studi e saggi linguistici, In press. ⟨hal-02399878v1⟩
158 Consultations
238 Téléchargements

Partager

Gmail Facebook X LinkedIn More