From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource

Résumé

In this paper, we present two French lexical resources, GLÀFF and PsychoGLÀFF. The former, automatically extracted from the collaborative online dictionary Wiktionary, is a large-scale versatile lexicon exploitable in Natural Language Processing applications and linguistic studies. The latter, based on GLÀFF, is a lexicon specifically designed for psycholinguistic research. GLÀFF, counting more than 1.4 million entries, features an unprecedented size. It reports lemmas, main syntactic categories, inflectional features and phonemic transcriptions. PsychoGLÀFF contains additional information related to formal aspects of the lexicon and its distribution. It contains about 340,000 entries (120,000 lemmas) that are corpora-attested. We explain how the resources have been created and compare them to other known resources in terms of coverage and quality. Regarding PsychoGLÀFF, the comparison shows that it has an exceptionally large repertoire while having a comparable quality.

Domaines

Linguistique
Fichier principal
Vignette du fichier
CalderoneEtAl2014-Euralex.pdf (1.19 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01121487 , version 1 (01-03-2015)

Identifiants

  • HAL Id : hal-01121487 , version 1

Citer

Basilio Calderone, Nabil Hathout, Franck Sajous. From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource. Euralex, Jul 2014, Bolzano, Italy. pp.431-446. ⟨hal-01121487⟩
197 Consultations
39 Téléchargements

Partager

Gmail Facebook X LinkedIn More