Skip to Main content Skip to Navigation
Conference papers

FreDist: Automatic construction of distributional thesauri for French

Abstract : In this article we present FreDist, a freely available software package for the automatic construction of distributional thesauri from text corpora, as well as an evaluation of various distributional similarity metrics for French. Following from the work of Lin (1998) and Curran (2004), we use a large corpus of journalistic text and implement different choices for the type of lexical context relation, the weight function, and the measure function needed to build a distributional thesaurus. Using the EuroWordNet and \wolf wordnet resources for French as gold-standard references for our evaluation, we obtain the novel result that combining bigram and syntactic dependency context relations results in higher quality distributional thesauri. In addition, we hope that our software package and a joint release of our best thesauri for French will be useful to the NLP community.
Complete list of metadatas

Cited literature [6 references]  Display  Hide  Download
Contributor : Enrique Henestroza Anguiano <>
Submitted on : Tuesday, June 21, 2011 - 11:56:56 AM
Last modification on : Friday, March 27, 2020 - 2:55:27 AM
Document(s) archivé(s) le : Thursday, September 22, 2011 - 2:22:54 AM


Files produced by the author(s)


  • HAL Id : hal-00602004, version 1



Enrique Henestroza Anguiano, Pascal Denis. FreDist: Automatic construction of distributional thesauri for French. TALN - 18ème conférence sur le traitement automatique des langues naturelles, Jun 2011, Montpellier, France, France. pp.119--124. ⟨hal-00602004⟩



Record views


Files downloads