Skip to Main content Skip to Navigation
Conference papers

Ajuster l'analyse distributionnelle à un corpus spécialisé de petite taille

Abstract : Applying distributional semantic models to medium-size specialized corpora is an important objective for the extraction of lexical and terminological ressources. In this context, we seek to optimize the distributional analysis procedure on a 2 million word corpus consisting of NLP conference proceedings. Our expertise in this field allows us to establish a relevant benchmark for the task, thus providing an ideal experimental setup to observe the distributional mechanisms at work. We test several hundred configurations, with parameters ranging from syntactic analysis to similarity measures. This study highlights the variety of the results, particularly according to the POS of the target words, and allows for the identification of the best performing configurations by varying the number, nature and type of the contexts considered.
Document type :
Conference papers
Complete list of metadatas

Cited literature [8 references]  Display  Hide  Download
Contributor : Franck Sajous <>
Submitted on : Friday, July 11, 2014 - 9:19:21 AM
Last modification on : Friday, September 18, 2020 - 2:34:36 PM
Long-term archiving on: : Saturday, October 11, 2014 - 10:46:05 AM


Files produced by the author(s)


  • HAL Id : hal-01022171, version 1


Cécile Fabre, Nabil Hathout, Franck Sajous, Ludovic Tanguy. Ajuster l'analyse distributionnelle à un corpus spécialisé de petite taille. 21e Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2014), Jun 2014, Marseille, France. pp.266-279. ⟨hal-01022171⟩



Record views


Files downloads