Skip to Main content Skip to Navigation
Conference papers

Towards a Variability Measure for Multiword Expressions

Caroline Pasquer 1 Agata Savary 1 Jean-yves Antoine 1 Carlos Ramisch 2 
1 BDTLN - Bases de données et traitement des langues naturelles
LIFAT - Laboratoire d'Informatique Fondamentale et Appliquée de Tours
2 TALEP - Traitement Automatique du Langage Ecrit et Parlé
LIS - Laboratoire d'Informatique et Systèmes
Abstract : One of the outstanding properties of multi-word expressions (MWEs), especially verbal ones (VMWEs), important both in theoretical models and applications, is their idiosyncratic variability. Some MWEs are always continuous , while some others admit certain types of insertions. Components of some MWEs are rarely or never modified, while some others admit either specific or unrestricted modification. This unpredictable variability profile of MWEs hinders modeling and processing them as " words-with-spaces " on the one hand, and as regular syntactic structures on the other hand. Since variability of MWEs is a matter of scale rather than a binary property, we propose a 2-dimensional language-independent measure of variability dedicated to verbal MWEs based on syntactic and discontinuity-related clues. We assess its relevance with respect to a linguistic benchmark and its utility for the tasks of VMWE classification and variant identification on a French corpus.
Document type :
Conference papers
Complete list of metadata

Cited literature [19 references]  Display  Hide  Download
Contributor : Caroline Pasquer Connect in order to contact the contributor
Submitted on : Tuesday, May 29, 2018 - 10:31:12 AM
Last modification on : Friday, February 4, 2022 - 3:27:09 AM
Long-term archiving on: : Thursday, August 30, 2018 - 1:21:30 PM


Files produced by the author(s)


  • HAL Id : hal-01802238, version 1


Caroline Pasquer, Agata Savary, Jean-yves Antoine, Carlos Ramisch. Towards a Variability Measure for Multiword Expressions. Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018) - Short papers, Jun 2018, New Orleans, United States. ⟨hal-01802238⟩



Record views


Files downloads