The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions

Abstract : Multiword expressions (MWEs) are known as a "pain in the neck" for NLP due to their idiosyncratic behaviour. While some categories of MWEs have been addressed by many studies, verbal MWEs (VMWEs), such as to take a decision, to break one's heart or to turn off, have been rarely modelled. This is notably due to their syntactic variability, which hinders treating them as " words with spaces ". We describe an initiative meant to bring about substantial progress in understanding, modelling and processing VMWEs. It is a joint effort, carried out within a European research network, to elaborate universal terminologies and annotation guidelines for 18 languages. Its main outcome is a multilingual 5-million-word annotated corpus which underlies a shared task on automatic identification of VMWEs. This paper presents the corpus annotation methodology and outcome, the shared task organisation and the results of the participating systems.
Type de document :
Communication dans un congrès
MWE 2017 - Proceedings of the 13th Workshop on Multiword Expressions , Apr 2017, Valencia, Spain. pp.31 - 47
Liste complète des métadonnées

Littérature citée [34 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01504624
Contributeur : Agata Savary <>
Soumis le : lundi 10 avril 2017 - 14:00:23
Dernière modification le : mardi 9 octobre 2018 - 11:46:07

Fichiers

W17-1704.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-01504624, version 1

Citation

Agata Savary, Carlos Ramisch, Silvio Cordeiro, Federico Sangati, Veronika Vincze, et al.. The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions. MWE 2017 - Proceedings of the 13th Workshop on Multiword Expressions , Apr 2017, Valencia, Spain. pp.31 - 47. 〈hal-01504624〉

Partager

Métriques

Consultations de la notice

720

Téléchargements de fichiers

274