The Impact of Word Representations on Sequential Neural MWE Identification - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

The Impact of Word Representations on Sequential Neural MWE Identification

Résumé

Recent initiatives such as the PARSEME shared task have allowed the rapid development of MWE identification systems. Many of those are based on recent NLP advances, using neural sequence models that take continuous word representations as input. We study two related questions in neural verbal MWE identification: (a) the use of lemmas and/or surface forms as input features, and (b) the use of word-based or character-based em-beddings to represent them. Our experiments on Basque, French, and Polish show that character-based representations yield systematically better results than word-based ones. In some cases, character-based representations of surface forms can be used as a proxy for lem-mas, depending on the morphological complexity of the language.
Fichier principal
Vignette du fichier
W19-5121.pdf (205.58 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02318287 , version 1 (16-10-2019)

Identifiants

Citer

Nicolas Zampieri, Carlos Ramisch, Geraldine Damnati. The Impact of Word Representations on Sequential Neural MWE Identification. Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019), Aug 2019, Florence, Italy. pp.169 - 175, ⟨10.18653/v1/W19-5121⟩. ⟨hal-02318287⟩
71 Consultations
190 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More