The Impact of Word Representations on Sequential Neural MWE Identification

Nicolas Zampieri; Carlos Ramisch; Geraldine Damnati

doi:10.18653/v1/W19-5121

Communication Dans Un Congrès Année : 2019

The Impact of Word Representations on Sequential Neural MWE Identification

(1) , (1) , (2)

1
2

Nicolas Zampieri

Fonction : Auteur

Traitement Automatique du Langage Ecrit et Parlé

Carlos Ramisch

Fonction : Auteur
PersonId : 5103
IdHAL : carlos-ramisch
ORCID : 0000-0001-7466-9039
IdRef : 170720802

Traitement Automatique du Langage Ecrit et Parlé

Geraldine Damnati

Fonction : Auteur

Orange Labs [Lannion]

Résumé

Recent initiatives such as the PARSEME shared task have allowed the rapid development of MWE identification systems. Many of those are based on recent NLP advances, using neural sequence models that take continuous word representations as input. We study two related questions in neural verbal MWE identification: (a) the use of lemmas and/or surface forms as input features, and (b) the use of word-based or character-based em-beddings to represent them. Our experiments on Basque, French, and Polish show that character-based representations yield systematically better results than word-based ones. In some cases, character-based representations of surface forms can be used as a proxy for lem-mas, depending on the morphological complexity of the language.

Domaines

Informatique et langage [cs.CL]

Fichier principal

W19-5121.pdf (205.58 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Carlos Ramisch : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02318287

Soumis le : mercredi 16 octobre 2019-18:44:38

Dernière modification le : vendredi 22 mars 2024-18:24:04

Archivage à long terme le : vendredi 17 janvier 2020-17:33:56

Dates et versions

hal-02318287 , version 1 (16-10-2019)

Identifiants

HAL Id : hal-02318287 , version 1
DOI : 10.18653/v1/W19-5121

Citer

Nicolas Zampieri, Carlos Ramisch, Geraldine Damnati. The Impact of Word Representations on Sequential Neural MWE Identification. Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019), Aug 2019, Florence, Italy. pp.169 - 175, ⟨10.18653/v1/W19-5121⟩. ⟨hal-02318287⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLN CNRS UNIV-AMU LIS-LAB ANR

71 Consultations

190 Téléchargements

The Impact of Word Representations on Sequential Neural MWE Identification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager