Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French

Résumé

Multiword expressions (MWE), a known nui-sance for both linguistics and NLP, blur the lines between syntax and semantics. Previous work on MWE identification has relied primar-ily on surface statistics, which perform poorly for longer MWEs and cannot model discontin-uous expressions. To address these problems, we show that even the simplest parsing mod-els can effectively identify MWEs of arbitrary length, and that Tree Substitution Grammars achieve the best results. Our experiments show a 36.4% F1 absolute improvement for French over an n-gram surface statistics baseline, cur-rently the predominant method for MWE iden-tification. Our models are useful for several NLP tasks in which MWE pre-grouping has improved accuracy.
Fichier principal
Vignette du fichier
green demarneffe bauer manning.pdf (183.61 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01111383 , version 1 (05-02-2015)

Identifiants

  • HAL Id : hal-01111383 , version 1

Citer

Spence Green, Marie-Catherine de Marneffe, John Bauer, Christopher D. Manning. Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French. Conference on Empirical Methods in Natural Language Processing, Jul 2011, Edinburgh, United Kingdom. ⟨hal-01111383⟩
141 Consultations
216 Téléchargements

Partager

Gmail Facebook X LinkedIn More