Unsupervised Compositionality Prediction of Nominal Compounds

Abstract : Nominal compounds such as red wine and nut case display a continuum of compositionality, with varying contributions from the components of the compound to its semantics. This article proposes a framework for compound compositionality prediction using distributional semantic models, evaluating to what extent they capture idiomaticity compared to human judgments. For evaluation, we introduce data sets containing human judgments in three languages: English, French, and Portuguese. The results obtained reveal a high agreement between the models and human predictions, suggesting that they are able to incorporate information about idiomaticity. We also present an in-depth evaluation of various factors that can affect prediction, such as model and corpus parameters and compositionality operations. General crosslingual analyses reveal the impact of morphological variation and corpus size in the ability of the model to predict compositionality, and of a uniform combination of the components for best results.
Document type :
Journal articles
Complete list of metadatas

Cited literature [80 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02318196
Contributor : Carlos Ramisch <>
Submitted on : Wednesday, October 16, 2019 - 4:58:39 PM
Last modification on : Tuesday, October 22, 2019 - 2:38:05 PM

File

coli_a_00341.pdf
Publisher files allowed on an open archive

Identifiers

Collections

Citation

Silvio Cordeiro, Aline Villavicencio, Marco Idiart, Carlos Ramisch. Unsupervised Compositionality Prediction of Nominal Compounds. Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2019, 45 (1), pp.1-57. ⟨10.1162/coli_a_00341⟩. ⟨hal-02318196⟩

Share

Metrics

Record views

94

Files downloads

28