What do you mean, BERT? Assessing BERT as a Distributional Semantics Model

Timothee Mickus; Mathieu Constant; Denis Paperno; Kees van Deemter

doi:10.7275/t778-ja71

Article Dans Une Revue Proceedings of the Society for Computation in Linguistics Année : 2020

What do you mean, BERT? Assessing BERT as a Distributional Semantics Model

(1) , (1) , (2) , (2)

1
2

Timothee Mickus

Fonction : Auteur
PersonId : 179395
IdHAL : timothee-mickus
ORCID : 0000-0001-9538-7209

Analyse et Traitement Informatique de la Langue Française

Mathieu Constant

Fonction : Auteur
PersonId : 19722
IdHAL : constant-mathieu
IdRef : 158098188

Analyse et Traitement Informatique de la Langue Française

Denis Paperno

Fonction : Auteur

Universiteit Utrecht / Utrecht University [Utrecht]

Kees van Deemter

Fonction : Auteur
PersonId : 1065153

Universiteit Utrecht / Utrecht University [Utrecht]

Résumé

Contextualized word embeddings, i.e. vector representations for words in context, are naturally seen as an extension of previous non-contextual distributional semantic models. In this work, we focus on BERT, a deep neural network that produces contextualized embeddings and has set the state-of-the-art in several semantic tasks, and study the semantic coherence of its embedding space. While showing a tendency towards coherence, BERT does not fully live up to the natural expectations for a semantic vector space. In particular, we find that the position of the sentence in which a word occurs, while having no meaning correlates , leaves a noticeable trace on the word embeddings and disturbs similarity relationships.

Mots clés

distributional semantics contextualized word embeddings neural networks

Domaines

Informatique et langage [cs.CL] Linguistique Informatique

Fichier principal

1911.05758.pdf (399.83 Ko)

Timothee Mickus : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02484933

Soumis le : mardi 25 février 2020-15:34:06

Dernière modification le : vendredi 3 novembre 2023-08:50:05

Archivage à long terme le : mardi 26 mai 2020-12:41:56

Dates et versions

hal-02484933 , version 1 (25-02-2020)

Identifiants

HAL Id : hal-02484933 , version 1
ARXIV : 1911.05758
DOI : 10.7275/t778-ja71

Citer

Timothee Mickus, Mathieu Constant, Denis Paperno, Kees van Deemter. What do you mean, BERT? Assessing BERT as a Distributional Semantics Model. Proceedings of the Society for Computation in Linguistics, 2020, 3, ⟨10.7275/t778-ja71⟩. ⟨hal-02484933⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS ATILF UNIV-LORRAINE LUE-UL IMPACT-OLKI ANR

95 Consultations

207 Téléchargements

What do you mean, BERT? Assessing BERT as a Distributional Semantics Model

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager