Skip to Main content Skip to Navigation
Conference papers

Plongements Interprétables pour la Détection de Biais Cachés

Abstract : A lot of current semantic NLP tasks use semi-automatically collected data, that are often prone to unwanted artifacts, which may negatively affect models trained on them. With the more recent shift towards more complex, and less interpretable, pre-trained general purpose models, these biases may lead to undesirable correlations getting integrated into end-user applications. Recently a few methods have been proposed to train word embeddings with better interpretability. We propose a simple setup which exploits these representations to preemptively detect easy-to-learn lexical correlations in various datasets. We evaluate a few popular interpretable embedding models for English for this purpose, using both an intrinsic evaluation, and a large set of downstream semantic tasks, and we make use of the embeddings’ interpretable quality in order to diagnose potential biases in the associated datasets.
Document type :
Conference papers
Complete list of metadata
Contributor : Yannick Parmentier <>
Submitted on : Wednesday, June 23, 2021 - 11:43:39 PM
Last modification on : Monday, July 5, 2021 - 10:02:02 AM


Publisher files allowed on an open archive


Distributed under a Creative Commons Attribution 4.0 International License


  • HAL Id : hal-03265888, version 1


Tom Bourgeade, Philippe Muller, Tim van de Cruys. Plongements Interprétables pour la Détection de Biais Cachés. Traitement Automatique des Langues Naturelles (TALN 2021), 2021, Lille, France. pp.64-80. ⟨hal-03265888⟩



Record views


Files downloads