Evaluating self-attention interpretability through human-grounded experimental protocol

Milan Bhan; Nina Achache; Victor Legrand; Annabelle Blangero; Nicolas Chesneau

doi:10.1007/978-3-031-44070-0_2

Communication Dans Un Congrès Année : 2023

Evaluating self-attention interpretability through human-grounded experimental protocol

(1, 2) , (2) , (2) , (2, 3) , (2)

1
2
3

Milan Bhan

Fonction : Auteur

Learning, Fuzzy and Intelligent systems

Ekimetrics

Nina Achache

Fonction : Auteur

Ekimetrics

Victor Legrand

Fonction : Auteur

Ekimetrics

Annabelle Blangero

Fonction : Auteur

Ekimetrics

Aix Marseille Université

Nicolas Chesneau

Fonction : Auteur

Ekimetrics

Résumé

Attention mechanisms have played a crucial role in the development of complex architectures such as Transformers in natural language processing. However, Transformers remain hard to interpret and are considered as black-boxes. In this paper we assess how attention coefficients from Transformers help in providing classifier interpretability when properly aggregated. A fast and easy-to-implement way of aggregating attention is proposed to build local feature importance. A human-grounded experiment is conducted to evaluate and compare this approach to other usual interpretability methods. The experimental protocol relies on the capacity of an interpretability method to provide explanation in line with human reasoning. Experiment design includes measuring reaction times and correct response rates by human subjects. Attention performs comparably to usual interpretability methods and significantly better than a random baseline regarding average participant reaction time and accuracy. Moreover, data analysis highlights that high probability prediction induces great explanation relevance. This work shows how self-attention can be aggregated and used to explain Transformer classifiers. The low computational cost of attention compared to other interpretability methods and its availability by design within Transformer classifiers make it particularly beneficial. Finally, the quality of its explanation depends strongly on the certainty of the classifier's prediction related to it.

Mots clés

Interpretability NLP XAI Attention Human-grounded ML

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

Evaluating_self_attention_interpretability_through_human_grounded_experimental_protocol___Springer_xAI.pdf (1.07 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Milan Bhan : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04311790

Soumis le : mardi 28 novembre 2023-12:11:30

Dernière modification le : vendredi 1 décembre 2023-03:42:17

Dates et versions

hal-04311790 , version 1 (28-11-2023)

Identifiants

HAL Id : hal-04311790 , version 1
DOI : 10.1007/978-3-031-44070-0_2

Citer

Milan Bhan, Nina Achache, Victor Legrand, Annabelle Blangero, Nicolas Chesneau. Evaluating self-attention interpretability through human-grounded experimental protocol. First World Conference on Explainable Artificial Intelligence xAI, Jul 2023, Lisbonne, Portugal. pp.26--46, ⟨10.1007/978-3-031-44070-0_2⟩. ⟨hal-04311790⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-AMU LIP6 SORBONNE-UNIVERSITE SU-SCIENCES

26 Consultations

39 Téléchargements

Evaluating self-attention interpretability through human-grounded experimental protocol

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager