CA-Stream: Attention-based pooling for interpretable image recognition

Felipe Torres; Hanwei Zhang; Ronan Sicre; Stéphane Ayache; Yannis Avrithis

Conference Papers Year : 2024

CA-Stream: Attention-based pooling for interpretable image recognition

(1) , (1) , (1) , (1) , (2)

1
2

Felipe Torres

Function : Author
PersonId : 1376426

éQuipe d'AppRentissage de MArseille

Hanwei Zhang

Function : Author
PersonId : 1376427

éQuipe d'AppRentissage de MArseille

Ronan Sicre

Function : Author
PersonId : 1067988

éQuipe d'AppRentissage de MArseille

Stéphane Ayache

Function : Author
PersonId : 16733
IdHAL : stephane-ayache
ORCID : 0000-0003-2982-7127
IdRef : 129313254

éQuipe d'AppRentissage de MArseille

Yannis Avrithis

Function : Author
PersonId : 20705
IdHAL : yannis-avrithis
ORCID : 0000-0001-7476-4482
IdRef : 253126193

Institute of Advanced Research in Artificial Intelligence [Vienna]

Abstract

Explanations obtained from transformer-based architectures in the form of raw attention, can be seen as a class-agnostic saliency map. Additionally, attention-based pooling serves as a form of masking the in feature space. Motivated by this observation, we design an attention-based pooling mechanism intended to replace Global Average Pooling (GAP) at inference. This mechanism, called Cross-Attention Stream (CA-Stream), comprises a stream of cross attention blocks interacting with features at different network depths. CA-Stream enhances interpretability in models, while preserving recognition performance.

Keywords

EXplainable AI XAI interpretability attention-based models image classification

Domains

Computer Vision and Pattern Recognition [cs.CV]

Fichier principal

WACV_ICCVround2-4.pdf (2.29 Mo)

Origin : Files produced by the author(s)

ronan sicre : Connect in order to contact the contributor

https://hal.science/hal-04551613

Submitted on : Thursday, April 18, 2024-4:20:42 PM

Last modification on : Saturday, April 20, 2024-3:32:48 AM

Dates and versions

hal-04551613 , version 1 (18-04-2024)

Identifiers

HAL Id : hal-04551613 , version 1

Cite

Felipe Torres, Hanwei Zhang, Ronan Sicre, Stéphane Ayache, Yannis Avrithis. CA-Stream: Attention-based pooling for interpretable image recognition. XAI4CV workshop (CVPR), Jun 2024, Seatle, WA, United States. ⟨hal-04551613⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLN CNRS UNIV-AMU GENCI LIS-LAB AMIDEX ANR INCIAM

16 View

24 Download

CA-Stream: Attention-based pooling for interpretable image recognition

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share