Training Set Class Distribution Analysis for Deep Learning Model - Application to Cancer Detection

Deep learning models specifically CNNs have been used successfully in many tasks including medical image classification. CNN effectiveness depends on the availability of large training data set to train which is generally costly to obtain for new applications or new cases. However, there is a little concrete recommendation about training set creation. In this research, we analyze the impact of different class distributions in the training data to a CNN model. We consider the case of cancer detection task from histopathological images for cancer diagnosis and derive some useful hypotheses about the distribution of classes in the training data. We found that using all the training data leads to the best recall-precision trade-off, while training with a reduced number of examples from some classes, it is possible to inflect the model toward a desired accuracy on a given class.

Mots clés

Medical information retrieval Image segmentation and classification Deep learning Class-biased training

Domaines

Intelligence artificielle [cs.AI] Apprentissage [cs.LG]

Fichier principal

reshma_26163.pdf (407.71 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Open Archive Toulouse Archive Ouverte (OATAO) : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02891748

Soumis le : mardi 7 juillet 2020-09:55:59

Dernière modification le : jeudi 14 décembre 2023-13:48:02

Archivage à long terme le : vendredi 27 novembre 2020-12:22:02

Dates et versions

hal-02891748 , version 1 (07-07-2020)

Identifiants

HAL Id : hal-02891748 , version 1
OATAO : 26163

Citer

Ismat Ara Reshma, Margot Gaspard, Camille Franchet, Pierre Brousset, Emmanuel Faure, et al.. Training Set Class Distribution Analysis for Deep Learning Model - Application to Cancer Detection. 1st International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI 2019), Mar 2019, Barcelona, Spain. pp.123-127. ⟨hal-02891748⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSERM UNIV-TLSE2 CNRS SMS UT1-CAPITOLE IRIT IRIT-SIG IRIT-REVA IRIT-GD IRIT-CISO IRIT-UT2J TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

110 Consultations

49 Téléchargements