MOCA-I: Discovering Rules and Guiding Decision Maker in the Context of Partial Classification in Large and Imbalanced Datasets

Abstract : This paper focuses on the modeling and the implementation as a multi-objective optimization problem of a Pittsburgh classification rule mining algorithm adapted to large and imbalanced datasets, as encountered in hospital data. We associate to this algorithm an original post-processing method based on ROC curve to help the decision maker to choose the most interesting rules. After an introduction to problems brought by hospital data such as class imbalance, volumetry or inconsistency, we present MOCA-I - a Pittsburgh modelization adapted to this kind of problems. We propose its implementation as a dominance-based local search in opposition to existing multi-objective approaches based on genetic algorithms. Then we introduce the post-processing method to sort and filter the obtained classifiers. Our approach is compared to state-of-the-art classification rule mining algorithms, giving as good or better results, using less parameters. Then it is compared to C4.5 and C4.5-CS on hospital data with a larger set of attributes, giving the best results.
Type de document :
Communication dans un congrès
Learning and Intelligent OptimizatioN Conference (LION 7), Jan 2013, Catania, Italy. pp.37-51, 2013, Lecture Notes in Computer Science
Liste complète des métadonnées

Littérature citée [21 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00806757
Contributeur : Julie Jacques <>
Soumis le : mardi 2 avril 2013 - 11:57:13
Dernière modification le : jeudi 21 février 2019 - 10:52:49
Document(s) archivé(s) le : mercredi 3 juillet 2013 - 04:06:11

Fichier

2013-01-03_lion2013.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00806757, version 1

Citation

Julie Jacques, Julien Taillard, David Delerue, Laetitia Jourdan, Clarisse Dhaenens. MOCA-I: Discovering Rules and Guiding Decision Maker in the Context of Partial Classification in Large and Imbalanced Datasets. Learning and Intelligent OptimizatioN Conference (LION 7), Jan 2013, Catania, Italy. pp.37-51, 2013, Lecture Notes in Computer Science. 〈hal-00806757〉

Partager

Métriques

Consultations de la notice

401

Téléchargements de fichiers

268