Hierarchical Multi-Label Classification Using Web Reasoning for Large Datasets

Abstract : Extracting valuable data among large volumes of data is one of the main challenges in Big Data. In this paper, a Hierarchical Multi-Label Classification process called Semantic HMC is presented. This process aims to extract valuable data from very large data sources, by automatically learning a label hierarchy and classifying data items.The Semantic HMC process is composed of five scalable steps, namely Indexation, Vectorization, Hierarchization, Resolution and Realization. The first three steps construct automatically a label hierarchy from statistical analysis of data. This paper focuses on the last two steps which perform item classification according to the label hierarchy. The process is implemented as a scalable and distributed application, and deployed on a Big Data platform. A quality evaluation is described, which compares the approach with multi-label classification algorithms from the state of the art dedicated to the same goal. The Semantic HMC approach outperforms state of the art approaches in some areas.
Type de document :
Article dans une revue
Open Journal Of Semantic Web, Research Online Publishing (RonPub), 2016, 〈10.19210/1006.3.1.1〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01356375
Contributeur : Thomas Hassan <>
Soumis le : jeudi 25 août 2016 - 15:43:56
Dernière modification le : mercredi 12 septembre 2018 - 01:27:46

Identifiants

Collections

Citation

Rafael Peixoto, Thomas Hassan, Christophe Cruz, Aurélie Bertaux, Nuno Silva. Hierarchical Multi-Label Classification Using Web Reasoning for Large Datasets. Open Journal Of Semantic Web, Research Online Publishing (RonPub), 2016, 〈10.19210/1006.3.1.1〉. 〈hal-01356375〉

Partager

Métriques

Consultations de la notice

216